API Reference

All public exports come from:

import kayak

This page focuses on the stable Python surface and the backend contract.

Hosted-engine Python note:

this API page covers import kayak
the hosted-engine Python surface lives under import kayak_engine
for pinned hosted snapshots and the local exact-search scheduler, see Hosted Engine Python

Choose The API Section

I Need The Main Entry PointI Already Have VectorsI Need A Database AdapterI Need Plans Or Candidate StagesI Need Hosted Snapshot Reuse

Start with:

kayak.open_text_retriever(...) for one high-level plain-text workflow
kayak.open_encoder(...) if you need encoder control
kayak.open_store(...) if you need persistence control

Start with:

kayak.query(...)
kayak.documents(...).pack()
kayak.search(...)
kayak.search_batch(...)

Jump to:

Candidate Generation And Plans
Search Plans

This is the wrong page.

Use:

Hosted Engine Python

Common Calls Cheat Sheet

If you want to...	Use...	Read...
ingest plain text and search with one object	`kayak.open_text_retriever(...)`	Usage Patterns
encode with a ColBERT checkpoint from Hugging Face	`kayak.open_encoder("colbert", model_name="...")`	Text Encoders
wrap your own token-level model methods	`kayak.open_encoder("callable", ...)`	Text Encoders
run one exact search from vectors	`kayak.query(...)` + `kayak.documents(...).pack()` + `kayak.search(...)`	Quickstart
get the full score vector instead of only top-k	`kayak.maxsim(...)`	Usage Patterns
run many queries against one fixed index	`kayak.query_batch(...)` + `kayak.search_batch(...)`	Usage Patterns
keep an existing vector database for storage	`kayak.open_store(...)` + `store.load_index(...)`	Storage + Search
run a stage-1 candidate step plus exact rerank	`kayak.document_proxy_search_plan(...)` + `kayak.search_with_plan(...)`	Search Plans
check whether Mojo is actually available	`kayak.available_backends()` + `kayak.backend_info(...)`	Using the Mojo Backend
inspect the Python-to-Mojo bridge directly	`kayak.mojo_bridge_info()`	Installation
get generated terminal help from the public API itself	`kayak.help(...)`	Installation
look up one retriever method from the REPL	`kayak.help("search_text")`	Usage Patterns
use normal Python hover/help in an editor or notebook	`inspect.getdoc(...)` and `inspect.signature(...)`	Installation

Constructors

`kayak.open_encoder(kind, **kwargs)`

Open a public text encoder from the stable factory.

encoder = kayak.open_encoder(
    "callable",
    query_encoder=my_query_encoder,
    document_encoder=my_document_encoder,
)

Built-in kinds:

"callable"
"colbert"

Use:

"colbert" when the model is a ColBERT checkpoint and model_name is the Hugging Face repo id
"callable" when you already have query/document methods that emit token vectors

`kayak.open_store(kind, **kwargs)`

Open a public persistence adapter from the stable factory.

store = kayak.open_store("kayak", path="./kayak-index")

Built-in kinds:

"memory"
"kayak" and "directory"
"lancedb"
"pgvector" with aliases "postgres" and "postgresql"
"qdrant"
"weaviate"
"chromadb" and "chroma"

`kayak.open_text_retriever(...)`

Open one high-level text retriever that composes:

one encoder
one store
one default backend

retriever = kayak.open_text_retriever(
    encoder="callable",
    store="memory",
    encoder_kwargs={
        "query_encoder": my_query_encoder,
        "document_encoder": my_document_encoder,
    },
)

The encoder and store arguments can be either:

a registered factory kind like "colbert" or "kayak"
an already-constructed encoder or store object

Backend policy for this high-level constructor:

prefers kayak.MOJO_EXACT_CPU_BACKEND when Mojo is available in the active environment
falls back to kayak.NUMPY_REFERENCE_BACKEND otherwise
accepts backend=... when you want to override the default explicitly

This constructor expects text strings plus an explicit encoder. It does not own OCR, document parsing, extraction, or answer generation.

`kayak.query(token_vectors, *, text=None)`

Build a LateQuery from a 2D token-vector matrix.

Accepted inputs are objects that can be converted to a 2D array-like matrix, including NumPy arrays and torch tensors.

query = kayak.query(vectors)
query_with_text = kayak.query(vectors, text="founded in 1984 in a church")

`kayak.query_batch(token_vectors)`

Build a LateQueryBatch from a sequence of query matrices.

Each query can have a different number of vectors. All queries must share the same vector dimension.

batch = kayak.query_batch([query_a_vectors, query_b_vectors])

`kayak.documents(doc_ids, token_vectors, *, texts=None)`

Build LateDocuments from:

document ids
one token-vector matrix per document
optional document texts

documents = kayak.documents(
    ["doc-a", "doc-b"],
    [doc_a_vectors, doc_b_vectors],
    texts=["text a", "text b"],
)

Layout Constructors

These are direct constructors for already-materialized layouts.

`kayak.packed_index(doc_ids, doc_offsets, token_vectors, *, doc_texts=None)`

Build a LateIndex directly in packed layout.

Use this only when you already own packed storage fields.

`kayak.hybrid_flat_dim128_index(doc_ids, doc_offsets, token_values, *, doc_texts=None)`

Build a LateIndex directly in hybrid_flat_dim128 layout.

Requires a 128-dimensional flattened token-value buffer.

`kayak.flat_query_dim128(token_values, *, text=None)`

Build a LateQuery directly in flat_dim128 layout from flat values.

Requires 128-dimensional vectors.

Object Methods

`LateDocuments.pack()`

Pack LateDocuments into a searchable LateIndex.

index = documents.pack()

Text Encoders

`kayak.CallableLateTextEncoder(query_encoder, document_encoder)`

Wrap user-provided Python callables behind Kayak's text encoder contract.

Those callables can come from plain functions or bound model methods. They are called one text at a time by the public SDK surface.

`kayak.ColBERTTextEncoder(model_name='colbert-ir/colbertv2.0', *, checkpoint=None, gpus=0)`

Encode text with a ColBERT checkpoint into LateQuery and LateDocuments.

model_name is the Hugging Face repo id for the ColBERT checkpoint. The public encoder path encodes one query or document text at a time.

`kayak.register_encoder(kind, factory, *, replace=False)`

Register a custom encoder factory behind open_encoder(...).

Stores

All public stores support:

close()
with ... as store:
load_index(...) to materialize one exact LateIndex

`kayak.MemoryLateStore()`

Keep late-interaction documents in memory and materialize LateIndex objects on demand.

`kayak.DirectoryLateStore(path)`

Persist one local packed Kayak snapshot on disk and materialize it back into LateIndex.

`kayak.LanceDBLateStore(path, *, table_name='late_documents')`

Persist documents in LanceDB row storage and materialize LateIndex objects from the table.

where= is applied exactly, but after Arrow materialization in the current public adapter.

`kayak.PgVectorLateStore(dsn=None, *, connection=None, table_name='late_documents', schema_name='public', ensure_extension=True)`

Persist documents in Postgres with the pgvector extension and materialize exact LateIndex objects from stored rows.

dsn and connection are mutually exclusive. Simple scalar where= filters are pushed into Postgres through JSONB containment before load, and Kayak still applies the exact filter after fetch.

`kayak.QdrantLateStore(path=None, *, client=None, collection_name='late_documents')`

Persist documents in a Qdrant collection with native multivectors and materialize LateIndex objects from stored rows.

Simple scalar where= filters are pushed into Qdrant before load.

`kayak.WeaviateLateStore(persistence_path=None, *, client=None, collection_name='LateDocument', vector_name='colbert', environment_variables=None)`

Persist documents in a Weaviate collection with a named self-provided multivector and materialize LateIndex objects from stored rows.

where= is applied exactly, but after collection iteration in the current public adapter.

`kayak.ChromaLateStore(path=None, *, client=None, collection_name='late_documents')`

Persist documents in a Chroma collection and materialize exact LateIndex objects from stored rows.

The adapter stores one pooled dense vector per document in Chroma plus the exact token matrix in metadata so Kayak can reconstruct the exact index.

`kayak.register_store(kind, factory, *, replace=False)`

Register a custom store factory behind open_store(...).

Text Retrievers

`kayak.LateTextRetriever`

High-level workflow object for plain-text ingest plus search.

Important methods:

upsert_texts(doc_ids, texts, metadata=None)
delete(doc_ids)
close()
load_index(...)
search_text(...)
search_query(...)
search_text_batch(...)
search_query_batch(...)
search_text_with_plan(...)
search_query_with_plan(...)

load_index(...) is the reusable exact slice. Use it when the store stays fixed and the queries are the thing changing.

The public retriever/store contract does not currently promise generic thread-safe concurrent use of the same store instance across all adapters. The verified reusable path is:

call load_index(...) once
reuse that LateIndex with search(...) or search_batch(...)

If you need one explicit same-process multi-caller surface around a fixed hosted snapshot, use import kayak_engine and prepare_exact_search_runtime(...).

`LateQuery.to_layout(layout)`

Convert a query between:

"nested"
"flat_dim128"

"flat_dim128" requires vector_dim == 128.

`LateIndex.to_layout(layout)`

Convert an index between:

"packed"
"hybrid_flat_dim128"

"hybrid_flat_dim128" requires vector_dim == 128.

`LateIndex.select(doc_ids)`

Select a subset of documents and return a new index in the same layout.

This is the mechanism search plans use when they materialize candidate windows for exact reranking.

Exact Operations

`kayak.maxsim(query, index, *, backend=kayak.NUMPY_REFERENCE_BACKEND)`

Return exact scores for every document in the index.

scores = kayak.maxsim(
    query,
    index,
    backend=kayak.MOJO_EXACT_CPU_BACKEND,
)

Returns LateScores.

`kayak.maxsim_batch(batch, index, *, backend=kayak.NUMPY_REFERENCE_BACKEND)`

Return one LateScores object per query in the batch.

`kayak.search(query, index, *, k, backend=kayak.NUMPY_REFERENCE_BACKEND)`

Return exact top-k SearchHit tuples.

hits = kayak.search(
    query,
    index,
    k=10,
    backend=kayak.MOJO_EXACT_CPU_BACKEND,
)

`kayak.search_batch(batch, index, *, k, backend=kayak.NUMPY_REFERENCE_BACKEND)`

Return exact top-k hits for each query in the batch.

Candidate Generation And Plans

`kayak.generate_candidates(query, index, generator, *, k, backend=kayak.NUMPY_REFERENCE_BACKEND)`

Run stage-1 candidate generation directly.

result = kayak.generate_candidates(
    query,
    index,
    kayak.document_proxy_candidate_generator(),
    k=100,
    backend=kayak.MOJO_EXACT_CPU_BACKEND,
)

Returns CandidateStageResult.

`kayak.search_with_plan(query, index, plan, *, backend=kayak.NUMPY_REFERENCE_BACKEND)`

Run an explicit search plan and return SearchPlanResult.

result = kayak.search_with_plan(
    query,
    index,
    plan,
    backend=kayak.MOJO_EXACT_CPU_BACKEND,
)

Plan Builders

`kayak.exact_full_scan_search_plan(final_k, *, candidate_k=None, stage2_reference_operator=None, stage3_verifier=None)`

Build an exact full-scan plan.

This is the correctness baseline.

`kayak.exact_full_scan_clause_text_search_plan(final_k, *, candidate_k=None)`

Build an exact full-scan plan with clause-text stage-3 verification.

`kayak.document_proxy_search_plan(final_k, candidate_k, *, query_vector_budget=0, document_vector_budget=0, stage2_reference_operator=None, stage3_verifier=None)`

Build an approximate-candidate plus exact-rerank plan.

This is the main staged retrieval API.

Stage Components

Candidate generators

kayak.exact_full_scan_candidate_generator()
kayak.document_proxy_candidate_generator(*, query_vector_budget=0, document_vector_budget=0)

Stage-2 reference operators

kayak.exact_late_interaction_stage2_reference_operator()
kayak.noop_topk_stage2_reference_operator()

Stage-3 verifiers

kayak.clause_text_stage3_verifier_operator()
kayak.none_stage3_verifier_operator()

Backends

Constants

kayak.NUMPY_REFERENCE_BACKEND
kayak.MOJO_EXACT_CPU_BACKEND

`kayak.available_backends()`

Return a tuple of available backend names.

('numpy_reference', 'mojo_exact_cpu')

or, if Mojo is unavailable:

('numpy_reference',)

`kayak.backend_info(name)`

Return a BackendInfo record with:

name
available
requires_mojo
query_layouts
index_layouts
availability_reason

Use this instead of guessing why the Mojo backend is or is not available.

`kayak.mojo_bridge_info(*, probe_load=False)`

Return structured diagnostics for the Python-to-Mojo bridge behind mojo_exact_cpu.

Use this when you want one public SDK call that answers:

which Mojo command Kayak would invoke
whether the bridge is using repo sources or bundled wheel artifacts
what Mojo version the active CLI reports
what Mojo version built the bundled wheel artifact, when that metadata exists

info = kayak.mojo_bridge_info()

If you want the stronger check that actually tries to build or load the bridge, use:

info = kayak.mojo_bridge_info(probe_load=True)

That sets:

module_loaded
load_error

so the caller can distinguish "Mojo is discoverable" from "the exact bridge really loaded successfully."

`kayak.help(topic=None)`

Return generated public help from the exported API, signatures, and docstrings.

Use this when you want terminal-friendly help that stays aligned with the current public package surface instead of a second handwritten registry.

Examples:

print(kayak.help())
print(kayak.help("search"))
print(kayak.help("stores"))
print(kayak.help(kayak.LateTextRetriever))

Types

Core objects

LateQuery
LateQueryBatch
LateDocuments
LateIndex

Scoring and hits

LateScores
SearchHit

Plan and stage results

SearchPlan
SearchPlanResult
CandidateStageResult
SearchStageProfile
StageArtifactMaterialization

Descriptors

CandidateGenerator
Stage2ReferenceOperator
Stage3VerifierOperator
ReferenceScoringSemantics
BackendInfo

Notes On Mojo Usage

If you want your own codebase to behave as "use Mojo by default," define a backend constant and thread it through the operations you call:

BACKEND = kayak.MOJO_EXACT_CPU_BACKEND

That is the supported way to make Mojo your application default without hiding which executor is running.

API Reference

Choose The API Section

Common Calls Cheat Sheet

Constructors

kayak.open_encoder(kind, **kwargs)

kayak.open_store(kind, **kwargs)

kayak.open_text_retriever(...)

kayak.query(token_vectors, *, text=None)

kayak.query_batch(token_vectors)

kayak.documents(doc_ids, token_vectors, *, texts=None)

Layout Constructors

kayak.packed_index(doc_ids, doc_offsets, token_vectors, *, doc_texts=None)

kayak.hybrid_flat_dim128_index(doc_ids, doc_offsets, token_values, *, doc_texts=None)

kayak.flat_query_dim128(token_values, *, text=None)

Object Methods

LateDocuments.pack()

Text Encoders

kayak.CallableLateTextEncoder(query_encoder, document_encoder)

kayak.ColBERTTextEncoder(model_name='colbert-ir/colbertv2.0', *, checkpoint=None, gpus=0)

kayak.register_encoder(kind, factory, *, replace=False)

Stores

kayak.MemoryLateStore()

kayak.DirectoryLateStore(path)

kayak.LanceDBLateStore(path, *, table_name='late_documents')

kayak.PgVectorLateStore(dsn=None, *, connection=None, table_name='late_documents', schema_name='public', ensure_extension=True)

kayak.QdrantLateStore(path=None, *, client=None, collection_name='late_documents')

kayak.WeaviateLateStore(persistence_path=None, *, client=None, collection_name='LateDocument', vector_name='colbert', environment_variables=None)

kayak.ChromaLateStore(path=None, *, client=None, collection_name='late_documents')

kayak.register_store(kind, factory, *, replace=False)