Usage Patterns
This page is the shortest answer to:
- what object to build
- which function to call
- when to use each API
For a complete runnable example, download the executed notebook:
- full local SDK walkthrough
- batch-search-on-one-loaded-lancedb-slice.ipynb
- lancedb-to-kayak-reranking.ipynb
If you already have a vector database and want Kayak to own retrieval, see:
If you are choosing an encoder first, read:
Recommendation First
If you just want the recommended public path:
- use
kayak.open_text_retriever(...)when your app starts from text - use
kayak.query(...)pluskayak.documents(...).pack()when you already own token vectors - reuse one loaded or packed
LateIndexwhen many queries hit the same slice - use
kayak.search_batch(...)for repeated-query throughput on that same index - use
kayak.search(...)when you only need top-k hits - use
kayak.maxsim(...)only when you actually need the full score vector
Current low-level fast path, as implemented:
kayak.search(...)orkayak.search_batch(...)backend=kayak.MOJO_EXACT_CPU_BACKEND- packed index layout
- nested query layout
That is why the docs should push users toward index reuse and batch search rather than backend-specific application primitives.
Choose The API Shape
Use one retriever object:
kayak.open_text_retriever(...)retriever.upsert_texts(...)retriever.search_text(...)retriever.search_text_batch(...)for repeated query traffic
Choose this when your application inputs are plain text and you do not want to manage encoder and store objects separately.
Use the low-level exact search surface:
kayak.query(...)kayak.documents(...).pack()kayak.search(...)kayak.search_batch(...)when the same index serves many queries
Choose this when your application already owns token-level query and document vectors.
Keep the database for persistence and let Kayak materialize the searchable slice:
kayak.open_store(...)store.load_index(...)kayak.search(...)orkayak.search_batch(...)
Choose this when LanceDB, PgVector, Qdrant, Weaviate, or Chroma is already part of your system.
Use an explicit search plan:
kayak.exact_full_scan_search_plan(...)kayak.document_proxy_search_plan(...)kayak.search_with_plan(...)
Choose this when you need stage accounting, candidate windows, or exact reranking after an approximate first stage.
Plain-Text Ingest Plus Search
Use this when your application starts from text and you want one SDK object for ingest plus retrieval.
This is not a document-parsing path. If your inputs are PDFs, scans, tables, or other mixed-layout files, parse or extract them first and pass the resulting text or token-level vectors into Kayak.
import kayak
retriever = kayak.open_text_retriever(
encoder="colbert",
store="kayak",
encoder_kwargs={"model_name": "colbert-ir/colbertv2.0"},
store_kwargs={"path": "./kayak-index"},
)
retriever.upsert_texts(doc_ids, texts, metadata=metadata_rows)
hits = retriever.search_text(query_text, k=5, where={"tenant": "acme"})
Use:
kayak.open_text_retriever(...)for one high-level text workflow objectretriever.upsert_texts(...)when the source corpus starts as plain textretriever.search_text(...)when you want the retriever to encode the query and load the store sliceretriever.load_index(...)when you want to reuse one materialized slice across many queries- omit
backend=...if the default backend behavior is enough for your application
Default advice:
- use the retriever first for simplicity
- move to explicit
load_index(...)plussearch_batch(...)when repeated-query performance matters
Choose the encoder this way:
"colbert"when you have a ColBERT checkpoint on Hugging Face"callable"when you already have your own model methods
Exact Search For One Query
Use this when you already have one query embedding and one packed index.
import kayak
BACKEND = kayak.MOJO_EXACT_CPU_BACKEND
query = kayak.query(query_vectors, text=query_text)
index = kayak.documents(
doc_ids,
document_vectors,
texts=document_texts,
).pack()
hits = kayak.search(query, index, k=5, backend=BACKEND)
Use:
kayak.query(...)for one querykayak.documents(...).pack()for one reusable indexkayak.search(...)when you want top-k hits only
Exact Scores For All Documents
Use this when you need the full score vector, not only top-k hits.
Use:
kayak.maxsim(...)for one query against all indexed documentsLateScores.topk(k)when you want hits after inspecting scores
Batch Search Against The Same Index
Use this when the index stays fixed and you want to run multiple queries.
batch = kayak.query_batch([query_a_vectors, query_b_vectors, query_c_vectors])
hits_by_query = kayak.search_batch(
batch,
index,
k=5,
backend=BACKEND,
)
Use:
kayak.query_batch(...)to keep ragged query vector counts explicitkayak.search_batch(...)for top-k hits per querykayak.maxsim_batch(...)if you need full score vectors per query
If you are already using the high-level retriever and your queries still start as text, use:
hits_by_query = retriever.search_text_batch(
[query_a_text, query_b_text, query_c_text],
k=5,
where={"tenant": "acme"},
)
retriever.search_text_batch(...)when you want one store load plus one batch searchretriever.search_query_batch(...)when you already built aLateQueryBatch
This is the right public SDK shape when many queries target the same already loaded slice. It is different from generic concurrent use of one store object.
This is also the current documented fast path for repeated-query public SDK use.
Exact Baseline With Stage Profiles
Use this when you want exact retrieval plus explicit stage accounting.
plan = kayak.exact_full_scan_search_plan(final_k=5)
result = kayak.search_with_plan(
query,
index,
plan,
backend=BACKEND,
)
Use:
kayak.exact_full_scan_search_plan(...)as the exact baselinekayak.search_with_plan(...)when you want profiles, candidate-stage output, and stage metadata
Approximate Candidate Stage Plus Exact Rerank
Use this when you want a smaller candidate window before exact late interaction.
plan = kayak.document_proxy_search_plan(
final_k=5,
candidate_k=100,
query_vector_budget=32,
document_vector_budget=64,
)
result = kayak.search_with_plan(
query,
index,
plan,
backend=BACKEND,
)
Use:
candidate_kto control candidate-window sizequery_vector_budgetanddocument_vector_budgetto control proxy-stage costresult.candidate_stage.profileandresult.stage2to inspect the actual work done
Vector DB For Storage, Kayak For Search
Use this when your database is the durable store and Kayak is the search engine.
rows = vector_db.fetch_all()
index = kayak.documents(
[row["doc_id"] for row in rows],
[row["vector"] for row in rows],
texts=[row["text"] for row in rows],
).pack()
hits = kayak.search(query, index, k=10, backend=BACKEND)
Use:
- the vector database for saving
kayak.open_store(...)when you want Kayak to materialize that database into one searchable slice directlykayak.documents(...).pack()as the reusable in-process search indexkayak.search(...)orkayak.search_with_plan(...)for query-time retrieval
For the benchmark-backed version of this pattern, see:
Clause-Text Verification
Use this only when you want the text-family verifier.
You must attach:
text=to the querytexts=to the documents or index
query = kayak.query(query_vectors, text=query_text)
index = kayak.documents(
doc_ids,
document_vectors,
texts=document_texts,
).pack()
plan = kayak.exact_full_scan_search_plan(
final_k=5,
candidate_k=20,
stage3_verifier=kayak.clause_text_stage3_verifier_operator(),
)
result = kayak.search_with_plan(
query,
index,
plan,
backend=BACKEND,
)
Explicit 128-Dim Layouts
Use this when your embeddings are 128-dimensional and you want the layout choice to be explicit in code.
flat_query = query.to_layout("flat_dim128")
hybrid_index = index.to_layout("hybrid_flat_dim128")
scores = kayak.maxsim(
flat_query,
hybrid_index,
backend=BACKEND,
)
Backend Selection
If your application should normally use one explicit backend, define it once and reuse it.
If you want to inspect what the runtime can actually use:
Minimal API Map
| Task | API |
|---|---|
| Plain-text ingest plus text search | kayak.open_text_retriever(...) |
| One query, top-k hits | kayak.search(...) |
| One query, all exact scores | kayak.maxsim(...) |
| Many queries, top-k hits | kayak.search_batch(...) |
| Many queries, all exact scores | kayak.maxsim_batch(...) |
| Exact baseline with stage metadata | kayak.exact_full_scan_search_plan(...) + kayak.search_with_plan(...) |
| Approximate candidates + exact rerank | kayak.document_proxy_search_plan(...) + kayak.search_with_plan(...) |
| Text-family verification | kayak.clause_text_stage3_verifier_operator() |
| 128-dim explicit layouts | query.to_layout(...), index.to_layout(...) |
Notebook
The notebook uses a real text corpus, encodes it to ColBERT-style 128-dimensional vectors on CPU, builds a Kayak index, and runs:
- exact search
- flat/hybrid layout scoring
- approximate candidate generation plus exact reranking
- batch search
- clause-text verification
Notebook:
For vector-database handoff patterns, see: