What I was about to do
I needed semantic search over structured content. My plan was the usual: export documents, chunk them, embed them, push to a vector store, build a sync job to keep it current.
Before wiring all of that, I checked whether Sanity had anything relevant in that direction. It did.
The embeddings index
Sanity has a first-party embeddings index that runs against your existing dataset. You define which document types and fields to include, it handles chunking and embedding, and it stays in sync automatically as content changes.
Configuration is minimal. A few lines in sanity.config.ts:
import { embeddingsIndexDashboard } from "@sanity/embeddings-index-ui"
export default defineConfig({
// ...
plugins: [
embeddingsIndexDashboard(),
],
})
Then you create an index from the Studio dashboard or via the CLI, specifying document types and which fields to embed. After that, the index is live and queryable.
Querying it
The search endpoint accepts a natural language query and returns ranked document IDs with similarity scores:
const hits = await client.request({
url: `/vX/embeddings-index/search/${indexName}`,
method: "POST",
body: {
query: "onboarding flow for enterprise accounts",
maxResults: 5,
},
})
Response shape:
[
{ "id": "doc-abc123", "score": 0.87 },
{ "id": "doc-def456", "score": 0.81 }
]
You then fetch full document content with a GROQ query using those IDs. Two round trips, no external service in the middle.
What works well
The index stays current without any intervention. Publish a document in Sanity and it is searchable within seconds.
Combining semantic results with GROQ filters is clean in practice. You get the ranked IDs from the semantic endpoint, then scope them with GROQ to filter by type, publication status, or any field:
const docs = await client.fetch(
`*[_id in $ids && _type == "article" && status == "published"]`,
{ ids: hits.map(h => h.id) }
)
The semantic search narrows by meaning; GROQ enforces policy. Together they handle most retrieval scenarios without needing a separate query planner.
What to watch
The index quality depends on what you embed. If you include raw portable text without prose-level chunking, results for long documents can be inconsistent. I got better results indexing specific fields (title, summary, structured body paragraphs) separately rather than dumping the whole document.
Short or ambiguous queries are also less reliable. A two-word query with no domain context will match on surface similarity more than intent. Adding a short system prompt or query expansion step upstream helps.
The part I did not expect
Every result carries the document _id.
Sanity documents also have _rev, a revision hash that changes on every edit.
Fetch both at query time and you have an exact record of which version of which document was in scope for any given search. That is useful for debugging retrieval, and more useful if you are passing results to an LLM and want to audit what the model actually saw.
More on that in a follow-up post.