Setting refresh=True forces OpenSearch to rebuild the HNSW segment immediately, which is expensive for vector indexes. Use it only in tests. For production ingestion, rely on the index’s refresh_interval setting.
Index and Query Vectors via the OpenSearch API
Last verified 28 Apr 2026
DigitalOcean Managed OpenSearch for vector search uses the same managed OpenSearch engine available under Managed Databases. It bundles the k-NN, ML Commons, and Neural Search plugins for vector similarity search, hybrid vector and keyword search, and remote embedding models.
OpenSearch treats vectors as first-class fields in any index that has k-NN enabled. Once the index exists, you load data with the same REST APIs you use for text documents and query it with a dedicated knn query type.
This guide covers the write and read path: single-document and bulk ingestion, the basic k-NN query, filtered k-NN, and exact k-NN. Every example is shown in both curl and Python (opensearch-py).
Prerequisites
- A k-NN-enabled index. See Create a k-NN Index. The examples below assume an index named
documentswith a 1024-dimensionalembeddingfield. - An embedding pipeline. You generate embeddings in your application (OpenAI, Cohere, Voyage, sentence-transformers, DigitalOcean Serverless Inference, etc.) and send the resulting float arrays to OpenSearch. For a server-side alternative, see Register a Remote Embedding Model.
- The Python client (optional). Install with
pip install opensearch-py.
Index a Single Document
curl -X POST "$OS/documents/_doc/doc-1" -H 'Content-Type: application/json' -d '{
"title": "Introduction to OpenSearch vector search",
"body": "OpenSearch stores vector embeddings in knn_vector fields...",
"source":"blog",
"embedding": [0.0123, -0.0456, 0.0789, "..."]
}'In Python:
import os
from opensearchpy import OpenSearch
client = OpenSearch(
hosts=[{
"host": os.environ["OPENSEARCH_HOST"],
"port": int(os.environ.get("OPENSEARCH_PORT", 25060)),
}],
http_auth=(os.environ["OPENSEARCH_USER"], os.environ["OPENSEARCH_PASSWORD"]),
use_ssl=True,
verify_certs=True,
)
client.index(
index="documents",
id="doc-1",
body={
"title": "Introduction to OpenSearch vector search",
"body": "OpenSearch stores vector embeddings...",
"source": "blog",
"embedding": embed_one("Introduction to OpenSearch vector search"),
},
refresh=True,
)Bulk Index Documents
For anything more than a handful of documents, use the _bulk API. It accepts NDJSON: an action line followed by a source line, repeated.
curl -X POST "$OS/_bulk" \
-H 'Content-Type: application/x-ndjson' \
--data-binary @documents.ndjsonWhere documents.ndjson looks like:
{ "index": { "_index": "documents", "_id": "doc-1" } }
{ "title": "...", "body": "...", "source": "blog", "embedding": [0.01, "..."] }
{ "index": { "_index": "documents", "_id": "doc-2" } }
{ "title": "...", "body": "...", "source": "blog", "embedding": [0.02, "..."] }Efficient Bulk Ingestion in Python
The helpers.bulk function streams generators, batches requests, and retries transient errors:
from opensearchpy import helpers
def gen_actions(docs):
for d in docs:
yield {
"_op_type": "index",
"_index": "documents",
"_id": d["id"],
"_source": {
"title": d["title"],
"body": d["body"],
"source": d["source"],
"embedding": d["embedding"],
},
}
helpers.bulk(client, gen_actions(docs), chunk_size=500, request_timeout=60)
client.indices.refresh(index="documents")For very large initial loads, set refresh_interval to -1 and number_of_replicas to 0 during the load, then restore both after ingestion finishes. This can cut load time by 3 to 5 times on larger clusters.
Run a k-NN Query
The knn query takes a field name, a query vector, and a k (the number of neighbors to return). OpenSearch returns documents sorted by similarity, with _score normalized so that higher is better.
curl -X POST "$OS/documents/_search" -H 'Content-Type: application/json' -d '{
"size": 5,
"_source": ["title", "source"],
"query": {
"knn": {
"embedding": {
"vector": [0.013, -0.041, "..."],
"k": 5
}
}
}
}'The number of results returned is min(size, k). Keep them equal for the simplest mental model. Use _source to limit the fields returned, since vector fields are large and rarely useful in the response.
Filter k-NN Results
Real applications almost always want to narrow results by metadata. Lucene’s and Faiss’s HNSW engines both support efficient pre-filtering through the filter clause inside the knn query.
curl -X POST "$OS/documents/_search" -H 'Content-Type: application/json' -d '{
"size": 10,
"query": {
"knn": {
"embedding": {
"vector": [0.013, -0.041, "..."],
"k": 10,
"filter": {
"bool": {
"must": [
{ "term": { "source": "blog" } },
{ "range": { "created_at": { "gte": "now-30d/d" } } }
]
}
}
}
}
}
}'Both engines apply filters during graph traversal, so even restrictive filters return accurate top-k results. Each engine falls back to an exact scan when a filter is extremely selective.
Run Exact k-NN for Small or Heavily Filtered Datasets
When you need perfect recall, or when a filter reduces the candidate set below about 10,000 documents, exact k-NN with a script_score query is a good choice. It scans every matching document and computes the distance exactly.
curl -X POST "$OS/documents/_search" -H 'Content-Type: application/json' -d '{
"size": 10,
"query": {
"script_score": {
"query": { "term": { "source": "internal-wiki" } },
"script": {
"source": "knn_score",
"lang": "knn",
"params": {
"field": "embedding",
"query_value": [0.013, -0.041, "..."],
"space_type": "cosinesimil"
}
}
}
}
}'The space_type in the script parameters must match the one used when creating the index.
Check Index Statistics
Useful when debugging ingestion performance:
curl "$OS/_plugins/_knn/stats?pretty"
curl "$OS/documents/_count"
curl "$OS/_cat/indices/documents?v&h=index,docs.count,store.size,pri.store.size"