DigitalOcean Vector Databases are billed by the hour for as long as they exist. If you are experimenting, destroy the cluster from the Settings tab when you are finished. Destroying a cluster deletes all indexes and vectors irreversibly.
Vector Search Quickstart for OpenSearch
Last verified 28 Apr 2026
DigitalOcean Managed OpenSearch for vector search uses the same managed OpenSearch engine available under Managed Databases. It bundles the k-NN, ML Commons, and Neural Search plugins for vector similarity search, hybrid vector and keyword search, and remote embedding models.
This quickstart walks you through creating a DigitalOcean Managed OpenSearch cluster, configuring it as a vector store, indexing a handful of sample embeddings, and running a k-Nearest Neighbor (k-NN) similarity query. It takes about 15 minutes.
OpenSearch 2.19 bundles the k-NN, ML Commons, and Neural Search plugins, so they are preinstalled on DigitalOcean managed OpenSearch clusters. You can create a k-NN index and run similarity queries as soon as the cluster is online.
Prerequisites
To complete this quickstart, you need:
- A DigitalOcean account. You provision the cluster from the Control Panel.
- A terminal with
curlinstalled, or Python 3.9 or later withopensearch-py(>=2.4.0). Both are shown below.
If you already have a managed OpenSearch cluster running version 2.14 or later, you can skip Step 1 and start at Step 2 by pointing the requests in this guide at your existing cluster.
Step 1: Create an OpenSearch Vector Database
- In the Control Panel, click Create, then Vector Database.
- Select OpenSearch 2.19 as the engine.
- For this quickstart, the default Basic Shared CPU plan (1 vCPU, 2 GB RAM, 40 GiB disk) is enough.
- Choose the region closest to your application and name the cluster.
- Click Create Vector Database Cluster.
For production vector workloads, size for RAM: OpenSearch holds the HNSW graph in memory. See Create a Cluster for full sizing guidance.
Step 2: Secure the Cluster and Collect Connection Details
While the cluster provisions, open its Overview tab:
-
Under Trusted Sources, add your workstation IP or a DigitalOcean resource. Only listed sources can connect.
-
Copy the host, port, and
doadminpassword. -
Export them as environment variables:
export OPENSEARCH_HOST="<your-cluster-host>"
export OPENSEARCH_PORT="25060"
export OPENSEARCH_USER="doadmin"
export OPENSEARCH_PASSWORD="<your-doadmin-password>"
export OS="https://$OPENSEARCH_USER:$OPENSEARCH_PASSWORD@$OPENSEARCH_HOST:$OPENSEARCH_PORT"- Verify connectivity:
curl -sS "$OS/" | jq '.version.number'You should see "2.19.x".
Step 3: Create a k-NN Index
Create an index that stores 4-dimensional vectors. In production you typically use 384, 768, 1024, or 1536 dimensions depending on your embedding model.
curl -X PUT "$OS/articles" -H 'Content-Type: application/json' -d '{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 100
}
},
"mappings": {
"properties": {
"title": { "type": "text" },
"body": { "type": "text" },
"embedding": {
"type": "knn_vector",
"dimension": 4,
"method": {
"name": "hnsw",
"engine": "lucene",
"space_type": "cosinesimil",
"parameters": { "m": 16, "ef_construction": 128 }
}
}
}
}
}'"knn": trueenables the k-NN plugin for this index."type": "knn_vector"declares the vector field;dimensionmust match your embedding model."engine": "lucene"uses OpenSearch’s native HNSW, which supports efficient filtered search. Use Faiss for very large indexes (greater than 10 million vectors).
Step 4: Index Sample Vectors
Load four tiny documents with pre-computed embeddings.
curl -X POST "$OS/articles/_bulk" -H 'Content-Type: application/x-ndjson' --data-binary '
{ "index": { "_id": "1" } }
{ "title": "Coffee brewing basics", "body": "Pour-over, espresso, and cold brew compared.", "embedding": [0.91, 0.10, 0.05, 0.02] }
{ "index": { "_id": "2" } }
{ "title": "Best espresso machines", "body": "A buyer guide for home espresso setups.", "embedding": [0.88, 0.15, 0.07, 0.04] }
{ "index": { "_id": "3" } }
{ "title": "Intro to deep learning", "body": "Neural networks, backpropagation, activations.", "embedding": [0.05, 0.92, 0.18, 0.10] }
{ "index": { "_id": "4" } }
{ "title": "Hiking trails near Denver","body": "Five scenic day hikes within an hour of the city.","embedding": [0.12, 0.08, 0.90, 0.22] }
'OpenSearch responds with "errors": false when the bulk request succeeds.
Step 5: Run Your First k-NN Query
Find the two documents closest to a query vector that looks like a coffee-related embedding:
curl -X POST "$OS/articles/_search" -H 'Content-Type: application/json' -d '{
"size": 2,
"query": {
"knn": {
"embedding": {
"vector": [0.90, 0.12, 0.06, 0.03],
"k": 2
}
}
}
}'You should see the two coffee articles ranked highest, with _score values close to 1.0. OpenSearch normalizes cosine similarity so that higher is better.
Optional: The Same Query in Python
import os
from opensearchpy import OpenSearch
client = OpenSearch(
hosts=[{
"host": os.environ["OPENSEARCH_HOST"],
"port": int(os.environ.get("OPENSEARCH_PORT", 25060)),
}],
http_auth=(os.environ["OPENSEARCH_USER"], os.environ["OPENSEARCH_PASSWORD"]),
use_ssl=True,
verify_certs=True,
)
resp = client.search(
index="articles",
body={
"size": 2,
"query": {
"knn": {
"embedding": {
"vector": [0.90, 0.12, 0.06, 0.03],
"k": 2,
}
}
},
},
)
for hit in resp["hits"]["hits"]:
print(hit["_score"], hit["_source"]["title"])Next Steps
- Create a k-NN Index: tune engines, space types, and HNSW parameters for your workload.
- Index and Query Vectors: bulk ingestion, filtered k-NN, and exact search.
- Run Hybrid Searches: combine BM25 with vector similarity.
- Register a Remote Embedding Model: let OpenSearch call your embedding service directly.
For upstream OpenSearch vector documentation, see the official OpenSearch vector-search docs.