Qdrant Vector Search Client

A comprehensive Python client for the Qdrant vector search engine. This skill helps you integrate Qdrant into your applications for semantic search, similarity matching, and vector operations with support for local development, cloud deployment, and embedded inference.

What This Skill Does

This skill enables you to:

Set up and connect to Qdrant vector databases (local, server, or cloud)

Create and manage vector collections with custom configurations

Insert, update, and search vector embeddings efficiently

Use local inference with FastEmbed for CPU/GPU-accelerated embeddings

Perform filtered vector searches with complex conditions

Handle both synchronous and asynchronous operations

Switch seamlessly between development and production modes

Instructions

1. Install the Qdrant Client

Install the base client:

```bash

pip install qdrant-client

```

For local embedding generation with FastEmbed (CPU):

```bash

pip install qdrant-client[fastembed]

```

For GPU-accelerated embeddings:

```bash

pip install qdrant-client[fastembed-gpu]

```

2. Initialize the Client

**Local mode (in-memory, perfect for development/testing):**

```python

from qdrant_client import QdrantClient

In-memory mode

client = QdrantClient(":memory:")

Or persist to disk

client = QdrantClient(path="path/to/db")

```

**Connect to Qdrant server:**

```python

By host and port

client = QdrantClient(host="localhost", port=6333)

Or by URL

client = QdrantClient(url="http://localhost:6333")

```

**Connect to Qdrant Cloud:**

```python

client = QdrantClient(

url="https://your-cluster.cloud.qdrant.io:6333",

api_key="<your-api-key>",

)

```

**Enable gRPC for faster uploads:**

```python

client = QdrantClient(host="localhost", grpc_port=6334, prefer_grpc=True)

```

3. Create a Collection

Define your vector space with size and distance metric:

```python

from qdrant_client.models import Distance, VectorParams

client.create_collection(

collection_name="my_collection",

vectors_config=VectorParams(

size=100, # Dimension of your vectors

distance=Distance.COSINE # or Distance.EUCLID, Distance.DOT

)

```

4. Insert Vectors

Upload vectors with optional metadata payloads:

```python

import numpy as np

from qdrant_client.models import PointStruct

vectors = np.random.rand(100, 100) # 100 vectors of dimension 100

client.upsert(

collection_name="my_collection",

points=[

PointStruct(

id=idx,

vector=vector.tolist(),

payload={"color": "red", "category": idx % 10}

)

for idx, vector in enumerate(vectors)

]

)

```

5. Search for Similar Vectors

Basic similarity search:

```python

query_vector = np.random.rand(100)

hits = client.query_points(

collection_name="my_collection",

query=query_vector,

limit=5 # Return top 5 matches

)

for hit in hits.points:

print(f"ID: {hit.id}, Score: {hit.score}, Payload: {hit.payload}")

```

6. Search with Filters

Apply metadata filters to narrow results:

```python

from qdrant_client.models import Filter, FieldCondition, Range

hits = client.query_points(

collection_name="my_collection",

query=query_vector,

query_filter=Filter(

must=[

FieldCondition(

key='category',

range=Range(gte=3, lte=7) # category between 3 and 7

)

]

limit=5

)

```

7. Use Local Inference with FastEmbed

Generate embeddings automatically without external APIs:

```python

from qdrant_client import QdrantClient, models

client = QdrantClient(":memory:")

model_name = "sentence-transformers/all-MiniLM-L6-v2"

Create collection

client.create_collection(

"demo_collection",

vectors_config=models.VectorParams(

size=client.get_embedding_size(model_name),

distance=models.Distance.COSINE

)

Prepare documents

payload = [

{"document": "Qdrant has Langchain integrations", "source": "Langchain-docs"},

{"document": "Qdrant also has Llama Index integrations", "source": "LlamaIndex-docs"},

]

docs = [models.Document(text=data["document"], model=model_name) for data in payload]

Upload with automatic embedding

client.upload_collection(

collection_name="demo_collection",

vectors=docs,

ids=[1, 2],

payload=payload,

)

Search with automatic query embedding

search_result = client.query_points(

collection_name="demo_collection",

query=models.Document(text="This is a query document", model=model_name)

).points

```

**Enable GPU acceleration:**

```python

docs = [

models.Document(

text="Your text here",

model=model_name,

options={"cuda": True}

)

]

```

8. Use Async Client

For async applications (requires async runtime):

```python

import asyncio

from qdrant_client import AsyncQdrantClient, models

async def main():

client = AsyncQdrantClient(url="http://localhost:6333")

await client.create_collection(

collection_name="async_collection",

vectors_config=models.VectorParams(size=10, distance=models.Distance.COSINE),

)

await client.upsert(

collection_name="async_collection",

points=[

models.PointStruct(id=i, vector=np.random.rand(10).tolist())

for i in range(100)

)

res = await client.query_points(

collection_name="async_collection",

query=np.random.rand(10).tolist(),

limit=10,

)

print(res)

asyncio.run(main())

```

9. Remote Inference with Qdrant Cloud

Use cloud-hosted models (paid plans only):

```python

client = QdrantClient(

url="https://your-cluster.cloud.qdrant.io:6333",

api_key="<your-api-key>",

cloud_inference=True, # Enable remote inference

)

Use the same Document API as FastEmbed

Images must be base64 encoded strings or URLs

```

Best Practices

1. **Development workflow**: Use local mode (`:memory:` or `path`) for prototyping, then switch to server/cloud for production

2. **Batch uploads**: Use `upload_collection()` or `upload_points()` for large datasets to avoid payload size limits

3. **gRPC for performance**: Enable gRPC with `prefer_grpc=True` for significantly faster bulk operations

4. **Distance metrics**: Choose `COSINE` for normalized vectors, `EUCLID` for absolute distances, `DOT` for pre-normalized vectors

5. **Filter efficiency**: Structure payloads with filterable fields to enable fast metadata-based searches

6. **Async for scale**: Use `AsyncQdrantClient` in async applications to handle concurrent requests efficiently

Common Use Cases

**Semantic search**: Find documents similar to a query using text embeddings

**Recommendation systems**: Discover similar items based on vector representations

**Image search**: Match images using vision model embeddings

**Duplicate detection**: Identify near-duplicate content with similarity thresholds

**RAG pipelines**: Build retrieval-augmented generation systems with vector context

**Anomaly detection**: Flag outliers using distance from cluster centroids

Additional Resources

[Qdrant Documentation](https://qdrant.tech/documentation/)

[API Reference](https://api.qdrant.tech/)

[Qdrant Cloud](https://cloud.qdrant.io/)

[FastEmbed Library](https://github.com/qdrant/fastembed)

[GitHub Repository](https://github.com/qdrant/qdrant-client)

Qdrant Vector Search Client

Qdrant Vector Search Client

What This Skill Does

Instructions

1. Install the Qdrant Client

2. Initialize the Client

In-memory mode

Or persist to disk

By host and port

Or by URL

3. Create a Collection

4. Insert Vectors

5. Search for Similar Vectors

6. Search with Filters

7. Use Local Inference with FastEmbed

Create collection

Prepare documents

Upload with automatic embedding

Search with automatic query embedding

8. Use Async Client

9. Remote Inference with Qdrant Cloud

Use the same Document API as FastEmbed

Images must be base64 encoded strings or URLs

Best Practices

Common Use Cases

Additional Resources

Reviews (0)