11 KiB
Vector Search MCP - Documentation
A comprehensive Model Context Protocol (MCP) server for vector similarity search operations with pluggable backend support.
📋 Table of Contents
🔍 Overview
This package provides a production-ready MCP server that enables semantic search capabilities through a unified interface. It supports multiple vector database backends while maintaining type safety and comprehensive test coverage.
Key Features
- 🔌 Pluggable Backends: Abstract engine interface for easy backend integration
- 🛡️ Type Safety: Full generic typing with Rust-like associated types pattern
- ⚡ Performance: Caching and async/await throughout
- 🧪 Well Tested: 62+ tests with 100% critical path coverage
- 📚 Comprehensive Docs: Detailed docstrings and examples
Supported Backends
- Qdrant ✅ Fully implemented with async client
- Cosmos DB 🚧 Planned (interface ready)
🏗️ Architecture
Core Components
graph TB
A[MCP Server] --> B[BaseEngine Abstract Class]
B --> C[QdrantEngine]
B --> D[CosmosEngine - Future]
C --> E[Qdrant AsyncClient]
F[Factory with Overloads] --> B
G[Generic Type System] --> B
Design Patterns
1. Abstract Factory with Overloaded Types
# Type checker knows exact return type for literals
engine = get_engine(Backend.QDRANT) # Returns: QdrantEngine
# Generic typing for variables
backend: Backend = some_variable
engine = get_engine(backend) # Returns: BaseEngine
2. Generic Interface (Rust-like Associated Types)
class BaseEngine(ABC, Generic[ResponseType, ConditionType]):
# ResponseType: Backend-specific raw response (e.g., list[ScoredPoint])
# ConditionType: Backend-specific filter type (e.g., models.Filter)
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
# Concrete implementation with Qdrant types
3. Template Method Pattern
async def semantic_search(self, ...):
"""Public interface orchestrates the workflow"""
conditions = self.transform_conditions(...) # Abstract
response = await self.run_similarity_query(...) # Abstract
return self.transform_response(response) # Abstract
📖 API Documentation
Main Entry Points
run(transport: Transport = "sse")
Starts the MCP server with specified transport protocol.
Parameters:
transport: Either"sse"(Server-Sent Events) or"stdio"
Example:
from vector_search_mcp import run
run("sse") # Start server
get_engine(backend: Backend) -> BaseEngine
Factory function creating cached engine instances.
Parameters:
backend: Backend enum value (Backend.QDRANT, Backend.COSMOS)
Returns:
- Typed engine instance (QdrantEngine for QDRANT)
Example:
from vector_search_mcp.engine import get_engine, Backend
engine = get_engine(Backend.QDRANT)
results = await engine.semantic_search(
embedding=[0.1, 0.2, 0.3],
collection="documents",
limit=10
)
Core Classes
BaseEngine[ResponseType, ConditionType]
Abstract base class defining the engine interface.
Generic Parameters:
ResponseType: Backend's native response formatConditionType: Backend's native filter format
Key Methods:
semantic_search(): Main public interfacetransform_conditions(): Convert generic to backend conditionstransform_response(): Convert backend to generic resultsrun_similarity_query(): Execute backend-specific search
QdrantEngine(BaseEngine[list[ScoredPoint], Filter])
Concrete Qdrant implementation.
Features:
- Async Qdrant client with connection pooling
- Automatic payload filtering (excludes null payloads)
- Support for Match, MatchAny, MatchExclude conditions
- Named vector support
Data Models
SearchRow
Standardized search result format.
SearchRow(
chunk_id="doc_123", # Document identifier
score=0.95, # Similarity score (0.0-1.0)
payload={"text": "...", ...} # Metadata dictionary
)
Condition Types
Match - Exact field matching
Match(key="category", value="technology")
MatchAny - Match any of provided values
MatchAny(key="tags", any=["python", "rust", "go"])
MatchExclude - Exclude specified values
MatchExclude(key="status", exclude=["draft", "deleted"])
🛡️ Type Safety
Generic Type System
The package uses a sophisticated generic type system that provides compile-time type safety while maintaining flexibility:
# Engine implementations specify their exact types
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
def transform_response(self, response: list[models.ScoredPoint]) -> list[SearchRow]:
# Type checker validates response parameter type
async def run_similarity_query(...) -> list[models.ScoredPoint]:
# Type checker validates return type matches generic parameter
Factory Type Overloads
@overload
def get_engine(backend: Literal[Backend.QDRANT]) -> QdrantEngine: ...
@overload
def get_engine(backend: Backend) -> BaseEngine: ...
# Usage provides different type information:
engine1 = get_engine(Backend.QDRANT) # Type: QdrantEngine
engine2 = get_engine(some_variable) # Type: BaseEngine
🧪 Testing
Test Coverage
- 62 Tests Total across 4 test modules
- 100% Critical Path Coverage for search workflows
- Integration Testing with full mock environments
- Type Safety Validation with runtime checks
Test Structure
tests/test_engine/
├── test_base_engine.py # Abstract interface tests (12 tests)
├── test_qdrant_engine.py # Qdrant implementation (20 tests)
├── test_factory.py # Factory and typing tests (17 tests)
├── test_integration.py # End-to-end workflows (13 tests)
├── conftest.py # Shared fixtures and mocks
└── README.md # Testing documentation
Running Tests
# Run all engine tests
uv run pytest tests/test_engine/ -v
# Run with coverage
uv run pytest tests/test_engine/ --cov=src/vector_search_mcp/engine --cov-report=html
# Run specific test categories
uv run pytest tests/test_engine/test_integration.py -v
Key Testing Features
- Cache Management: Auto-clearing fixtures prevent test interference
- Mock Isolation: Comprehensive mocking prevents real network calls
- Async Testing: Full async/await support with proper event loops
- Type Validation: Runtime checks for generic type correctness
🛠️ Development
Prerequisites
# Install with uv
uv install
# Or with pip
pip install -e .
Code Quality
The package maintains high code quality standards:
# Linting and formatting
uv run ruff check # Check for issues
uv run ruff check --fix # Auto-fix issues
uv run ruff format # Format code
# Type checking
uv run mypy src/
# Run tests
uv run pytest
Adding New Backends
- Define Types: Determine ResponseType and ConditionType for your backend
- Implement Engine: Create class extending
BaseEngine[ResponseType, ConditionType] - Add to Factory: Update
Backendenum andget_engine()function - Write Tests: Follow existing test patterns
- Update Documentation: Add examples and API docs
Example template:
class MyEngine(BaseEngine[MyResponseType, MyConditionType]):
def transform_conditions(self, conditions: list[Condition] | None) -> MyConditionType | None:
# Convert generic conditions to backend format
def transform_response(self, response: MyResponseType) -> list[SearchRow]:
# Convert backend response to SearchRow objects
async def run_similarity_query(...) -> MyResponseType:
# Execute backend-specific search
💡 Examples
Basic Usage
from vector_search_mcp.engine import get_engine, Backend
from vector_search_mcp.models import Match, MatchAny
# Create engine
engine = get_engine(Backend.QDRANT)
# Simple search
results = await engine.semantic_search(
embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
collection="documents",
limit=10
)
for result in results:
print(f"Score: {result.score:.3f} - {result.payload['text'][:50]}...")
Advanced Filtering
# Complex conditions
conditions = [
Match(key="category", value="technology"),
MatchAny(key="language", any=["python", "rust", "go"]),
MatchExclude(key="status", exclude=["draft", "archived"])
]
results = await engine.semantic_search(
embedding=query_vector,
collection="tech_docs",
limit=20,
conditions=conditions,
threshold=0.75 # Minimum similarity score
)
Custom Backend Implementation
from vector_search_mcp.engine.base_engine import BaseEngine
from vector_search_mcp.models import SearchRow, Condition
class CustomEngine(BaseEngine[dict, str]):
"""Example custom backend implementation."""
def transform_conditions(self, conditions: list[Condition] | None) -> str | None:
if not conditions:
return None
# Convert to custom query string format
return " AND ".join([f"{c.key}:{c.value}" for c in conditions])
def transform_response(self, response: dict) -> list[SearchRow]:
# Convert custom response to SearchRow objects
return [
SearchRow(
chunk_id=str(item['id']),
score=item['similarity'],
payload=item['metadata']
)
for item in response.get('results', [])
]
async def run_similarity_query(self, embedding, collection, limit=10,
conditions=None, threshold=None) -> dict:
# Custom backend API call
return await self.custom_client.search(
vector=embedding,
index=collection,
limit=limit,
filter=conditions,
min_score=threshold
)
MCP Server Integration
# Start the MCP server
from vector_search_mcp import run
# With Server-Sent Events (web-based clients)
run("sse")
# With stdio (terminal/CLI clients)
run("stdio")
📚 Additional Resources
- Source Code: Fully documented with comprehensive docstrings
- Test Suite: Located in
tests/test_engine/with detailed README - Type Definitions: All public APIs have complete type annotations
- Examples: See
examples/directory (if available) for more use cases
This documentation covers the current state of the Vector Search MCP package. The architecture is designed for extensibility, type safety, and production use.