forked from innovacion/searchbox
391 lines
11 KiB
Markdown
391 lines
11 KiB
Markdown
# Vector Search MCP - Documentation
|
|
|
|
A comprehensive Model Context Protocol (MCP) server for vector similarity search operations with pluggable backend support.
|
|
|
|
## 📋 Table of Contents
|
|
|
|
- [Overview](#overview)
|
|
- [Architecture](#architecture)
|
|
- [API Documentation](#api-documentation)
|
|
- [Type Safety](#type-safety)
|
|
- [Testing](#testing)
|
|
- [Development](#development)
|
|
- [Examples](#examples)
|
|
|
|
## 🔍 Overview
|
|
|
|
This package provides a production-ready MCP server that enables semantic search capabilities through a unified interface. It supports multiple vector database backends while maintaining type safety and comprehensive test coverage.
|
|
|
|
### Key Features
|
|
|
|
- **🔌 Pluggable Backends**: Abstract engine interface for easy backend integration
|
|
- **🛡️ Type Safety**: Full generic typing with Rust-like associated types pattern
|
|
- **⚡ Performance**: Caching and async/await throughout
|
|
- **🧪 Well Tested**: 62+ tests with 100% critical path coverage
|
|
- **📚 Comprehensive Docs**: Detailed docstrings and examples
|
|
|
|
### Supported Backends
|
|
|
|
- **Qdrant** ✅ Fully implemented with async client
|
|
- **Cosmos DB** 🚧 Planned (interface ready)
|
|
|
|
## 🏗️ Architecture
|
|
|
|
### Core Components
|
|
|
|
```mermaid
|
|
graph TB
|
|
A[MCP Server] --> B[BaseEngine Abstract Class]
|
|
B --> C[QdrantEngine]
|
|
B --> D[CosmosEngine - Future]
|
|
C --> E[Qdrant AsyncClient]
|
|
F[Factory with Overloads] --> B
|
|
G[Generic Type System] --> B
|
|
```
|
|
|
|
### Design Patterns
|
|
|
|
#### 1. **Abstract Factory with Overloaded Types**
|
|
```python
|
|
# Type checker knows exact return type for literals
|
|
engine = get_engine(Backend.QDRANT) # Returns: QdrantEngine
|
|
|
|
# Generic typing for variables
|
|
backend: Backend = some_variable
|
|
engine = get_engine(backend) # Returns: BaseEngine
|
|
```
|
|
|
|
#### 2. **Generic Interface (Rust-like Associated Types)**
|
|
```python
|
|
class BaseEngine(ABC, Generic[ResponseType, ConditionType]):
|
|
# ResponseType: Backend-specific raw response (e.g., list[ScoredPoint])
|
|
# ConditionType: Backend-specific filter type (e.g., models.Filter)
|
|
|
|
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
|
|
# Concrete implementation with Qdrant types
|
|
```
|
|
|
|
#### 3. **Template Method Pattern**
|
|
```python
|
|
async def semantic_search(self, ...):
|
|
"""Public interface orchestrates the workflow"""
|
|
conditions = self.transform_conditions(...) # Abstract
|
|
response = await self.run_similarity_query(...) # Abstract
|
|
return self.transform_response(response) # Abstract
|
|
```
|
|
|
|
## 📖 API Documentation
|
|
|
|
### Main Entry Points
|
|
|
|
#### `run(transport: Transport = "sse")`
|
|
Starts the MCP server with specified transport protocol.
|
|
|
|
**Parameters:**
|
|
- `transport`: Either `"sse"` (Server-Sent Events) or `"stdio"`
|
|
|
|
**Example:**
|
|
```python
|
|
from vector_search_mcp import run
|
|
run("sse") # Start server
|
|
```
|
|
|
|
#### `get_engine(backend: Backend) -> BaseEngine`
|
|
Factory function creating cached engine instances.
|
|
|
|
**Parameters:**
|
|
- `backend`: Backend enum value (Backend.QDRANT, Backend.COSMOS)
|
|
|
|
**Returns:**
|
|
- Typed engine instance (QdrantEngine for QDRANT)
|
|
|
|
**Example:**
|
|
```python
|
|
from vector_search_mcp.engine import get_engine, Backend
|
|
|
|
engine = get_engine(Backend.QDRANT)
|
|
results = await engine.semantic_search(
|
|
embedding=[0.1, 0.2, 0.3],
|
|
collection="documents",
|
|
limit=10
|
|
)
|
|
```
|
|
|
|
### Core Classes
|
|
|
|
#### `BaseEngine[ResponseType, ConditionType]`
|
|
Abstract base class defining the engine interface.
|
|
|
|
**Generic Parameters:**
|
|
- `ResponseType`: Backend's native response format
|
|
- `ConditionType`: Backend's native filter format
|
|
|
|
**Key Methods:**
|
|
- `semantic_search()`: Main public interface
|
|
- `transform_conditions()`: Convert generic to backend conditions
|
|
- `transform_response()`: Convert backend to generic results
|
|
- `run_similarity_query()`: Execute backend-specific search
|
|
|
|
#### `QdrantEngine(BaseEngine[list[ScoredPoint], Filter])`
|
|
Concrete Qdrant implementation.
|
|
|
|
**Features:**
|
|
- Async Qdrant client with connection pooling
|
|
- Automatic payload filtering (excludes null payloads)
|
|
- Support for Match, MatchAny, MatchExclude conditions
|
|
- Named vector support
|
|
|
|
### Data Models
|
|
|
|
#### `SearchRow`
|
|
Standardized search result format.
|
|
|
|
```python
|
|
SearchRow(
|
|
chunk_id="doc_123", # Document identifier
|
|
score=0.95, # Similarity score (0.0-1.0)
|
|
payload={"text": "...", ...} # Metadata dictionary
|
|
)
|
|
```
|
|
|
|
#### Condition Types
|
|
|
|
**`Match`** - Exact field matching
|
|
```python
|
|
Match(key="category", value="technology")
|
|
```
|
|
|
|
**`MatchAny`** - Match any of provided values
|
|
```python
|
|
MatchAny(key="tags", any=["python", "rust", "go"])
|
|
```
|
|
|
|
**`MatchExclude`** - Exclude specified values
|
|
```python
|
|
MatchExclude(key="status", exclude=["draft", "deleted"])
|
|
```
|
|
|
|
## 🛡️ Type Safety
|
|
|
|
### Generic Type System
|
|
|
|
The package uses a sophisticated generic type system that provides compile-time type safety while maintaining flexibility:
|
|
|
|
```python
|
|
# Engine implementations specify their exact types
|
|
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
|
|
def transform_response(self, response: list[models.ScoredPoint]) -> list[SearchRow]:
|
|
# Type checker validates response parameter type
|
|
|
|
async def run_similarity_query(...) -> list[models.ScoredPoint]:
|
|
# Type checker validates return type matches generic parameter
|
|
```
|
|
|
|
### Factory Type Overloads
|
|
|
|
```python
|
|
@overload
|
|
def get_engine(backend: Literal[Backend.QDRANT]) -> QdrantEngine: ...
|
|
|
|
@overload
|
|
def get_engine(backend: Backend) -> BaseEngine: ...
|
|
|
|
# Usage provides different type information:
|
|
engine1 = get_engine(Backend.QDRANT) # Type: QdrantEngine
|
|
engine2 = get_engine(some_variable) # Type: BaseEngine
|
|
```
|
|
|
|
## 🧪 Testing
|
|
|
|
### Test Coverage
|
|
|
|
- **62 Tests Total** across 4 test modules
|
|
- **100% Critical Path Coverage** for search workflows
|
|
- **Integration Testing** with full mock environments
|
|
- **Type Safety Validation** with runtime checks
|
|
|
|
### Test Structure
|
|
|
|
```
|
|
tests/test_engine/
|
|
├── test_base_engine.py # Abstract interface tests (12 tests)
|
|
├── test_qdrant_engine.py # Qdrant implementation (20 tests)
|
|
├── test_factory.py # Factory and typing tests (17 tests)
|
|
├── test_integration.py # End-to-end workflows (13 tests)
|
|
├── conftest.py # Shared fixtures and mocks
|
|
└── README.md # Testing documentation
|
|
```
|
|
|
|
### Running Tests
|
|
|
|
```bash
|
|
# Run all engine tests
|
|
uv run pytest tests/test_engine/ -v
|
|
|
|
# Run with coverage
|
|
uv run pytest tests/test_engine/ --cov=src/vector_search_mcp/engine --cov-report=html
|
|
|
|
# Run specific test categories
|
|
uv run pytest tests/test_engine/test_integration.py -v
|
|
```
|
|
|
|
### Key Testing Features
|
|
|
|
- **Cache Management**: Auto-clearing fixtures prevent test interference
|
|
- **Mock Isolation**: Comprehensive mocking prevents real network calls
|
|
- **Async Testing**: Full async/await support with proper event loops
|
|
- **Type Validation**: Runtime checks for generic type correctness
|
|
|
|
## 🛠️ Development
|
|
|
|
### Prerequisites
|
|
|
|
```bash
|
|
# Install with uv
|
|
uv install
|
|
|
|
# Or with pip
|
|
pip install -e .
|
|
```
|
|
|
|
### Code Quality
|
|
|
|
The package maintains high code quality standards:
|
|
|
|
```bash
|
|
# Linting and formatting
|
|
uv run ruff check # Check for issues
|
|
uv run ruff check --fix # Auto-fix issues
|
|
uv run ruff format # Format code
|
|
|
|
# Type checking
|
|
uv run mypy src/
|
|
|
|
# Run tests
|
|
uv run pytest
|
|
```
|
|
|
|
### Adding New Backends
|
|
|
|
1. **Define Types**: Determine ResponseType and ConditionType for your backend
|
|
2. **Implement Engine**: Create class extending `BaseEngine[ResponseType, ConditionType]`
|
|
3. **Add to Factory**: Update `Backend` enum and `get_engine()` function
|
|
4. **Write Tests**: Follow existing test patterns
|
|
5. **Update Documentation**: Add examples and API docs
|
|
|
|
Example template:
|
|
```python
|
|
class MyEngine(BaseEngine[MyResponseType, MyConditionType]):
|
|
def transform_conditions(self, conditions: list[Condition] | None) -> MyConditionType | None:
|
|
# Convert generic conditions to backend format
|
|
|
|
def transform_response(self, response: MyResponseType) -> list[SearchRow]:
|
|
# Convert backend response to SearchRow objects
|
|
|
|
async def run_similarity_query(...) -> MyResponseType:
|
|
# Execute backend-specific search
|
|
```
|
|
|
|
## 💡 Examples
|
|
|
|
### Basic Usage
|
|
|
|
```python
|
|
from vector_search_mcp.engine import get_engine, Backend
|
|
from vector_search_mcp.models import Match, MatchAny
|
|
|
|
# Create engine
|
|
engine = get_engine(Backend.QDRANT)
|
|
|
|
# Simple search
|
|
results = await engine.semantic_search(
|
|
embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
|
|
collection="documents",
|
|
limit=10
|
|
)
|
|
|
|
for result in results:
|
|
print(f"Score: {result.score:.3f} - {result.payload['text'][:50]}...")
|
|
```
|
|
|
|
### Advanced Filtering
|
|
|
|
```python
|
|
# Complex conditions
|
|
conditions = [
|
|
Match(key="category", value="technology"),
|
|
MatchAny(key="language", any=["python", "rust", "go"]),
|
|
MatchExclude(key="status", exclude=["draft", "archived"])
|
|
]
|
|
|
|
results = await engine.semantic_search(
|
|
embedding=query_vector,
|
|
collection="tech_docs",
|
|
limit=20,
|
|
conditions=conditions,
|
|
threshold=0.75 # Minimum similarity score
|
|
)
|
|
```
|
|
|
|
### Custom Backend Implementation
|
|
|
|
```python
|
|
from vector_search_mcp.engine.base_engine import BaseEngine
|
|
from vector_search_mcp.models import SearchRow, Condition
|
|
|
|
class CustomEngine(BaseEngine[dict, str]):
|
|
"""Example custom backend implementation."""
|
|
|
|
def transform_conditions(self, conditions: list[Condition] | None) -> str | None:
|
|
if not conditions:
|
|
return None
|
|
# Convert to custom query string format
|
|
return " AND ".join([f"{c.key}:{c.value}" for c in conditions])
|
|
|
|
def transform_response(self, response: dict) -> list[SearchRow]:
|
|
# Convert custom response to SearchRow objects
|
|
return [
|
|
SearchRow(
|
|
chunk_id=str(item['id']),
|
|
score=item['similarity'],
|
|
payload=item['metadata']
|
|
)
|
|
for item in response.get('results', [])
|
|
]
|
|
|
|
async def run_similarity_query(self, embedding, collection, limit=10,
|
|
conditions=None, threshold=None) -> dict:
|
|
# Custom backend API call
|
|
return await self.custom_client.search(
|
|
vector=embedding,
|
|
index=collection,
|
|
limit=limit,
|
|
filter=conditions,
|
|
min_score=threshold
|
|
)
|
|
```
|
|
|
|
### MCP Server Integration
|
|
|
|
```python
|
|
# Start the MCP server
|
|
from vector_search_mcp import run
|
|
|
|
# With Server-Sent Events (web-based clients)
|
|
run("sse")
|
|
|
|
# With stdio (terminal/CLI clients)
|
|
run("stdio")
|
|
```
|
|
|
|
---
|
|
|
|
## 📚 Additional Resources
|
|
|
|
- **Source Code**: Fully documented with comprehensive docstrings
|
|
- **Test Suite**: Located in `tests/test_engine/` with detailed README
|
|
- **Type Definitions**: All public APIs have complete type annotations
|
|
- **Examples**: See `examples/` directory (if available) for more use cases
|
|
|
|
This documentation covers the current state of the Vector Search MCP package. The architecture is designed for extensibility, type safety, and production use.
|