Files

Anibal Angulo b44a209d42 Add docstrings

2025-09-26 15:45:13 +00:00

11 KiB

Raw Blame History

Vector Search MCP - Documentation

A comprehensive Model Context Protocol (MCP) server for vector similarity search operations with pluggable backend support.

📋 Table of Contents

Overview
Architecture
API Documentation
Type Safety
Testing
Development
Examples

🔍 Overview

This package provides a production-ready MCP server that enables semantic search capabilities through a unified interface. It supports multiple vector database backends while maintaining type safety and comprehensive test coverage.

Key Features

🔌 Pluggable Backends: Abstract engine interface for easy backend integration
🛡️ Type Safety: Full generic typing with Rust-like associated types pattern
⚡ Performance: Caching and async/await throughout
🧪 Well Tested: 62+ tests with 100% critical path coverage
📚 Comprehensive Docs: Detailed docstrings and examples

Supported Backends

Qdrant ✅ Fully implemented with async client
Cosmos DB 🚧 Planned (interface ready)

🏗️ Architecture

Core Components

graph TB
    A[MCP Server] --> B[BaseEngine Abstract Class]
    B --> C[QdrantEngine]
    B --> D[CosmosEngine - Future]
    C --> E[Qdrant AsyncClient]
    F[Factory with Overloads] --> B
    G[Generic Type System] --> B

Design Patterns

1. Abstract Factory with Overloaded Types

# Type checker knows exact return type for literals
engine = get_engine(Backend.QDRANT)  # Returns: QdrantEngine

# Generic typing for variables
backend: Backend = some_variable
engine = get_engine(backend)  # Returns: BaseEngine

2. Generic Interface (Rust-like Associated Types)

class BaseEngine(ABC, Generic[ResponseType, ConditionType]):
    # ResponseType: Backend-specific raw response (e.g., list[ScoredPoint])
    # ConditionType: Backend-specific filter type (e.g., models.Filter)

class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    # Concrete implementation with Qdrant types

3. Template Method Pattern

async def semantic_search(self, ...):
    """Public interface orchestrates the workflow"""
    conditions = self.transform_conditions(...)  # Abstract
    response = await self.run_similarity_query(...)  # Abstract
    return self.transform_response(response)  # Abstract

📖 API Documentation

Main Entry Points

`run(transport: Transport = "sse")`

Starts the MCP server with specified transport protocol.

Parameters:

transport: Either "sse" (Server-Sent Events) or "stdio"

Example:

from vector_search_mcp import run
run("sse")  # Start server

`get_engine(backend: Backend) -> BaseEngine`

Factory function creating cached engine instances.

Parameters:

backend: Backend enum value (Backend.QDRANT, Backend.COSMOS)

Returns:

Typed engine instance (QdrantEngine for QDRANT)

Example:

from vector_search_mcp.engine import get_engine, Backend

engine = get_engine(Backend.QDRANT)
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3],
    collection="documents",
    limit=10
)

Core Classes

`BaseEngine[ResponseType, ConditionType]`

Abstract base class defining the engine interface.

Generic Parameters:

ResponseType: Backend's native response format
ConditionType: Backend's native filter format

Key Methods:

semantic_search(): Main public interface
transform_conditions(): Convert generic to backend conditions
transform_response(): Convert backend to generic results
run_similarity_query(): Execute backend-specific search

`QdrantEngine(BaseEngine[list[ScoredPoint], Filter])`

Concrete Qdrant implementation.

Features:

Async Qdrant client with connection pooling
Automatic payload filtering (excludes null payloads)
Support for Match, MatchAny, MatchExclude conditions
Named vector support

Data Models

`SearchRow`

Standardized search result format.

SearchRow(
    chunk_id="doc_123",           # Document identifier
    score=0.95,                   # Similarity score (0.0-1.0)
    payload={"text": "...", ...}  # Metadata dictionary
)

Condition Types

Match - Exact field matching

Match(key="category", value="technology")

MatchAny - Match any of provided values

MatchAny(key="tags", any=["python", "rust", "go"])

MatchExclude - Exclude specified values

MatchExclude(key="status", exclude=["draft", "deleted"])

🛡️ Type Safety

Generic Type System

The package uses a sophisticated generic type system that provides compile-time type safety while maintaining flexibility:

# Engine implementations specify their exact types
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    def transform_response(self, response: list[models.ScoredPoint]) -> list[SearchRow]:
        # Type checker validates response parameter type

    async def run_similarity_query(...) -> list[models.ScoredPoint]:
        # Type checker validates return type matches generic parameter

Factory Type Overloads

@overload
def get_engine(backend: Literal[Backend.QDRANT]) -> QdrantEngine: ...

@overload
def get_engine(backend: Backend) -> BaseEngine: ...

# Usage provides different type information:
engine1 = get_engine(Backend.QDRANT)      # Type: QdrantEngine
engine2 = get_engine(some_variable)       # Type: BaseEngine

🧪 Testing

Test Coverage

62 Tests Total across 4 test modules
100% Critical Path Coverage for search workflows
Integration Testing with full mock environments
Type Safety Validation with runtime checks

Test Structure

tests/test_engine/
├── test_base_engine.py      # Abstract interface tests (12 tests)
├── test_qdrant_engine.py    # Qdrant implementation (20 tests)
├── test_factory.py          # Factory and typing tests (17 tests)
├── test_integration.py      # End-to-end workflows (13 tests)
├── conftest.py              # Shared fixtures and mocks
└── README.md                # Testing documentation

Running Tests

# Run all engine tests
uv run pytest tests/test_engine/ -v

# Run with coverage
uv run pytest tests/test_engine/ --cov=src/vector_search_mcp/engine --cov-report=html

# Run specific test categories
uv run pytest tests/test_engine/test_integration.py -v

Key Testing Features

Cache Management: Auto-clearing fixtures prevent test interference
Mock Isolation: Comprehensive mocking prevents real network calls
Async Testing: Full async/await support with proper event loops
Type Validation: Runtime checks for generic type correctness

🛠️ Development

Prerequisites

# Install with uv
uv install

# Or with pip
pip install -e .

Code Quality

The package maintains high code quality standards:

# Linting and formatting
uv run ruff check          # Check for issues
uv run ruff check --fix    # Auto-fix issues
uv run ruff format         # Format code

# Type checking
uv run mypy src/

# Run tests
uv run pytest

Adding New Backends

Define Types: Determine ResponseType and ConditionType for your backend
Implement Engine: Create class extending BaseEngine[ResponseType, ConditionType]
Add to Factory: Update Backend enum and get_engine() function
Write Tests: Follow existing test patterns
Update Documentation: Add examples and API docs

Example template:

class MyEngine(BaseEngine[MyResponseType, MyConditionType]):
    def transform_conditions(self, conditions: list[Condition] | None) -> MyConditionType | None:
        # Convert generic conditions to backend format

    def transform_response(self, response: MyResponseType) -> list[SearchRow]:
        # Convert backend response to SearchRow objects

    async def run_similarity_query(...) -> MyResponseType:
        # Execute backend-specific search

💡 Examples

Basic Usage

from vector_search_mcp.engine import get_engine, Backend
from vector_search_mcp.models import Match, MatchAny

# Create engine
engine = get_engine(Backend.QDRANT)

# Simple search
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
    collection="documents",
    limit=10
)

for result in results:
    print(f"Score: {result.score:.3f} - {result.payload['text'][:50]}...")

Advanced Filtering

# Complex conditions
conditions = [
    Match(key="category", value="technology"),
    MatchAny(key="language", any=["python", "rust", "go"]),
    MatchExclude(key="status", exclude=["draft", "archived"])
]

results = await engine.semantic_search(
    embedding=query_vector,
    collection="tech_docs",
    limit=20,
    conditions=conditions,
    threshold=0.75  # Minimum similarity score
)

Custom Backend Implementation

from vector_search_mcp.engine.base_engine import BaseEngine
from vector_search_mcp.models import SearchRow, Condition

class CustomEngine(BaseEngine[dict, str]):
    """Example custom backend implementation."""

    def transform_conditions(self, conditions: list[Condition] | None) -> str | None:
        if not conditions:
            return None
        # Convert to custom query string format
        return " AND ".join([f"{c.key}:{c.value}" for c in conditions])

    def transform_response(self, response: dict) -> list[SearchRow]:
        # Convert custom response to SearchRow objects
        return [
            SearchRow(
                chunk_id=str(item['id']),
                score=item['similarity'],
                payload=item['metadata']
            )
            for item in response.get('results', [])
        ]

    async def run_similarity_query(self, embedding, collection, limit=10,
                                 conditions=None, threshold=None) -> dict:
        # Custom backend API call
        return await self.custom_client.search(
            vector=embedding,
            index=collection,
            limit=limit,
            filter=conditions,
            min_score=threshold
        )

MCP Server Integration

# Start the MCP server
from vector_search_mcp import run

# With Server-Sent Events (web-based clients)
run("sse")

# With stdio (terminal/CLI clients)
run("stdio")

📚 Additional Resources

Source Code: Fully documented with comprehensive docstrings
Test Suite: Located in tests/test_engine/ with detailed README
Type Definitions: All public APIs have complete type annotations
Examples: See examples/ directory (if available) for more use cases

This documentation covers the current state of the Vector Search MCP package. The architecture is designed for extensibility, type safety, and production use.

11 KiB Raw Blame History