searchbox/README.md

# Vector Search MCP - Documentation

A comprehensive Model Context Protocol (MCP) server for vector similarity search operations with pluggable backend support.

## 📋 Table of Contents

- [Overview](#overview)
- [Architecture](#architecture)
- [API Documentation](#api-documentation)
- [Type Safety](#type-safety)
- [Testing](#testing)
- [Development](#development)
- [Examples](#examples)

## 🔍 Overview

This package provides a production-ready MCP server that enables semantic search capabilities through a unified interface. It supports multiple vector database backends while maintaining type safety and comprehensive test coverage.

### Key Features

- **🔌 Pluggable Backends**: Abstract engine interface for easy backend integration
- **🛡️ Type Safety**: Full generic typing with Rust-like associated types pattern
- **⚡ Performance**: Caching and async/await throughout
- **🧪 Well Tested**: 62+ tests with 100% critical path coverage
- **📚 Comprehensive Docs**: Detailed docstrings and examples

### Supported Backends

- **Qdrant** ✅ Fully implemented with async client
- **Cosmos DB** 🚧 Planned (interface ready)

## 🏗️ Architecture

### Core Components

```mermaid
graph TB
    A[MCP Server] --> B[BaseEngine Abstract Class]
    B --> C[QdrantEngine]
    B --> D[CosmosEngine - Future]
    C --> E[Qdrant AsyncClient]
    F[Factory with Overloads] --> B
    G[Generic Type System] --> B
```

### Design Patterns

#### 1. **Abstract Factory with Overloaded Types**
```python
# Type checker knows exact return type for literals
engine = get_engine(Backend.QDRANT)  # Returns: QdrantEngine

# Generic typing for variables
backend: Backend = some_variable
engine = get_engine(backend)  # Returns: BaseEngine
```

#### 2. **Generic Interface (Rust-like Associated Types)**
```python
class BaseEngine(ABC, Generic[ResponseType, ConditionType]):
    # ResponseType: Backend-specific raw response (e.g., list[ScoredPoint])
    # ConditionType: Backend-specific filter type (e.g., models.Filter)

class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    # Concrete implementation with Qdrant types
```

#### 3. **Template Method Pattern**
```python
async def semantic_search(self, ...):
    """Public interface orchestrates the workflow"""
    conditions = self.transform_conditions(...)  # Abstract
    response = await self.run_similarity_query(...)  # Abstract
    return self.transform_response(response)  # Abstract
```

## 📖 API Documentation

### Main Entry Points

#### `run(transport: Transport = "sse")`
Starts the MCP server with specified transport protocol.

**Parameters:**
- `transport`: Either `"sse"` (Server-Sent Events) or `"stdio"`

**Example:**
```python
from vector_search_mcp import run
run("sse")  # Start server
```

#### `get_engine(backend: Backend) -> BaseEngine`
Factory function creating cached engine instances.

**Parameters:**
- `backend`: Backend enum value (Backend.QDRANT, Backend.COSMOS)

**Returns:**
- Typed engine instance (QdrantEngine for QDRANT)

**Example:**
```python
from vector_search_mcp.engine import get_engine, Backend

engine = get_engine(Backend.QDRANT)
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3],
    collection="documents",
    limit=10
)
```

### Core Classes

#### `BaseEngine[ResponseType, ConditionType]`
Abstract base class defining the engine interface.

**Generic Parameters:**
- `ResponseType`: Backend's native response format
- `ConditionType`: Backend's native filter format

**Key Methods:**
- `semantic_search()`: Main public interface
- `transform_conditions()`: Convert generic to backend conditions
- `transform_response()`: Convert backend to generic results
- `run_similarity_query()`: Execute backend-specific search

#### `QdrantEngine(BaseEngine[list[ScoredPoint], Filter])`
Concrete Qdrant implementation.

**Features:**
- Async Qdrant client with connection pooling
- Automatic payload filtering (excludes null payloads)
- Support for Match, MatchAny, MatchExclude conditions
- Named vector support

### Data Models

#### `SearchRow`
Standardized search result format.

```python
SearchRow(
    chunk_id="doc_123",           # Document identifier
    score=0.95,                   # Similarity score (0.0-1.0)
    payload={"text": "...", ...}  # Metadata dictionary
)
```

#### Condition Types

**`Match`** - Exact field matching
```python
Match(key="category", value="technology")
```

**`MatchAny`** - Match any of provided values
```python
MatchAny(key="tags", any=["python", "rust", "go"])
```

**`MatchExclude`** - Exclude specified values
```python
MatchExclude(key="status", exclude=["draft", "deleted"])
```

## 🛡️ Type Safety

### Generic Type System

The package uses a sophisticated generic type system that provides compile-time type safety while maintaining flexibility:

```python
# Engine implementations specify their exact types
class QdrantEngine(BaseEngine[list[models.ScoredPoint], models.Filter]):
    def transform_response(self, response: list[models.ScoredPoint]) -> list[SearchRow]:
        # Type checker validates response parameter type

    async def run_similarity_query(...) -> list[models.ScoredPoint]:
        # Type checker validates return type matches generic parameter
```

### Factory Type Overloads

```python
@overload
def get_engine(backend: Literal[Backend.QDRANT]) -> QdrantEngine: ...

@overload
def get_engine(backend: Backend) -> BaseEngine: ...

# Usage provides different type information:
engine1 = get_engine(Backend.QDRANT)      # Type: QdrantEngine
engine2 = get_engine(some_variable)       # Type: BaseEngine
```

## 🧪 Testing

### Test Coverage

- **62 Tests Total** across 4 test modules
- **100% Critical Path Coverage** for search workflows
- **Integration Testing** with full mock environments
- **Type Safety Validation** with runtime checks

### Test Structure

```
tests/test_engine/
├── test_base_engine.py      # Abstract interface tests (12 tests)
├── test_qdrant_engine.py    # Qdrant implementation (20 tests)
├── test_factory.py          # Factory and typing tests (17 tests)
├── test_integration.py      # End-to-end workflows (13 tests)
├── conftest.py              # Shared fixtures and mocks
└── README.md                # Testing documentation
```

### Running Tests

```bash
# Run all engine tests
uv run pytest tests/test_engine/ -v

# Run with coverage
uv run pytest tests/test_engine/ --cov=src/vector_search_mcp/engine --cov-report=html

# Run specific test categories
uv run pytest tests/test_engine/test_integration.py -v
```

### Key Testing Features

- **Cache Management**: Auto-clearing fixtures prevent test interference
- **Mock Isolation**: Comprehensive mocking prevents real network calls
- **Async Testing**: Full async/await support with proper event loops
- **Type Validation**: Runtime checks for generic type correctness

## 🛠️ Development

### Prerequisites

```bash
# Install with uv
uv install

# Or with pip
pip install -e .
```

### Code Quality

The package maintains high code quality standards:

```bash
# Linting and formatting
uv run ruff check          # Check for issues
uv run ruff check --fix    # Auto-fix issues
uv run ruff format         # Format code

# Type checking
uv run mypy src/

# Run tests
uv run pytest
```

### Adding New Backends

1. **Define Types**: Determine ResponseType and ConditionType for your backend
2. **Implement Engine**: Create class extending `BaseEngine[ResponseType, ConditionType]`
3. **Add to Factory**: Update `Backend` enum and `get_engine()` function
4. **Write Tests**: Follow existing test patterns
5. **Update Documentation**: Add examples and API docs

Example template:
```python
class MyEngine(BaseEngine[MyResponseType, MyConditionType]):
    def transform_conditions(self, conditions: list[Condition] | None) -> MyConditionType | None:
        # Convert generic conditions to backend format

    def transform_response(self, response: MyResponseType) -> list[SearchRow]:
        # Convert backend response to SearchRow objects

    async def run_similarity_query(...) -> MyResponseType:
        # Execute backend-specific search
```

## 💡 Examples

### Basic Usage

```python
from vector_search_mcp.engine import get_engine, Backend
from vector_search_mcp.models import Match, MatchAny

# Create engine
engine = get_engine(Backend.QDRANT)

# Simple search
results = await engine.semantic_search(
    embedding=[0.1, 0.2, 0.3, 0.4, 0.5],
    collection="documents",
    limit=10
)

for result in results:
    print(f"Score: {result.score:.3f} - {result.payload['text'][:50]}...")
```

### Advanced Filtering

```python
# Complex conditions
conditions = [
    Match(key="category", value="technology"),
    MatchAny(key="language", any=["python", "rust", "go"]),
    MatchExclude(key="status", exclude=["draft", "archived"])
]

results = await engine.semantic_search(
    embedding=query_vector,
    collection="tech_docs",
    limit=20,
    conditions=conditions,
    threshold=0.75  # Minimum similarity score
)
```

### Custom Backend Implementation

```python
from vector_search_mcp.engine.base_engine import BaseEngine
from vector_search_mcp.models import SearchRow, Condition

class CustomEngine(BaseEngine[dict, str]):
    """Example custom backend implementation."""

    def transform_conditions(self, conditions: list[Condition] | None) -> str | None:
        if not conditions:
            return None
        # Convert to custom query string format
        return " AND ".join([f"{c.key}:{c.value}" for c in conditions])

    def transform_response(self, response: dict) -> list[SearchRow]:
        # Convert custom response to SearchRow objects
        return [
            SearchRow(
                chunk_id=str(item['id']),
                score=item['similarity'],
                payload=item['metadata']
            )
            for item in response.get('results', [])
        ]

    async def run_similarity_query(self, embedding, collection, limit=10,
                                 conditions=None, threshold=None) -> dict:
        # Custom backend API call
        return await self.custom_client.search(
            vector=embedding,
            index=collection,
            limit=limit,
            filter=conditions,
            min_score=threshold
        )
```

### MCP Server Integration

```python
# Start the MCP server
from vector_search_mcp import run

# With Server-Sent Events (web-based clients)
run("sse")

# With stdio (terminal/CLI clients)
run("stdio")
```

---

## 📚 Additional Resources

- **Source Code**: Fully documented with comprehensive docstrings
- **Test Suite**: Located in `tests/test_engine/` with detailed README
- **Type Definitions**: All public APIs have complete type annotations
- **Examples**: See `examples/` directory (if available) for more use cases

This documentation covers the current state of the Vector Search MCP package. The architecture is designed for extensibility, type safety, and production use.