First commit

This commit is contained in:
Anibal Angulo
2026-02-18 19:57:43 +00:00
commit a53f8fcf62
115 changed files with 9957 additions and 0 deletions

View File

@@ -0,0 +1,95 @@
# Search Evaluation
This package contains scripts to evaluate the performance of the vector search component.
## Evaluation
The `search-eval` script evaluates search performance. It can source data from either BigQuery or local files.
### Local File Evaluation
To run the evaluation using a local file, use the `--input-file` option.
```bash
uv run search-eval -- --input-file /path/to/your/data.csv
```
Or for a SQLite database:
```bash
uv run search-eval -- --input-file /path/to/your/data.db
```
#### Input File Structures
**CSV File**
The CSV file must contain the following columns:
| Column | Description |
|--------|-----------------------------------------------|
| `input` | The question to be used for the search query. |
| `source` | The expected document path for the question. |
**SQLite Database**
The SQLite database must contain a table named `evaluation_data` with the following columns:
| Column | Description |
|--------|-----------------------------------------------|
| `input` | The question to be used for the search query. |
| `source` | The expected document path for the question. |
### BigQuery Evaluation
The `search-eval-bq` script evaluates search performance using data sourced from and written to BigQuery.
### BigQuery Table Structures
#### Input Table
The input table must contain the following columns:
| Column | Type | Description |
| --------------- | ------- | --------------------------------------------------------------------------- |
| `id` | STRING | A unique identifier for each question. |
| `question` | STRING | The question to be used for the search query. |
| `document_path` | STRING | The expected document path for the given question. |
| `question_type` | STRING | The type of question. Rows where `question_type` is 'Unanswerable' are ignored. |
#### Output Table
The output table will be created by the script if it doesn't exist, or appended to if it does. It will have the following structure:
| Column | Type | Description |
| ------------------------ | --------- | ------------------------------------------------------------------------ |
| `id` | STRING | The unique identifier for the question from the input table. |
| `question` | STRING | The question used for the search query. |
| `expected_document` | STRING | The expected document for the given question. |
| `retrieved_documents` | STRING[] | An array of document IDs retrieved from the vector search. |
| `retrieved_distances` | FLOAT64[] | An array of distance scores for the retrieved documents. |
| `is_expected_in_results` | BOOLEAN | A flag indicating whether the expected document was in the search results. |
| `evaluation_timestamp` | TIMESTAMP | The timestamp of when the evaluation was run. |
### Usage
To run the BigQuery evaluation script, use the `uv run search-eval-bq` command with the following options:
```bash
uv run search-eval-bq -- --input-table <project.dataset.table> --output-table <project.dataset.table> [--project-id <gcp-project-id>]
```
**Arguments:**
* `--input-table`: **(Required)** The full BigQuery table name for the input data (e.g., `my-gcp-project.my_dataset.questions`).
* `--output-table`: **(Required)** The full BigQuery table name for the output results (e.g., `my-gcp-project.my_dataset.eval_results`).
* `--project-id`: (Optional) The Google Cloud project ID. If not provided, it will use the `project_id` from the `config.yaml` file.
**Example:**
```bash
uv run search-eval-bq -- \
--input-table "my-gcp-project.search_eval.synthetic_questions" \
--output-table "my-gcp-project.search_eval.results" \
--project-id "my-gcp-project"
```