# Search Evaluation This package contains scripts to evaluate the performance of the vector search component. ## Evaluation The `search-eval` script evaluates search performance. It can source data from either BigQuery or local files. ### Local File Evaluation To run the evaluation using a local file, use the `--input-file` option. ```bash uv run search-eval -- --input-file /path/to/your/data.csv ``` Or for a SQLite database: ```bash uv run search-eval -- --input-file /path/to/your/data.db ``` #### Input File Structures **CSV File** The CSV file must contain the following columns: | Column | Description | |--------|-----------------------------------------------| | `input` | The question to be used for the search query. | | `source` | The expected document path for the question. | **SQLite Database** The SQLite database must contain a table named `evaluation_data` with the following columns: | Column | Description | |--------|-----------------------------------------------| | `input` | The question to be used for the search query. | | `source` | The expected document path for the question. | ### BigQuery Evaluation The `search-eval-bq` script evaluates search performance using data sourced from and written to BigQuery. ### BigQuery Table Structures #### Input Table The input table must contain the following columns: | Column | Type | Description | | --------------- | ------- | --------------------------------------------------------------------------- | | `id` | STRING | A unique identifier for each question. | | `question` | STRING | The question to be used for the search query. | | `document_path` | STRING | The expected document path for the given question. | | `question_type` | STRING | The type of question. Rows where `question_type` is 'Unanswerable' are ignored. | #### Output Table The output table will be created by the script if it doesn't exist, or appended to if it does. It will have the following structure: | Column | Type | Description | | ------------------------ | --------- | ------------------------------------------------------------------------ | | `id` | STRING | The unique identifier for the question from the input table. | | `question` | STRING | The question used for the search query. | | `expected_document` | STRING | The expected document for the given question. | | `retrieved_documents` | STRING[] | An array of document IDs retrieved from the vector search. | | `retrieved_distances` | FLOAT64[] | An array of distance scores for the retrieved documents. | | `is_expected_in_results` | BOOLEAN | A flag indicating whether the expected document was in the search results. | | `evaluation_timestamp` | TIMESTAMP | The timestamp of when the evaluation was run. | ### Usage To run the BigQuery evaluation script, use the `uv run search-eval-bq` command with the following options: ```bash uv run search-eval-bq -- --input-table --output-table [--project-id ] ``` **Arguments:** * `--input-table`: **(Required)** The full BigQuery table name for the input data (e.g., `my-gcp-project.my_dataset.questions`). * `--output-table`: **(Required)** The full BigQuery table name for the output results (e.g., `my-gcp-project.my_dataset.eval_results`). * `--project-id`: (Optional) The Google Cloud project ID. If not provided, it will use the `project_id` from the `config.yaml` file. **Example:** ```bash uv run search-eval-bq -- \ --input-table "my-gcp-project.search_eval.synthetic_questions" \ --output-table "my-gcp-project.search_eval.results" \ --project-id "my-gcp-project" ```