Add RAG client

This commit is contained in:
2026-02-22 22:45:47 +00:00
parent 3c1c1a246a
commit 10520012d4
189 changed files with 10690 additions and 31 deletions

View File

@@ -0,0 +1,268 @@
# RAG API Specification
## Overview
This document defines the API contract between the integration layer (`capa-de-integracion`) and the RAG server.
The RAG server replaces Dialogflow CX for intent detection and response generation using Retrieval-Augmented Generation.
## Base URL
```
https://your-rag-server.com/api/v1
```
## Authentication
- Method: API Key (optional)
- Header: `X-API-Key: <your-api-key>`
---
## Endpoint: Query
### **POST /query**
Process a user message or notification and return a generated response.
### Request
**Headers:**
- `Content-Type: application/json`
- `X-API-Key: <api-key>` (optional)
**Body:**
```json
{
"phone_number": "string (required)",
"text": "string (required - obfuscated user input or notification text)",
"type": "string (optional: 'conversation' or 'notification')",
"notification": {
"text": "string (optional - original notification text)",
"parameters": {
"key": "value"
}
},
"language_code": "string (optional, default: 'es')"
}
```
**Field Descriptions:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `phone_number` | string | ✅ Yes | User's phone number (used by RAG for internal conversation history tracking) |
| `text` | string | ✅ Yes | Obfuscated user input (already processed by DLP in integration layer) |
| `type` | string | ❌ No | Request type: `"conversation"` (default) or `"notification"` |
| `notification` | object | ❌ No | Present only when processing a notification-related query |
| `notification.text` | string | ❌ No | Original notification text (obfuscated) |
| `notification.parameters` | object | ❌ No | Key-value pairs of notification metadata |
| `language_code` | string | ❌ No | Language code (e.g., `"es"`, `"en"`). Defaults to `"es"` |
### Response
**Status Code:** `200 OK`
**Body:**
```json
{
"response_id": "string (unique identifier for this response)",
"response_text": "string (generated response)",
"parameters": {
"key": "value"
},
"confidence": 0.95
}
```
**Field Descriptions:**
| Field | Type | Description |
|-------|------|-------------|
| `response_id` | string | Unique identifier for this RAG response (for tracking/logging) |
| `response_text` | string | The generated response text to send back to the user |
| `parameters` | object | Optional key-value pairs extracted or computed by RAG (can be empty) |
| `confidence` | number | Optional confidence score (0.0 - 1.0) |
---
## Error Responses
### **400 Bad Request**
Invalid request format or missing required fields.
```json
{
"error": "Bad Request",
"message": "Missing required field: phone_number",
"status": 400
}
```
### **500 Internal Server Error**
RAG server encountered an error processing the request.
```json
{
"error": "Internal Server Error",
"message": "Failed to generate response",
"status": 500
}
```
### **503 Service Unavailable**
RAG server is temporarily unavailable (triggers retry in client).
```json
{
"error": "Service Unavailable",
"message": "RAG service is currently unavailable",
"status": 503
}
```
---
## Example Requests
### Example 1: Regular Conversation
```json
POST /api/v1/query
{
"phone_number": "573001234567",
"text": "¿Cuál es el estado de mi solicitud?",
"type": "conversation",
"language_code": "es"
}
```
**Response:**
```json
{
"response_id": "rag-resp-12345-67890",
"response_text": "Tu solicitud está en proceso de revisión. Te notificaremos cuando esté lista.",
"parameters": {},
"confidence": 0.92
}
```
### Example 2: Notification Flow
```json
POST /api/v1/query
{
"phone_number": "573001234567",
"text": "necesito más información",
"type": "notification",
"notification": {
"text": "Tu documento ha sido aprobado. Descárgalo desde el portal.",
"parameters": {
"document_id": "DOC-2025-001",
"status": "approved"
}
},
"language_code": "es"
}
```
**Response:**
```json
{
"response_id": "rag-resp-12345-67891",
"response_text": "Puedes descargar tu documento aprobado ingresando al portal con tu número de documento DOC-2025-001.",
"parameters": {
"document_id": "DOC-2025-001"
},
"confidence": 0.88
}
```
---
## Design Decisions
### 1. **RAG Handles Conversation History Internally**
- The RAG server maintains its own conversation history indexed by `phone_number`
- The integration layer will continue to store conversation history (redundant for now)
- This allows gradual migration without risk
### 2. **No Session ID Required**
- Unlike Dialogflow (complex session paths), RAG uses `phone_number` as the session identifier
- Simpler and aligns with RAG's internal tracking
### 3. **Notifications Are Contextual**
- When a notification is active, the integration layer passes both:
- The user's query (`text`)
- The notification context (`notification.text` and `notification.parameters`)
- RAG uses this context to generate relevant responses
### 4. **Minimal Parameter Passing**
- Only essential data is sent to RAG
- The integration layer can store additional metadata internally without sending it to RAG
- RAG can return parameters if needed (e.g., extracted entities)
### 5. **Obfuscation Stays in Integration Layer**
- DLP obfuscation happens before calling RAG
- RAG receives already-obfuscated text
- This maintains the existing security boundary
---
## Non-Functional Requirements
### Performance
- **Target Response Time:** < 2 seconds (p95)
- **Timeout:** 30 seconds (configurable in client)
### Reliability
- **Availability:** 99.5%+
- **Retry Strategy:** Client will retry on 500, 503, 504 errors (exponential backoff)
### Scalability
- **Concurrent Requests:** Support 100+ concurrent requests
- **Rate Limiting:** None (or specify if needed)
---
## Migration Notes
### What the Integration Layer Will Do:
✅ Continue to obfuscate text via DLP before calling RAG
✅ Continue to store conversation history in Memorystore + Firestore (redundant but safe)
✅ Continue to manage session timeouts (30 minutes)
✅ Continue to handle notification storage and retrieval
✅ Map `DetectIntentRequestDTO` → RAG request format
✅ Map RAG response → `DetectIntentResponseDTO`
### What the RAG Server Will Do:
✅ Maintain its own conversation history by `phone_number`
✅ Use notification context when provided to generate relevant responses
✅ Generate responses using RAG (retrieval + generation)
✅ Return structured responses with optional parameters
### What We're NOT Changing:
❌ External API contracts (controllers remain unchanged)
❌ DTO structures (`DetectIntentRequestDTO`, `DetectIntentResponseDTO`)
❌ Conversation storage logic (Memorystore + Firestore)
❌ DLP obfuscation flow
❌ Session management (30-minute timeout)
❌ Notification storage
---
## Questions for RAG Team
Before implementation:
1. **Endpoint URL:** What is the actual RAG server URL?
2. **Authentication:** Do we need API key authentication? If yes, what's the header format?
3. **Timeout:** What's a reasonable timeout? (We're using 30s as default)
4. **Rate Limiting:** Any rate limits we should be aware of?
5. **Conversation History:** Does RAG need explicit conversation history, or does it fetch by phone_number internally?
6. **Response Parameters:** Will RAG return any extracted parameters, or just `response_text`?
7. **Health Check:** Is there a `/health` endpoint for monitoring?
8. **Versioning:** Should we use `/api/v1/query` or a different version?
---
## Changelog
| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2025-02-22 | Initial specification based on 3 core requirements |