Compare commits

8 Commits

Author SHA1 Message Date
Anibal Angulo
3401af3952 update redis data 2025-11-10 21:52:35 -06:00
Anibal Angulo
19c4841afc localize UI to english 2025-11-09 11:14:42 -06:00
Anibal Angulo
1ce4162e4a add analysis component 2025-11-09 10:24:58 -06:00
Anibal Angulo
77a11ef32e add agent context 2025-11-09 08:35:01 -06:00
Sebastian
a23f45ca6d Agente de web search construido e implementado(Tavily ejemplo de .env en readme) 2025-11-08 10:18:32 +00:00
Sebastian
70f2a42502 Bug solucionado de Qdrant y subida a de datos extraidos a Redis con referencia al documento 2025-11-07 23:30:10 +00:00
Anibal Angulo
c9a63e129d initial agent 2025-11-07 11:19:43 -06:00
Anibal Angulo
af9b5fed01 wip chat 2025-11-07 09:41:18 -06:00
50 changed files with 7201 additions and 522 deletions

View File

@@ -0,0 +1,3 @@
Asi debe estar en el .env para el agente de web search
TAVILY_API_KEY=

384
ai-elements-tools.md Normal file
View File

@@ -0,0 +1,384 @@
# Generative User Interfaces
Generative user interfaces (generative UI) is the process of allowing a large language model (LLM) to go beyond text and "generate UI". This creates a more engaging and AI-native experience for users.
<WeatherSearch />
At the core of generative UI are [ tools ](/docs/ai-sdk-core/tools-and-tool-calling), which are functions you provide to the model to perform specialized tasks like getting the weather in a location. The model can decide when and how to use these tools based on the context of the conversation.
Generative UI is the process of connecting the results of a tool call to a React component. Here's how it works:
1. You provide the model with a prompt or conversation history, along with a set of tools.
2. Based on the context, the model may decide to call a tool.
3. If a tool is called, it will execute and return data.
4. This data can then be passed to a React component for rendering.
By passing the tool results to React components, you can create a generative UI experience that's more engaging and adaptive to your needs.
## Build a Generative UI Chat Interface
Let's create a chat interface that handles text-based conversations and incorporates dynamic UI elements based on model responses.
### Basic Chat Implementation
Start with a basic chat implementation using the `useChat` hook:
```tsx filename="app/page.tsx"
'use client';
import { useChat } from '@ai-sdk/react';
import { useState } from 'react';
export default function Page() {
const [input, setInput] = useState('');
const { messages, sendMessage } = useChat();
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault();
sendMessage({ text: input });
setInput('');
};
return (
<div>
{messages.map(message => (
<div key={message.id}>
<div>{message.role === 'user' ? 'User: ' : 'AI: '}</div>
<div>
{message.parts.map((part, index) => {
if (part.type === 'text') {
return <span key={index}>{part.text}</span>;
}
return null;
})}
</div>
</div>
))}
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={e => setInput(e.target.value)}
placeholder="Type a message..."
/>
<button type="submit">Send</button>
</form>
</div>
);
}
```
To handle the chat requests and model responses, set up an API route:
```ts filename="app/api/chat/route.ts"
import { openai } from '@ai-sdk/openai';
import { streamText, convertToModelMessages, UIMessage, stepCountIs } from 'ai';
export async function POST(request: Request) {
const { messages }: { messages: UIMessage[] } = await request.json();
const result = streamText({
model: openai('gpt-4o'),
system: 'You are a friendly assistant!',
messages: convertToModelMessages(messages),
stopWhen: stepCountIs(5),
});
return result.toUIMessageStreamResponse();
}
```
This API route uses the `streamText` function to process chat messages and stream the model's responses back to the client.
### Create a Tool
Before enhancing your chat interface with dynamic UI elements, you need to create a tool and corresponding React component. A tool will allow the model to perform a specific action, such as fetching weather information.
Create a new file called `ai/tools.ts` with the following content:
```ts filename="ai/tools.ts"
import { tool as createTool } from 'ai';
import { z } from 'zod';
export const weatherTool = createTool({
description: 'Display the weather for a location',
inputSchema: z.object({
location: z.string().describe('The location to get the weather for'),
}),
execute: async function ({ location }) {
await new Promise(resolve => setTimeout(resolve, 2000));
return { weather: 'Sunny', temperature: 75, location };
},
});
export const tools = {
displayWeather: weatherTool,
};
```
In this file, you've created a tool called `weatherTool`. This tool simulates fetching weather information for a given location. This tool will return simulated data after a 2-second delay. In a real-world application, you would replace this simulation with an actual API call to a weather service.
### Update the API Route
Update the API route to include the tool you've defined:
```ts filename="app/api/chat/route.ts" highlight="3,8,14"
import { openai } from '@ai-sdk/openai';
import { streamText, convertToModelMessages, UIMessage, stepCountIs } from 'ai';
import { tools } from '@/ai/tools';
export async function POST(request: Request) {
const { messages }: { messages: UIMessage[] } = await request.json();
const result = streamText({
model: openai('gpt-4o'),
system: 'You are a friendly assistant!',
messages: convertToModelMessages(messages),
stopWhen: stepCountIs(5),
tools,
});
return result.toUIMessageStreamResponse();
}
```
Now that you've defined the tool and added it to your `streamText` call, let's build a React component to display the weather information it returns.
### Create UI Components
Create a new file called `components/weather.tsx`:
```tsx filename="components/weather.tsx"
type WeatherProps = {
temperature: number;
weather: string;
location: string;
};
export const Weather = ({ temperature, weather, location }: WeatherProps) => {
return (
<div>
<h2>Current Weather for {location}</h2>
<p>Condition: {weather}</p>
<p>Temperature: {temperature}°C</p>
</div>
);
};
```
This component will display the weather information for a given location. It takes three props: `temperature`, `weather`, and `location` (exactly what the `weatherTool` returns).
### Render the Weather Component
Now that you have your tool and corresponding React component, let's integrate them into your chat interface. You'll render the Weather component when the model calls the weather tool.
To check if the model has called a tool, you can check the `parts` array of the UIMessage object for tool-specific parts. In AI SDK 5.0, tool parts use typed naming: `tool-${toolName}` instead of generic types.
Update your `page.tsx` file:
```tsx filename="app/page.tsx" highlight="4,9,14-15,19-46"
'use client';
import { useChat } from '@ai-sdk/react';
import { useState } from 'react';
import { Weather } from '@/components/weather';
export default function Page() {
const [input, setInput] = useState('');
const { messages, sendMessage } = useChat();
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault();
sendMessage({ text: input });
setInput('');
};
return (
<div>
{messages.map(message => (
<div key={message.id}>
<div>{message.role === 'user' ? 'User: ' : 'AI: '}</div>
<div>
{message.parts.map((part, index) => {
if (part.type === 'text') {
return <span key={index}>{part.text}</span>;
}
if (part.type === 'tool-displayWeather') {
switch (part.state) {
case 'input-available':
return <div key={index}>Loading weather...</div>;
case 'output-available':
return (
<div key={index}>
<Weather {...part.output} />
</div>
);
case 'output-error':
return <div key={index}>Error: {part.errorText}</div>;
default:
return null;
}
}
return null;
})}
</div>
</div>
))}
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={e => setInput(e.target.value)}
placeholder="Type a message..."
/>
<button type="submit">Send</button>
</form>
</div>
);
}
```
In this updated code snippet, you:
1. Use manual input state management with `useState` instead of the built-in `input` and `handleInputChange`.
2. Use `sendMessage` instead of `handleSubmit` to send messages.
3. Check the `parts` array of each message for different content types.
4. Handle tool parts with type `tool-displayWeather` and their different states (`input-available`, `output-available`, `output-error`).
This approach allows you to dynamically render UI components based on the model's responses, creating a more interactive and context-aware chat experience.
## Expanding Your Generative UI Application
You can enhance your chat application by adding more tools and components, creating a richer and more versatile user experience. Here's how you can expand your application:
### Adding More Tools
To add more tools, simply define them in your `ai/tools.ts` file:
```ts
// Add a new stock tool
export const stockTool = createTool({
description: 'Get price for a stock',
inputSchema: z.object({
symbol: z.string().describe('The stock symbol to get the price for'),
}),
execute: async function ({ symbol }) {
// Simulated API call
await new Promise(resolve => setTimeout(resolve, 2000));
return { symbol, price: 100 };
},
});
// Update the tools object
export const tools = {
displayWeather: weatherTool,
getStockPrice: stockTool,
};
```
Now, create a new file called `components/stock.tsx`:
```tsx
type StockProps = {
price: number;
symbol: string;
};
export const Stock = ({ price, symbol }: StockProps) => {
return (
<div>
<h2>Stock Information</h2>
<p>Symbol: {symbol}</p>
<p>Price: ${price}</p>
</div>
);
};
```
Finally, update your `page.tsx` file to include the new Stock component:
```tsx
'use client';
import { useChat } from '@ai-sdk/react';
import { useState } from 'react';
import { Weather } from '@/components/weather';
import { Stock } from '@/components/stock';
export default function Page() {
const [input, setInput] = useState('');
const { messages, sendMessage } = useChat();
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault();
sendMessage({ text: input });
setInput('');
};
return (
<div>
{messages.map(message => (
<div key={message.id}>
<div>{message.role}</div>
<div>
{message.parts.map((part, index) => {
if (part.type === 'text') {
return <span key={index}>{part.text}</span>;
}
if (part.type === 'tool-displayWeather') {
switch (part.state) {
case 'input-available':
return <div key={index}>Loading weather...</div>;
case 'output-available':
return (
<div key={index}>
<Weather {...part.output} />
</div>
);
case 'output-error':
return <div key={index}>Error: {part.errorText}</div>;
default:
return null;
}
}
if (part.type === 'tool-getStockPrice') {
switch (part.state) {
case 'input-available':
return <div key={index}>Loading stock price...</div>;
case 'output-available':
return (
<div key={index}>
<Stock {...part.output} />
</div>
);
case 'output-error':
return <div key={index}>Error: {part.errorText}</div>;
default:
return null;
}
}
return null;
})}
</div>
</div>
))}
<form onSubmit={handleSubmit}>
<input
type="text"
value={input}
onChange={e => setInput(e.target.value)}
/>
<button type="submit">Send</button>
</form>
</div>
);
}
```
By following this pattern, you can continue to add more tools and components, expanding the capabilities of your Generative UI application.

105
backend/RATE_LIMITING.md Normal file
View File

@@ -0,0 +1,105 @@
# Configuración de Rate Limiting para Azure OpenAI
Este documento explica cómo configurar el rate limiting para evitar errores `429 RateLimitReached` en Azure OpenAI.
## Variables de Entorno
Agrega estas variables en tu archivo `.env`:
```bash
# Rate limiting para embeddings
EMBEDDING_BATCH_SIZE=16
EMBEDDING_DELAY_BETWEEN_BATCHES=1.0
EMBEDDING_MAX_RETRIES=5
```
## Configuración según Azure OpenAI Tier
### **S0 Tier (Gratis/Básico)**
- **Límite**: ~1-3 requests/minuto, ~1,000 tokens/minuto
- **Configuración recomendada**:
```bash
EMBEDDING_BATCH_SIZE=16
EMBEDDING_DELAY_BETWEEN_BATCHES=1.0
EMBEDDING_MAX_RETRIES=5
```
### **Standard Tier**
- **Límite**: ~10-20 requests/segundo, ~100,000 tokens/minuto
- **Configuración recomendada**:
```bash
EMBEDDING_BATCH_SIZE=50
EMBEDDING_DELAY_BETWEEN_BATCHES=0.5
EMBEDDING_MAX_RETRIES=3
```
### **Premium Tier**
- **Límite**: ~100+ requests/segundo, ~500,000+ tokens/minuto
- **Configuración recomendada**:
```bash
EMBEDDING_BATCH_SIZE=100
EMBEDDING_DELAY_BETWEEN_BATCHES=0.1
EMBEDDING_MAX_RETRIES=3
```
## Cómo Funciona el Rate Limiting
### 1. **Batching**
Los textos se dividen en lotes de tamaño `EMBEDDING_BATCH_SIZE`. Un lote más pequeño reduce la probabilidad de exceder el rate limit.
### 2. **Delays entre Batches**
Después de procesar cada lote, el sistema espera `EMBEDDING_DELAY_BETWEEN_BATCHES` segundos antes de procesar el siguiente.
### 3. **Retry con Exponential Backoff**
Si ocurre un error 429 (rate limit):
- **Reintento 1**: espera 2 segundos
- **Reintento 2**: espera 4 segundos
- **Reintento 3**: espera 8 segundos
- **Reintento 4**: espera 16 segundos
- **Reintento 5**: espera 32 segundos
Después de `EMBEDDING_MAX_RETRIES` reintentos, el proceso falla.
## Monitoreo de Logs
Cuando procesas documentos, verás logs como:
```
📊 Procesando batch 1/10 (16 textos)...
✓ Batch 1/10 completado exitosamente
📊 Procesando batch 2/10 (16 textos)...
⚠️ Rate limit alcanzado en batch 2/10. Reintento 1/5 en 2s...
✓ Batch 2/10 completado exitosamente
...
✅ Embeddings generados exitosamente: 150 vectores de 3072D
```
## Cálculo de Tiempo de Procesamiento
Para estimar cuánto tardará el procesamiento:
```
Tiempo estimado = (total_chunks / EMBEDDING_BATCH_SIZE) * EMBEDDING_DELAY_BETWEEN_BATCHES
```
**Ejemplos**:
- 100 chunks con S0 config: `(100/16) * 1.0 = ~6.25 segundos` (sin contar reintentos)
- 1000 chunks con S0 config: `(1000/16) * 1.0 = ~62.5 segundos` (sin contar reintentos)
## Ajuste Dinámico
Si experimentas muchos errores 429:
1. **Reduce** `EMBEDDING_BATCH_SIZE` (ej: de 16 a 8)
2. **Aumenta** `EMBEDDING_DELAY_BETWEEN_BATCHES` (ej: de 1.0 a 2.0)
3. **Aumenta** `EMBEDDING_MAX_RETRIES` (ej: de 5 a 10)
Si el procesamiento es muy lento y NO tienes errores 429:
1. **Aumenta** `EMBEDDING_BATCH_SIZE` (ej: de 16 a 32)
2. **Reduce** `EMBEDDING_DELAY_BETWEEN_BATCHES` (ej: de 1.0 a 0.5)
## Upgrade de Azure OpenAI Tier
Para aumentar tu límite, visita:
https://aka.ms/oai/quotaincrease
Después del upgrade, ajusta las variables de entorno según tu nuevo tier.

View File

@@ -0,0 +1,112 @@
from __future__ import annotations
from typing import Any, Iterable, List
from app.agents.form_auditor.models import ExtractedIrsForm990PfDataSchema
from .agent import agent
from .metrics import SnapshotBundle, build_key_metrics, build_snapshots
from .models import AnalystReport, AnalystState
__all__ = ["build_performance_report"]
def _resolve_year(
entry: dict[str, Any], extraction: ExtractedIrsForm990PfDataSchema
) -> int:
candidates: Iterable[Any] = (
entry.get("calendar_year"),
entry.get("year"),
entry.get("tax_year"),
entry.get("return_year"),
entry.get("metadata", {}).get("return_year")
if isinstance(entry.get("metadata"), dict)
else None,
entry.get("metadata", {}).get("tax_year")
if isinstance(entry.get("metadata"), dict)
else None,
extraction.core_organization_metadata.calendar_year,
)
for candidate in candidates:
if candidate in (None, ""):
continue
try:
return int(candidate)
except (TypeError, ValueError):
continue
raise ValueError("Unable to determine filing year for one of the payload entries.")
async def build_performance_report(payloads: List[dict[str, Any]]) -> AnalystReport:
if not payloads:
raise ValueError("At least one payload is required for performance analysis.")
bundles: List[SnapshotBundle] = []
organisation_name = ""
organisation_ein = ""
for entry in payloads:
if not isinstance(entry, dict):
raise TypeError("Each payload entry must be a dict.")
extraction_payload = entry.get("extraction") if "extraction" in entry else entry
extraction = ExtractedIrsForm990PfDataSchema.model_validate(extraction_payload)
year = _resolve_year(entry, extraction)
if not organisation_ein:
organisation_ein = extraction.core_organization_metadata.ein
organisation_name = extraction.core_organization_metadata.legal_name
else:
if extraction.core_organization_metadata.ein != organisation_ein:
raise ValueError(
"All payload entries must belong to the same organization."
)
bundles.append(SnapshotBundle(year=year, extraction=extraction))
bundles.sort(key=lambda bundle: bundle.year)
snapshots = build_snapshots(bundles)
metrics = build_key_metrics(snapshots)
notes = []
if metrics:
revenue_metric = metrics[0]
expense_metric = metrics[1] if len(metrics) > 1 else None
if revenue_metric.cagr is not None:
notes.append(f"Revenue CAGR: {revenue_metric.cagr:.2%}")
if expense_metric and expense_metric.cagr is not None:
notes.append(f"Expense CAGR: {expense_metric.cagr:.2%}")
surplus_metric = next(
(m for m in metrics if m.name == "Operating Surplus"), None
)
if surplus_metric:
last_value = surplus_metric.points[-1].value if surplus_metric.points else 0
notes.append(f"Latest operating surplus: {last_value:,.0f}")
state = AnalystState(
organisation_name=organisation_name,
organisation_ein=organisation_ein,
series=snapshots,
key_metrics=metrics,
notes=notes,
)
prompt = (
"Analyze the provided multi-year financial context. Quantify notable trends, "
"call out risks or strengths, and supply actionable recommendations. "
"Capture both positive momentum and areas requiring attention."
)
result = await agent.run(prompt, deps=state)
report = result.output
years = [snapshot.year for snapshot in snapshots]
return report.model_copy(
update={
"organisation_name": organisation_name,
"organisation_ein": organisation_ein,
"years_analyzed": years,
"key_metrics": metrics,
}
)

View File

@@ -0,0 +1,33 @@
from __future__ import annotations
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.azure import AzureProvider
from app.core.config import settings
from .models import AnalystReport, AnalystState
provider = AzureProvider(
azure_endpoint=settings.AZURE_OPENAI_ENDPOINT,
api_version=settings.AZURE_OPENAI_API_VERSION,
api_key=settings.AZURE_OPENAI_API_KEY,
)
model = OpenAIChatModel(model_name="gpt-4o", provider=provider)
agent = Agent(
model=model,
name="MultiYearAnalyst",
deps_type=AnalystState,
output_type=AnalystReport,
system_prompt=(
"You are a nonprofit financial analyst. You receive multi-year Form 990 extractions "
"summarized into deterministic metrics (series, ratios, surplus, CAGR). Use the context "
"to highlight performance trends, governance implications, and forward-looking risks. "
"Focus on numeric trends: revenue growth, expense discipline, surplus stability, "
"program-vs-admin mix, and fundraising efficiency. Provide concise bullet insights, "
"clear recommendations tied to the data, and a balanced outlook (strengths vs watch items). "
"Only cite facts available in the provided series—do not invent figures."
),
)

View File

@@ -0,0 +1,197 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import Iterable, List, Sequence, Tuple
from app.agents.form_auditor.models import ExtractedIrsForm990PfDataSchema
from .models import TrendDirection, TrendMetric, TrendMetricPoint, YearlySnapshot
@dataclass
class SnapshotBundle:
year: int
extraction: ExtractedIrsForm990PfDataSchema
def _safe_ratio(numerator: float, denominator: float) -> float | None:
if denominator in (0, None):
return None
try:
return numerator / denominator
except ZeroDivisionError:
return None
def _growth(current: float, previous: float | None) -> float | None:
if previous in (None, 0):
return None
try:
return (current - previous) / previous
except ZeroDivisionError:
return None
def _direction_from_points(values: Sequence[float | None]) -> TrendDirection:
clean = [value for value in values if value is not None]
if len(clean) < 2:
return TrendDirection.STABLE
start, end = clean[0], clean[-1]
if start is None or end is None:
return TrendDirection.STABLE
delta = end - start
tolerance = abs(start) * 0.02 if start else 0.01
if abs(delta) <= tolerance:
return TrendDirection.STABLE
if len(clean) > 2:
swings = sum(
1
for idx in range(1, len(clean) - 1)
if (clean[idx] - clean[idx - 1]) * (clean[idx + 1] - clean[idx]) < 0
)
if swings >= len(clean) // 2:
return TrendDirection.VOLATILE
return TrendDirection.IMPROVING if delta > 0 else TrendDirection.DECLINING
def _cagr(start: float | None, end: float | None, periods: int) -> float | None:
if start is None or end is None or start <= 0 or end <= 0 or periods <= 0:
return None
return (end / start) ** (1 / periods) - 1
def build_snapshots(bundles: Sequence[SnapshotBundle]) -> List[YearlySnapshot]:
snapshots: List[YearlySnapshot] = []
previous_revenue = None
previous_expenses = None
for bundle in bundles:
rev = bundle.extraction.revenue_breakdown.total_revenue
exp = bundle.extraction.expenses_breakdown.total_expenses
program = bundle.extraction.expenses_breakdown.program_services_expenses
admin = bundle.extraction.expenses_breakdown.management_general_expenses
fundraising = bundle.extraction.expenses_breakdown.fundraising_expenses
snapshots.append(
YearlySnapshot(
year=bundle.year,
total_revenue=rev,
total_expenses=exp,
revenue_growth=_growth(rev, previous_revenue),
expense_growth=_growth(exp, previous_expenses),
surplus=rev - exp,
program_ratio=_safe_ratio(program, exp),
admin_ratio=_safe_ratio(admin, exp),
fundraising_ratio=_safe_ratio(fundraising, exp),
net_margin=_safe_ratio(rev - exp, rev),
)
)
previous_revenue = rev
previous_expenses = exp
return snapshots
def _metric_from_series(
name: str,
unit: str,
description: str,
values: Iterable[Tuple[int, float | None]],
) -> TrendMetric:
points = [
TrendMetricPoint(year=year, value=value or 0.0, growth=None)
for year, value in values
]
for idx in range(1, len(points)):
prev = points[idx - 1].value
curr = points[idx].value
points[idx].growth = _growth(curr, prev)
data_values = [point.value for point in points]
direction = _direction_from_points(data_values)
cagr = None
if len(points) >= 2:
cagr = _cagr(points[0].value, points[-1].value, len(points) - 1)
return TrendMetric(
name=name,
unit=unit,
description=description,
points=points,
cagr=cagr,
direction=direction,
)
def build_key_metrics(snapshots: Sequence[YearlySnapshot]) -> List[TrendMetric]:
if not snapshots:
return []
metrics = [
_metric_from_series(
"Total Revenue",
"USD",
"Reported total revenue in Part I.",
[(snap.year, snap.total_revenue) for snap in snapshots],
),
_metric_from_series(
"Total Expenses",
"USD",
"Reported total expenses in Part I.",
[(snap.year, snap.total_expenses) for snap in snapshots],
),
_metric_from_series(
"Operating Surplus",
"USD",
"Difference between total revenue and total expenses.",
[(snap.year, snap.surplus) for snap in snapshots],
),
_metric_from_series(
"Program Service Ratio",
"Ratio",
"Program service expenses divided by total expenses.",
[
(
snap.year,
snap.program_ratio if snap.program_ratio is not None else 0.0,
)
for snap in snapshots
],
),
_metric_from_series(
"Administrative Ratio",
"Ratio",
"Management & general expenses divided by total expenses.",
[
(snap.year, snap.admin_ratio if snap.admin_ratio is not None else 0.0)
for snap in snapshots
],
),
_metric_from_series(
"Fundraising Ratio",
"Ratio",
"Fundraising expenses divided by total expenses.",
[
(
snap.year,
snap.fundraising_ratio
if snap.fundraising_ratio is not None
else 0.0,
)
for snap in snapshots
],
),
]
for metric in metrics:
if metric.name.endswith("Ratio"):
metric.notes = "Higher values indicate greater spending share."
elif metric.name == "Operating Surplus":
metric.notes = "Positive surplus implies revenues exceeded expenses."
return metrics

View File

@@ -0,0 +1,74 @@
from __future__ import annotations
from enum import Enum
from typing import List
from pydantic import BaseModel, Field
class TrendDirection(str, Enum):
IMPROVING = "Improving"
DECLINING = "Declining"
STABLE = "Stable"
VOLATILE = "Volatile"
class TrendMetricPoint(BaseModel):
year: int
value: float
growth: float | None = Field(
default=None, description="Year-over-year growth expressed as a decimal."
)
class TrendMetric(BaseModel):
name: str
unit: str
description: str
points: List[TrendMetricPoint]
cagr: float | None = Field(
default=None,
description="Compound annual growth rate across the analyzed period.",
)
direction: TrendDirection = Field(
default=TrendDirection.STABLE, description="Overall direction of the metric."
)
notes: str | None = None
class TrendInsight(BaseModel):
category: str
direction: TrendDirection
summary: str
confidence: float = Field(default=0.7, ge=0.0, le=1.0)
class AnalystReport(BaseModel):
organisation_name: str
organisation_ein: str
years_analyzed: List[int] = Field(default_factory=list)
key_metrics: List[TrendMetric] = Field(default_factory=list)
insights: List[TrendInsight] = Field(default_factory=list)
recommendations: List[str] = Field(default_factory=list)
outlook: str = "Pending analysis"
class YearlySnapshot(BaseModel):
year: int
total_revenue: float
total_expenses: float
revenue_growth: float | None = None
expense_growth: float | None = None
surplus: float | None = None
program_ratio: float | None = None
admin_ratio: float | None = None
fundraising_ratio: float | None = None
net_margin: float | None = None
class AnalystState(BaseModel):
organisation_name: str
organisation_ein: str
series: List[YearlySnapshot]
key_metrics: List[TrendMetric]
notes: List[str] = Field(default_factory=list)

View File

@@ -0,0 +1,47 @@
from __future__ import annotations
from typing import Any
from .agent import agent, prepare_initial_findings
from .models import (
AuditReport,
ExtractedIrsForm990PfDataSchema,
ValidatorState,
)
async def build_audit_report(payload: dict[str, Any]) -> AuditReport:
metadata_raw: Any = None
extraction_payload: Any = None
if isinstance(payload, dict) and "extraction" in payload:
extraction_payload = payload.get("extraction")
metadata_raw = payload.get("metadata")
else:
extraction_payload = payload
if extraction_payload is None:
raise ValueError("Payload missing extraction data.")
extraction = ExtractedIrsForm990PfDataSchema.model_validate(extraction_payload)
initial_findings = prepare_initial_findings(extraction)
metadata: dict[str, Any] = {}
if isinstance(metadata_raw, dict):
metadata = {str(k): v for k, v in metadata_raw.items()}
state = ValidatorState(
extraction=extraction,
initial_findings=initial_findings,
metadata=metadata,
)
prompt = (
"Review the Form 990 extraction and deterministic checks. Validate or adjust "
"the findings, add any additional issues or mitigations, and craft narrative "
"section summaries that highlight the most material points. Focus on concrete "
"evidence; do not fabricate figures."
)
result = await agent.run(prompt, deps=state)
return result.output

View File

@@ -0,0 +1,155 @@
from __future__ import annotations
from collections.abc import Iterable
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.azure import AzureProvider
from app.core.config import settings
from .checks import (
aggregate_findings,
build_section_summaries,
check_balance_sheet_presence,
check_board_engagement,
check_expense_totals,
check_fundraising_alignment,
check_governance_policies,
check_missing_operational_details,
check_revenue_totals,
compose_overall_summary,
irs_ein_lookup,
)
from .models import (
AuditFinding,
AuditReport,
ExtractedIrsForm990PfDataSchema,
Severity,
ValidatorState,
)
provider = AzureProvider(
azure_endpoint=settings.AZURE_OPENAI_ENDPOINT,
api_version=settings.AZURE_OPENAI_API_VERSION,
api_key=settings.AZURE_OPENAI_API_KEY,
)
model = OpenAIChatModel(model_name="gpt-4o", provider=provider)
agent = Agent(model=model)
def prepare_initial_findings(
extraction: ExtractedIrsForm990PfDataSchema,
) -> list[AuditFinding]:
findings = [
check_revenue_totals(extraction),
check_expense_totals(extraction),
check_fundraising_alignment(extraction),
check_balance_sheet_presence(extraction),
check_board_engagement(extraction),
check_missing_operational_details(extraction),
]
findings.extend(check_governance_policies(extraction))
return findings
def _merge_findings(
findings: Iterable[AuditFinding],
added: Iterable[AuditFinding],
) -> list[AuditFinding]:
existing = {finding.check_id: finding for finding in findings}
for finding in added:
existing[finding.check_id] = finding
return list(existing.values())
agent = Agent(
model=model,
name="FormValidator",
deps_type=ValidatorState,
output_type=AuditReport,
system_prompt=(
"You are a Form 990 auditor. Review the extraction data and deterministic "
"checks provided in deps. Use tools to confirm calculations, add or adjust "
"findings, supply mitigation guidance, and craft concise section summaries. "
"The AuditReport must include severity (`Pass`, `Warning`, `Error`), "
"confidence scores, mitigation advice, section summaries, and an overall summary. "
"Ground every statement in supplied data; do not invent financial figures."
),
)
@agent.tool
def revenue_check(ctx: RunContext[ValidatorState]) -> AuditFinding:
return check_revenue_totals(ctx.deps.extraction)
@agent.tool
def expense_check(ctx: RunContext[ValidatorState]) -> AuditFinding:
return check_expense_totals(ctx.deps.extraction)
@agent.tool
def fundraising_alignment_check(ctx: RunContext[ValidatorState]) -> AuditFinding:
return check_fundraising_alignment(ctx.deps.extraction)
@agent.tool
async def verify_ein(ctx: RunContext[ValidatorState]) -> AuditFinding:
ein = ctx.deps.extraction.core_organization_metadata.ein
exists, confidence, note = await irs_ein_lookup(ein)
if exists:
return AuditFinding(
check_id="irs_ein_match",
category="Compliance",
severity=Severity.PASS,
message="EIN confirmed against IRS index.",
mitigation="Document verification in the filing workpapers.",
confidence=confidence,
)
return AuditFinding(
check_id="irs_ein_match",
category="Compliance",
severity=Severity.WARNING,
message=f"EIN {ein} could not be confirmed. {note}",
mitigation="Verify the EIN against the IRS EO BMF or IRS determination letter.",
confidence=confidence,
)
@agent.output_validator
def finalize_report(
ctx: RunContext[ValidatorState],
report: AuditReport,
) -> AuditReport:
merged_findings = _merge_findings(ctx.deps.initial_findings, report.findings)
overall = aggregate_findings(merged_findings)
sections = build_section_summaries(merged_findings)
overall_summary = compose_overall_summary(merged_findings)
metadata = ctx.deps.metadata
notes = report.notes
if notes is None and isinstance(metadata, dict) and metadata.get("source"):
notes = f"Reviewed data source: {metadata['source']}."
year: int | None = None
if isinstance(metadata, dict):
metadata_year = metadata.get("return_year")
if metadata_year is not None:
try:
year = int(metadata_year)
except (TypeError, ValueError):
pass
core = ctx.deps.extraction.core_organization_metadata
organisation_name = core.legal_name or report.organisation_name
organisation_ein = core.ein or report.organisation_ein
return report.model_copy(
update={
"organisation_ein": organisation_ein,
"organisation_name": organisation_name,
"year": year,
"findings": merged_findings,
"overall_severity": overall,
"sections": sections,
"overall_summary": overall_summary,
"notes": notes,
}
)

View File

@@ -0,0 +1,282 @@
from __future__ import annotations
from collections import Counter, defaultdict
from .models import (
AuditFinding,
AuditSectionSummary,
ExtractedIrsForm990PfDataSchema,
Severity,
)
def aggregate_findings(findings: list[AuditFinding]) -> Severity:
order = {Severity.ERROR: 3, Severity.WARNING: 2, Severity.PASS: 1}
overall = Severity.PASS
for finding in findings:
if order[finding.severity] > order[overall]:
overall = finding.severity
return overall
def check_revenue_totals(data: ExtractedIrsForm990PfDataSchema) -> AuditFinding:
subtotal = sum(
value
for key, value in data.revenue_breakdown.model_dump().items()
if key != "total_revenue"
)
if abs(subtotal - data.revenue_breakdown.total_revenue) <= 1:
return AuditFinding(
check_id="revenue_totals",
category="Revenue",
severity=Severity.PASS,
message=f"Revenue categories sum (${subtotal:,.2f}) matches total revenue.",
mitigation="Maintain detailed support for each revenue source to preserve reconciliation trail.",
confidence=0.95,
)
return AuditFinding(
check_id="revenue_totals",
category="Revenue",
severity=Severity.ERROR,
message=(
f"Revenue categories sum (${subtotal:,.2f}) does not equal reported total "
f"(${data.revenue_breakdown.total_revenue:,.2f})."
),
mitigation="Recalculate revenue totals and correct line items or Schedule A before filing.",
confidence=0.95,
)
def check_expense_totals(data: ExtractedIrsForm990PfDataSchema) -> AuditFinding:
subtotal = (
data.expenses_breakdown.program_services_expenses
+ data.expenses_breakdown.management_general_expenses
+ data.expenses_breakdown.fundraising_expenses
)
if abs(subtotal - data.expenses_breakdown.total_expenses) <= 1:
return AuditFinding(
check_id="expense_totals",
category="Expenses",
severity=Severity.PASS,
message="Functional expenses match total expenses.",
mitigation="Keep functional allocation workpapers to support the reconciliation.",
confidence=0.95,
)
return AuditFinding(
check_id="expense_totals",
category="Expenses",
severity=Severity.ERROR,
message=(
f"Functional expenses (${subtotal:,.2f}) do not reconcile to total expenses "
f"(${data.expenses_breakdown.total_expenses:,.2f})."
),
mitigation="Review Part I, lines 2327 and reclassify functional expenses to tie to Part II totals.",
confidence=0.95,
)
def check_fundraising_alignment(
data: ExtractedIrsForm990PfDataSchema,
) -> AuditFinding:
reported_fundraising = data.expenses_breakdown.fundraising_expenses
event_expenses = data.fundraising_grantmaking.total_fundraising_event_expenses
difference = abs(reported_fundraising - event_expenses)
if difference <= 1:
return AuditFinding(
check_id="fundraising_alignment",
category="Fundraising",
severity=Severity.PASS,
message="Fundraising functional expenses align with reported event expenses.",
mitigation="Retain event ledgers and allocations to support matching totals.",
confidence=0.9,
)
severity = (
Severity.WARNING
if reported_fundraising and difference <= reported_fundraising * 0.1
else Severity.ERROR
)
return AuditFinding(
check_id="fundraising_alignment",
category="Fundraising",
severity=severity,
message=(
f"Fundraising functional expenses (${reported_fundraising:,.2f}) differ from "
f"reported event expenses (${event_expenses:,.2f}) by ${difference:,.2f}."
),
mitigation="Reconcile Schedule G and Part I allocations to eliminate the variance.",
confidence=0.85,
)
def check_balance_sheet_presence(
data: ExtractedIrsForm990PfDataSchema,
) -> AuditFinding:
if data.balance_sheet:
return AuditFinding(
check_id="balance_sheet_present",
category="Balance Sheet",
severity=Severity.PASS,
message="Balance sheet data is present.",
mitigation="Ensure ending net assets tie to Part I, line 30.",
confidence=0.7,
)
return AuditFinding(
check_id="balance_sheet_absent",
category="Balance Sheet",
severity=Severity.WARNING,
message="Balance sheet section is empty; confirm Part II filing requirements.",
mitigation="Populate assets, liabilities, and net assets or attach supporting schedules.",
confidence=0.6,
)
def check_governance_policies(
data: ExtractedIrsForm990PfDataSchema,
) -> list[AuditFinding]:
gm = data.governance_management_disclosure
findings: list[AuditFinding] = []
policy_fields = {
"conflict_of_interest_policy": "Document the policy in Part VI or adopt one prior to filing.",
"whistleblower_policy": "Document whistleblower protections for staff and volunteers.",
"document_retention_policy": "Adopt and document a record retention policy.",
}
affirmative_fields = {
"financial_statements_reviewed": "Capture whether the board reviewed or audited year-end financials.",
"form_990_provided_to_governing_body": "Provide Form 990 to the board before submission and note the date of review.",
}
for field, mitigation in policy_fields.items():
value = (getattr(gm, field) or "").strip()
if not value or value.lower() in {"no", "n", "false"}:
findings.append(
AuditFinding(
check_id=f"{field}_missing",
category="Governance",
severity=Severity.WARNING,
message=f"{field.replace('_', ' ').title()} not reported or marked 'No'.",
mitigation=mitigation,
confidence=0.55,
)
)
for field, mitigation in affirmative_fields.items():
value = (getattr(gm, field) or "").strip()
if not value:
findings.append(
AuditFinding(
check_id=f"{field}_blank",
category="Governance",
severity=Severity.WARNING,
message=f"{field.replace('_', ' ').title()} left blank.",
mitigation=mitigation,
confidence=0.5,
)
)
return findings
def check_board_engagement(data: ExtractedIrsForm990PfDataSchema) -> AuditFinding:
hours = [
member.average_hours_per_week
for member in data.officers_directors_trustees_key_employees
if member.average_hours_per_week is not None
]
total_hours = sum(hours)
if total_hours >= 5:
return AuditFinding(
check_id="board_hours",
category="Governance",
severity=Severity.PASS,
message="Officer and director time commitments appear reasonable.",
mitigation="Continue documenting board attendance and oversight responsibilities.",
confidence=0.7,
)
return AuditFinding(
check_id="board_hours",
category="Governance",
severity=Severity.WARNING,
message=(
f"Aggregate reported board hours ({total_hours:.1f} per week) are low; "
"confirm entries reflect actual governance involvement."
),
mitigation="Verify hours in Part VII; update if officers volunteer significant time.",
confidence=0.6,
)
def check_missing_operational_details(
data: ExtractedIrsForm990PfDataSchema,
) -> AuditFinding:
descriptors = (
data.functional_operational_data.fundraising_method_descriptions or ""
).strip()
if descriptors:
return AuditFinding(
check_id="fundraising_methods_documented",
category="Operations",
severity=Severity.PASS,
message="Fundraising method descriptions provided.",
mitigation="Update narratives annually to reflect any new campaigns or joint ventures.",
confidence=0.65,
)
return AuditFinding(
check_id="fundraising_methods_missing",
category="Operations",
severity=Severity.WARNING,
message="Fundraising method descriptions are blank.",
mitigation="Add a brief Schedule G narrative describing major fundraising approaches.",
confidence=0.55,
)
def build_section_summaries(findings: list[AuditFinding]) -> list[AuditSectionSummary]:
grouped: defaultdict[str, list[AuditFinding]] = defaultdict(list)
for finding in findings:
grouped[finding.category].append(finding)
summaries: list[AuditSectionSummary] = []
severity_order = {Severity.ERROR: 3, Severity.WARNING: 2, Severity.PASS: 1}
for category, category_findings in grouped.items():
counter = Counter(f.severity for f in category_findings)
severity = aggregate_findings(category_findings)
summary = ", ".join(
f"{count} {label}"
for label, count in (
("passes", counter.get(Severity.PASS, 0)),
("warnings", counter.get(Severity.WARNING, 0)),
("errors", counter.get(Severity.ERROR, 0)),
)
)
summary_text = f"{category} review: {summary}."
confidence = sum(f.confidence for f in category_findings) / len(
category_findings
)
summaries.append(
AuditSectionSummary(
section=category,
severity=severity,
summary=summary_text,
confidence=confidence,
)
)
summaries.sort(key=lambda s: (-severity_order[s.severity], s.section.lower()))
return summaries
def compose_overall_summary(findings: list[AuditFinding]) -> str:
if not findings:
return "No automated findings generated."
counter = Counter(f.severity for f in findings)
parts = []
if counter.get(Severity.ERROR):
parts.append(f"{counter[Severity.ERROR]} error(s)")
if counter.get(Severity.WARNING):
parts.append(f"{counter[Severity.WARNING]} warning(s)")
if counter.get(Severity.PASS):
parts.append(f"{counter[Severity.PASS]} check(s) passed")
summary = "Overall results: " + ", ".join(parts) + "."
return summary
async def irs_ein_lookup(_ein: str) -> tuple[bool, float, str]:
return False, 0.2, "IRS verification unavailable in current environment."

View File

@@ -0,0 +1,38 @@
from __future__ import annotations
import argparse
import asyncio
import json
from pathlib import Path
from . import build_audit_report
__all__ = ["build_audit_report", "main"]
def _load_payload(path: Path) -> dict:
text = path.read_text(encoding="utf-8")
return json.loads(text)
def _print_report(report: dict) -> None:
print(json.dumps(report, indent=2))
def main(argv: list[str] | None = None) -> None:
parser = argparse.ArgumentParser(
description="Validate a Form 990 extraction payload using the Form Auditor agent."
)
parser.add_argument(
"payload",
nargs="?",
default="example_data.json",
help="Path to a JSON file containing the extraction payload.",
)
args = parser.parse_args(argv)
payload_path = Path(args.payload).expanduser()
payload = _load_payload(payload_path)
report = asyncio.run(build_audit_report(payload))
_print_report(report.model_dump())

View File

@@ -0,0 +1,797 @@
from __future__ import annotations
import re
from enum import Enum
from typing import Any
from pydantic import BaseModel, Field, model_validator
class Severity(str, Enum):
PASS = "Pass"
WARNING = "Warning"
ERROR = "Error"
class AuditFinding(BaseModel):
check_id: str
category: str
severity: Severity
message: str
mitigation: str | None = None
confidence: float = Field(ge=0.0, le=1.0)
class AuditSectionSummary(BaseModel):
section: str
severity: Severity
summary: str
confidence: float = Field(ge=0.0, le=1.0)
class AuditReport(BaseModel):
organisation_ein: str
organisation_name: str
year: int | None
overall_severity: Severity
findings: list[AuditFinding]
sections: list[AuditSectionSummary] = Field(default_factory=list)
overall_summary: str | None = None
notes: str | None = None
class CoreOrgMetadata(BaseModel):
ein: str
legal_name: str
return_type: str
accounting_method: str
incorporation_state: str | None = None
class CoreOrganizationMetadata(BaseModel):
ein: str = Field(
...,
description="Unique IRS identifier for the organization.",
title="Employer Identification Number (EIN)",
)
legal_name: str = Field(
...,
description="Official registered name of the organization.",
title="Legal Name of Organization",
)
phone_number: str = Field(
..., description="Primary contact phone number.", title="Phone Number"
)
website_url: str = Field(
..., description="Organization's website address.", title="Website URL"
)
return_type: str = Field(
...,
description="Type of IRS return filed (e.g., 990, 990-EZ, 990-PF).",
title="Return Type",
)
amended_return: str = Field(
...,
description="Indicates if the return is amended.",
title="Amended Return Flag",
)
group_exemption_number: str = Field(
...,
description="IRS group exemption number, if applicable.",
title="Group Exemption Number",
)
subsection_code: str = Field(
...,
description="IRS subsection code (e.g., 501(c)(3)).",
title="Subsection Code",
)
ruling_date: str = Field(
...,
description="Date of IRS ruling or determination letter.",
title="Ruling/Determination Letter Date",
)
accounting_method: str = Field(
...,
description="Accounting method used (cash, accrual, other).",
title="Accounting Method",
)
organization_type: str = Field(
...,
description="Legal structure (corporation, trust, association, etc.).",
title="Organization Type",
)
year_of_formation: str = Field(
..., description="Year the organization was formed.", title="Year of Formation"
)
incorporation_state: str = Field(
..., description="State of incorporation.", title="Incorporation State"
)
calendar_year: str | None = Field(
default=None,
description="Calendar year covered by the return (if different from fiscal year).",
title="Calendar Year",
)
class RevenueBreakdown(BaseModel):
total_revenue: float = Field(
..., description="Sum of all revenue sources.", title="Total Revenue"
)
contributions_gifts_grants: float = Field(
...,
description="Revenue from donations and grants.",
title="Contributions, Gifts, and Grants",
)
program_service_revenue: float = Field(
...,
description="Revenue from program services.",
title="Program Service Revenue",
)
membership_dues: float = Field(
..., description="Revenue from membership dues.", title="Membership Dues"
)
investment_income: float = Field(
...,
description="Revenue from interest and dividends.",
title="Investment Income",
)
gains_losses_sales_assets: float = Field(
...,
description="Net gains or losses from asset sales.",
title="Gains/Losses from Sales of Assets",
)
rental_income: float = Field(
...,
description="Income from rental of real estate or equipment.",
title="Rental Income",
)
related_organizations_revenue: float = Field(
...,
description="Revenue from related organizations.",
title="Related Organizations Revenue",
)
gaming_revenue: float = Field(
..., description="Revenue from gaming activities.", title="Gaming Revenue"
)
other_revenue: float = Field(
..., description="Miscellaneous revenue sources.", title="Other Revenue"
)
government_grants: float = Field(
...,
description="Revenue from government grants.",
title="Revenue from Government Grants",
)
foreign_contributions: float = Field(
..., description="Revenue from foreign sources.", title="Foreign Contributions"
)
class ExpensesBreakdown(BaseModel):
total_expenses: float = Field(
..., description="Sum of all expenses.", title="Total Functional Expenses"
)
program_services_expenses: float = Field(
...,
description="Expenses for program services.",
title="Program Services Expenses",
)
management_general_expenses: float = Field(
...,
description="Administrative and management expenses.",
title="Management & General Expenses",
)
fundraising_expenses: float = Field(
...,
description="Expenses for fundraising activities.",
title="Fundraising Expenses",
)
grants_us_organizations: float = Field(
...,
description="Grants and assistance to U.S. organizations.",
title="Grants to U.S. Organizations",
)
grants_us_individuals: float = Field(
...,
description="Grants and assistance to U.S. individuals.",
title="Grants to U.S. Individuals",
)
grants_foreign_organizations: float = Field(
...,
description="Grants and assistance to foreign organizations.",
title="Grants to Foreign Organizations",
)
grants_foreign_individuals: float = Field(
...,
description="Grants and assistance to foreign individuals.",
title="Grants to Foreign Individuals",
)
compensation_officers: float = Field(
...,
description="Compensation paid to officers and key employees.",
title="Compensation of Officers/Key Employees",
)
compensation_other_staff: float = Field(
...,
description="Compensation paid to other staff.",
title="Compensation of Other Staff",
)
payroll_taxes_benefits: float = Field(
...,
description="Payroll taxes and employee benefits.",
title="Payroll Taxes and Benefits",
)
professional_fees: float = Field(
...,
description="Legal, accounting, and lobbying fees.",
title="Professional Fees",
)
office_occupancy_costs: float = Field(
...,
description="Office and occupancy expenses.",
title="Office and Occupancy Costs",
)
information_technology_costs: float = Field(
..., description="IT-related expenses.", title="Information Technology Costs"
)
travel_conference_expenses: float = Field(
...,
description="Travel and conference costs.",
title="Travel and Conference Expenses",
)
depreciation_amortization: float = Field(
...,
description="Depreciation and amortization expenses.",
title="Depreciation and Amortization",
)
insurance: float = Field(..., description="Insurance expenses.", title="Insurance")
class OfficersDirectorsTrusteesKeyEmployee(BaseModel):
name: str = Field(..., description="Full name of the individual.", title="Name")
title_position: str = Field(
..., description="Role or position held.", title="Title/Position"
)
average_hours_per_week: float = Field(
...,
description="Average weekly hours devoted to position.",
title="Average Hours Per Week",
)
related_party_transactions: str = Field(
...,
description="Indicates if related-party transactions occurred.",
title="Related-Party Transactions",
)
former_officer: str = Field(
...,
description="Indicates if the individual is a former officer.",
title="Former Officer Indicator",
)
governance_role: str = Field(
...,
description="Role in governance (voting, independent, etc.).",
title="Governance Role",
)
class GovernanceManagementDisclosure(BaseModel):
governing_body_size: float = Field(
...,
description="Number of voting members on the governing body.",
title="Governing Body Size",
)
independent_members: float = Field(
...,
description="Number of independent voting members.",
title="Number of Independent Members",
)
financial_statements_reviewed: str = Field(
...,
description="Indicates if financial statements were reviewed or audited.",
title="Financial Statements Reviewed/Audited",
)
form_990_provided_to_governing_body: str = Field(
...,
description="Indicates if Form 990 was provided to governing body before filing.",
title="Form 990 Provided to Governing Body",
)
conflict_of_interest_policy: str = Field(
...,
description="Indicates if a conflict-of-interest policy is in place.",
title="Conflict-of-Interest Policy",
)
whistleblower_policy: str = Field(
...,
description="Indicates if a whistleblower policy is in place.",
title="Whistleblower Policy",
)
document_retention_policy: str = Field(
...,
description="Indicates if a document retention/destruction policy is in place.",
title="Document Retention/Destruction Policy",
)
ceo_compensation_review_process: str = Field(
...,
description="Description of CEO compensation review process.",
title="CEO Compensation Review Process",
)
public_disclosure_practices: str = Field(
...,
description="Description of public disclosure practices.",
title="Public Disclosure Practices",
)
class ProgramServiceAccomplishment(BaseModel):
program_name: str = Field(
..., description="Name of the program.", title="Program Name"
)
program_description: str = Field(
..., description="Description of the program.", title="Program Description"
)
expenses: float = Field(
..., description="Expenses for the program.", title="Program Expenses"
)
grants: float = Field(
..., description="Grants made under the program.", title="Program Grants"
)
revenue_generated: float = Field(
..., description="Revenue generated by the program.", title="Revenue Generated"
)
quantitative_outputs: str = Field(
...,
description="Quantitative outputs (e.g., number served, events held).",
title="Quantitative Outputs",
)
class FundraisingGrantmaking(BaseModel):
total_fundraising_event_revenue: float = Field(
...,
description="Total revenue from fundraising events.",
title="Total Fundraising Event Revenue",
)
total_fundraising_event_expenses: float = Field(
...,
description="Total direct expenses for fundraising events.",
title="Total Fundraising Event Expenses",
)
professional_fundraiser_fees: float = Field(
...,
description="Fees paid to professional fundraisers.",
title="Professional Fundraiser Fees",
)
class FunctionalOperationalData(BaseModel):
number_of_employees: float = Field(
..., description="Total number of employees.", title="Number of Employees"
)
number_of_volunteers: float = Field(
..., description="Total number of volunteers.", title="Number of Volunteers"
)
occupancy_costs: float = Field(
..., description="Total occupancy costs.", title="Occupancy Costs"
)
fundraising_method_descriptions: str = Field(
...,
description="Descriptions of fundraising methods used.",
title="Fundraising Method Descriptions",
)
joint_ventures_disregarded_entities: str = Field(
...,
description="Details of joint ventures and disregarded entities.",
title="Joint Ventures and Disregarded Entities",
)
class CompensationDetails(BaseModel):
base_compensation: float = Field(
..., description="Base salary or wages.", title="Base Compensation"
)
bonus: float = Field(
..., description="Bonus or incentive compensation.", title="Bonus Compensation"
)
incentive: float = Field(
..., description="Incentive compensation.", title="Incentive Compensation"
)
other: float = Field(
..., description="Other forms of compensation.", title="Other Compensation"
)
non_fixed_compensation: str = Field(
...,
description="Indicates if compensation is non-fixed.",
title="Non-Fixed Compensation Flag",
)
first_class_travel: str = Field(
...,
description="Indicates if first-class travel was provided.",
title="First-Class Travel",
)
housing_allowance: str = Field(
...,
description="Indicates if housing allowance was provided.",
title="Housing Allowance",
)
expense_account_usage: str = Field(
...,
description="Indicates if expense account was used.",
title="Expense Account Usage",
)
supplemental_retirement: str = Field(
...,
description="Indicates if supplemental retirement or deferred comp was provided.",
title="Supplemental Retirement/Deferred Comp",
)
class PoliticalLobbyingActivities(BaseModel):
lobbying_expenditures_direct: float = Field(
...,
description="Direct lobbying expenditures.",
title="Direct Lobbying Expenditures",
)
lobbying_expenditures_grassroots: float = Field(
...,
description="Grassroots lobbying expenditures.",
title="Grassroots Lobbying Expenditures",
)
election_501h_status: str = Field(
...,
description="Indicates if 501(h) election was made.",
title="501(h) Election Status",
)
political_campaign_expenditures: float = Field(
...,
description="Expenditures for political campaigns.",
title="Political Campaign Expenditures",
)
related_organizations_affiliates: str = Field(
...,
description="Details of related organizations or affiliates involved.",
title="Related Organizations/Affiliates Involved",
)
class InvestmentsEndowment(BaseModel):
investment_types: str = Field(
...,
description="Types of investments held (securities, partnerships, real estate).",
title="Investment Types",
)
donor_restricted_endowment_values: float = Field(
...,
description="Value of donor-restricted endowments.",
title="Donor-Restricted Endowment Values",
)
net_appreciation_depreciation: float = Field(
...,
description="Net appreciation or depreciation of investments.",
title="Net Appreciation/Depreciation",
)
related_organization_transactions: str = Field(
...,
description="Details of transactions with related organizations.",
title="Related Organization Transactions",
)
loans_to_from_related_parties: str = Field(
...,
description="Details of loans to or from related parties.",
title="Loans to/from Related Parties",
)
class TaxCompliancePenalties(BaseModel):
penalties_excise_taxes_reported: str = Field(
...,
description="Reported penalties or excise taxes.",
title="Penalties or Excise Taxes Reported",
)
unrelated_business_income_disclosure: str = Field(
...,
description="Disclosure of unrelated business income (UBI).",
title="Unrelated Business Income Disclosure",
)
foreign_bank_account_reporting: str = Field(
...,
description="Disclosure of foreign bank accounts (FBAR equivalent).",
title="Foreign Bank Account Reporting",
)
schedule_o_narrative_explanations: str = Field(
...,
description="Narrative explanations from Schedule O.",
title="Schedule O Narrative Explanations",
)
_OFFICER_HOURS_PATTERN = re.compile(r"([\d.]+)\s*hrs?/wk", re.IGNORECASE)
def _parse_officer_list(entries: list[str] | None) -> list[dict[str, Any]]:
if not entries:
return []
parsed: list[dict[str, Any]] = []
for raw in entries:
if not isinstance(raw, str):
continue
parts = [part.strip() for part in raw.split(",")]
name = parts[0] if parts else ""
title = parts[1] if len(parts) > 1 else ""
role = parts[3] if len(parts) > 3 else ""
hours = 0.0
match = _OFFICER_HOURS_PATTERN.search(raw)
if match:
try:
hours = float(match.group(1))
except ValueError:
hours = 0.0
parsed.append(
{
"name": name,
"title_position": title,
"average_hours_per_week": hours,
"related_party_transactions": "",
"former_officer": "",
"governance_role": role,
}
)
return parsed
def _build_program_accomplishments(
descriptions: list[str] | None,
) -> list[dict[str, Any]]:
if not descriptions:
return []
programs: list[dict[str, Any]] = []
for idx, description in enumerate(descriptions, start=1):
if not isinstance(description, str):
continue
programs.append(
{
"program_name": f"Program {idx}",
"program_description": description.strip(),
"expenses": 0.0,
"grants": 0.0,
"revenue_generated": 0.0,
"quantitative_outputs": "",
}
)
return programs
def _transform_flat_payload(data: dict[str, Any]) -> dict[str, Any]:
def get_str(key: str) -> str:
value = data.get(key)
if value is None:
return ""
return str(value)
def get_value(key: str, default: Any = 0) -> Any:
return data.get(key, default)
transformed: dict[str, Any] = {
"core_organization_metadata": {
"ein": get_str("ein"),
"legal_name": get_str("legal_name"),
"phone_number": get_str("phone_number"),
"website_url": get_str("website_url"),
"return_type": get_str("return_type"),
"amended_return": get_str("amended_return"),
"group_exemption_number": get_str("group_exemption_number"),
"subsection_code": get_str("subsection_code"),
"ruling_date": get_str("ruling_date"),
"accounting_method": get_str("accounting_method"),
"organization_type": get_str("organization_type"),
"year_of_formation": get_str("year_of_formation"),
"incorporation_state": get_str("incorporation_state"),
"calendar_year": get_str("calendar_year"),
},
"revenue_breakdown": {
"total_revenue": get_value("total_revenue"),
"contributions_gifts_grants": get_value("contributions_gifts_grants"),
"program_service_revenue": get_value("program_service_revenue"),
"membership_dues": get_value("membership_dues"),
"investment_income": get_value("investment_income"),
"gains_losses_sales_assets": get_value("gains_losses_sales_assets"),
"rental_income": get_value("rental_income"),
"related_organizations_revenue": get_value("related_organizations_revenue"),
"gaming_revenue": get_value("gaming_revenue"),
"other_revenue": get_value("other_revenue"),
"government_grants": get_value("government_grants"),
"foreign_contributions": get_value("foreign_contributions"),
},
"expenses_breakdown": {
"total_expenses": get_value("total_expenses"),
"program_services_expenses": get_value("program_services_expenses"),
"management_general_expenses": get_value("management_general_expenses"),
"fundraising_expenses": get_value("fundraising_expenses"),
"grants_us_organizations": get_value("grants_us_organizations"),
"grants_us_individuals": get_value("grants_us_individuals"),
"grants_foreign_organizations": get_value("grants_foreign_organizations"),
"grants_foreign_individuals": get_value("grants_foreign_individuals"),
"compensation_officers": get_value("compensation_officers"),
"compensation_other_staff": get_value("compensation_other_staff"),
"payroll_taxes_benefits": get_value("payroll_taxes_benefits"),
"professional_fees": get_value("professional_fees"),
"office_occupancy_costs": get_value("office_occupancy_costs"),
"information_technology_costs": get_value("information_technology_costs"),
"travel_conference_expenses": get_value("travel_conference_expenses"),
"depreciation_amortization": get_value("depreciation_amortization"),
"insurance": get_value("insurance"),
},
"balance_sheet": data.get("balance_sheet") or {},
"officers_directors_trustees_key_employees": _parse_officer_list(
data.get("officers_list")
),
"governance_management_disclosure": {
"governing_body_size": get_value("governing_body_size"),
"independent_members": get_value("independent_members"),
"financial_statements_reviewed": get_str("financial_statements_reviewed"),
"form_990_provided_to_governing_body": get_str(
"form_990_provided_to_governing_body"
),
"conflict_of_interest_policy": get_str("conflict_of_interest_policy"),
"whistleblower_policy": get_str("whistleblower_policy"),
"document_retention_policy": get_str("document_retention_policy"),
"ceo_compensation_review_process": get_str(
"ceo_compensation_review_process"
),
"public_disclosure_practices": get_str("public_disclosure_practices"),
},
"program_service_accomplishments": _build_program_accomplishments(
data.get("program_accomplishments_list")
),
"fundraising_grantmaking": {
"total_fundraising_event_revenue": get_value(
"total_fundraising_event_revenue"
),
"total_fundraising_event_expenses": get_value(
"total_fundraising_event_expenses"
),
"professional_fundraiser_fees": get_value("professional_fundraiser_fees"),
},
"functional_operational_data": {
"number_of_employees": get_value("number_of_employees"),
"number_of_volunteers": get_value("number_of_volunteers"),
"occupancy_costs": get_value("occupancy_costs"),
"fundraising_method_descriptions": get_str(
"fundraising_method_descriptions"
),
"joint_ventures_disregarded_entities": get_str(
"joint_ventures_disregarded_entities"
),
},
"compensation_details": {
"base_compensation": get_value("base_compensation"),
"bonus": get_value("bonus"),
"incentive": get_value("incentive"),
"other": get_value("other_compensation", get_value("other", 0)),
"non_fixed_compensation": get_str("non_fixed_compensation"),
"first_class_travel": get_str("first_class_travel"),
"housing_allowance": get_str("housing_allowance"),
"expense_account_usage": get_str("expense_account_usage"),
"supplemental_retirement": get_str("supplemental_retirement"),
},
"political_lobbying_activities": {
"lobbying_expenditures_direct": get_value("lobbying_expenditures_direct"),
"lobbying_expenditures_grassroots": get_value(
"lobbying_expenditures_grassroots"
),
"election_501h_status": get_str("election_501h_status"),
"political_campaign_expenditures": get_value(
"political_campaign_expenditures"
),
"related_organizations_affiliates": get_str(
"related_organizations_affiliates"
),
},
"investments_endowment": {
"investment_types": get_str("investment_types"),
"donor_restricted_endowment_values": get_value(
"donor_restricted_endowment_values"
),
"net_appreciation_depreciation": get_value("net_appreciation_depreciation"),
"related_organization_transactions": get_str(
"related_organization_transactions"
),
"loans_to_from_related_parties": get_str("loans_to_from_related_parties"),
},
"tax_compliance_penalties": {
"penalties_excise_taxes_reported": get_str(
"penalties_excise_taxes_reported"
),
"unrelated_business_income_disclosure": get_str(
"unrelated_business_income_disclosure"
),
"foreign_bank_account_reporting": get_str("foreign_bank_account_reporting"),
"schedule_o_narrative_explanations": get_str(
"schedule_o_narrative_explanations"
),
},
}
return transformed
class ExtractedIrsForm990PfDataSchema(BaseModel):
core_organization_metadata: CoreOrganizationMetadata = Field(
...,
description="Essential identifiers and attributes for normalizing entities across filings and years.",
title="Core Organization Metadata",
)
revenue_breakdown: RevenueBreakdown = Field(
...,
description="Detailed breakdown of revenue streams for the fiscal year.",
title="Revenue Breakdown",
)
expenses_breakdown: ExpensesBreakdown = Field(
...,
description="Detailed breakdown of expenses for the fiscal year.",
title="Expenses Breakdown",
)
balance_sheet: dict[str, Any] = Field(
default_factory=dict,
description="Assets, liabilities, and net assets at year end.",
title="Balance Sheet Data",
)
officers_directors_trustees_key_employees: list[
OfficersDirectorsTrusteesKeyEmployee
] = Field(
...,
description="List of key personnel and their compensation.",
title="Officers, Directors, Trustees, Key Employees",
)
governance_management_disclosure: GovernanceManagementDisclosure = Field(
...,
description="Governance and management practices, policies, and disclosures.",
title="Governance, Management, and Disclosure",
)
program_service_accomplishments: list[ProgramServiceAccomplishment] = Field(
...,
description="Major programs and their outputs for the fiscal year.",
title="Program Service Accomplishments",
)
fundraising_grantmaking: FundraisingGrantmaking = Field(
...,
description="Fundraising event details and grantmaking activities.",
title="Fundraising & Grantmaking",
)
functional_operational_data: FunctionalOperationalData = Field(
...,
description="Operational metrics and related-organization relationships.",
title="Functional & Operational Data",
)
compensation_details: CompensationDetails = Field(
...,
description="Detailed breakdown of officer compensation and benefits.",
title="Compensation Details",
)
political_lobbying_activities: PoliticalLobbyingActivities = Field(
...,
description="Details of political and lobbying expenditures and affiliations.",
title="Political & Lobbying Activities",
)
investments_endowment: InvestmentsEndowment = Field(
...,
description="Investment holdings, endowment values, and related transactions.",
title="Investments & Endowment",
)
tax_compliance_penalties: TaxCompliancePenalties = Field(
...,
description="Tax compliance indicators, penalties, and narrative explanations.",
title="Tax Compliance / Penalties",
)
@model_validator(mode="before")
@classmethod
def _ensure_structure(cls, value: Any) -> Any:
if not isinstance(value, dict):
return value
if "core_organization_metadata" in value:
return value
return _transform_flat_payload(value)
class ValidatorState(BaseModel):
extraction: ExtractedIrsForm990PfDataSchema
initial_findings: list[AuditFinding] = Field(default_factory=list)
metadata: dict[str, Any] = Field(default_factory=dict)

View File

@@ -0,0 +1,37 @@
from __future__ import annotations
from .agent import agent
from .models import WebSearchResponse, WebSearchState
async def search_web(
query: str,
max_results: int = 5,
include_raw_content: bool = False,
) -> WebSearchResponse:
"""
Execute web search using Tavily MCP server.
Args:
query: Search query string
max_results: Maximum number of results to return (1-10)
include_raw_content: Whether to include full content in results
Returns:
WebSearchResponse with results and summary
"""
state = WebSearchState(
user_query=query,
max_results=max_results,
include_raw_content=include_raw_content,
)
prompt = (
f"Search the web for: {query}\n\n"
f"Return the top {max_results} most relevant results. "
"Provide a concise summary of the key findings."
)
# Ejecutar agente con Tavily API directa
result = await agent.run(prompt, deps=state)
return result.output

View File

@@ -0,0 +1,72 @@
from __future__ import annotations
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.azure import AzureProvider
from tavily import TavilyClient
from app.core.config import settings
from .models import WebSearchResponse, WebSearchState, SearchResult
provider = AzureProvider(
azure_endpoint=settings.AZURE_OPENAI_ENDPOINT,
api_version=settings.AZURE_OPENAI_API_VERSION,
api_key=settings.AZURE_OPENAI_API_KEY,
)
model = OpenAIChatModel(model_name="gpt-4o", provider=provider)
tavily_client = TavilyClient(api_key=settings.TAVILY_API_KEY)
agent = Agent(
model=model,
name="WebSearchAgent",
deps_type=WebSearchState,
output_type=WebSearchResponse,
system_prompt=(
"You are a web search assistant powered by Tavily. "
"Use the tavily_search tool to find relevant, up-to-date information. "
"Return a structured WebSearchResponse with results and a concise summary. "
"Always cite your sources with URLs."
),
)
@agent.tool
def tavily_search(ctx: RunContext[WebSearchState], query: str) -> list[SearchResult]:
"""Search the web using Tavily API for up-to-date information."""
response = tavily_client.search(
query=query,
max_results=ctx.deps.max_results,
search_depth="basic",
include_raw_content=ctx.deps.include_raw_content,
)
results = []
for item in response.get("results", []):
results.append(
SearchResult(
title=item.get("title", ""),
url=item.get("url", ""),
content=item.get("content", ""),
score=item.get("score"),
)
)
return results
@agent.output_validator
def finalize_response(
ctx: RunContext[WebSearchState],
response: WebSearchResponse,
) -> WebSearchResponse:
"""Post-process and validate the search response"""
return response.model_copy(
update={
"query": ctx.deps.user_query,
"total_results": len(response.results),
}
)

View File

@@ -0,0 +1,29 @@
from __future__ import annotations
from pydantic import BaseModel, Field
class SearchResult(BaseModel):
"""Individual search result from web search"""
title: str
url: str
content: str
score: float | None = None
class WebSearchState(BaseModel):
"""State passed to agent tools via deps"""
user_query: str
max_results: int = Field(default=5, ge=1, le=10)
include_raw_content: bool = False
class WebSearchResponse(BaseModel):
"""Structured response from the web search agent"""
query: str
results: list[SearchResult]
summary: str
total_results: int

View File

@@ -37,10 +37,17 @@ class Settings(BaseSettings):
# Azure OpenAI configuración # Azure OpenAI configuración
AZURE_OPENAI_ENDPOINT: str AZURE_OPENAI_ENDPOINT: str
AZURE_OPENAI_API_KEY: str AZURE_OPENAI_API_KEY: str
AZURE_OPENAI_API_VERSION: str = "2024-02-01" AZURE_OPENAI_API_VERSION: str = "2024-08-01-preview"
AZURE_OPENAI_EMBEDDING_MODEL: str = "text-embedding-3-large" AZURE_OPENAI_EMBEDDING_MODEL: str = "text-embedding-3-large"
AZURE_OPENAI_EMBEDDING_DEPLOYMENT: str = "text-embedding-3-large" AZURE_OPENAI_EMBEDDING_DEPLOYMENT: str = "text-embedding-3-large"
# Rate limiting para embeddings (ajustar según tier de Azure OpenAI)
# S0 tier: batch_size=16, delay=1.0 es seguro
# Tier superior: batch_size=100, delay=0.1
EMBEDDING_BATCH_SIZE: int = 16
EMBEDDING_DELAY_BETWEEN_BATCHES: float = 1.0
EMBEDDING_MAX_RETRIES: int = 5
# Google Cloud / Vertex AI configuración # Google Cloud / Vertex AI configuración
GOOGLE_APPLICATION_CREDENTIALS: str GOOGLE_APPLICATION_CREDENTIALS: str
GOOGLE_CLOUD_PROJECT: str GOOGLE_CLOUD_PROJECT: str
@@ -51,6 +58,9 @@ class Settings(BaseSettings):
LANDINGAI_API_KEY: str LANDINGAI_API_KEY: str
LANDINGAI_ENVIRONMENT: str = "production" # "production" o "eu" LANDINGAI_ENVIRONMENT: str = "production" # "production" o "eu"
TAVILY_API_KEY: str
# Schemas storage # Schemas storage
SCHEMAS_DIR: str = "./data/schemas" SCHEMAS_DIR: str = "./data/schemas"

View File

@@ -0,0 +1,608 @@
{
"extraction": {
"core_organization_metadata": {
"ein": "84-2674654",
"legal_name": "07 IN HEAVEN MEMORIAL SCHOLARSHIP",
"phone_number": "(262) 215-0300",
"website_url": "",
"return_type": "990-PF",
"amended_return": "No",
"group_exemption_number": "",
"subsection_code": "501(c)(3)",
"ruling_date": "",
"accounting_method": "Cash",
"organization_type": "corporation",
"year_of_formation": "",
"incorporation_state": "WI"
},
"revenue_breakdown": {
"total_revenue": 5227,
"contributions_gifts_grants": 5227,
"program_service_revenue": 0,
"membership_dues": 0,
"investment_income": 0,
"gains_losses_sales_assets": 0,
"rental_income": 0,
"related_organizations_revenue": 0,
"gaming_revenue": 0,
"other_revenue": 0,
"government_grants": 0,
"foreign_contributions": 0
},
"expenses_breakdown": {
"total_expenses": 2104,
"program_services_expenses": 0,
"management_general_expenses": 0,
"fundraising_expenses": 2104,
"grants_us_organizations": 0,
"grants_us_individuals": 0,
"grants_foreign_organizations": 0,
"grants_foreign_individuals": 0,
"compensation_officers": 0,
"compensation_other_staff": 0,
"payroll_taxes_benefits": 0,
"professional_fees": 0,
"office_occupancy_costs": 0,
"information_technology_costs": 0,
"travel_conference_expenses": 0,
"depreciation_amortization": 0,
"insurance": 0
},
"balance_sheet": {},
"officers_directors_trustees_key_employees": [
{
"name": "REBECCA TERPSTRA",
"title_position": "PRESIDENT",
"average_hours_per_week": 0.1,
"related_party_transactions": "",
"former_officer": "",
"governance_role": ""
},
{
"name": "ROBERT GUZMAN",
"title_position": "VICE PRESDEINT",
"average_hours_per_week": 0.1,
"related_party_transactions": "",
"former_officer": "",
"governance_role": ""
},
{
"name": "ANDREA VALENTI",
"title_position": "TREASURER",
"average_hours_per_week": 0.1,
"related_party_transactions": "",
"former_officer": "",
"governance_role": ""
},
{
"name": "BETHANY WALSH",
"title_position": "SECRETARY",
"average_hours_per_week": 0.1,
"related_party_transactions": "",
"former_officer": "",
"governance_role": ""
}
],
"governance_management_disclosure": {
"governing_body_size": 4,
"independent_members": 4,
"financial_statements_reviewed": "",
"form_990_provided_to_governing_body": "",
"conflict_of_interest_policy": "",
"whistleblower_policy": "",
"document_retention_policy": "",
"ceo_compensation_review_process": "",
"public_disclosure_practices": "Yes"
},
"program_service_accomplishments": [],
"fundraising_grantmaking": {
"total_fundraising_event_revenue": 0,
"total_fundraising_event_expenses": 2104,
"professional_fundraiser_fees": 0
},
"functional_operational_data": {
"number_of_employees": 0,
"number_of_volunteers": 0,
"occupancy_costs": 0,
"fundraising_method_descriptions": "",
"joint_ventures_disregarded_entities": ""
},
"compensation_details": {
"base_compensation": 0,
"bonus": 0,
"incentive": 0,
"other": 0,
"non_fixed_compensation": "",
"first_class_travel": "",
"housing_allowance": "",
"expense_account_usage": "",
"supplemental_retirement": ""
},
"political_lobbying_activities": {
"lobbying_expenditures_direct": 0,
"lobbying_expenditures_grassroots": 0,
"election_501h_status": "",
"political_campaign_expenditures": 0,
"related_organizations_affiliates": ""
},
"investments_endowment": {
"investment_types": "",
"donor_restricted_endowment_values": 0,
"net_appreciation_depreciation": 0,
"related_organization_transactions": "",
"loans_to_from_related_parties": ""
},
"tax_compliance_penalties": {
"penalties_excise_taxes_reported": "No",
"unrelated_business_income_disclosure": "No",
"foreign_bank_account_reporting": "No",
"schedule_o_narrative_explanations": ""
}
},
"extraction_metadata": {
"core_organization_metadata": {
"ein": {
"value": "84-2674654",
"references": ["0-7"]
},
"legal_name": {
"value": "07 IN HEAVEN MEMORIAL SCHOLARSHIP",
"references": ["0-6"]
},
"phone_number": {
"value": "(262) 215-0300",
"references": ["0-a"]
},
"website_url": {
"value": "",
"references": []
},
"return_type": {
"value": "990-PF",
"references": ["4ade8ed0-bce7-4bd5-bd8d-190e3e4be95b"]
},
"amended_return": {
"value": "No",
"references": ["4ac9edc4-e9bb-430f-b4c4-a42bf4c04b28"]
},
"group_exemption_number": {
"value": "",
"references": []
},
"subsection_code": {
"value": "501(c)(3)",
"references": ["4ac9edc4-e9bb-430f-b4c4-a42bf4c04b28"]
},
"ruling_date": {
"value": "",
"references": []
},
"accounting_method": {
"value": "Cash",
"references": ["0-d"]
},
"organization_type": {
"value": "corporation",
"references": ["4ac9edc4-e9bb-430f-b4c4-a42bf4c04b28"]
},
"year_of_formation": {
"value": "",
"references": []
},
"incorporation_state": {
"value": "WI",
"references": ["4ac9edc4-e9bb-430f-b4c4-a42bf4c04b28"]
}
},
"revenue_breakdown": {
"total_revenue": {
"value": 5227,
"references": ["0-1z"]
},
"contributions_gifts_grants": {
"value": 5227,
"references": ["0-m"]
},
"program_service_revenue": {
"value": 0,
"references": []
},
"membership_dues": {
"value": 0,
"references": []
},
"investment_income": {
"value": 0,
"references": []
},
"gains_losses_sales_assets": {
"value": 0,
"references": []
},
"rental_income": {
"value": 0,
"references": []
},
"related_organizations_revenue": {
"value": 0,
"references": []
},
"gaming_revenue": {
"value": 0,
"references": []
},
"other_revenue": {
"value": 0,
"references": []
},
"government_grants": {
"value": 0,
"references": []
},
"foreign_contributions": {
"value": 0,
"references": []
}
},
"expenses_breakdown": {
"total_expenses": {
"value": 2104,
"references": ["0-2S"]
},
"program_services_expenses": {
"value": 0,
"references": []
},
"management_general_expenses": {
"value": 0,
"references": []
},
"fundraising_expenses": {
"value": 2104,
"references": ["13-d"]
},
"grants_us_organizations": {
"value": 0,
"references": []
},
"grants_us_individuals": {
"value": 0,
"references": []
},
"grants_foreign_organizations": {
"value": 0,
"references": []
},
"grants_foreign_individuals": {
"value": 0,
"references": []
},
"compensation_officers": {
"value": 0,
"references": ["5-1q", "5-1w", "5-1C", "5-1I"]
},
"compensation_other_staff": {
"value": 0,
"references": []
},
"payroll_taxes_benefits": {
"value": 0,
"references": []
},
"professional_fees": {
"value": 0,
"references": []
},
"office_occupancy_costs": {
"value": 0,
"references": []
},
"information_technology_costs": {
"value": 0,
"references": []
},
"travel_conference_expenses": {
"value": 0,
"references": []
},
"depreciation_amortization": {
"value": 0,
"references": []
},
"insurance": {
"value": 0,
"references": []
}
},
"balance_sheet": {},
"officers_directors_trustees_key_employees": [
{
"name": {
"value": "REBECCA TERPSTRA",
"references": ["5-1o"]
},
"title_position": {
"value": "PRESIDENT",
"references": ["5-1p"]
},
"average_hours_per_week": {
"value": 0.1,
"references": ["5-1p"]
},
"related_party_transactions": {
"value": "",
"references": []
},
"former_officer": {
"value": "",
"references": []
},
"governance_role": {
"value": "",
"references": []
}
},
{
"name": {
"value": "ROBERT GUZMAN",
"references": ["5-1u"]
},
"title_position": {
"value": "VICE PRESDEINT",
"references": ["5-1v"]
},
"average_hours_per_week": {
"value": 0.1,
"references": ["5-1v"]
},
"related_party_transactions": {
"value": "",
"references": []
},
"former_officer": {
"value": "",
"references": []
},
"governance_role": {
"value": "",
"references": []
}
},
{
"name": {
"value": "ANDREA VALENTI",
"references": ["5-1A"]
},
"title_position": {
"value": "TREASURER",
"references": ["5-1B"]
},
"average_hours_per_week": {
"value": 0.1,
"references": ["5-1B"]
},
"related_party_transactions": {
"value": "",
"references": []
},
"former_officer": {
"value": "",
"references": []
},
"governance_role": {
"value": "",
"references": []
}
},
{
"name": {
"value": "BETHANY WALSH",
"references": ["5-1G"]
},
"title_position": {
"value": "SECRETARY",
"references": ["5-1H"]
},
"average_hours_per_week": {
"value": 0.1,
"references": ["5-1H"]
},
"related_party_transactions": {
"value": "",
"references": []
},
"former_officer": {
"value": "",
"references": []
},
"governance_role": {
"value": "",
"references": []
}
}
],
"governance_management_disclosure": {
"governing_body_size": {
"value": 4,
"references": ["5-1o", "5-1u", "5-1A", "5-1G"]
},
"independent_members": {
"value": 4,
"references": ["5-1o", "5-1u", "5-1A", "5-1G"]
},
"financial_statements_reviewed": {
"value": "",
"references": []
},
"form_990_provided_to_governing_body": {
"value": "",
"references": []
},
"conflict_of_interest_policy": {
"value": "",
"references": []
},
"whistleblower_policy": {
"value": "",
"references": []
},
"document_retention_policy": {
"value": "",
"references": []
},
"ceo_compensation_review_process": {
"value": "",
"references": []
},
"public_disclosure_practices": {
"value": "Yes",
"references": ["4-g"]
}
},
"program_service_accomplishments": [],
"fundraising_grantmaking": {
"total_fundraising_event_revenue": {
"value": 0,
"references": []
},
"total_fundraising_event_expenses": {
"value": 2104,
"references": ["13-d"]
},
"professional_fundraiser_fees": {
"value": 0,
"references": []
}
},
"functional_operational_data": {
"number_of_employees": {
"value": 0,
"references": []
},
"number_of_volunteers": {
"value": 0,
"references": []
},
"occupancy_costs": {
"value": 0,
"references": []
},
"fundraising_method_descriptions": {
"value": "",
"references": []
},
"joint_ventures_disregarded_entities": {
"value": "",
"references": []
}
},
"compensation_details": {
"base_compensation": {
"value": 0,
"references": ["5-1q", "5-1w"]
},
"bonus": {
"value": 0,
"references": []
},
"incentive": {
"value": 0,
"references": []
},
"other": {
"value": 0,
"references": []
},
"non_fixed_compensation": {
"value": "",
"references": []
},
"first_class_travel": {
"value": "",
"references": []
},
"housing_allowance": {
"value": "",
"references": []
},
"expense_account_usage": {
"value": "",
"references": []
},
"supplemental_retirement": {
"value": "",
"references": []
}
},
"political_lobbying_activities": {
"lobbying_expenditures_direct": {
"value": 0,
"references": []
},
"lobbying_expenditures_grassroots": {
"value": 0,
"references": []
},
"election_501h_status": {
"value": "",
"references": []
},
"political_campaign_expenditures": {
"value": 0,
"references": []
},
"related_organizations_affiliates": {
"value": "",
"references": []
}
},
"investments_endowment": {
"investment_types": {
"value": "",
"references": []
},
"donor_restricted_endowment_values": {
"value": 0,
"references": []
},
"net_appreciation_depreciation": {
"value": 0,
"references": []
},
"related_organization_transactions": {
"value": "",
"references": []
},
"loans_to_from_related_parties": {
"value": "",
"references": []
}
},
"tax_compliance_penalties": {
"penalties_excise_taxes_reported": {
"value": "No",
"references": ["3-I"]
},
"unrelated_business_income_disclosure": {
"value": "No",
"references": ["3-Y"]
},
"foreign_bank_account_reporting": {
"value": "No",
"references": ["4-H"]
},
"schedule_o_narrative_explanations": {
"value": "",
"references": []
}
}
},
"metadata": {
"filename": "markdown.md",
"org_id": null,
"duration_ms": 16656,
"credit_usage": 27.2,
"job_id": "nnmr8lcxtykk5ll5wodjtrnn6",
"version": "extract-20250930"
}
}

View File

@@ -11,6 +11,7 @@ from .routers.agent import router as agent_router
from .routers.chunking import router as chunking_router from .routers.chunking import router as chunking_router
from .routers.chunking_landingai import router as chunking_landingai_router from .routers.chunking_landingai import router as chunking_landingai_router
from .routers.dataroom import router as dataroom_router from .routers.dataroom import router as dataroom_router
from .routers.extracted_data import router as extracted_data_router
from .routers.files import router as files_router from .routers.files import router as files_router
from .routers.schemas import router as schemas_router from .routers.schemas import router as schemas_router
from .routers.vectors import router as vectors_router from .routers.vectors import router as vectors_router
@@ -123,6 +124,9 @@ app.include_router(schemas_router)
# Chunking LandingAI router (nuevo) # Chunking LandingAI router (nuevo)
app.include_router(chunking_landingai_router) app.include_router(chunking_landingai_router)
# Extracted data router (nuevo)
app.include_router(extracted_data_router)
app.include_router(dataroom_router, prefix="/api/v1") app.include_router(dataroom_router, prefix="/api/v1")
app.include_router(agent_router) app.include_router(agent_router)

View File

@@ -0,0 +1,68 @@
"""
Modelo Redis-OM para almacenar datos extraídos de documentos.
Permite búsqueda rápida de datos estructurados sin necesidad de búsqueda vectorial.
"""
from datetime import datetime
from typing import Optional, Dict, Any
from redis_om import HashModel, Field, Migrator
import json
class ExtractedDocument(HashModel):
"""
Modelo para guardar datos extraídos de documentos en Redis.
Uso:
1. Cuando se procesa un PDF con schema y se extraen datos
2. Los chunks van a Qdrant (para RAG)
3. Los datos extraídos van a Redis (para búsqueda estructurada)
Ventajas:
- Búsqueda rápida por file_name, tema, collection_name
- Acceso directo a datos extraídos sin búsqueda vectorial
- Permite filtros y agregaciones
"""
# Identificadores
file_name: str = Field(index=True)
tema: str = Field(index=True)
collection_name: str = Field(index=True)
# Datos extraídos (JSON serializado)
# Redis-OM HashModel no soporta Dict directamente, usamos str y serializamos
extracted_data_json: str
# Metadata
extraction_timestamp: str # ISO format
class Meta:
database = None # Se configura en runtime
global_key_prefix = "extracted_doc"
model_key_prefix = "doc"
def set_extracted_data(self, data: Dict[str, Any]) -> None:
"""Helper para serializar datos extraídos a JSON"""
self.extracted_data_json = json.dumps(data, ensure_ascii=False, indent=2)
def get_extracted_data(self) -> Dict[str, Any]:
"""Helper para deserializar datos extraídos desde JSON"""
return json.loads(self.extracted_data_json)
@classmethod
def find_by_file(cls, file_name: str):
"""Busca todos los documentos extraídos de un archivo"""
return cls.find(cls.file_name == file_name).all()
@classmethod
def find_by_tema(cls, tema: str):
"""Busca todos los documentos extraídos de un tema"""
return cls.find(cls.tema == tema).all()
@classmethod
def find_by_collection(cls, collection_name: str):
"""Busca todos los documentos en una colección"""
return cls.find(cls.collection_name == collection_name).all()
# Ejecutar migración para crear índices en Redis
Migrator().run()

View File

@@ -58,7 +58,7 @@ class CustomSchema(BaseModel):
schema_id: Optional[str] = Field(None, description="ID único del schema (generado automáticamente si no se provee)") schema_id: Optional[str] = Field(None, description="ID único del schema (generado automáticamente si no se provee)")
schema_name: str = Field(..., description="Nombre descriptivo del schema", min_length=1, max_length=100) schema_name: str = Field(..., description="Nombre descriptivo del schema", min_length=1, max_length=100)
description: str = Field(..., description="Descripción de qué extrae este schema", min_length=1, max_length=500) description: str = Field(..., description="Descripción de qué extrae este schema", min_length=1, max_length=500)
fields: List[SchemaField] = Field(..., description="Lista de campos a extraer", min_items=1, max_items=50) fields: List[SchemaField] = Field(..., description="Lista de campos a extraer", min_items=1, max_items=200)
# Metadata # Metadata
created_at: Optional[str] = Field(None, description="Timestamp de creación ISO") created_at: Optional[str] = Field(None, description="Timestamp de creación ISO")

View File

@@ -1,12 +1,19 @@
from fastapi import APIRouter import json
from pydantic_ai import Agent import logging
from dataclasses import dataclass
from typing import Annotated, Any
from fastapi import APIRouter, Header
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.azure import AzureProvider from pydantic_ai.providers.azure import AzureProvider
from pydantic_ai.ui.vercel_ai import VercelAIAdapter from pydantic_ai.ui.vercel_ai import VercelAIAdapter
from starlette.requests import Request from starlette.requests import Request
from starlette.responses import Response from starlette.responses import Response
from app.agents import analyst, form_auditor, web_search
from app.core.config import settings from app.core.config import settings
from app.services.extracted_data_service import get_extracted_data_service
provider = AzureProvider( provider = AzureProvider(
azure_endpoint=settings.AZURE_OPENAI_ENDPOINT, azure_endpoint=settings.AZURE_OPENAI_ENDPOINT,
@@ -14,11 +21,65 @@ provider = AzureProvider(
api_key=settings.AZURE_OPENAI_API_KEY, api_key=settings.AZURE_OPENAI_API_KEY,
) )
model = OpenAIChatModel(model_name="gpt-4o", provider=provider) model = OpenAIChatModel(model_name="gpt-4o", provider=provider)
agent = Agent(model=model)
@dataclass
class Deps:
extracted_data: list[dict[str, Any]]
agent = Agent(model=model, deps_type=Deps)
router = APIRouter(prefix="/api/v1/agent", tags=["Agent"]) router = APIRouter(prefix="/api/v1/agent", tags=["Agent"])
logger = logging.getLogger(__name__)
@agent.tool
async def build_audit_report(ctx: RunContext[Deps]):
"""Calls the audit subagent to get a full audit report of the organization"""
data = ctx.deps.extracted_data[0]
result = await form_auditor.build_audit_report(data)
return result.model_dump()
@agent.tool
async def build_analysis_report(ctx: RunContext[Deps]):
"""Calls the analyst subagent to get a full report of the organization's performance across years"""
data = ctx.deps.extracted_data
if not data:
raise ValueError("No extracted data available for analysis.")
if len(data) == 1:
logger.info(
"build_analysis_report called with single-year data; report will still be generated but trends may be limited."
)
result = await analyst.build_performance_report(data)
return result.model_dump()
@agent.tool_plain
async def search_web_information(query: str, max_results: int = 5):
"""Search the web for up-to-date information using Tavily. Use this when you need current information, news, research, or facts not in your knowledge base."""
result = await web_search.search_web(query=query, max_results=max_results)
return result.model_dump()
@router.post("/chat") @router.post("/chat")
async def chat(request: Request) -> Response: async def chat(request: Request, tema: Annotated[str, Header()]) -> Response:
return await VercelAIAdapter.dispatch_request(request, agent=agent) extracted_data_service = get_extracted_data_service()
data = await extracted_data_service.get_by_tema(tema)
extracted_data = [doc.get_extracted_data() for doc in data]
logger.info(f"Extracted data amount: {len(extracted_data)}")
deps = Deps(extracted_data=extracted_data)
return await VercelAIAdapter.dispatch_request(request, agent=agent, deps=deps)

View File

@@ -2,17 +2,19 @@
Router para procesamiento de PDFs con LandingAI. Router para procesamiento de PDFs con LandingAI.
Soporta dos modos: rápido (solo parse) y extracción (parse + extract con schema). Soporta dos modos: rápido (solo parse) y extracción (parse + extract con schema).
""" """
import logging import logging
import time import time
from typing import List, Literal, Optional
from fastapi import APIRouter, HTTPException from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from typing import Optional, List, Literal
from langchain_core.documents import Document from langchain_core.documents import Document
from pydantic import BaseModel, Field
from ..services.landingai_service import get_landingai_service
from ..services.chunking_service import get_chunking_service
from ..repositories.schema_repository import get_schema_repository from ..repositories.schema_repository import get_schema_repository
from ..services.chunking_service import get_chunking_service
from ..services.landingai_service import get_landingai_service
from ..services.extracted_data_service import get_extracted_data_service
from ..utils.chunking.token_manager import TokenManager from ..utils.chunking.token_manager import TokenManager
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -22,6 +24,7 @@ router = APIRouter(prefix="/api/v1/chunking-landingai", tags=["chunking-landinga
class ProcessLandingAIRequest(BaseModel): class ProcessLandingAIRequest(BaseModel):
"""Request para procesar PDF con LandingAI""" """Request para procesar PDF con LandingAI"""
file_name: str = Field(..., description="Nombre del archivo PDF") file_name: str = Field(..., description="Nombre del archivo PDF")
tema: str = Field(..., description="Tema/carpeta del archivo") tema: str = Field(..., description="Tema/carpeta del archivo")
collection_name: str = Field(..., description="Colección de Qdrant") collection_name: str = Field(..., description="Colección de Qdrant")
@@ -29,34 +32,33 @@ class ProcessLandingAIRequest(BaseModel):
# Modo de procesamiento # Modo de procesamiento
mode: Literal["quick", "extract"] = Field( mode: Literal["quick", "extract"] = Field(
default="quick", default="quick",
description="Modo: 'quick' (solo parse) o 'extract' (parse + datos estructurados)" description="Modo: 'quick' (solo parse) o 'extract' (parse + datos estructurados)",
) )
# Schema (obligatorio si mode='extract') # Schema (obligatorio si mode='extract')
schema_id: Optional[str] = Field( schema_id: Optional[str] = Field(
None, None, description="ID del schema a usar (requerido si mode='extract')"
description="ID del schema a usar (requerido si mode='extract')"
) )
# Configuración de chunks # Configuración de chunks
include_chunk_types: List[str] = Field( include_chunk_types: List[str] = Field(
default=["text", "table"], default=["text", "table"],
description="Tipos de chunks a incluir: text, table, figure, etc." description="Tipos de chunks a incluir: text, table, figure, etc.",
) )
max_tokens_per_chunk: int = Field( max_tokens_per_chunk: int = Field(
default=1500, default=1500,
ge=500, ge=500,
le=3000, le=3000,
description="Tokens máximos por chunk (flexible para tablas/figuras)" description="Tokens máximos por chunk (flexible para tablas/figuras)",
) )
merge_small_chunks: bool = Field( merge_small_chunks: bool = Field(
default=True, default=True, description="Unir chunks pequeños de la misma página y tipo"
description="Unir chunks pequeños de la misma página y tipo"
) )
class ProcessLandingAIResponse(BaseModel): class ProcessLandingAIResponse(BaseModel):
"""Response del procesamiento con LandingAI""" """Response del procesamiento con LandingAI"""
success: bool success: bool
mode: str mode: str
processing_time_seconds: float processing_time_seconds: float
@@ -97,21 +99,22 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
start_time = time.time() start_time = time.time()
try: try:
logger.info(f"\n{'='*60}") logger.info(f"\n{'=' * 60}")
logger.info(f"INICIANDO PROCESAMIENTO CON LANDINGAI") logger.info("INICIANDO PROCESAMIENTO CON LANDINGAI")
logger.info(f"{'='*60}") logger.info(f"{'=' * 60}")
logger.info(f"Archivo: {request.file_name}") logger.info(f"Archivo: {request.file_name}")
logger.info(f"Tema: {request.tema}") logger.info(f"Tema: {request.tema}")
logger.info(f"Modo: {request.mode}") logger.info(f"Modo: {request.mode}")
logger.info(f"Colección: {request.collection_name}") logger.info(f"Colección: {request.collection_name}")
logger.info(f"Schema ID recibido: '{request.schema_id}' (tipo: {type(request.schema_id).__name__})")
# 1. Validar schema si es modo extract # 1. Validar schema si es modo extract
custom_schema = None custom_schema = None
if request.mode == "extract": if request.mode == "extract":
if not request.schema_id: if not request.schema_id or request.schema_id.strip() == "":
raise HTTPException( raise HTTPException(
status_code=400, status_code=400,
detail="schema_id es requerido cuando mode='extract'" detail="schema_id es requerido cuando mode='extract'",
) )
schema_repo = get_schema_repository() schema_repo = get_schema_repository()
@@ -119,8 +122,7 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
if not custom_schema: if not custom_schema:
raise HTTPException( raise HTTPException(
status_code=404, status_code=404, detail=f"Schema no encontrado: {request.schema_id}"
detail=f"Schema no encontrado: {request.schema_id}"
) )
logger.info(f"Schema seleccionado: {custom_schema.schema_name}") logger.info(f"Schema seleccionado: {custom_schema.schema_name}")
@@ -131,14 +133,12 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
try: try:
pdf_bytes = await chunking_service.download_pdf_from_blob( pdf_bytes = await chunking_service.download_pdf_from_blob(
request.file_name, request.file_name, request.tema
request.tema
) )
except Exception as e: except Exception as e:
logger.error(f"Error descargando PDF: {e}") logger.error(f"Error descargando PDF: {e}")
raise HTTPException( raise HTTPException(
status_code=404, status_code=404, detail=f"No se pudo descargar el PDF: {str(e)}"
detail=f"No se pudo descargar el PDF: {str(e)}"
) )
# 3. Procesar con LandingAI # 3. Procesar con LandingAI
@@ -150,13 +150,12 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
pdf_bytes=pdf_bytes, pdf_bytes=pdf_bytes,
file_name=request.file_name, file_name=request.file_name,
custom_schema=custom_schema, custom_schema=custom_schema,
include_chunk_types=request.include_chunk_types include_chunk_types=request.include_chunk_types,
) )
except Exception as e: except Exception as e:
logger.error(f"Error en LandingAI: {e}") logger.error(f"Error en LandingAI: {e}")
raise HTTPException( raise HTTPException(
status_code=500, status_code=500, detail=f"Error procesando con LandingAI: {str(e)}"
detail=f"Error procesando con LandingAI: {str(e)}"
) )
documents = result["chunks"] documents = result["chunks"]
@@ -164,7 +163,7 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
if not documents: if not documents:
raise HTTPException( raise HTTPException(
status_code=400, status_code=400,
detail="No se generaron chunks después del procesamiento" detail="No se generaron chunks después del procesamiento",
) )
# 4. Aplicar control flexible de tokens # 4. Aplicar control flexible de tokens
@@ -172,7 +171,7 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
documents = _apply_flexible_token_control( documents = _apply_flexible_token_control(
documents, documents,
max_tokens=request.max_tokens_per_chunk, max_tokens=request.max_tokens_per_chunk,
merge_small=request.merge_small_chunks merge_small=request.merge_small_chunks,
) )
# 5. Generar embeddings # 5. Generar embeddings
@@ -180,13 +179,16 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
texts = [doc.page_content for doc in documents] texts = [doc.page_content for doc in documents]
try: try:
embeddings = await chunking_service.embedding_service.generate_embeddings_batch(texts) embeddings = (
await chunking_service.embedding_service.generate_embeddings_batch(
texts
)
)
logger.info(f"Embeddings generados: {len(embeddings)} vectores") logger.info(f"Embeddings generados: {len(embeddings)} vectores")
except Exception as e: except Exception as e:
logger.error(f"Error generando embeddings: {e}") logger.error(f"Error generando embeddings: {e}")
raise HTTPException( raise HTTPException(
status_code=500, status_code=500, detail=f"Error generando embeddings: {str(e)}"
detail=f"Error generando embeddings: {str(e)}"
) )
# 6. Preparar chunks para Qdrant con IDs determinísticos # 6. Preparar chunks para Qdrant con IDs determinísticos
@@ -198,38 +200,54 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
chunk_id = chunking_service._generate_deterministic_id( chunk_id = chunking_service._generate_deterministic_id(
file_name=request.file_name, file_name=request.file_name,
page=doc.metadata.get("page", 1), page=doc.metadata.get("page", 1),
chunk_index=doc.metadata.get("chunk_id", str(idx)) chunk_index=doc.metadata.get("chunk_id", str(idx)),
) )
qdrant_chunks.append({ qdrant_chunks.append(
{
"id": chunk_id, "id": chunk_id,
"vector": embedding, "vector": embedding,
"payload": { "payload": {
"page_content": doc.page_content, "page_content": doc.page_content,
"metadata": doc.metadata # Metadata rica de LandingAI "metadata": doc.metadata, # Metadata rica de LandingAI
},
} }
}) )
# 7. Subir a Qdrant # 7. Subir a Qdrant
try: try:
upload_result = await chunking_service.vector_db.add_chunks( upload_result = await chunking_service.vector_db.add_chunks(
request.collection_name, request.collection_name, qdrant_chunks
qdrant_chunks
) )
logger.info(f"Subida completada: {upload_result['chunks_added']} chunks") logger.info(f"Subida completada: {upload_result['chunks_added']} chunks")
except Exception as e: except Exception as e:
logger.error(f"Error subiendo a Qdrant: {e}") logger.error(f"Error subiendo a Qdrant: {e}")
raise HTTPException( raise HTTPException(
status_code=500, status_code=500, detail=f"Error subiendo a Qdrant: {str(e)}"
detail=f"Error subiendo a Qdrant: {str(e)}"
) )
# 8. Guardar datos extraídos en Redis (si existe extracted_data)
if result.get("extracted_data") and result["extracted_data"].get("extraction"):
try:
logger.info("\n[6/6] Guardando datos extraídos en Redis...")
extracted_data_service = get_extracted_data_service()
await extracted_data_service.save_extracted_data(
file_name=request.file_name,
tema=request.tema,
collection_name=request.collection_name,
extracted_data=result["extracted_data"]["extraction"]
)
except Exception as e:
# No fallar si Redis falla, solo logear
logger.warning(f"⚠️ No se pudieron guardar datos en Redis (no crítico): {e}")
# Tiempo total # Tiempo total
processing_time = time.time() - start_time processing_time = time.time() - start_time
logger.info(f"\n{'='*60}") logger.info(f"\n{'=' * 60}")
logger.info(f"PROCESAMIENTO COMPLETADO") logger.info(f"PROCESAMIENTO COMPLETADO")
logger.info(f"{'='*60}") logger.info(f"{'=' * 60}")
logger.info(f"Tiempo: {processing_time:.2f}s") logger.info(f"Tiempo: {processing_time:.2f}s")
logger.info(f"Chunks procesados: {len(documents)}") logger.info(f"Chunks procesados: {len(documents)}")
logger.info(f"Chunks subidos: {upload_result['chunks_added']}") logger.info(f"Chunks subidos: {upload_result['chunks_added']}")
@@ -245,23 +263,18 @@ async def process_with_landingai(request: ProcessLandingAIRequest):
schema_used=custom_schema.schema_id if custom_schema else None, schema_used=custom_schema.schema_id if custom_schema else None,
extracted_data=result.get("extracted_data"), extracted_data=result.get("extracted_data"),
parse_metadata=result["parse_metadata"], parse_metadata=result["parse_metadata"],
message=f"PDF procesado exitosamente en modo {request.mode}" message=f"PDF procesado exitosamente en modo {request.mode}",
) )
except HTTPException: except HTTPException:
raise raise
except Exception as e: except Exception as e:
logger.error(f"Error inesperado en procesamiento: {e}") logger.error(f"Error inesperado en procesamiento: {e}")
raise HTTPException( raise HTTPException(status_code=500, detail=f"Error inesperado: {str(e)}")
status_code=500,
detail=f"Error inesperado: {str(e)}"
)
def _apply_flexible_token_control( def _apply_flexible_token_control(
documents: List[Document], documents: List[Document], max_tokens: int, merge_small: bool
max_tokens: int,
merge_small: bool
) -> List[Document]: ) -> List[Document]:
""" """
Aplica control flexible de tokens (Opción C del diseño). Aplica control flexible de tokens (Opción C del diseño).
@@ -306,14 +319,10 @@ def _apply_flexible_token_control(
else: else:
# Intentar merge si es pequeño # Intentar merge si es pequeño
if ( if merge_small and tokens < max_tokens * 0.5 and i < len(documents) - 1:
merge_small and
tokens < max_tokens * 0.5 and
i < len(documents) - 1
):
next_doc = documents[i + 1] next_doc = documents[i + 1]
if _can_merge(doc, next_doc, max_tokens, token_manager): if _can_merge(doc, next_doc, max_tokens, token_manager):
logger.debug(f"Merging chunks {i} y {i+1}") logger.debug(f"Merging chunks {i} y {i + 1}")
doc = _merge_documents(doc, next_doc) doc = _merge_documents(doc, next_doc)
i += 1 # Skip next i += 1 # Skip next
@@ -326,9 +335,7 @@ def _apply_flexible_token_control(
def _split_large_chunk( def _split_large_chunk(
doc: Document, doc: Document, max_tokens: int, token_manager: TokenManager
max_tokens: int,
token_manager: TokenManager
) -> List[Document]: ) -> List[Document]:
"""Divide un chunk grande en sub-chunks""" """Divide un chunk grande en sub-chunks"""
content = doc.page_content content = doc.page_content
@@ -343,8 +350,7 @@ def _split_large_chunk(
# Guardar chunk actual # Guardar chunk actual
sub_content = " ".join(current_chunk) sub_content = " ".join(current_chunk)
sub_doc = Document( sub_doc = Document(
page_content=sub_content, page_content=sub_content, metadata={**doc.metadata, "is_split": True}
metadata={**doc.metadata, "is_split": True}
) )
sub_chunks.append(sub_doc) sub_chunks.append(sub_doc)
current_chunk = [word] current_chunk = [word]
@@ -357,8 +363,7 @@ def _split_large_chunk(
if current_chunk: if current_chunk:
sub_content = " ".join(current_chunk) sub_content = " ".join(current_chunk)
sub_doc = Document( sub_doc = Document(
page_content=sub_content, page_content=sub_content, metadata={**doc.metadata, "is_split": True}
metadata={**doc.metadata, "is_split": True}
) )
sub_chunks.append(sub_doc) sub_chunks.append(sub_doc)
@@ -366,10 +371,7 @@ def _split_large_chunk(
def _can_merge( def _can_merge(
doc1: Document, doc1: Document, doc2: Document, max_tokens: int, token_manager: TokenManager
doc2: Document,
max_tokens: int,
token_manager: TokenManager
) -> bool: ) -> bool:
"""Verifica si dos docs se pueden mergear""" """Verifica si dos docs se pueden mergear"""
# Misma página # Misma página
@@ -391,6 +393,5 @@ def _merge_documents(doc1: Document, doc2: Document) -> Document:
"""Mergea dos documentos""" """Mergea dos documentos"""
merged_content = f"{doc1.page_content}\n\n{doc2.page_content}" merged_content = f"{doc1.page_content}\n\n{doc2.page_content}"
return Document( return Document(
page_content=merged_content, page_content=merged_content, metadata={**doc1.metadata, "is_merged": True}
metadata={**doc1.metadata, "is_merged": True}
) )

View File

@@ -1,10 +1,12 @@
import logging import logging
from typing import Optional
from fastapi import APIRouter, HTTPException from fastapi import APIRouter, HTTPException
from pydantic import BaseModel from pydantic import BaseModel
from ..models.dataroom import DataRoom from ..models.dataroom import DataRoom
from ..models.vector_models import CollectionCreateRequest from ..models.vector_models import CollectionCreateRequest
from ..services.azure_service import azure_service
from ..services.vector_service import vector_service from ..services.vector_service import vector_service
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -12,13 +14,146 @@ logger = logging.getLogger(__name__)
class DataroomCreate(BaseModel): class DataroomCreate(BaseModel):
name: str name: str
collection: str = ""
storage: str = "" @property
def collection(self) -> str:
return self.name.lower().replace(" ", "_")
@property
def storage(self) -> str:
return self.name.lower().replace(" ", "_")
class DataroomInfo(BaseModel):
name: str
collection: str
storage: str
file_count: int
total_size_bytes: int
total_size_mb: float
collection_exists: bool
vector_count: Optional[int]
collection_info: Optional[dict]
file_types: dict
recent_files: list
router = APIRouter(prefix="/dataroom", tags=["Dataroom"]) router = APIRouter(prefix="/dataroom", tags=["Dataroom"])
@router.get("/{dataroom_name}/info")
async def dataroom_info(dataroom_name: str) -> DataroomInfo:
"""
Obtener información detallada de un dataroom específico
"""
try:
# Find the dataroom in Redis
datarooms = DataRoom.find().all()
dataroom = None
for room in datarooms:
if room.name == dataroom_name:
dataroom = room
break
if not dataroom:
raise HTTPException(
status_code=404, detail=f"Dataroom '{dataroom_name}' not found"
)
# Get file information from Azure Storage
try:
files_data = await azure_service.list_files(dataroom_name)
except Exception as e:
logger.warning(f"Could not fetch files for dataroom '{dataroom_name}': {e}")
files_data = []
# Calculate file metrics
file_count = len(files_data)
total_size_bytes = sum(file_data.get("size", 0) for file_data in files_data)
total_size_mb = (
round(total_size_bytes / (1024 * 1024), 2) if total_size_bytes > 0 else 0.0
)
# Analyze file types
file_types = {}
recent_files = []
for file_data in files_data:
# Count file types by extension
filename = file_data.get("name", "")
if "." in filename:
ext = filename.split(".")[-1].lower()
file_types[ext] = file_types.get(ext, 0) + 1
# Collect recent files (up to 5)
if len(recent_files) < 5:
recent_files.append(
{
"name": filename,
"size_mb": round(file_data.get("size", 0) / (1024 * 1024), 2),
"last_modified": file_data.get("last_modified"),
}
)
# Sort recent files by last modified (newest first)
recent_files.sort(key=lambda x: x.get("last_modified", ""), reverse=True)
# Get vector collection information
collection_exists = False
vector_count = None
collection_info = None
try:
collection_exists_response = await vector_service.check_collection_exists(
dataroom_name
)
collection_exists = collection_exists_response.exists
if collection_exists:
collection_info_response = await vector_service.get_collection_info(
dataroom_name
)
if collection_info_response:
collection_info = {
"vectors_count": collection_info_response.vectors_count,
"indexed_vectors_count": collection_info_response.vectors_count,
"points_count": collection_info_response.vectors_count,
"segments_count": collection_info_response.vectors_count,
"status": collection_info_response.status,
}
vector_count = collection_info_response.vectors_count
except Exception as e:
logger.warning(
f"Could not fetch collection info for '{dataroom_name}': {e}"
)
logger.info(
f"Retrieved info for dataroom '{dataroom_name}': {file_count} files, {total_size_mb}MB"
)
return DataroomInfo(
name=dataroom.name,
collection=dataroom.collection,
storage=dataroom.storage,
file_count=file_count,
total_size_bytes=total_size_bytes,
total_size_mb=total_size_mb,
collection_exists=collection_exists,
vector_count=vector_count,
collection_info=collection_info,
file_types=file_types,
recent_files=recent_files,
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Error getting dataroom info for '{dataroom_name}': {e}")
raise HTTPException(
status_code=500, detail=f"Error getting dataroom info: {str(e)}"
)
@router.get("/") @router.get("/")
async def list_datarooms(): async def list_datarooms():
""" """

View File

@@ -0,0 +1,141 @@
"""
Router para consultar datos extraídos almacenados en Redis.
"""
import logging
from typing import List, Optional
from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel
from ..services.extracted_data_service import get_extracted_data_service
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/v1/extracted-data", tags=["extracted-data"])
class ExtractedDataResponse(BaseModel):
"""Response con datos extraídos de un documento"""
pk: str
file_name: str
tema: str
collection_name: str
extracted_data: dict
extraction_timestamp: str
class ExtractedDataListResponse(BaseModel):
"""Response con lista de datos extraídos"""
total: int
documents: List[ExtractedDataResponse]
@router.get("/by-file/{file_name}", response_model=ExtractedDataListResponse)
async def get_by_file(file_name: str):
"""
Obtiene todos los datos extraídos de un archivo específico.
Args:
file_name: Nombre del archivo
Returns:
Lista de documentos con datos extraídos
"""
try:
service = get_extracted_data_service()
docs = await service.get_by_file(file_name)
documents = [
ExtractedDataResponse(
pk=doc.pk,
file_name=doc.file_name,
tema=doc.tema,
collection_name=doc.collection_name,
extracted_data=doc.get_extracted_data(),
extraction_timestamp=doc.extraction_timestamp
)
for doc in docs
]
return ExtractedDataListResponse(
total=len(documents),
documents=documents
)
except Exception as e:
logger.error(f"Error obteniendo datos extraídos por archivo: {e}")
raise HTTPException(status_code=500, detail=str(e))
@router.get("/by-tema/{tema}", response_model=ExtractedDataListResponse)
async def get_by_tema(tema: str):
"""
Obtiene todos los datos extraídos de un tema específico.
Args:
tema: Nombre del tema
Returns:
Lista de documentos con datos extraídos
"""
try:
service = get_extracted_data_service()
docs = await service.get_by_tema(tema)
documents = [
ExtractedDataResponse(
pk=doc.pk,
file_name=doc.file_name,
tema=doc.tema,
collection_name=doc.collection_name,
extracted_data=doc.get_extracted_data(),
extraction_timestamp=doc.extraction_timestamp
)
for doc in docs
]
return ExtractedDataListResponse(
total=len(documents),
documents=documents
)
except Exception as e:
logger.error(f"Error obteniendo datos extraídos por tema: {e}")
raise HTTPException(status_code=500, detail=str(e))
@router.get("/by-collection/{collection_name}", response_model=ExtractedDataListResponse)
async def get_by_collection(collection_name: str):
"""
Obtiene todos los datos extraídos de una colección específica.
Args:
collection_name: Nombre de la colección
Returns:
Lista de documentos con datos extraídos
"""
try:
service = get_extracted_data_service()
docs = await service.get_by_collection(collection_name)
documents = [
ExtractedDataResponse(
pk=doc.pk,
file_name=doc.file_name,
tema=doc.tema,
collection_name=doc.collection_name,
extracted_data=doc.get_extracted_data(),
extraction_timestamp=doc.extraction_timestamp
)
for doc in docs
]
return ExtractedDataListResponse(
total=len(documents),
documents=documents
)
except Exception as e:
logger.error(f"Error obteniendo datos extraídos por colección: {e}")
raise HTTPException(status_code=500, detail=str(e))

View File

@@ -66,6 +66,8 @@ class ChunkingService:
""" """
Descarga un PDF desde Azure Blob Storage. Descarga un PDF desde Azure Blob Storage.
NOTA: Todos los blobs se guardan en minúsculas en Azure.
Args: Args:
file_name: Nombre del archivo file_name: Nombre del archivo
tema: Tema/carpeta del archivo tema: Tema/carpeta del archivo
@@ -77,8 +79,9 @@ class ChunkingService:
Exception: Si hay error descargando el archivo Exception: Si hay error descargando el archivo
""" """
try: try:
blob_path = f"{tema}/{file_name}" # Convertir a minúsculas ya que todos los blobs están en minúsculas
logger.info(f"Descargando PDF: {blob_path}") blob_path = f"{tema.lower()}/{file_name.lower()}"
logger.info(f"Descargando PDF: {blob_path} (tema original: {tema}, file original: {file_name})")
blob_client = self.blob_service.get_blob_client( blob_client = self.blob_service.get_blob_client(
container=self.container_name, container=self.container_name,

View File

@@ -1,10 +1,12 @@
""" """
Servicio de embeddings usando Azure OpenAI. Servicio de embeddings usando Azure OpenAI.
Genera embeddings para chunks de texto usando text-embedding-3-large (3072 dimensiones). Genera embeddings para chunks de texto usando text-embedding-3-large (3072 dimensiones).
Incluye manejo de rate limits con retry exponencial y delays entre batches.
""" """
import asyncio
import logging import logging
from typing import List from typing import List
from openai import AzureOpenAI from openai import AzureOpenAI, RateLimitError
from ..core.config import settings from ..core.config import settings
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -63,28 +65,47 @@ class EmbeddingService:
async def generate_embeddings_batch( async def generate_embeddings_batch(
self, self,
texts: List[str], texts: List[str],
batch_size: int = 100 batch_size: int | None = None,
delay_between_batches: float | None = None,
max_retries: int | None = None
) -> List[List[float]]: ) -> List[List[float]]:
""" """
Genera embeddings para múltiples textos en lotes. Genera embeddings para múltiples textos en lotes con manejo de rate limits.
Args: Args:
texts: Lista de textos para generar embeddings texts: Lista de textos para generar embeddings
batch_size: Tamaño del lote para procesamiento (default: 100) batch_size: Tamaño del lote (None = usar configuración de settings)
delay_between_batches: Segundos de espera entre batches (None = usar configuración)
max_retries: Número máximo de reintentos (None = usar configuración)
Returns: Returns:
Lista de vectores de embeddings Lista de vectores de embeddings
Raises: Raises:
Exception: Si hay error al generar los embeddings Exception: Si hay error al generar los embeddings después de todos los reintentos
""" """
# Usar configuración de settings si no se proporciona
batch_size = batch_size or settings.EMBEDDING_BATCH_SIZE
delay_between_batches = delay_between_batches or settings.EMBEDDING_DELAY_BETWEEN_BATCHES
max_retries = max_retries or settings.EMBEDDING_MAX_RETRIES
try: try:
embeddings = [] embeddings = []
total_batches = (len(texts) - 1) // batch_size + 1
logger.info(f"Iniciando generación de embeddings: {len(texts)} textos en {total_batches} batches")
logger.info(f"Configuración: batch_size={batch_size}, delay={delay_between_batches}s, max_retries={max_retries}")
for i in range(0, len(texts), batch_size): for i in range(0, len(texts), batch_size):
batch = texts[i:i + batch_size] batch = texts[i:i + batch_size]
logger.info(f"Procesando lote {i//batch_size + 1}/{(len(texts)-1)//batch_size + 1}") batch_num = i // batch_size + 1
logger.info(f"📊 Procesando batch {batch_num}/{total_batches} ({len(batch)} textos)...")
# Retry con exponential backoff
retry_count = 0
while retry_count <= max_retries:
try:
response = self.client.embeddings.create( response = self.client.embeddings.create(
input=batch, input=batch,
model=self.model model=self.model
@@ -101,8 +122,32 @@ class EmbeddingService:
) )
embeddings.extend(batch_embeddings) embeddings.extend(batch_embeddings)
logger.info(f"✓ Batch {batch_num}/{total_batches} completado exitosamente")
break # Éxito, salir del retry loop
logger.info(f"Generados {len(embeddings)} embeddings exitosamente") except RateLimitError as e:
retry_count += 1
if retry_count > max_retries:
logger.error(f"❌ Rate limit excedido después de {max_retries} reintentos")
raise
# Exponential backoff: 2^retry_count segundos
wait_time = 2 ** retry_count
logger.warning(
f"⚠️ Rate limit alcanzado en batch {batch_num}/{total_batches}. "
f"Reintento {retry_count}/{max_retries} en {wait_time}s..."
)
await asyncio.sleep(wait_time)
except Exception as e:
logger.error(f"❌ Error en batch {batch_num}/{total_batches}: {e}")
raise
# Delay entre batches para respetar rate limit (excepto en el último)
if i + batch_size < len(texts):
await asyncio.sleep(delay_between_batches)
logger.info(f"✅ Embeddings generados exitosamente: {len(embeddings)} vectores de {self.embedding_dimension}D")
return embeddings return embeddings
except Exception as e: except Exception as e:

View File

@@ -0,0 +1,131 @@
"""
Servicio para manejar el almacenamiento de datos extraídos en Redis.
"""
import logging
from datetime import datetime
from typing import Dict, Any, List, Optional
from ..models.extracted_data import ExtractedDocument
logger = logging.getLogger(__name__)
class ExtractedDataService:
"""Servicio para guardar y recuperar datos extraídos de documentos"""
async def save_extracted_data(
self,
file_name: str,
tema: str,
collection_name: str,
extracted_data: Dict[str, Any]
) -> ExtractedDocument:
"""
Guarda datos extraídos de un documento en Redis.
Args:
file_name: Nombre del archivo
tema: Tema del documento
collection_name: Colección de Qdrant
extracted_data: Datos extraídos (dict)
Returns:
ExtractedDocument guardado
"""
try:
# Crear instancia del modelo
doc = ExtractedDocument(
file_name=file_name,
tema=tema,
collection_name=collection_name,
extracted_data_json="", # Se setea después
extraction_timestamp=datetime.utcnow().isoformat()
)
# Serializar datos extraídos
doc.set_extracted_data(extracted_data)
# Guardar en Redis
doc.save()
logger.info(
f"💾 Datos extraídos guardados en Redis: {file_name} "
f"({len(extracted_data)} campos)"
)
return doc
except Exception as e:
logger.error(f"Error guardando datos extraídos en Redis: {e}")
raise
async def get_by_file(self, file_name: str) -> List[ExtractedDocument]:
"""
Obtiene todos los documentos extraídos de un archivo.
Args:
file_name: Nombre del archivo
Returns:
Lista de ExtractedDocument
"""
try:
docs = ExtractedDocument.find_by_file(file_name)
logger.info(f"Encontrados {len(docs)} documentos extraídos para {file_name}")
return docs
except Exception as e:
logger.error(f"Error buscando documentos por archivo: {e}")
return []
async def get_by_tema(self, tema: str) -> List[ExtractedDocument]:
"""
Obtiene todos los documentos extraídos de un tema.
Args:
tema: Tema a buscar
Returns:
Lista de ExtractedDocument
"""
try:
docs = ExtractedDocument.find_by_tema(tema)
logger.info(f"Encontrados {len(docs)} documentos extraídos para tema {tema}")
return docs
except Exception as e:
logger.error(f"Error buscando documentos por tema: {e}")
return []
async def get_by_collection(self, collection_name: str) -> List[ExtractedDocument]:
"""
Obtiene todos los documentos de una colección.
Args:
collection_name: Nombre de la colección
Returns:
Lista de ExtractedDocument
"""
try:
docs = ExtractedDocument.find_by_collection(collection_name)
logger.info(f"Encontrados {len(docs)} documentos en colección {collection_name}")
return docs
except Exception as e:
logger.error(f"Error buscando documentos por colección: {e}")
return []
# Instancia global singleton
_extracted_data_service: Optional[ExtractedDataService] = None
def get_extracted_data_service() -> ExtractedDataService:
"""
Obtiene la instancia singleton del servicio.
Returns:
Instancia de ExtractedDataService
"""
global _extracted_data_service
if _extracted_data_service is None:
_extracted_data_service = ExtractedDataService()
return _extracted_data_service

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,776 @@
{
"schema_id": "schema_103b7090a542",
"schema_name": "Form 990-PF Data Extraction",
"description": "Comprehensive data extraction schema for IRS Form 990-PF (Private Foundation) including financial, governance, and operational information",
"fields": [
{
"name": "ein",
"type": "string",
"description": "Federal Employer Identification Number of the organization",
"required": true,
"min_value": null,
"max_value": null,
"pattern": "^\\d{2}-\\d{7}$"
},
{
"name": "calendar_year",
"type": "integer",
"description": "Calendar year for which the data is reported",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "legal_name",
"type": "string",
"description": "Official registered name of the organization",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "phone_number",
"type": "string",
"description": "Primary contact phone number",
"required": true,
"min_value": null,
"max_value": null,
"pattern": "^\\([0-9]{3}\\) [0-9]{3}-[0-9]{4}$"
},
{
"name": "website_url",
"type": "string",
"description": "Organization's website address",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "return_type",
"type": "string",
"description": "Type of IRS return filed (990-PF for private foundations)",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "amended_return",
"type": "string",
"description": "Indicates if this is an amended return (Yes/No)",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "group_exemption_number",
"type": "string",
"description": "IRS group exemption number, if applicable",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "subsection_code",
"type": "string",
"description": "IRS subsection code (typically 501(c)(3) for foundations)",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "ruling_date",
"type": "string",
"description": "Date of IRS ruling or determination letter",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "accounting_method",
"type": "string",
"description": "Accounting method used (Cash, Accrual, or Other)",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "organization_type",
"type": "string",
"description": "Legal structure (corporation, trust, association, etc.)",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "year_of_formation",
"type": "string",
"description": "Year the organization was established",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "incorporation_state",
"type": "string",
"description": "State where the organization was incorporated",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "total_revenue",
"type": "float",
"description": "Sum of all revenue sources for the year",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "contributions_gifts_grants",
"type": "float",
"description": "Revenue from donations, contributions, and grants",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "program_service_revenue",
"type": "float",
"description": "Revenue generated from program services",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "membership_dues",
"type": "float",
"description": "Revenue from membership dues and assessments",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "investment_income",
"type": "float",
"description": "Income from interest, dividends, and other investments",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "gains_losses_sales_assets",
"type": "float",
"description": "Net gains or losses from sale of investments and assets",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "rental_income",
"type": "float",
"description": "Income from rental of real estate or equipment",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "related_organizations_revenue",
"type": "float",
"description": "Revenue received from related organizations",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "gaming_revenue",
"type": "float",
"description": "Revenue from gaming and gambling activities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "other_revenue",
"type": "float",
"description": "All other revenue not categorized elsewhere",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "government_grants",
"type": "float",
"description": "Revenue from federal, state, and local government grants",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "foreign_contributions",
"type": "float",
"description": "Revenue from foreign sources and contributors",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "total_expenses",
"type": "float",
"description": "Sum of all organizational expenses for the year",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "program_services_expenses",
"type": "float",
"description": "Direct expenses for charitable program activities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "management_general_expenses",
"type": "float",
"description": "Administrative and general operating expenses",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "fundraising_expenses",
"type": "float",
"description": "Expenses related to fundraising activities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "grants_us_organizations",
"type": "float",
"description": "Grants and assistance provided to domestic organizations",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "grants_us_individuals",
"type": "float",
"description": "Grants and assistance provided to domestic individuals",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "grants_foreign_organizations",
"type": "float",
"description": "Grants and assistance provided to foreign organizations",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "grants_foreign_individuals",
"type": "float",
"description": "Grants and assistance provided to foreign individuals",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "compensation_officers",
"type": "float",
"description": "Total compensation paid to officers and key employees",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "compensation_other_staff",
"type": "float",
"description": "Compensation paid to other employees",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "payroll_taxes_benefits",
"type": "float",
"description": "Payroll taxes, pension plans, and employee benefits",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "professional_fees",
"type": "float",
"description": "Legal, accounting, and other professional service fees",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "office_occupancy_costs",
"type": "float",
"description": "Rent, utilities, and facility-related expenses",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "information_technology_costs",
"type": "float",
"description": "IT equipment, software, and technology expenses",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "travel_conference_expenses",
"type": "float",
"description": "Travel, conferences, conventions, and meetings",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "depreciation_amortization",
"type": "float",
"description": "Depreciation of equipment and amortization of intangibles",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "insurance",
"type": "float",
"description": "Insurance premiums and related costs",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "officers_list",
"type": "array_string",
"description": "JSON array of officers, directors, trustees, and key employees with their details",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "governing_body_size",
"type": "integer",
"description": "Total number of voting members on the governing body",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "independent_members",
"type": "integer",
"description": "Number of independent voting members",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "financial_statements_reviewed",
"type": "string",
"description": "Whether financial statements were reviewed or audited",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "form_990_provided_to_governing_body",
"type": "string",
"description": "Whether Form 990 was provided to governing body before filing",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "conflict_of_interest_policy",
"type": "string",
"description": "Whether organization has a conflict of interest policy",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "whistleblower_policy",
"type": "string",
"description": "Whether organization has a whistleblower policy",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "document_retention_policy",
"type": "string",
"description": "Whether organization has a document retention and destruction policy",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "ceo_compensation_review_process",
"type": "string",
"description": "Process used to determine compensation of organization's top management",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "public_disclosure_practices",
"type": "string",
"description": "How organization makes its governing documents and annual returns available to the public",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "program_accomplishments_list",
"type": "array_string",
"description": "JSON array of program service accomplishments with descriptions and financial details",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "total_fundraising_event_revenue",
"type": "float",
"description": "Total revenue from all fundraising events",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "total_fundraising_event_expenses",
"type": "float",
"description": "Total direct expenses for all fundraising events",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "professional_fundraiser_fees",
"type": "float",
"description": "Fees paid to professional fundraising services",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "number_of_employees",
"type": "integer",
"description": "Total number of employees during the year",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "number_of_volunteers",
"type": "integer",
"description": "Estimate of volunteers who provided services",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "occupancy_costs",
"type": "float",
"description": "Total costs for office space and facilities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "fundraising_method_descriptions",
"type": "string",
"description": "Description of methods used for fundraising",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "joint_ventures_disregarded_entities",
"type": "string",
"description": "Information about joint ventures and disregarded entities",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "base_compensation",
"type": "float",
"description": "Base salary or wages paid to key personnel",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "bonus",
"type": "float",
"description": "Bonus and incentive compensation paid",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "incentive",
"type": "float",
"description": "Other incentive compensation",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "other_compensation",
"type": "float",
"description": "Other forms of compensation",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "non_fixed_compensation",
"type": "string",
"description": "Whether compensation arrangement is non-fixed",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "first_class_travel",
"type": "string",
"description": "Whether first-class or charter travel was provided",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "housing_allowance",
"type": "string",
"description": "Whether housing allowance or residence was provided",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "expense_account_usage",
"type": "string",
"description": "Whether payments for business use of personal residence were made",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "supplemental_retirement",
"type": "string",
"description": "Whether supplemental nonqualified retirement plan was provided",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "lobbying_expenditures_direct",
"type": "float",
"description": "Amount spent on direct lobbying activities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "lobbying_expenditures_grassroots",
"type": "float",
"description": "Amount spent on grassroots lobbying activities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "election_501h_status",
"type": "string",
"description": "Whether the organization made a Section 501(h) election",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "political_campaign_expenditures",
"type": "float",
"description": "Amount spent on political campaign activities",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "related_organizations_affiliates",
"type": "string",
"description": "Information about related organizations involved in political activities",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "investment_types",
"type": "string",
"description": "Description of types of investments held",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "donor_restricted_endowment_values",
"type": "float",
"description": "Value of permanently restricted endowment funds",
"required": true,
"min_value": 0,
"max_value": null,
"pattern": null
},
{
"name": "net_appreciation_depreciation",
"type": "float",
"description": "Net appreciation or depreciation in fair value of investments",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "related_organization_transactions",
"type": "string",
"description": "Information about transactions with related organizations",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "loans_to_from_related_parties",
"type": "string",
"description": "Information about loans to or from related parties",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "penalties_excise_taxes_reported",
"type": "string",
"description": "Whether the organization reported any penalties or excise taxes",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "unrelated_business_income_disclosure",
"type": "string",
"description": "Whether the organization had unrelated business income",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "foreign_bank_account_reporting",
"type": "string",
"description": "Whether the organization had foreign bank accounts or assets",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
},
{
"name": "schedule_o_narrative_explanations",
"type": "string",
"description": "Additional narrative explanations from Schedule O",
"required": true,
"min_value": null,
"max_value": null,
"pattern": null
}
],
"created_at": "2025-11-07T23:45:00.000000",
"updated_at": "2025-11-07T23:45:00.000000",
"tema": "IRS_FORM_990PF",
"is_global": true
}

View File

View File

@@ -28,8 +28,14 @@ dependencies = [
# LandingAI Document AI # LandingAI Document AI
"landingai-ade>=0.2.1", "landingai-ade>=0.2.1",
"redis-om>=0.3.5", "redis-om>=0.3.5",
"pydantic-ai-slim[google,openai]>=1.11.1", "pydantic-ai-slim[google,openai,mcp]>=1.11.1",
"tavily-python>=0.5.0",
] ]
[project.scripts] [project.scripts]
dev = "uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload" dev = "uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload"
start = "uvicorn app.main:app --host 0.0.0.0 --port 8000" start = "uvicorn app.main:app --host 0.0.0.0 --port 8000"
[dependency-groups]
dev = [
"ruff>=0.14.4",
]

239
backend/uv.lock generated
View File

@@ -1,5 +1,5 @@
version = 1 version = 1
revision = 3 revision = 2
requires-python = ">=3.12" requires-python = ">=3.12"
resolution-markers = [ resolution-markers = [
"python_full_version >= '3.14'", "python_full_version >= '3.14'",
@@ -30,6 +30,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/6f/12/e5e0282d673bb9746bacfb6e2dba8719989d3660cdb2ea79aee9a9651afb/anyio-4.10.0-py3-none-any.whl", hash = "sha256:60e474ac86736bbfd6f210f7a61218939c318f43f9972497381f1c5e930ed3d1", size = 107213, upload-time = "2025-08-04T08:54:24.882Z" }, { url = "https://files.pythonhosted.org/packages/6f/12/e5e0282d673bb9746bacfb6e2dba8719989d3660cdb2ea79aee9a9651afb/anyio-4.10.0-py3-none-any.whl", hash = "sha256:60e474ac86736bbfd6f210f7a61218939c318f43f9972497381f1c5e930ed3d1", size = 107213, upload-time = "2025-08-04T08:54:24.882Z" },
] ]
[[package]]
name = "attrs"
version = "25.4.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/6b/5c/685e6633917e101e5dcb62b9dd76946cbb57c26e133bae9e0cd36033c0a9/attrs-25.4.0.tar.gz", hash = "sha256:16d5969b87f0859ef33a48b35d55ac1be6e42ae49d5e853b597db70c35c57e11", size = 934251, upload-time = "2025-10-06T13:54:44.725Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/3a/2a/7cc015f5b9f5db42b7d48157e23356022889fc354a2813c15934b7cb5c0e/attrs-25.4.0-py3-none-any.whl", hash = "sha256:adcf7e2a1fb3b36ac48d97835bb6d8ade15b8dcce26aba8bf1d14847b57a3373", size = 67615, upload-time = "2025-10-06T13:54:43.17Z" },
]
[[package]] [[package]]
name = "azure-core" name = "azure-core"
version = "1.35.0" version = "1.35.0"
@@ -74,18 +83,24 @@ dependencies = [
{ name = "openai" }, { name = "openai" },
{ name = "pdf2image" }, { name = "pdf2image" },
{ name = "pillow" }, { name = "pillow" },
{ name = "pydantic-ai-slim", extra = ["google", "openai"] }, { name = "pydantic-ai-slim", extra = ["google", "mcp", "openai"] },
{ name = "pydantic-settings" }, { name = "pydantic-settings" },
{ name = "pypdf" }, { name = "pypdf" },
{ name = "python-dotenv" }, { name = "python-dotenv" },
{ name = "python-multipart" }, { name = "python-multipart" },
{ name = "qdrant-client" }, { name = "qdrant-client" },
{ name = "redis-om" }, { name = "redis-om" },
{ name = "tavily-python" },
{ name = "tiktoken" }, { name = "tiktoken" },
{ name = "uvicorn", extra = ["standard"] }, { name = "uvicorn", extra = ["standard"] },
{ name = "websockets" }, { name = "websockets" },
] ]
[package.dev-dependencies]
dev = [
{ name = "ruff" },
]
[package.metadata] [package.metadata]
requires-dist = [ requires-dist = [
{ name = "azure-storage-blob", specifier = ">=12.26.0" }, { name = "azure-storage-blob", specifier = ">=12.26.0" },
@@ -98,18 +113,22 @@ requires-dist = [
{ name = "openai", specifier = ">=1.59.6" }, { name = "openai", specifier = ">=1.59.6" },
{ name = "pdf2image", specifier = ">=1.17.0" }, { name = "pdf2image", specifier = ">=1.17.0" },
{ name = "pillow", specifier = ">=11.0.0" }, { name = "pillow", specifier = ">=11.0.0" },
{ name = "pydantic-ai-slim", extras = ["google", "openai"], specifier = ">=1.11.1" }, { name = "pydantic-ai-slim", extras = ["google", "openai", "mcp"], specifier = ">=1.11.1" },
{ name = "pydantic-settings", specifier = ">=2.10.1" }, { name = "pydantic-settings", specifier = ">=2.10.1" },
{ name = "pypdf", specifier = ">=5.1.0" }, { name = "pypdf", specifier = ">=5.1.0" },
{ name = "python-dotenv", specifier = ">=1.1.1" }, { name = "python-dotenv", specifier = ">=1.1.1" },
{ name = "python-multipart", specifier = ">=0.0.20" }, { name = "python-multipart", specifier = ">=0.0.20" },
{ name = "qdrant-client", specifier = ">=1.15.1" }, { name = "qdrant-client", specifier = ">=1.15.1" },
{ name = "redis-om", specifier = ">=0.3.5" }, { name = "redis-om", specifier = ">=0.3.5" },
{ name = "tavily-python", specifier = ">=0.5.0" },
{ name = "tiktoken", specifier = ">=0.8.0" }, { name = "tiktoken", specifier = ">=0.8.0" },
{ name = "uvicorn", extras = ["standard"], specifier = ">=0.35.0" }, { name = "uvicorn", extras = ["standard"], specifier = ">=0.35.0" },
{ name = "websockets", specifier = ">=14.1" }, { name = "websockets", specifier = ">=14.1" },
] ]
[package.metadata.requires-dev]
dev = [{ name = "ruff", specifier = ">=0.14.4" }]
[[package]] [[package]]
name = "cachetools" name = "cachetools"
version = "6.2.1" version = "6.2.1"
@@ -785,6 +804,15 @@ http2 = [
{ name = "h2" }, { name = "h2" },
] ]
[[package]]
name = "httpx-sse"
version = "0.4.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/0f/4c/751061ffa58615a32c31b2d82e8482be8dd4a89154f003147acee90f2be9/httpx_sse-0.4.3.tar.gz", hash = "sha256:9b1ed0127459a66014aec3c56bebd93da3c1bc8bb6618c8082039a44889a755d", size = 15943, upload-time = "2025-10-10T21:48:22.271Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/d2/fd/6668e5aec43ab844de6fc74927e155a3b37bf40d7c3790e49fc0406b6578/httpx_sse-0.4.3-py3-none-any.whl", hash = "sha256:0ac1c9fe3c0afad2e0ebb25a934a59f4c7823b60792691f779fad2c5568830fc", size = 8960, upload-time = "2025-10-10T21:48:21.158Z" },
]
[[package]] [[package]]
name = "hyperframe" name = "hyperframe"
version = "6.1.0" version = "6.1.0"
@@ -913,6 +941,33 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/71/92/5e77f98553e9e75130c78900d000368476aed74276eb8ae8796f65f00918/jsonpointer-3.0.0-py2.py3-none-any.whl", hash = "sha256:13e088adc14fca8b6aa8177c044e12701e6ad4b28ff10e65f2267a90109c9942", size = 7595, upload-time = "2024-06-10T19:24:40.698Z" }, { url = "https://files.pythonhosted.org/packages/71/92/5e77f98553e9e75130c78900d000368476aed74276eb8ae8796f65f00918/jsonpointer-3.0.0-py2.py3-none-any.whl", hash = "sha256:13e088adc14fca8b6aa8177c044e12701e6ad4b28ff10e65f2267a90109c9942", size = 7595, upload-time = "2024-06-10T19:24:40.698Z" },
] ]
[[package]]
name = "jsonschema"
version = "4.25.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "attrs" },
{ name = "jsonschema-specifications" },
{ name = "referencing" },
{ name = "rpds-py" },
]
sdist = { url = "https://files.pythonhosted.org/packages/74/69/f7185de793a29082a9f3c7728268ffb31cb5095131a9c139a74078e27336/jsonschema-4.25.1.tar.gz", hash = "sha256:e4a9655ce0da0c0b67a085847e00a3a51449e1157f4f75e9fb5aa545e122eb85", size = 357342, upload-time = "2025-08-18T17:03:50.038Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/bf/9c/8c95d856233c1f82500c2450b8c68576b4cf1c871db3afac5c34ff84e6fd/jsonschema-4.25.1-py3-none-any.whl", hash = "sha256:3fba0169e345c7175110351d456342c364814cfcf3b964ba4587f22915230a63", size = 90040, upload-time = "2025-08-18T17:03:48.373Z" },
]
[[package]]
name = "jsonschema-specifications"
version = "2025.9.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "referencing" },
]
sdist = { url = "https://files.pythonhosted.org/packages/19/74/a633ee74eb36c44aa6d1095e7cc5569bebf04342ee146178e2d36600708b/jsonschema_specifications-2025.9.1.tar.gz", hash = "sha256:b540987f239e745613c7a9176f3edb72b832a4ac465cf02712288397832b5e8d", size = 32855, upload-time = "2025-09-08T01:34:59.186Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/41/45/1a4ed80516f02155c51f51e8cedb3c1902296743db0bbc66608a0db2814f/jsonschema_specifications-2025.9.1-py3-none-any.whl", hash = "sha256:98802fee3a11ee76ecaca44429fda8a41bff98b00a0f2838151b113f210cc6fe", size = 18437, upload-time = "2025-09-08T01:34:57.871Z" },
]
[[package]] [[package]]
name = "landingai-ade" name = "landingai-ade"
version = "0.20.3" version = "0.20.3"
@@ -1057,6 +1112,29 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/58/c7/b06a83df678fca882c24fb498e628e0406bdb95ffdfa7ae43ecc0a714d52/logfire_api-4.14.2-py3-none-any.whl", hash = "sha256:aa4af2ecb007c3e0095e25ba4526fd8c0e2c0be2ceceac71ca651c4ad86dc713", size = 95021, upload-time = "2025-10-24T20:14:36.161Z" }, { url = "https://files.pythonhosted.org/packages/58/c7/b06a83df678fca882c24fb498e628e0406bdb95ffdfa7ae43ecc0a714d52/logfire_api-4.14.2-py3-none-any.whl", hash = "sha256:aa4af2ecb007c3e0095e25ba4526fd8c0e2c0be2ceceac71ca651c4ad86dc713", size = 95021, upload-time = "2025-10-24T20:14:36.161Z" },
] ]
[[package]]
name = "mcp"
version = "1.21.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
{ name = "httpx" },
{ name = "httpx-sse" },
{ name = "jsonschema" },
{ name = "pydantic" },
{ name = "pydantic-settings" },
{ name = "pyjwt", extra = ["crypto"] },
{ name = "python-multipart" },
{ name = "pywin32", marker = "sys_platform == 'win32'" },
{ name = "sse-starlette" },
{ name = "starlette" },
{ name = "uvicorn", marker = "sys_platform != 'emscripten'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/33/54/dd2330ef4611c27ae59124820863c34e1d3edb1133c58e6375e2d938c9c5/mcp-1.21.0.tar.gz", hash = "sha256:bab0a38e8f8c48080d787233343f8d301b0e1e95846ae7dead251b2421d99855", size = 452697, upload-time = "2025-11-06T23:19:58.432Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/39/47/850b6edc96c03bd44b00de9a0ca3c1cc71e0ba1cd5822955bc9e4eb3fad3/mcp-1.21.0-py3-none-any.whl", hash = "sha256:598619e53eb0b7a6513db38c426b28a4bdf57496fed04332100d2c56acade98b", size = 173672, upload-time = "2025-11-06T23:19:56.508Z" },
]
[[package]] [[package]]
name = "more-itertools" name = "more-itertools"
version = "10.8.0" version = "10.8.0"
@@ -1447,6 +1525,9 @@ wheels = [
google = [ google = [
{ name = "google-genai" }, { name = "google-genai" },
] ]
mcp = [
{ name = "mcp" },
]
openai = [ openai = [
{ name = "openai" }, { name = "openai" },
] ]
@@ -1531,6 +1612,11 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/61/ad/689f02752eeec26aed679477e80e632ef1b682313be70793d798c1d5fc8f/PyJWT-2.10.1-py3-none-any.whl", hash = "sha256:dcdd193e30abefd5debf142f9adfcdd2b58004e644f25406ffaebd50bd98dacb", size = 22997, upload-time = "2024-11-28T03:43:27.893Z" }, { url = "https://files.pythonhosted.org/packages/61/ad/689f02752eeec26aed679477e80e632ef1b682313be70793d798c1d5fc8f/PyJWT-2.10.1-py3-none-any.whl", hash = "sha256:dcdd193e30abefd5debf142f9adfcdd2b58004e644f25406ffaebd50bd98dacb", size = 22997, upload-time = "2024-11-28T03:43:27.893Z" },
] ]
[package.optional-dependencies]
crypto = [
{ name = "cryptography" },
]
[[package]] [[package]]
name = "pypdf" name = "pypdf"
version = "6.1.3" version = "6.1.3"
@@ -1672,6 +1758,20 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/e4/60/2cc6753c2c36a2a5dded8c380c6cad67a26c5878cd7aad56de2eee1d63c8/redis_om-0.3.5-py3-none-any.whl", hash = "sha256:99ab40f696028ce47c5e2eb5118a1ffc1fd193005428df89c8cf77ad35a0177a", size = 86634, upload-time = "2025-04-04T12:54:50.07Z" }, { url = "https://files.pythonhosted.org/packages/e4/60/2cc6753c2c36a2a5dded8c380c6cad67a26c5878cd7aad56de2eee1d63c8/redis_om-0.3.5-py3-none-any.whl", hash = "sha256:99ab40f696028ce47c5e2eb5118a1ffc1fd193005428df89c8cf77ad35a0177a", size = 86634, upload-time = "2025-04-04T12:54:50.07Z" },
] ]
[[package]]
name = "referencing"
version = "0.37.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "attrs" },
{ name = "rpds-py" },
{ name = "typing-extensions", marker = "python_full_version < '3.13'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/22/f5/df4e9027acead3ecc63e50fe1e36aca1523e1719559c499951bb4b53188f/referencing-0.37.0.tar.gz", hash = "sha256:44aefc3142c5b842538163acb373e24cce6632bd54bdb01b21ad5863489f50d8", size = 78036, upload-time = "2025-10-13T15:30:48.871Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2c/58/ca301544e1fa93ed4f80d724bf5b194f6e4b945841c5bfd555878eea9fcb/referencing-0.37.0-py3-none-any.whl", hash = "sha256:381329a9f99628c9069361716891d34ad94af76e461dcb0335825aecc7692231", size = 26766, upload-time = "2025-10-13T15:30:47.625Z" },
]
[[package]] [[package]]
name = "regex" name = "regex"
version = "2025.11.3" version = "2025.11.3"
@@ -1777,6 +1877,87 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/3f/51/d4db610ef29373b879047326cbf6fa98b6c1969d6f6dc423279de2b1be2c/requests_toolbelt-1.0.0-py2.py3-none-any.whl", hash = "sha256:cccfdd665f0a24fcf4726e690f65639d272bb0637b9b92dfd91a5568ccf6bd06", size = 54481, upload-time = "2023-05-01T04:11:28.427Z" }, { url = "https://files.pythonhosted.org/packages/3f/51/d4db610ef29373b879047326cbf6fa98b6c1969d6f6dc423279de2b1be2c/requests_toolbelt-1.0.0-py2.py3-none-any.whl", hash = "sha256:cccfdd665f0a24fcf4726e690f65639d272bb0637b9b92dfd91a5568ccf6bd06", size = 54481, upload-time = "2023-05-01T04:11:28.427Z" },
] ]
[[package]]
name = "rpds-py"
version = "0.28.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/48/dc/95f074d43452b3ef5d06276696ece4b3b5d696e7c9ad7173c54b1390cd70/rpds_py-0.28.0.tar.gz", hash = "sha256:abd4df20485a0983e2ca334a216249b6186d6e3c1627e106651943dbdb791aea", size = 27419, upload-time = "2025-10-22T22:24:29.327Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b8/5c/6c3936495003875fe7b14f90ea812841a08fca50ab26bd840e924097d9c8/rpds_py-0.28.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:6b4f28583a4f247ff60cd7bdda83db8c3f5b05a7a82ff20dd4b078571747708f", size = 366439, upload-time = "2025-10-22T22:22:04.525Z" },
{ url = "https://files.pythonhosted.org/packages/56/f9/a0f1ca194c50aa29895b442771f036a25b6c41a35e4f35b1a0ea713bedae/rpds_py-0.28.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:d678e91b610c29c4b3d52a2c148b641df2b4676ffe47c59f6388d58b99cdc424", size = 348170, upload-time = "2025-10-22T22:22:06.397Z" },
{ url = "https://files.pythonhosted.org/packages/18/ea/42d243d3a586beb72c77fa5def0487daf827210069a95f36328e869599ea/rpds_py-0.28.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e819e0e37a44a78e1383bf1970076e2ccc4dc8c2bbaa2f9bd1dc987e9afff628", size = 378838, upload-time = "2025-10-22T22:22:07.932Z" },
{ url = "https://files.pythonhosted.org/packages/e7/78/3de32e18a94791af8f33601402d9d4f39613136398658412a4e0b3047327/rpds_py-0.28.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:5ee514e0f0523db5d3fb171f397c54875dbbd69760a414dccf9d4d7ad628b5bd", size = 393299, upload-time = "2025-10-22T22:22:09.435Z" },
{ url = "https://files.pythonhosted.org/packages/13/7e/4bdb435afb18acea2eb8a25ad56b956f28de7c59f8a1d32827effa0d4514/rpds_py-0.28.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5f3fa06d27fdcee47f07a39e02862da0100cb4982508f5ead53ec533cd5fe55e", size = 518000, upload-time = "2025-10-22T22:22:11.326Z" },
{ url = "https://files.pythonhosted.org/packages/31/d0/5f52a656875cdc60498ab035a7a0ac8f399890cc1ee73ebd567bac4e39ae/rpds_py-0.28.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:46959ef2e64f9e4a41fc89aa20dbca2b85531f9a72c21099a3360f35d10b0d5a", size = 408746, upload-time = "2025-10-22T22:22:13.143Z" },
{ url = "https://files.pythonhosted.org/packages/3e/cd/49ce51767b879cde77e7ad9fae164ea15dce3616fe591d9ea1df51152706/rpds_py-0.28.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8455933b4bcd6e83fde3fefc987a023389c4b13f9a58c8d23e4b3f6d13f78c84", size = 386379, upload-time = "2025-10-22T22:22:14.602Z" },
{ url = "https://files.pythonhosted.org/packages/6a/99/e4e1e1ee93a98f72fc450e36c0e4d99c35370220e815288e3ecd2ec36a2a/rpds_py-0.28.0-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:ad50614a02c8c2962feebe6012b52f9802deec4263946cddea37aaf28dd25a66", size = 401280, upload-time = "2025-10-22T22:22:16.063Z" },
{ url = "https://files.pythonhosted.org/packages/61/35/e0c6a57488392a8b319d2200d03dad2b29c0db9996f5662c3b02d0b86c02/rpds_py-0.28.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:e5deca01b271492553fdb6c7fd974659dce736a15bae5dad7ab8b93555bceb28", size = 412365, upload-time = "2025-10-22T22:22:17.504Z" },
{ url = "https://files.pythonhosted.org/packages/ff/6a/841337980ea253ec797eb084665436007a1aad0faac1ba097fb906c5f69c/rpds_py-0.28.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:735f8495a13159ce6a0d533f01e8674cec0c57038c920495f87dcb20b3ddb48a", size = 559573, upload-time = "2025-10-22T22:22:19.108Z" },
{ url = "https://files.pythonhosted.org/packages/e7/5e/64826ec58afd4c489731f8b00729c5f6afdb86f1df1df60bfede55d650bb/rpds_py-0.28.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:961ca621ff10d198bbe6ba4957decca61aa2a0c56695384c1d6b79bf61436df5", size = 583973, upload-time = "2025-10-22T22:22:20.768Z" },
{ url = "https://files.pythonhosted.org/packages/b6/ee/44d024b4843f8386a4eeaa4c171b3d31d55f7177c415545fd1a24c249b5d/rpds_py-0.28.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2374e16cc9131022e7d9a8f8d65d261d9ba55048c78f3b6e017971a4f5e6353c", size = 553800, upload-time = "2025-10-22T22:22:22.25Z" },
{ url = "https://files.pythonhosted.org/packages/7d/89/33e675dccff11a06d4d85dbb4d1865f878d5020cbb69b2c1e7b2d3f82562/rpds_py-0.28.0-cp312-cp312-win32.whl", hash = "sha256:d15431e334fba488b081d47f30f091e5d03c18527c325386091f31718952fe08", size = 216954, upload-time = "2025-10-22T22:22:24.105Z" },
{ url = "https://files.pythonhosted.org/packages/af/36/45f6ebb3210887e8ee6dbf1bc710ae8400bb417ce165aaf3024b8360d999/rpds_py-0.28.0-cp312-cp312-win_amd64.whl", hash = "sha256:a410542d61fc54710f750d3764380b53bf09e8c4edbf2f9141a82aa774a04f7c", size = 227844, upload-time = "2025-10-22T22:22:25.551Z" },
{ url = "https://files.pythonhosted.org/packages/57/91/f3fb250d7e73de71080f9a221d19bd6a1c1eb0d12a1ea26513f6c1052ad6/rpds_py-0.28.0-cp312-cp312-win_arm64.whl", hash = "sha256:1f0cfd1c69e2d14f8c892b893997fa9a60d890a0c8a603e88dca4955f26d1edd", size = 217624, upload-time = "2025-10-22T22:22:26.914Z" },
{ url = "https://files.pythonhosted.org/packages/d3/03/ce566d92611dfac0085c2f4b048cd53ed7c274a5c05974b882a908d540a2/rpds_py-0.28.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:e9e184408a0297086f880556b6168fa927d677716f83d3472ea333b42171ee3b", size = 366235, upload-time = "2025-10-22T22:22:28.397Z" },
{ url = "https://files.pythonhosted.org/packages/00/34/1c61da1b25592b86fd285bd7bd8422f4c9d748a7373b46126f9ae792a004/rpds_py-0.28.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:edd267266a9b0448f33dc465a97cfc5d467594b600fe28e7fa2f36450e03053a", size = 348241, upload-time = "2025-10-22T22:22:30.171Z" },
{ url = "https://files.pythonhosted.org/packages/fc/00/ed1e28616848c61c493a067779633ebf4b569eccaacf9ccbdc0e7cba2b9d/rpds_py-0.28.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:85beb8b3f45e4e32f6802fb6cd6b17f615ef6c6a52f265371fb916fae02814aa", size = 378079, upload-time = "2025-10-22T22:22:31.644Z" },
{ url = "https://files.pythonhosted.org/packages/11/b2/ccb30333a16a470091b6e50289adb4d3ec656fd9951ba8c5e3aaa0746a67/rpds_py-0.28.0-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:d2412be8d00a1b895f8ad827cc2116455196e20ed994bb704bf138fe91a42724", size = 393151, upload-time = "2025-10-22T22:22:33.453Z" },
{ url = "https://files.pythonhosted.org/packages/8c/d0/73e2217c3ee486d555cb84920597480627d8c0240ff3062005c6cc47773e/rpds_py-0.28.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:cf128350d384b777da0e68796afdcebc2e9f63f0e9f242217754e647f6d32491", size = 517520, upload-time = "2025-10-22T22:22:34.949Z" },
{ url = "https://files.pythonhosted.org/packages/c4/91/23efe81c700427d0841a4ae7ea23e305654381831e6029499fe80be8a071/rpds_py-0.28.0-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a2036d09b363aa36695d1cc1a97b36865597f4478470b0697b5ee9403f4fe399", size = 408699, upload-time = "2025-10-22T22:22:36.584Z" },
{ url = "https://files.pythonhosted.org/packages/ca/ee/a324d3198da151820a326c1f988caaa4f37fc27955148a76fff7a2d787a9/rpds_py-0.28.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b8e1e9be4fa6305a16be628959188e4fd5cd6f1b0e724d63c6d8b2a8adf74ea6", size = 385720, upload-time = "2025-10-22T22:22:38.014Z" },
{ url = "https://files.pythonhosted.org/packages/19/ad/e68120dc05af8b7cab4a789fccd8cdcf0fe7e6581461038cc5c164cd97d2/rpds_py-0.28.0-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:0a403460c9dd91a7f23fc3188de6d8977f1d9603a351d5db6cf20aaea95b538d", size = 401096, upload-time = "2025-10-22T22:22:39.869Z" },
{ url = "https://files.pythonhosted.org/packages/99/90/c1e070620042459d60df6356b666bb1f62198a89d68881816a7ed121595a/rpds_py-0.28.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d7366b6553cdc805abcc512b849a519167db8f5e5c3472010cd1228b224265cb", size = 411465, upload-time = "2025-10-22T22:22:41.395Z" },
{ url = "https://files.pythonhosted.org/packages/68/61/7c195b30d57f1b8d5970f600efee72a4fad79ec829057972e13a0370fd24/rpds_py-0.28.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:5b43c6a3726efd50f18d8120ec0551241c38785b68952d240c45ea553912ac41", size = 558832, upload-time = "2025-10-22T22:22:42.871Z" },
{ url = "https://files.pythonhosted.org/packages/b0/3d/06f3a718864773f69941d4deccdf18e5e47dd298b4628062f004c10f3b34/rpds_py-0.28.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:0cb7203c7bc69d7c1585ebb33a2e6074492d2fc21ad28a7b9d40457ac2a51ab7", size = 583230, upload-time = "2025-10-22T22:22:44.877Z" },
{ url = "https://files.pythonhosted.org/packages/66/df/62fc783781a121e77fee9a21ead0a926f1b652280a33f5956a5e7833ed30/rpds_py-0.28.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7a52a5169c664dfb495882adc75c304ae1d50df552fbd68e100fdc719dee4ff9", size = 553268, upload-time = "2025-10-22T22:22:46.441Z" },
{ url = "https://files.pythonhosted.org/packages/84/85/d34366e335140a4837902d3dea89b51f087bd6a63c993ebdff59e93ee61d/rpds_py-0.28.0-cp313-cp313-win32.whl", hash = "sha256:2e42456917b6687215b3e606ab46aa6bca040c77af7df9a08a6dcfe8a4d10ca5", size = 217100, upload-time = "2025-10-22T22:22:48.342Z" },
{ url = "https://files.pythonhosted.org/packages/3c/1c/f25a3f3752ad7601476e3eff395fe075e0f7813fbb9862bd67c82440e880/rpds_py-0.28.0-cp313-cp313-win_amd64.whl", hash = "sha256:e0a0311caedc8069d68fc2bf4c9019b58a2d5ce3cd7cb656c845f1615b577e1e", size = 227759, upload-time = "2025-10-22T22:22:50.219Z" },
{ url = "https://files.pythonhosted.org/packages/e0/d6/5f39b42b99615b5bc2f36ab90423ea404830bdfee1c706820943e9a645eb/rpds_py-0.28.0-cp313-cp313-win_arm64.whl", hash = "sha256:04c1b207ab8b581108801528d59ad80aa83bb170b35b0ddffb29c20e411acdc1", size = 217326, upload-time = "2025-10-22T22:22:51.647Z" },
{ url = "https://files.pythonhosted.org/packages/5c/8b/0c69b72d1cee20a63db534be0df271effe715ef6c744fdf1ff23bb2b0b1c/rpds_py-0.28.0-cp313-cp313t-macosx_10_12_x86_64.whl", hash = "sha256:f296ea3054e11fc58ad42e850e8b75c62d9a93a9f981ad04b2e5ae7d2186ff9c", size = 355736, upload-time = "2025-10-22T22:22:53.211Z" },
{ url = "https://files.pythonhosted.org/packages/f7/6d/0c2ee773cfb55c31a8514d2cece856dd299170a49babd50dcffb15ddc749/rpds_py-0.28.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:5a7306c19b19005ad98468fcefeb7100b19c79fc23a5f24a12e06d91181193fa", size = 342677, upload-time = "2025-10-22T22:22:54.723Z" },
{ url = "https://files.pythonhosted.org/packages/e2/1c/22513ab25a27ea205144414724743e305e8153e6abe81833b5e678650f5a/rpds_py-0.28.0-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e5d9b86aa501fed9862a443c5c3116f6ead8bc9296185f369277c42542bd646b", size = 371847, upload-time = "2025-10-22T22:22:56.295Z" },
{ url = "https://files.pythonhosted.org/packages/60/07/68e6ccdb4b05115ffe61d31afc94adef1833d3a72f76c9632d4d90d67954/rpds_py-0.28.0-cp313-cp313t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e5bbc701eff140ba0e872691d573b3d5d30059ea26e5785acba9132d10c8c31d", size = 381800, upload-time = "2025-10-22T22:22:57.808Z" },
{ url = "https://files.pythonhosted.org/packages/73/bf/6d6d15df80781d7f9f368e7c1a00caf764436518c4877fb28b029c4624af/rpds_py-0.28.0-cp313-cp313t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9a5690671cd672a45aa8616d7374fdf334a1b9c04a0cac3c854b1136e92374fe", size = 518827, upload-time = "2025-10-22T22:22:59.826Z" },
{ url = "https://files.pythonhosted.org/packages/7b/d3/2decbb2976cc452cbf12a2b0aaac5f1b9dc5dd9d1f7e2509a3ee00421249/rpds_py-0.28.0-cp313-cp313t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:9f1d92ecea4fa12f978a367c32a5375a1982834649cdb96539dcdc12e609ab1a", size = 399471, upload-time = "2025-10-22T22:23:01.968Z" },
{ url = "https://files.pythonhosted.org/packages/b1/2c/f30892f9e54bd02e5faca3f6a26d6933c51055e67d54818af90abed9748e/rpds_py-0.28.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8d252db6b1a78d0a3928b6190156042d54c93660ce4d98290d7b16b5296fb7cc", size = 377578, upload-time = "2025-10-22T22:23:03.52Z" },
{ url = "https://files.pythonhosted.org/packages/f0/5d/3bce97e5534157318f29ac06bf2d279dae2674ec12f7cb9c12739cee64d8/rpds_py-0.28.0-cp313-cp313t-manylinux_2_31_riscv64.whl", hash = "sha256:d61b355c3275acb825f8777d6c4505f42b5007e357af500939d4a35b19177259", size = 390482, upload-time = "2025-10-22T22:23:05.391Z" },
{ url = "https://files.pythonhosted.org/packages/e3/f0/886bd515ed457b5bd93b166175edb80a0b21a210c10e993392127f1e3931/rpds_py-0.28.0-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:acbe5e8b1026c0c580d0321c8aae4b0a1e1676861d48d6e8c6586625055b606a", size = 402447, upload-time = "2025-10-22T22:23:06.93Z" },
{ url = "https://files.pythonhosted.org/packages/42/b5/71e8777ac55e6af1f4f1c05b47542a1eaa6c33c1cf0d300dca6a1c6e159a/rpds_py-0.28.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:8aa23b6f0fc59b85b4c7d89ba2965af274346f738e8d9fc2455763602e62fd5f", size = 552385, upload-time = "2025-10-22T22:23:08.557Z" },
{ url = "https://files.pythonhosted.org/packages/5d/cb/6ca2d70cbda5a8e36605e7788c4aa3bea7c17d71d213465a5a675079b98d/rpds_py-0.28.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:7b14b0c680286958817c22d76fcbca4800ddacef6f678f3a7c79a1fe7067fe37", size = 575642, upload-time = "2025-10-22T22:23:10.348Z" },
{ url = "https://files.pythonhosted.org/packages/4a/d4/407ad9960ca7856d7b25c96dcbe019270b5ffdd83a561787bc682c797086/rpds_py-0.28.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:bcf1d210dfee61a6c86551d67ee1031899c0fdbae88b2d44a569995d43797712", size = 544507, upload-time = "2025-10-22T22:23:12.434Z" },
{ url = "https://files.pythonhosted.org/packages/51/31/2f46fe0efcac23fbf5797c6b6b7e1c76f7d60773e525cb65fcbc582ee0f2/rpds_py-0.28.0-cp313-cp313t-win32.whl", hash = "sha256:3aa4dc0fdab4a7029ac63959a3ccf4ed605fee048ba67ce89ca3168da34a1342", size = 205376, upload-time = "2025-10-22T22:23:13.979Z" },
{ url = "https://files.pythonhosted.org/packages/92/e4/15947bda33cbedfc134490a41841ab8870a72a867a03d4969d886f6594a2/rpds_py-0.28.0-cp313-cp313t-win_amd64.whl", hash = "sha256:7b7d9d83c942855e4fdcfa75d4f96f6b9e272d42fffcb72cd4bb2577db2e2907", size = 215907, upload-time = "2025-10-22T22:23:15.5Z" },
{ url = "https://files.pythonhosted.org/packages/08/47/ffe8cd7a6a02833b10623bf765fbb57ce977e9a4318ca0e8cf97e9c3d2b3/rpds_py-0.28.0-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:dcdcb890b3ada98a03f9f2bb108489cdc7580176cb73b4f2d789e9a1dac1d472", size = 353830, upload-time = "2025-10-22T22:23:17.03Z" },
{ url = "https://files.pythonhosted.org/packages/f9/9f/890f36cbd83a58491d0d91ae0db1702639edb33fb48eeb356f80ecc6b000/rpds_py-0.28.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:f274f56a926ba2dc02976ca5b11c32855cbd5925534e57cfe1fda64e04d1add2", size = 341819, upload-time = "2025-10-22T22:23:18.57Z" },
{ url = "https://files.pythonhosted.org/packages/09/e3/921eb109f682aa24fb76207698fbbcf9418738f35a40c21652c29053f23d/rpds_py-0.28.0-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4fe0438ac4a29a520ea94c8c7f1754cdd8feb1bc490dfda1bfd990072363d527", size = 373127, upload-time = "2025-10-22T22:23:20.216Z" },
{ url = "https://files.pythonhosted.org/packages/23/13/bce4384d9f8f4989f1a9599c71b7a2d877462e5fd7175e1f69b398f729f4/rpds_py-0.28.0-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8a358a32dd3ae50e933347889b6af9a1bdf207ba5d1a3f34e1a38cd3540e6733", size = 382767, upload-time = "2025-10-22T22:23:21.787Z" },
{ url = "https://files.pythonhosted.org/packages/23/e1/579512b2d89a77c64ccef5a0bc46a6ef7f72ae0cf03d4b26dcd52e57ee0a/rpds_py-0.28.0-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e80848a71c78aa328fefaba9c244d588a342c8e03bda518447b624ea64d1ff56", size = 517585, upload-time = "2025-10-22T22:23:23.699Z" },
{ url = "https://files.pythonhosted.org/packages/62/3c/ca704b8d324a2591b0b0adcfcaadf9c862375b11f2f667ac03c61b4fd0a6/rpds_py-0.28.0-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f586db2e209d54fe177e58e0bc4946bea5fb0102f150b1b2f13de03e1f0976f8", size = 399828, upload-time = "2025-10-22T22:23:25.713Z" },
{ url = "https://files.pythonhosted.org/packages/da/37/e84283b9e897e3adc46b4c88bb3f6ec92a43bd4d2f7ef5b13459963b2e9c/rpds_py-0.28.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5ae8ee156d6b586e4292491e885d41483136ab994e719a13458055bec14cf370", size = 375509, upload-time = "2025-10-22T22:23:27.32Z" },
{ url = "https://files.pythonhosted.org/packages/1a/c2/a980beab869d86258bf76ec42dec778ba98151f253a952b02fe36d72b29c/rpds_py-0.28.0-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:a805e9b3973f7e27f7cab63a6b4f61d90f2e5557cff73b6e97cd5b8540276d3d", size = 392014, upload-time = "2025-10-22T22:23:29.332Z" },
{ url = "https://files.pythonhosted.org/packages/da/b5/b1d3c5f9d3fa5aeef74265f9c64de3c34a0d6d5cd3c81c8b17d5c8f10ed4/rpds_py-0.28.0-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:5d3fd16b6dc89c73a4da0b4ac8b12a7ecc75b2864b95c9e5afed8003cb50a728", size = 402410, upload-time = "2025-10-22T22:23:31.14Z" },
{ url = "https://files.pythonhosted.org/packages/74/ae/cab05ff08dfcc052afc73dcb38cbc765ffc86f94e966f3924cd17492293c/rpds_py-0.28.0-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:6796079e5d24fdaba6d49bda28e2c47347e89834678f2bc2c1b4fc1489c0fb01", size = 553593, upload-time = "2025-10-22T22:23:32.834Z" },
{ url = "https://files.pythonhosted.org/packages/70/80/50d5706ea2a9bfc9e9c5f401d91879e7c790c619969369800cde202da214/rpds_py-0.28.0-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:76500820c2af232435cbe215e3324c75b950a027134e044423f59f5b9a1ba515", size = 576925, upload-time = "2025-10-22T22:23:34.47Z" },
{ url = "https://files.pythonhosted.org/packages/ab/12/85a57d7a5855a3b188d024b099fd09c90db55d32a03626d0ed16352413ff/rpds_py-0.28.0-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:bbdc5640900a7dbf9dd707fe6388972f5bbd883633eb68b76591044cfe346f7e", size = 542444, upload-time = "2025-10-22T22:23:36.093Z" },
{ url = "https://files.pythonhosted.org/packages/6c/65/10643fb50179509150eb94d558e8837c57ca8b9adc04bd07b98e57b48f8c/rpds_py-0.28.0-cp314-cp314-win32.whl", hash = "sha256:adc8aa88486857d2b35d75f0640b949759f79dc105f50aa2c27816b2e0dd749f", size = 207968, upload-time = "2025-10-22T22:23:37.638Z" },
{ url = "https://files.pythonhosted.org/packages/b4/84/0c11fe4d9aaea784ff4652499e365963222481ac647bcd0251c88af646eb/rpds_py-0.28.0-cp314-cp314-win_amd64.whl", hash = "sha256:66e6fa8e075b58946e76a78e69e1a124a21d9a48a5b4766d15ba5b06869d1fa1", size = 218876, upload-time = "2025-10-22T22:23:39.179Z" },
{ url = "https://files.pythonhosted.org/packages/0f/e0/3ab3b86ded7bb18478392dc3e835f7b754cd446f62f3fc96f4fe2aca78f6/rpds_py-0.28.0-cp314-cp314-win_arm64.whl", hash = "sha256:a6fe887c2c5c59413353b7c0caff25d0e566623501ccfff88957fa438a69377d", size = 212506, upload-time = "2025-10-22T22:23:40.755Z" },
{ url = "https://files.pythonhosted.org/packages/51/ec/d5681bb425226c3501eab50fc30e9d275de20c131869322c8a1729c7b61c/rpds_py-0.28.0-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:7a69df082db13c7070f7b8b1f155fa9e687f1d6aefb7b0e3f7231653b79a067b", size = 355433, upload-time = "2025-10-22T22:23:42.259Z" },
{ url = "https://files.pythonhosted.org/packages/be/ec/568c5e689e1cfb1ea8b875cffea3649260955f677fdd7ddc6176902d04cd/rpds_py-0.28.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:b1cde22f2c30ebb049a9e74c5374994157b9b70a16147d332f89c99c5960737a", size = 342601, upload-time = "2025-10-22T22:23:44.372Z" },
{ url = "https://files.pythonhosted.org/packages/32/fe/51ada84d1d2a1d9d8f2c902cfddd0133b4a5eb543196ab5161d1c07ed2ad/rpds_py-0.28.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5338742f6ba7a51012ea470bd4dc600a8c713c0c72adaa0977a1b1f4327d6592", size = 372039, upload-time = "2025-10-22T22:23:46.025Z" },
{ url = "https://files.pythonhosted.org/packages/07/c1/60144a2f2620abade1a78e0d91b298ac2d9b91bc08864493fa00451ef06e/rpds_py-0.28.0-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e1460ebde1bcf6d496d80b191d854adedcc619f84ff17dc1c6d550f58c9efbba", size = 382407, upload-time = "2025-10-22T22:23:48.098Z" },
{ url = "https://files.pythonhosted.org/packages/45/ed/091a7bbdcf4038a60a461df50bc4c82a7ed6d5d5e27649aab61771c17585/rpds_py-0.28.0-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e3eb248f2feba84c692579257a043a7699e28a77d86c77b032c1d9fbb3f0219c", size = 518172, upload-time = "2025-10-22T22:23:50.16Z" },
{ url = "https://files.pythonhosted.org/packages/54/dd/02cc90c2fd9c2ef8016fd7813bfacd1c3a1325633ec8f244c47b449fc868/rpds_py-0.28.0-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:bd3bbba5def70b16cd1c1d7255666aad3b290fbf8d0fe7f9f91abafb73611a91", size = 399020, upload-time = "2025-10-22T22:23:51.81Z" },
{ url = "https://files.pythonhosted.org/packages/ab/81/5d98cc0329bbb911ccecd0b9e19fbf7f3a5de8094b4cda5e71013b2dd77e/rpds_py-0.28.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3114f4db69ac5a1f32e7e4d1cbbe7c8f9cf8217f78e6e002cedf2d54c2a548ed", size = 377451, upload-time = "2025-10-22T22:23:53.711Z" },
{ url = "https://files.pythonhosted.org/packages/b4/07/4d5bcd49e3dfed2d38e2dcb49ab6615f2ceb9f89f5a372c46dbdebb4e028/rpds_py-0.28.0-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:4b0cb8a906b1a0196b863d460c0222fb8ad0f34041568da5620f9799b83ccf0b", size = 390355, upload-time = "2025-10-22T22:23:55.299Z" },
{ url = "https://files.pythonhosted.org/packages/3f/79/9f14ba9010fee74e4f40bf578735cfcbb91d2e642ffd1abe429bb0b96364/rpds_py-0.28.0-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:cf681ac76a60b667106141e11a92a3330890257e6f559ca995fbb5265160b56e", size = 403146, upload-time = "2025-10-22T22:23:56.929Z" },
{ url = "https://files.pythonhosted.org/packages/39/4c/f08283a82ac141331a83a40652830edd3a4a92c34e07e2bbe00baaea2f5f/rpds_py-0.28.0-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:1e8ee6413cfc677ce8898d9cde18cc3a60fc2ba756b0dec5b71eb6eb21c49fa1", size = 552656, upload-time = "2025-10-22T22:23:58.62Z" },
{ url = "https://files.pythonhosted.org/packages/61/47/d922fc0666f0dd8e40c33990d055f4cc6ecff6f502c2d01569dbed830f9b/rpds_py-0.28.0-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:b3072b16904d0b5572a15eb9d31c1954e0d3227a585fc1351aa9878729099d6c", size = 576782, upload-time = "2025-10-22T22:24:00.312Z" },
{ url = "https://files.pythonhosted.org/packages/d3/0c/5bafdd8ccf6aa9d3bfc630cfece457ff5b581af24f46a9f3590f790e3df2/rpds_py-0.28.0-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:b670c30fd87a6aec281c3c9896d3bae4b205fd75d79d06dc87c2503717e46092", size = 544671, upload-time = "2025-10-22T22:24:02.297Z" },
{ url = "https://files.pythonhosted.org/packages/2c/37/dcc5d8397caa924988693519069d0beea077a866128719351a4ad95e82fc/rpds_py-0.28.0-cp314-cp314t-win32.whl", hash = "sha256:8014045a15b4d2b3476f0a287fcc93d4f823472d7d1308d47884ecac9e612be3", size = 205749, upload-time = "2025-10-22T22:24:03.848Z" },
{ url = "https://files.pythonhosted.org/packages/d7/69/64d43b21a10d72b45939a28961216baeb721cc2a430f5f7c3bfa21659a53/rpds_py-0.28.0-cp314-cp314t-win_amd64.whl", hash = "sha256:7a4e59c90d9c27c561eb3160323634a9ff50b04e4f7820600a2beb0ac90db578", size = 216233, upload-time = "2025-10-22T22:24:05.471Z" },
]
[[package]] [[package]]
name = "rsa" name = "rsa"
version = "4.9.1" version = "4.9.1"
@@ -1789,6 +1970,32 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/64/8d/0133e4eb4beed9e425d9a98ed6e081a55d195481b7632472be1af08d2f6b/rsa-4.9.1-py3-none-any.whl", hash = "sha256:68635866661c6836b8d39430f97a996acbd61bfa49406748ea243539fe239762", size = 34696, upload-time = "2025-04-16T09:51:17.142Z" }, { url = "https://files.pythonhosted.org/packages/64/8d/0133e4eb4beed9e425d9a98ed6e081a55d195481b7632472be1af08d2f6b/rsa-4.9.1-py3-none-any.whl", hash = "sha256:68635866661c6836b8d39430f97a996acbd61bfa49406748ea243539fe239762", size = 34696, upload-time = "2025-04-16T09:51:17.142Z" },
] ]
[[package]]
name = "ruff"
version = "0.14.4"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/df/55/cccfca45157a2031dcbb5a462a67f7cf27f8b37d4b3b1cd7438f0f5c1df6/ruff-0.14.4.tar.gz", hash = "sha256:f459a49fe1085a749f15414ca76f61595f1a2cc8778ed7c279b6ca2e1fd19df3", size = 5587844, upload-time = "2025-11-06T22:07:45.033Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/17/b9/67240254166ae1eaa38dec32265e9153ac53645a6c6670ed36ad00722af8/ruff-0.14.4-py3-none-linux_armv6l.whl", hash = "sha256:e6604613ffbcf2297cd5dcba0e0ac9bd0c11dc026442dfbb614504e87c349518", size = 12606781, upload-time = "2025-11-06T22:07:01.841Z" },
{ url = "https://files.pythonhosted.org/packages/46/c8/09b3ab245d8652eafe5256ab59718641429f68681ee713ff06c5c549f156/ruff-0.14.4-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:d99c0b52b6f0598acede45ee78288e5e9b4409d1ce7f661f0fa36d4cbeadf9a4", size = 12946765, upload-time = "2025-11-06T22:07:05.858Z" },
{ url = "https://files.pythonhosted.org/packages/14/bb/1564b000219144bf5eed2359edc94c3590dd49d510751dad26202c18a17d/ruff-0.14.4-py3-none-macosx_11_0_arm64.whl", hash = "sha256:9358d490ec030f1b51d048a7fd6ead418ed0826daf6149e95e30aa67c168af33", size = 11928120, upload-time = "2025-11-06T22:07:08.023Z" },
{ url = "https://files.pythonhosted.org/packages/a3/92/d5f1770e9988cc0742fefaa351e840d9aef04ec24ae1be36f333f96d5704/ruff-0.14.4-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:81b40d27924f1f02dfa827b9c0712a13c0e4b108421665322218fc38caf615c2", size = 12370877, upload-time = "2025-11-06T22:07:10.015Z" },
{ url = "https://files.pythonhosted.org/packages/e2/29/e9282efa55f1973d109faf839a63235575519c8ad278cc87a182a366810e/ruff-0.14.4-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f5e649052a294fe00818650712083cddc6cc02744afaf37202c65df9ea52efa5", size = 12408538, upload-time = "2025-11-06T22:07:13.085Z" },
{ url = "https://files.pythonhosted.org/packages/8e/01/930ed6ecfce130144b32d77d8d69f5c610e6d23e6857927150adf5d7379a/ruff-0.14.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:aa082a8f878deeba955531f975881828fd6afd90dfa757c2b0808aadb437136e", size = 13141942, upload-time = "2025-11-06T22:07:15.386Z" },
{ url = "https://files.pythonhosted.org/packages/6a/46/a9c89b42b231a9f487233f17a89cbef9d5acd538d9488687a02ad288fa6b/ruff-0.14.4-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:1043c6811c2419e39011890f14d0a30470f19d47d197c4858b2787dfa698f6c8", size = 14544306, upload-time = "2025-11-06T22:07:17.631Z" },
{ url = "https://files.pythonhosted.org/packages/78/96/9c6cf86491f2a6d52758b830b89b78c2ae61e8ca66b86bf5a20af73d20e6/ruff-0.14.4-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a9f3a936ac27fb7c2a93e4f4b943a662775879ac579a433291a6f69428722649", size = 14210427, upload-time = "2025-11-06T22:07:19.832Z" },
{ url = "https://files.pythonhosted.org/packages/71/f4/0666fe7769a54f63e66404e8ff698de1dcde733e12e2fd1c9c6efb689cb5/ruff-0.14.4-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:95643ffd209ce78bc113266b88fba3d39e0461f0cbc8b55fb92505030fb4a850", size = 13658488, upload-time = "2025-11-06T22:07:22.32Z" },
{ url = "https://files.pythonhosted.org/packages/ee/79/6ad4dda2cfd55e41ac9ed6d73ef9ab9475b1eef69f3a85957210c74ba12c/ruff-0.14.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:456daa2fa1021bc86ca857f43fe29d5d8b3f0e55e9f90c58c317c1dcc2afc7b5", size = 13354908, upload-time = "2025-11-06T22:07:24.347Z" },
{ url = "https://files.pythonhosted.org/packages/b5/60/f0b6990f740bb15c1588601d19d21bcc1bd5de4330a07222041678a8e04f/ruff-0.14.4-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:f911bba769e4a9f51af6e70037bb72b70b45a16db5ce73e1f72aefe6f6d62132", size = 13587803, upload-time = "2025-11-06T22:07:26.327Z" },
{ url = "https://files.pythonhosted.org/packages/c9/da/eaaada586f80068728338e0ef7f29ab3e4a08a692f92eb901a4f06bbff24/ruff-0.14.4-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:76158a7369b3979fa878612c623a7e5430c18b2fd1c73b214945c2d06337db67", size = 12279654, upload-time = "2025-11-06T22:07:28.46Z" },
{ url = "https://files.pythonhosted.org/packages/66/d4/b1d0e82cf9bf8aed10a6d45be47b3f402730aa2c438164424783ac88c0ed/ruff-0.14.4-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:f3b8f3b442d2b14c246e7aeca2e75915159e06a3540e2f4bed9f50d062d24469", size = 12357520, upload-time = "2025-11-06T22:07:31.468Z" },
{ url = "https://files.pythonhosted.org/packages/04/f4/53e2b42cc82804617e5c7950b7079d79996c27e99c4652131c6a1100657f/ruff-0.14.4-py3-none-musllinux_1_2_i686.whl", hash = "sha256:c62da9a06779deecf4d17ed04939ae8b31b517643b26370c3be1d26f3ef7dbde", size = 12719431, upload-time = "2025-11-06T22:07:33.831Z" },
{ url = "https://files.pythonhosted.org/packages/a2/94/80e3d74ed9a72d64e94a7b7706b1c1ebaa315ef2076fd33581f6a1cd2f95/ruff-0.14.4-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:5a443a83a1506c684e98acb8cb55abaf3ef725078be40237463dae4463366349", size = 13464394, upload-time = "2025-11-06T22:07:35.905Z" },
{ url = "https://files.pythonhosted.org/packages/54/1a/a49f071f04c42345c793d22f6cf5e0920095e286119ee53a64a3a3004825/ruff-0.14.4-py3-none-win32.whl", hash = "sha256:643b69cb63cd996f1fc7229da726d07ac307eae442dd8974dbc7cf22c1e18fff", size = 12493429, upload-time = "2025-11-06T22:07:38.43Z" },
{ url = "https://files.pythonhosted.org/packages/bc/22/e58c43e641145a2b670328fb98bc384e20679b5774258b1e540207580266/ruff-0.14.4-py3-none-win_amd64.whl", hash = "sha256:26673da283b96fe35fa0c939bf8411abec47111644aa9f7cfbd3c573fb125d2c", size = 13635380, upload-time = "2025-11-06T22:07:40.496Z" },
{ url = "https://files.pythonhosted.org/packages/30/bd/4168a751ddbbf43e86544b4de8b5c3b7be8d7167a2a5cb977d274e04f0a1/ruff-0.14.4-py3-none-win_arm64.whl", hash = "sha256:dd09c292479596b0e6fec8cd95c65c3a6dc68e9ad17b8f2382130f87ff6a75bb", size = 12663065, upload-time = "2025-11-06T22:07:42.603Z" },
]
[[package]] [[package]]
name = "setuptools" name = "setuptools"
version = "80.9.0" version = "80.9.0"
@@ -1867,6 +2074,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" }, { url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
] ]
[[package]]
name = "sse-starlette"
version = "3.0.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
]
sdist = { url = "https://files.pythonhosted.org/packages/db/3c/fa6517610dc641262b77cc7bf994ecd17465812c1b0585fe33e11be758ab/sse_starlette-3.0.3.tar.gz", hash = "sha256:88cfb08747e16200ea990c8ca876b03910a23b547ab3bd764c0d8eb81019b971", size = 21943, upload-time = "2025-10-30T18:44:20.117Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/23/a0/984525d19ca5c8a6c33911a0c164b11490dd0f90ff7fd689f704f84e9a11/sse_starlette-3.0.3-py3-none-any.whl", hash = "sha256:af5bf5a6f3933df1d9c7f8539633dc8444ca6a97ab2e2a7cd3b6e431ac03a431", size = 11765, upload-time = "2025-10-30T18:44:18.834Z" },
]
[[package]] [[package]]
name = "starlette" name = "starlette"
version = "0.47.3" version = "0.47.3"
@@ -1880,6 +2099,20 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/ce/fd/901cfa59aaa5b30a99e16876f11abe38b59a1a2c51ffb3d7142bb6089069/starlette-0.47.3-py3-none-any.whl", hash = "sha256:89c0778ca62a76b826101e7c709e70680a1699ca7da6b44d38eb0a7e61fe4b51", size = 72991, upload-time = "2025-08-24T13:36:40.887Z" }, { url = "https://files.pythonhosted.org/packages/ce/fd/901cfa59aaa5b30a99e16876f11abe38b59a1a2c51ffb3d7142bb6089069/starlette-0.47.3-py3-none-any.whl", hash = "sha256:89c0778ca62a76b826101e7c709e70680a1699ca7da6b44d38eb0a7e61fe4b51", size = 72991, upload-time = "2025-08-24T13:36:40.887Z" },
] ]
[[package]]
name = "tavily-python"
version = "0.7.12"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "httpx" },
{ name = "requests" },
{ name = "tiktoken" },
]
sdist = { url = "https://files.pythonhosted.org/packages/3e/42/ce2329635b844dda548110a5dfa0ab5631cdc1085e15c2d68b1850a2d112/tavily_python-0.7.12.tar.gz", hash = "sha256:661945bbc9284cdfbe70fb50de3951fd656bfd72e38e352481d333a36ae91f5a", size = 17282, upload-time = "2025-09-10T17:02:01.281Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/9a/e2/dbc246d9fb24433f77b17d9ee4e750a1e2718432ebde2756589c9154cbad/tavily_python-0.7.12-py3-none-any.whl", hash = "sha256:00d09b9de3ca02ef9a994cf4e7ae43d4ec9d199f0566ba6e52cbfcbd07349bd1", size = 15473, upload-time = "2025-09-10T17:01:59.859Z" },
]
[[package]] [[package]]
name = "tenacity" name = "tenacity"
version = "9.1.2" version = "9.1.2"

View File

@@ -20,17 +20,22 @@ services:
volumes: volumes:
- ./backend/app:/app/app - ./backend/app:/app/app
- ./backend/.secrets:/app/.secrets - ./backend/.secrets:/app/.secrets
- ./backend/data:/app/data
env_file: env_file:
- .env - .env
networks: networks:
- app-network - app-network
db: db:
# docker run -p 6379:6379 -p 8001:8001 redis/redis-stack
image: redis/redis-stack:latest image: redis/redis-stack:latest
ports: ports:
- 6379:6379 - 6379:6379
- 8001:8001 - 8001:8001
environment:
REDIS_ARGS: --appendonly yes
volumes:
- ./redis_data:/data
restart: unless-stopped
networks: networks:
- app-network - app-network

View File

@@ -8,6 +8,7 @@
"name": "frontend", "name": "frontend",
"version": "0.0.0", "version": "0.0.0",
"dependencies": { "dependencies": {
"@ai-sdk/react": "^2.0.89",
"@radix-ui/react-checkbox": "^1.3.3", "@radix-ui/react-checkbox": "^1.3.3",
"@radix-ui/react-collapsible": "^1.1.12", "@radix-ui/react-collapsible": "^1.1.12",
"@radix-ui/react-dialog": "^1.1.15", "@radix-ui/react-dialog": "^1.1.15",
@@ -110,6 +111,30 @@
"zod": "^3.25.76 || ^4.1.8" "zod": "^3.25.76 || ^4.1.8"
} }
}, },
"node_modules/@ai-sdk/react": {
"version": "2.0.89",
"resolved": "https://registry.npmjs.org/@ai-sdk/react/-/react-2.0.89.tgz",
"integrity": "sha512-r2uCqx042JOjNrSlDrjh7ufSIfU2BM6Lo4qe47KHkYuJjPfssxhLpJUCFLB01iV7Foyn/xpbq06Zr6WI4qUDgw==",
"license": "Apache-2.0",
"dependencies": {
"@ai-sdk/provider-utils": "3.0.16",
"ai": "5.0.89",
"swr": "^2.2.5",
"throttleit": "2.1.0"
},
"engines": {
"node": ">=18"
},
"peerDependencies": {
"react": "^18 || ^19 || ^19.0.0-rc",
"zod": "^3.25.76 || ^4.1.8"
},
"peerDependenciesMeta": {
"zod": {
"optional": true
}
}
},
"node_modules/@alloc/quick-lru": { "node_modules/@alloc/quick-lru": {
"version": "5.2.0", "version": "5.2.0",
"resolved": "https://registry.npmjs.org/@alloc/quick-lru/-/quick-lru-5.2.0.tgz", "resolved": "https://registry.npmjs.org/@alloc/quick-lru/-/quick-lru-5.2.0.tgz",
@@ -8818,6 +8843,19 @@
"node": ">=8" "node": ">=8"
} }
}, },
"node_modules/swr": {
"version": "2.3.6",
"resolved": "https://registry.npmjs.org/swr/-/swr-2.3.6.tgz",
"integrity": "sha512-wfHRmHWk/isGNMwlLGlZX5Gzz/uTgo0o2IRuTMcf4CPuPFJZlq0rDaKUx+ozB5nBOReNV1kiOyzMfj+MBMikLw==",
"license": "MIT",
"dependencies": {
"dequal": "^2.0.3",
"use-sync-external-store": "^1.4.0"
},
"peerDependencies": {
"react": "^16.11.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
}
},
"node_modules/tailwind-merge": { "node_modules/tailwind-merge": {
"version": "3.3.1", "version": "3.3.1",
"resolved": "https://registry.npmjs.org/tailwind-merge/-/tailwind-merge-3.3.1.tgz", "resolved": "https://registry.npmjs.org/tailwind-merge/-/tailwind-merge-3.3.1.tgz",
@@ -8857,6 +8895,18 @@
"url": "https://opencollective.com/webpack" "url": "https://opencollective.com/webpack"
} }
}, },
"node_modules/throttleit": {
"version": "2.1.0",
"resolved": "https://registry.npmjs.org/throttleit/-/throttleit-2.1.0.tgz",
"integrity": "sha512-nt6AMGKW1p/70DF/hGBdJB57B8Tspmbp5gfJ8ilhLnt7kkr2ye7hzD6NVG8GGErk2HWF34igrL2CXmNIkzKqKw==",
"license": "MIT",
"engines": {
"node": ">=18"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/tiny-invariant": { "node_modules/tiny-invariant": {
"version": "1.3.3", "version": "1.3.3",
"resolved": "https://registry.npmjs.org/tiny-invariant/-/tiny-invariant-1.3.3.tgz", "resolved": "https://registry.npmjs.org/tiny-invariant/-/tiny-invariant-1.3.3.tgz",

View File

@@ -10,6 +10,7 @@
"preview": "vite preview" "preview": "vite preview"
}, },
"dependencies": { "dependencies": {
"@ai-sdk/react": "^2.0.89",
"@radix-ui/react-checkbox": "^1.3.3", "@radix-ui/react-checkbox": "^1.3.3",
"@radix-ui/react-collapsible": "^1.1.12", "@radix-ui/react-collapsible": "^1.1.12",
"@radix-ui/react-dialog": "^1.1.15", "@radix-ui/react-dialog": "^1.1.15",

View File

@@ -0,0 +1,230 @@
import {
TrendingUp,
TrendingDown,
Activity,
Info,
Target,
Lightbulb,
ArrowRight,
AlertTriangle,
} from "lucide-react";
type TrendDirection = "Improving" | "Declining" | "Stable" | "Volatile";
interface TrendMetricPoint {
year: number;
value: number;
growth?: number | null;
}
interface TrendMetric {
name: string;
unit: string;
description: string;
points: TrendMetricPoint[];
cagr?: number | null;
direction: TrendDirection;
notes?: string | null;
}
interface TrendInsight {
category: string;
direction: TrendDirection;
summary: string;
confidence: number;
}
interface AnalystReportData {
organisation_name: string;
organisation_ein: string;
years_analyzed: number[];
key_metrics: TrendMetric[];
insights: TrendInsight[];
recommendations: string[];
outlook: string;
}
interface AnalystReportProps {
data: AnalystReportData;
}
const directionBadgeClasses: Record<TrendDirection, string> = {
Improving: "bg-green-100 text-green-700 border-green-200",
Declining: "bg-red-100 text-red-700 border-red-200",
Stable: "bg-blue-100 text-blue-700 border-blue-200",
Volatile: "bg-yellow-100 text-yellow-700 border-yellow-200",
};
const directionIcon = (direction: TrendDirection) => {
switch (direction) {
case "Improving":
return <TrendingUp className="w-4 h-4" />;
case "Declining":
return <TrendingDown className="w-4 h-4" />;
case "Volatile":
return <Activity className="w-4 h-4" />;
default:
return <Info className="w-4 h-4" />;
}
};
const formatNumber = (value: number, unit: string) => {
if (unit === "USD") {
return `$${value.toLocaleString("en-US", {
maximumFractionDigits: 0,
})}`;
}
if (unit === "Ratio") {
return `${(value * 100).toFixed(1)}%`;
}
return value.toLocaleString("en-US", { maximumFractionDigits: 2 });
};
const formatPercent = (value?: number | null) => {
if (value === undefined || value === null || Number.isNaN(value)) {
return "—";
}
return `${(value * 100).toFixed(1)}%`;
};
export function AnalystReport({ data }: AnalystReportProps) {
const {
organisation_name,
organisation_ein,
years_analyzed,
key_metrics,
insights,
recommendations,
outlook,
} = data;
return (
<div className="w-full bg-white border border-gray-200 rounded-lg shadow-sm overflow-hidden">
<div className="bg-gradient-to-r from-emerald-50 to-sky-50 px-4 py-3 border-b border-emerald-100">
<div className="flex items-center justify-between">
<div>
<p className="text-xs text-gray-500 uppercase tracking-wide">
Multi-year performance analysis
</p>
<h3 className="text-lg font-semibold text-gray-900">
{organisation_name}
</h3>
<p className="text-xs text-gray-600">
EIN {organisation_ein} {years_analyzed.join(" ")}
</p>
</div>
<div className="text-right text-sm text-gray-600">
<span className="font-medium text-gray-900">Outlook</span>
<p className="text-xs text-gray-600 max-w-xs">{outlook}</p>
</div>
</div>
</div>
{/* Key Metrics */}
{key_metrics.length > 0 && (
<div className="p-4 space-y-3">
<h4 className="text-sm font-semibold text-gray-800 flex items-center gap-2">
<Activity className="w-4 h-4 text-emerald-600" />
Core Trend Metrics
</h4>
<div className="grid gap-3 md:grid-cols-2">
{key_metrics.slice(0, 4).map((metric, index) => {
const latest = metric.points[metric.points.length - 1];
const prior =
metric.points.length > 1
? metric.points[metric.points.length - 2]
: null;
const yoy = latest && prior ? formatPercent(latest.growth) : "—";
return (
<div
key={`${metric.name}-${index}`}
className="border rounded-lg p-3 bg-gray-50"
>
<div className="flex items-center justify-between mb-1">
<div className="font-medium text-sm text-gray-900">
{metric.name}
</div>
<span
className={`inline-flex items-center gap-1 text-xs px-2 py-0.5 rounded-full border ${directionBadgeClasses[metric.direction]}`}
>
{directionIcon(metric.direction)}
{metric.direction}
</span>
</div>
<p className="text-2xl font-semibold text-gray-900">
{latest ? formatNumber(latest.value, metric.unit) : "—"}
</p>
<div className="flex justify-between text-xs text-gray-600 mt-1">
<span>YoY: {yoy}</span>
<span>CAGR: {formatPercent(metric.cagr)}</span>
</div>
{metric.notes && (
<p className="text-xs text-gray-500 mt-2">{metric.notes}</p>
)}
</div>
);
})}
</div>
</div>
)}
{/* Insights */}
{insights.length > 0 && (
<div className="px-4 pb-4">
<h4 className="text-sm font-semibold text-gray-800 mb-2 flex items-center gap-2">
<Lightbulb className="w-4 h-4 text-amber-500" />
Key Insights
</h4>
<div className="space-y-2">
{insights.map((insight, index) => (
<div
key={`${insight.category}-${index}`}
className="border rounded-lg p-3 bg-white"
>
<div className="flex justify-between items-center mb-1">
<div className="flex items-center gap-2 text-sm font-medium text-gray-900">
{directionIcon(insight.direction)}
{insight.category}
</div>
<span className="text-xs text-gray-500">
{Math.round(insight.confidence * 100)}% confidence
</span>
</div>
<p className="text-sm text-gray-700">{insight.summary}</p>
</div>
))}
</div>
</div>
)}
{/* Recommendations */}
{recommendations.length > 0 && (
<div className="px-4 pb-4">
<h4 className="text-sm font-semibold text-gray-800 mb-2 flex items-center gap-2">
<Target className="w-4 h-4 text-indigo-500" />
Recommended Actions
</h4>
<ul className="space-y-1 text-sm text-gray-700">
{recommendations.map((rec, index) => (
<li
key={`rec-${index}`}
className="flex items-start gap-2 bg-gray-50 border border-gray-100 rounded-lg p-2"
>
<ArrowRight className="w-4 h-4 text-gray-500 mt-0.5" />
<span>{rec}</span>
</li>
))}
</ul>
</div>
)}
{/* Empty states fallback */}
{recommendations.length === 0 && insights.length === 0 && (
<div className="px-4 py-6 text-sm text-gray-600 flex items-center gap-2">
<AlertTriangle className="w-4 h-4 text-gray-400" />
No trend insights available yet. Try requesting an annual comparison.
</div>
)}
</div>
);
}

View File

@@ -0,0 +1,326 @@
import React, { useState } from "react";
import {
AlertTriangle,
CheckCircle,
XCircle,
FileText,
AlertCircle,
ChevronDown,
ChevronRight,
Shield,
Info,
} from "lucide-react";
type Severity = "Pass" | "Warning" | "Error";
interface AuditFinding {
check_id: string;
category: string;
severity: Severity;
message: string;
mitigation?: string;
confidence: number;
}
interface AuditSectionSummary {
section: string;
severity: Severity;
summary: string;
confidence: number;
}
interface AuditReportData {
organisation_ein: string;
organisation_name: string;
year?: number;
overall_severity: Severity;
findings: AuditFinding[];
sections: AuditSectionSummary[];
overall_summary?: string;
notes?: string;
}
interface AuditReportProps {
data: AuditReportData;
}
const getSeverityIcon = (severity: Severity, size = "w-4 h-4") => {
switch (severity) {
case "Pass":
return <CheckCircle className={`${size} text-green-600`} />;
case "Warning":
return <AlertTriangle className={`${size} text-yellow-600`} />;
case "Error":
return <XCircle className={`${size} text-red-600`} />;
default:
return <AlertCircle className={`${size} text-gray-600`} />;
}
};
const getSeverityBadge = (severity: Severity) => {
const colors = {
Pass: "bg-green-100 text-green-800 border-green-200",
Warning: "bg-yellow-100 text-yellow-800 border-yellow-200",
Error: "bg-red-100 text-red-800 border-red-200",
};
return (
<span
className={`inline-flex items-center gap-1 px-2 py-1 rounded-full text-xs font-medium border ${colors[severity]}`}
>
{getSeverityIcon(severity, "w-3 h-3")}
{severity}
</span>
);
};
export const AuditReport: React.FC<AuditReportProps> = ({ data }) => {
const [expandedSections, setExpandedSections] = useState<Set<string>>(
new Set(),
);
const [showAllFindings, setShowAllFindings] = useState(false);
const {
organisation_name,
organisation_ein,
year,
overall_severity,
findings,
sections,
overall_summary,
notes,
} = data;
const severityStats = {
Pass: findings.filter((f) => f.severity === "Pass").length,
Warning: findings.filter((f) => f.severity === "Warning").length,
Error: findings.filter((f) => f.severity === "Error").length,
};
const toggleSection = (section: string) => {
const newExpanded = new Set(expandedSections);
if (newExpanded.has(section)) {
newExpanded.delete(section);
} else {
newExpanded.add(section);
}
setExpandedSections(newExpanded);
};
const criticalFindings = findings.filter((f) => f.severity === "Error");
return (
<div className="w-full bg-white border border-gray-200 rounded-lg shadow-sm overflow-hidden">
{/* Compact Header */}
<div className="bg-linear-to-r from-blue-50 to-indigo-50 p-4">
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<Shield className="w-6 h-6 text-blue-600" />
<div>
<h3 className="font-semibold text-gray-900">
{organisation_name}
</h3>
<div className="flex items-center gap-3 text-xs text-gray-600">
<span>EIN: {organisation_ein}</span>
{year && <span>{year}</span>}
</div>
</div>
</div>
{getSeverityBadge(overall_severity)}
</div>
</div>
{/* Quick Stats */}
<div className="bg-gray-50 px-4 py-3 border-b">
<div className="flex items-center justify-between">
<span className="text-sm font-medium text-gray-700">
Audit Results
</span>
<div className="flex items-center gap-4 text-sm">
{severityStats.Pass > 0 && (
<div className="flex items-center gap-1">
<CheckCircle className="w-3 h-3 text-green-600" />
<span className="font-medium text-green-700">
{severityStats.Pass}
</span>
</div>
)}
{severityStats.Warning > 0 && (
<div className="flex items-center gap-1">
<AlertTriangle className="w-3 h-3 text-yellow-600" />
<span className="font-medium text-yellow-700">
{severityStats.Warning}
</span>
</div>
)}
{severityStats.Error > 0 && (
<div className="flex items-center gap-1">
<XCircle className="w-3 h-3 text-red-600" />
<span className="font-medium text-red-700">
{severityStats.Error}
</span>
</div>
)}
</div>
</div>
</div>
<div className="p-4 space-y-4">
{/* Summary */}
{overall_summary && (
<div className="bg-blue-50 border border-blue-200 rounded-lg p-3">
<div className="flex items-start gap-2">
<Info className="w-4 h-4 text-blue-600 mt-0.5 shrink-0" />
<div>
<h4 className="font-medium text-blue-900 text-sm mb-1">
Summary
</h4>
<p className="text-sm text-blue-800">{overall_summary}</p>
</div>
</div>
</div>
)}
{/* Critical Issues (if any) */}
{criticalFindings.length > 0 && (
<div className="bg-red-50 border border-red-200 rounded-lg p-3">
<h4 className="font-medium text-red-900 text-sm mb-2 flex items-center gap-2">
<XCircle className="w-4 h-4" />
Critical Issues ({criticalFindings.length})
</h4>
<div className="space-y-2">
{criticalFindings.slice(0, 2).map((finding, index) => (
<div key={index} className="text-sm text-red-800">
<span className="font-medium">{finding.category}:</span>{" "}
{finding.message}
</div>
))}
{criticalFindings.length > 2 && (
<button
onClick={() => setShowAllFindings(!showAllFindings)}
className="text-xs text-red-700 hover:text-red-800 font-medium"
>
{showAllFindings
? "Show less"
: `+${criticalFindings.length - 2} more issues`}
</button>
)}
</div>
</div>
)}
{/* Sections Overview */}
{sections.length > 0 && (
<div>
<button
onClick={() => toggleSection("sections")}
className="flex items-center gap-2 w-full text-left p-2 hover:bg-gray-50 rounded-lg"
>
{expandedSections.has("sections") ? (
<ChevronDown className="w-4 h-4" />
) : (
<ChevronRight className="w-4 h-4" />
)}
<span className="font-medium text-sm">
Section Analysis ({sections.length})
</span>
</button>
{expandedSections.has("sections") && (
<div className="mt-2 grid gap-2 sm:grid-cols-2">
{sections.map((section, index) => (
<div key={index} className="border rounded-lg p-3 bg-gray-50">
<div className="flex items-center justify-between mb-1">
<span className="font-medium text-sm">
{section.section}
</span>
<div className="flex items-center gap-1">
{getSeverityIcon(section.severity)}
<span className="text-xs text-gray-600">
{Math.round(section.confidence * 100)}%
</span>
</div>
</div>
<p className="text-xs text-gray-700">{section.summary}</p>
</div>
))}
</div>
)}
</div>
)}
{/* All Findings */}
<div>
<button
onClick={() => toggleSection("findings")}
className="flex items-center gap-2 w-full text-left p-2 hover:bg-gray-50 rounded-lg"
>
{expandedSections.has("findings") ? (
<ChevronDown className="w-4 h-4" />
) : (
<ChevronRight className="w-4 h-4" />
)}
<span className="font-medium text-sm">
All Findings ({findings.length})
</span>
</button>
{expandedSections.has("findings") && (
<div className="mt-2 space-y-2">
{findings.map((finding, index) => (
<div key={index} className="border rounded-lg p-3 bg-gray-50">
<div className="flex items-start justify-between mb-2">
<div className="flex items-center gap-2">
{getSeverityIcon(finding.severity)}
<div>
<span className="font-medium text-sm">
{finding.category}
</span>
<span className="text-xs text-gray-500 ml-1">
#{finding.check_id}
</span>
</div>
</div>
<span className="text-xs text-gray-600">
{Math.round(finding.confidence * 100)}% confidence
</span>
</div>
<p className="text-sm text-gray-700 mb-2">
{finding.message}
</p>
{finding.mitigation && (
<div className="bg-white rounded p-2 border">
<span className="text-xs font-medium text-gray-600">
Recommended:
</span>
<p className="text-xs text-gray-700 mt-1">
{finding.mitigation}
</p>
</div>
)}
</div>
))}
</div>
)}
</div>
{/* Notes */}
{notes && (
<div className="border-t pt-3">
<div className="flex items-start gap-2">
<FileText className="w-4 h-4 text-gray-400 mt-0.5 shrink-0" />
<div>
<h4 className="font-medium text-gray-700 text-sm mb-1">
Notes
</h4>
<p className="text-sm text-gray-600 italic">{notes}</p>
</div>
</div>
</div>
)}
</div>
</div>
);
};

View File

@@ -1,92 +1,412 @@
import { MessageCircle, Send, Bot, User } from "lucide-react"; import { Message, MessageContent } from "@/components/ai-elements/message";
import { Button } from "@/components/ui/button"; import {
import { Input } from "@/components/ui/input"; PromptInput,
PromptInputActionAddAttachments,
PromptInputActionMenu,
PromptInputActionMenuContent,
PromptInputActionMenuTrigger,
PromptInputAttachment,
PromptInputAttachments,
PromptInputBody,
PromptInputHeader,
type PromptInputMessage,
PromptInputSubmit,
PromptInputTextarea,
PromptInputFooter,
PromptInputTools,
} from "@/components/ai-elements/prompt-input";
import { Action, Actions } from "@/components/ai-elements/actions";
import { Fragment, useState, useEffect, useMemo } from "react";
import { useChat } from "@ai-sdk/react";
import { Response } from "@/components/ai-elements/response";
import {
CopyIcon,
RefreshCcwIcon,
MessageCircle,
Bot,
AlertCircle,
PaperclipIcon,
User,
} from "lucide-react";
import { AuditReport } from "./AuditReport";
import { AnalystReport } from "./AnalystReport";
import { WebSearchResults } from "./WebSearchResults";
import { Loader } from "@/components/ai-elements/loader";
import { DefaultChatTransport } from "ai";
interface ChatTabProps { interface ChatTabProps {
selectedTema: string | null; selectedTema: string | null;
} }
export function ChatTab({ selectedTema }: ChatTabProps) { export function ChatTab({ selectedTema }: ChatTabProps) {
const [input, setInput] = useState("");
const [error, setError] = useState<string | null>(null);
const {
messages,
sendMessage,
status,
regenerate,
error: chatError,
} = useChat({
transport: new DefaultChatTransport({
api: "/api/v1/agent/chat",
headers: {
tema: selectedTema || "",
},
}),
onError: (error) => {
setError(`Chat error: ${error.message}`);
},
});
// Clear error when starting new conversation
useEffect(() => {
if (status === "streaming") {
setError(null);
}
}, [status]);
const handleSubmit = (message: PromptInputMessage) => {
const hasText = Boolean(message.text?.trim());
const hasAttachments = Boolean(message.files?.length);
if (!(hasText || hasAttachments)) {
return;
}
setError(null);
sendMessage(
{
text: message.text || "Enviado con archivos adjuntos",
files: message.files,
},
{
body: {
dataroom: selectedTema,
context: `User is asking about the dataroom: ${selectedTema}`,
},
},
);
setInput("");
};
const hasActiveToolRequest = useMemo(() => {
return messages.some((message) =>
message.parts.some(
(part: any) =>
typeof part?.type === "string" &&
part.type.startsWith("tool-") &&
part.state === "input-available",
),
);
}, [messages]);
const shouldShowGlobalLoader =
(status === "streaming" || status === "loading") && !hasActiveToolRequest;
if (!selectedTema) { if (!selectedTema) {
return ( return (
<div className="flex flex-col items-center justify-center h-64"> <div className="flex flex-col items-center justify-center h-64">
<MessageCircle className="w-12 h-12 text-gray-400 mb-4" /> <MessageCircle className="w-12 h-12 text-gray-400 mb-4" />
<p className="text-gray-500"> <p className="text-gray-500">Select a dataroom to start chatting</p>
Selecciona un dataroom para iniciar el chat
</p>
</div> </div>
); );
} }
return ( return (
<div className="flex flex-col h-full"> <div className="flex flex-col h-[638px] max-h-[638px]">
{/* Chat Header */} {/* Chat Content */}
<div className="border-b border-gray-200 px-6 py-4"> <div className="flex-1 min-h-0 overflow-y-auto">
<div className="flex items-center gap-3"> <div className="max-w-4xl mx-auto w-full space-y-6 p-6">
<div className="p-2 bg-blue-100 rounded-lg">
<MessageCircle className="w-5 h-5 text-blue-600" />
</div>
<div>
<h3 className="text-lg font-semibold text-gray-900">
Chat con {selectedTema}
</h3>
<p className="text-sm text-gray-600">
Haz preguntas sobre los documentos de este dataroom
</p>
</div>
</div>
</div>
{/* Chat Messages Area */}
<div className="flex-1 overflow-y-auto p-6">
<div className="max-w-4xl mx-auto space-y-4">
{/* Welcome Message */} {/* Welcome Message */}
<div className="flex items-start gap-3"> {messages.length === 0 && (
<div className="flex items-start gap-3 mb-6">
<div className="p-2 bg-blue-100 rounded-full"> <div className="p-2 bg-blue-100 rounded-full">
<Bot className="w-4 h-4 text-blue-600" /> <Bot className="w-4 h-4 text-blue-600" />
</div> </div>
<div className="flex-1 bg-gray-50 rounded-lg p-4"> <div className="flex-1 bg-gray-50 rounded-lg p-4">
<p className="text-sm text-gray-800"> <p className="text-sm text-gray-800">
¡Hola! Soy tu asistente de IA para el dataroom <strong>{selectedTema}</strong>. Hi! Im your AI assistant for dataroom{" "}
Puedes hacerme preguntas sobre los documentos almacenados aquí. <strong>{selectedTema}</strong>. Ask me anything about the
stored documents.
</p> </p>
</div> </div>
</div> </div>
)}
{/* Placeholder for future messages */} {/* Error Message */}
<div className="text-center py-8"> {error && (
<MessageCircle className="w-16 h-16 text-gray-300 mx-auto mb-4" /> <div className="flex items-start gap-3 mb-4">
<h4 className="text-lg font-medium text-gray-900 mb-2"> <div className="p-2 bg-red-100 rounded-full">
Chat Inteligente <AlertCircle className="w-4 h-4 text-red-600" />
</h4> </div>
<p className="text-gray-500 max-w-md mx-auto"> <div className="flex-1 bg-red-50 rounded-lg p-4 border border-red-200">
El chat estará disponible próximamente. Podrás hacer preguntas sobre los <p className="text-sm text-red-800">{error}</p>
documentos y obtener respuestas basadas en el contenido del dataroom.
</p>
</div>
</div> </div>
</div> </div>
)}
{/* Chat Input Area */} {/* Chat Messages */}
<div className="border-t border-gray-200 p-6"> {messages.map((message) => (
<div className="max-w-4xl mx-auto"> <div key={message.id}>
<div className="flex items-center gap-3"> {message.parts.map((part, i) => {
switch (part.type) {
case "text":
return (
<Fragment key={`${message.id}-${i}`}>
{message.role === "user" ? (
<div className="flex items-start gap-3 justify-end">
<div className="flex-1"> <div className="flex-1">
<Input <Message
placeholder={`Pregunta algo sobre ${selectedTema}...`} from={message.role}
disabled className="max-w-none"
className="w-full" >
/> <MessageContent>
<Response>{part.text}</Response>
</MessageContent>
</Message>
</div> </div>
<Button disabled className="gap-2"> <div className="p-2 rounded-full flex-shrink-0 mt-1 bg-gray-100">
<Send className="w-4 h-4" /> <User className="w-4 h-4 text-gray-600" />
Enviar
</Button>
</div> </div>
<p className="text-xs text-gray-500 mt-2"> </div>
Esta funcionalidad estará disponible próximamente ) : (
<div className="flex items-start gap-3">
<div className="p-2 rounded-full flex-shrink-0 mt-1 bg-blue-100">
<Bot className="w-4 h-4 text-blue-600" />
</div>
<div className="flex-1">
<Message
from={message.role}
className="max-w-none"
>
<MessageContent>
<Response>{part.text}</Response>
</MessageContent>
</Message>
</div>
</div>
)}
{message.role === "assistant" &&
i === message.parts.length - 1 && (
<div className="ml-12">
<Actions className="mt-2">
<Action
onClick={() => regenerate()}
label="Regenerar"
disabled={status === "streaming"}
>
<RefreshCcwIcon className="size-3" />
</Action>
<Action
onClick={() =>
navigator.clipboard.writeText(part.text)
}
label="Copiar"
>
<CopyIcon className="size-3" />
</Action>
</Actions>
</div>
)}
</Fragment>
);
case "tool-build_audit_report":
switch (part.state) {
case "input-available":
return (
<div
key={`${message.id}-${i}`}
className="mb-4 flex items-center gap-2 p-4 bg-blue-50 rounded-lg border border-blue-200"
>
<div className="animate-spin rounded-full h-4 w-4 border-b-2 border-blue-600"></div>
<span className="text-sm text-blue-700">
Generating audit report
</span>
</div>
);
case "output-available":
return (
<div
key={`${message.id}-${i}`}
className="mt-4 mb-4 w-full"
>
<div className="max-w-full overflow-hidden">
<AuditReport data={part.output} />
</div>
</div>
);
case "output-error":
return (
<div
key={`${message.id}-${i}`}
className="mb-4 p-4 bg-red-50 border border-red-200 rounded-lg"
>
<div className="flex items-center gap-2">
<AlertCircle className="w-4 h-4 text-red-600" />
<span className="text-sm font-medium text-red-800">
Failed to generate audit report
</span>
</div>
<p className="text-sm text-red-600 mt-1">
{part.errorText}
</p> </p>
</div> </div>
);
default:
return null;
}
case "tool-build_analysis_report":
switch (part.state) {
case "input-available":
return (
<div
key={`${message.id}-${i}`}
className="mb-4 flex items-center gap-2 p-4 bg-purple-50 rounded-lg border border-purple-200"
>
<div className="animate-spin rounded-full h-4 w-4 border-b-2 border-purple-600"></div>
<span className="text-sm text-purple-700">
Generating performance analysis
</span>
</div>
);
case "output-available":
return (
<div
key={`${message.id}-${i}`}
className="mt-4 mb-4 w-full"
>
<div className="max-w-full overflow-hidden">
<AnalystReport data={part.output} />
</div>
</div>
);
case "output-error":
return (
<div
key={`${message.id}-${i}`}
className="mb-4 p-4 bg-red-50 border border-red-200 rounded-lg"
>
<div className="flex items-center gap-2">
<AlertCircle className="w-4 h-4 text-red-600" />
<span className="text-sm font-medium text-red-800">
Failed to generate performance analysis
</span>
</div>
<p className="text-sm text-red-600 mt-1">
{part.errorText}
</p>
</div>
);
default:
return null;
}
case "tool-search_web_information":
switch (part.state) {
case "input-available":
return (
<div
key={`${message.id}-${i}`}
className="mb-4 flex items-center gap-2 p-4 bg-green-50 rounded-lg border border-green-200"
>
<div className="animate-spin rounded-full h-4 w-4 border-b-2 border-green-600"></div>
<span className="text-sm text-green-700">
Searching the web
</span>
</div>
);
case "output-available":
return (
<div
key={`${message.id}-${i}`}
className="mt-4 mb-4 w-full"
>
<div className="max-w-full overflow-hidden">
<WebSearchResults data={part.output} />
</div>
</div>
);
case "output-error":
return (
<div
key={`${message.id}-${i}`}
className="mb-4 p-4 bg-red-50 border border-red-200 rounded-lg"
>
<div className="flex items-center gap-2">
<AlertCircle className="w-4 h-4 text-red-600" />
<span className="text-sm font-medium text-red-800">
Failed to search the web
</span>
</div>
<p className="text-sm text-red-600 mt-1">
{part.errorText}
</p>
</div>
);
default:
return null;
}
default:
return null;
}
})}
</div>
))}
{shouldShowGlobalLoader && <Loader />}
</div>
</div>
{/* Chat Input */}
<div className="border-t border-gray-200 p-3 bg-gray-50/50 flex-shrink-0">
<PromptInput
onSubmit={handleSubmit}
className="max-w-4xl mx-auto border-2 border-gray-200 rounded-xl focus-within:border-slate-500 transition-colors duration-200 bg-white"
globalDrop
multiple
>
<PromptInputHeader className="p-2 pb-0">
<PromptInputAttachments>
{(attachment) => <PromptInputAttachment data={attachment} />}
</PromptInputAttachments>
</PromptInputHeader>
<PromptInputBody>
<PromptInputTextarea
onChange={(e) => setInput(e.target.value)}
value={input}
placeholder={`Ask something about ${selectedTema}...`}
disabled={status === "streaming" || status === "loading"}
className="min-h-[60px] resize-none border-0 focus:ring-0 transition-all duration-200 text-base px-4 py-3 bg-white rounded-xl"
/>
</PromptInputBody>
<PromptInputFooter className="mt-3 flex justify-between items-center">
<PromptInputTools>
<PromptInputActionMenu>
<PromptInputActionMenuTrigger>
<PaperclipIcon className="size-4" />
</PromptInputActionMenuTrigger>
<PromptInputActionMenuContent>
<PromptInputActionAddAttachments />
</PromptInputActionMenuContent>
</PromptInputActionMenu>
</PromptInputTools>
<PromptInputSubmit
disabled={
(!input.trim() && !status) ||
status === "streaming" ||
status === "loading"
}
status={status}
className={`rounded-full px-6 py-2 font-medium transition-all duration-200 flex items-center gap-2 ${
(!input.trim() && !status) ||
status === "streaming" ||
status === "loading"
? "bg-gray-300 cursor-not-allowed text-gray-500"
: "bg-blue-600 hover:bg-blue-700 text-white"
}`}
/>
</PromptInputFooter>
</PromptInput>
</div> </div>
</div> </div>
); );

View File

@@ -1,32 +1,134 @@
import { FileText, Users, Database, Activity } from "lucide-react"; import { useState, useEffect } from "react";
import {
FileText,
Database,
Activity,
TrendingUp,
AlertCircle,
CheckCircle,
Loader2,
} from "lucide-react";
import { api } from "@/services/api";
interface DashboardTabProps { interface DashboardTabProps {
selectedTema: string | null; selectedTema: string | null;
} }
interface DataroomInfo {
name: string;
collection: string;
storage: string;
file_count: number;
total_size_bytes: number;
total_size_mb: number;
collection_exists: boolean;
vector_count: number | null;
collection_info: {
vectors_count: number;
indexed_vectors_count: number;
points_count: number;
segments_count: number;
status: string;
} | null;
file_types: Record<string, number>;
recent_files: Array<{
name: string;
size_mb: number;
last_modified: string;
}>;
}
export function DashboardTab({ selectedTema }: DashboardTabProps) { export function DashboardTab({ selectedTema }: DashboardTabProps) {
const [dataroomInfo, setDataroomInfo] = useState<DataroomInfo | null>(null);
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
if (selectedTema) {
fetchDataroomInfo();
}
}, [selectedTema]);
const fetchDataroomInfo = async () => {
if (!selectedTema) return;
setLoading(true);
setError(null);
try {
const info = await api.getDataroomInfo(selectedTema);
setDataroomInfo(info);
} catch (err) {
const errorMessage = err instanceof Error ? err.message : "Unknown error";
setError(`Unable to load dataroom info: ${errorMessage}`);
console.error("Error fetching dataroom info:", err);
} finally {
setLoading(false);
}
};
const formatFileTypes = (fileTypes: Record<string, number>) => {
const entries = Object.entries(fileTypes);
if (entries.length === 0) return "No files";
return entries
.sort(([, a], [, b]) => b - a) // Sort by count descending
.slice(0, 3) // Take top 3
.map(([ext, count]) => `${ext.toUpperCase()}: ${count}`)
.join(", ");
};
const formatBytes = (bytes: number) => {
if (bytes === 0) return "0 MB";
const mb = bytes / (1024 * 1024);
if (mb < 1) return `${(bytes / 1024).toFixed(1)} KB`;
return `${mb.toFixed(1)} MB`;
};
if (!selectedTema) { if (!selectedTema) {
return ( return (
<div className="flex flex-col items-center justify-center h-64"> <div className="flex flex-col items-center justify-center h-64">
<Activity className="w-12 h-12 text-gray-400 mb-4" /> <Activity className="w-12 h-12 text-gray-400 mb-4" />
<p className="text-gray-500"> <p className="text-gray-500">Select a dataroom to view its metrics</p>
Selecciona un dataroom para ver las métricas </div>
</p> );
}
if (loading) {
return (
<div className="flex flex-col items-center justify-center h-64">
<Loader2 className="w-8 h-8 text-blue-600 animate-spin mb-4" />
<p className="text-gray-600">Loading metrics</p>
</div>
);
}
if (error) {
return (
<div className="p-6">
<div className="bg-red-50 border border-red-200 rounded-lg p-4 flex items-center gap-3">
<AlertCircle className="w-5 h-5 text-red-600 flex-shrink-0" />
<div>
<p className="text-sm font-medium text-red-800">Error</p>
<p className="text-sm text-red-600">{error}</p>
</div>
</div>
</div>
);
}
if (!dataroomInfo) {
return (
<div className="flex flex-col items-center justify-center h-64">
<AlertCircle className="w-12 h-12 text-gray-400 mb-4" />
<p className="text-gray-500">Unable to load dataroom information</p>
</div> </div>
); );
} }
return ( return (
<div className="p-6"> <div className="p-6">
<div className="mb-6"> <h4 className="text-md font-semibold text-gray-900 mb-4">Metrics</h4>
<h3 className="text-lg font-semibold text-gray-900 mb-2">
Métricas del Dataroom: {selectedTema}
</h3>
<p className="text-sm text-gray-600">
Vista general del estado y actividad del dataroom
</p>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6"> <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6">
{/* Files Count Card */} {/* Files Count Card */}
<div className="bg-white border border-gray-200 rounded-lg p-4"> <div className="bg-white border border-gray-200 rounded-lg p-4">
@@ -35,8 +137,13 @@ export function DashboardTab({ selectedTema }: DashboardTabProps) {
<FileText className="w-5 h-5 text-blue-600" /> <FileText className="w-5 h-5 text-blue-600" />
</div> </div>
<div> <div>
<p className="text-sm font-medium text-gray-600">Archivos</p> <p className="text-sm font-medium text-gray-600">Files</p>
<p className="text-2xl font-bold text-gray-900">--</p> <p className="text-2xl font-bold text-gray-900">
{dataroomInfo.file_count}
</p>
<p className="text-xs text-gray-500 mt-1">
{formatFileTypes(dataroomInfo.file_types)}
</p>
</div> </div>
</div> </div>
</div> </div>
@@ -48,8 +155,13 @@ export function DashboardTab({ selectedTema }: DashboardTabProps) {
<Database className="w-5 h-5 text-green-600" /> <Database className="w-5 h-5 text-green-600" />
</div> </div>
<div> <div>
<p className="text-sm font-medium text-gray-600">Almacenamiento</p> <p className="text-sm font-medium text-gray-600">Storage</p>
<p className="text-2xl font-bold text-gray-900">--</p> <p className="text-2xl font-bold text-gray-900">
{dataroomInfo.total_size_mb.toFixed(1)} MB
</p>
<p className="text-xs text-gray-500 mt-1">
{formatBytes(dataroomInfo.total_size_bytes)}
</p>
</div> </div>
</div> </div>
</div> </div>
@@ -61,38 +173,98 @@ export function DashboardTab({ selectedTema }: DashboardTabProps) {
<Activity className="w-5 h-5 text-purple-600" /> <Activity className="w-5 h-5 text-purple-600" />
</div> </div>
<div> <div>
<p className="text-sm font-medium text-gray-600">Vectores</p> <p className="text-sm font-medium text-gray-600">Vectors</p>
<p className="text-2xl font-bold text-gray-900">--</p> <p className="text-2xl font-bold text-gray-900">
</div> {dataroomInfo.vector_count ?? 0}
</div> </p>
</div> <p className="text-xs text-gray-500 mt-1">
{dataroomInfo.collection_exists
{/* Activity Card */} ? "Indexed vectors"
<div className="bg-white border border-gray-200 rounded-lg p-4"> : "No vectors"}
<div className="flex items-center gap-3">
<div className="p-2 bg-orange-100 rounded-lg">
<Users className="w-5 h-5 text-orange-600" />
</div>
<div>
<p className="text-sm font-medium text-gray-600">Actividad</p>
<p className="text-2xl font-bold text-gray-900">--</p>
</div>
</div>
</div>
</div>
{/* Coming Soon Message */}
<div className="mt-8 bg-gray-50 border border-gray-200 rounded-lg p-6">
<div className="text-center">
<Activity className="w-8 h-8 text-gray-400 mx-auto mb-3" />
<h4 className="text-sm font-medium text-gray-900 mb-2">
Panel de Métricas
</h4>
<p className="text-sm text-gray-500">
Este panel se llenará con métricas detalladas y gráficos interactivos próximamente.
</p> </p>
</div> </div>
</div> </div>
</div> </div>
{/* Collection Status Card */}
<div className="bg-white border border-gray-200 rounded-lg p-4">
<div className="flex items-center gap-3">
<div className="p-2 bg-orange-100 rounded-lg">
<TrendingUp className="w-5 h-5 text-orange-600" />
</div>
<div>
<p className="text-sm font-medium text-gray-600">Status</p>
<div className="flex items-center gap-2">
<p className="text-2xl font-bold text-gray-900">
{dataroomInfo.collection_exists ? "Active" : "Inactive"}
</p>
{dataroomInfo.collection_exists ? (
<CheckCircle className="w-6 h-6 text-green-600" />
) : (
<AlertCircle className="w-6 h-6 text-yellow-600" />
)}
</div>
{dataroomInfo.collection_info ? (
<p className="text-xs text-gray-500 mt-1">
{dataroomInfo.collection_info.indexed_vectors_count}/
{dataroomInfo.collection_info.vectors_count} indexed vectors
</p>
) : (
<p className="text-xs text-gray-500 mt-1">
{dataroomInfo.collection_exists
? "Collection has no data"
: "No collection"}
</p>
)}
</div>
</div>
</div>
</div>
{/* Recent Files Section */}
{dataroomInfo.recent_files.length > 0 && (
<div className="mt-8">
<h4 className="text-md font-semibold text-gray-900 mb-4">
Recent Files
</h4>
<div className="bg-white border border-gray-200 rounded-lg overflow-hidden">
<div className="divide-y divide-gray-200">
{dataroomInfo.recent_files.map((file, index) => (
<div
key={index}
className="p-4 flex items-center justify-between hover:bg-gray-50"
>
<div className="flex items-center gap-3">
<FileText className="w-4 h-4 text-gray-400" />
<div>
<p className="text-sm font-medium text-gray-900">
{file.name}
</p>
<p className="text-xs text-gray-500">
{new Date(file.last_modified).toLocaleDateString(
"en-US",
{
year: "numeric",
month: "short",
day: "numeric",
hour: "2-digit",
minute: "2-digit",
},
)}
</p>
</div>
</div>
<div className="text-right">
<p className="text-sm text-gray-600">
{file.size_mb.toFixed(2)} MB
</p>
</div>
</div>
))}
</div>
</div>
</div>
)}
</div>
); );
} }

View File

@@ -1,158 +1,111 @@
import { useEffect, useState } from "react"; import { useState } from "react";
import { useFileStore } from "@/stores/fileStore"; import { useFileStore } from "@/stores/fileStore";
import { api } from "@/services/api";
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs"; import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
import {
Dialog,
DialogContent,
DialogHeader,
DialogTitle,
} from "@/components/ui/dialog";
import { Button } from "@/components/ui/button";
import { Expand, Minimize2 } from "lucide-react";
import { FilesTab } from "./FilesTab"; import { FilesTab } from "./FilesTab";
import { DashboardTab } from "./DashboardTab"; import { DashboardTab } from "./DashboardTab";
import { ChatTab } from "./ChatTab"; import { ChatTab } from "./ChatTab";
import {
CheckCircle2,
AlertCircle,
Loader2,
} from "lucide-react";
interface DataroomViewProps { interface DataroomViewProps {
onProcessingChange?: (isProcessing: boolean) => void; onProcessingChange?: (isProcessing: boolean) => void;
} }
export function DataroomView({ onProcessingChange }: DataroomViewProps = {}) { export function DataroomView({ onProcessingChange }: DataroomViewProps = {}) {
const { selectedTema, files } = useFileStore(); const { selectedTema } = useFileStore();
// Collection status states
const [isCheckingCollection, setIsCheckingCollection] = useState(false);
const [collectionExists, setCollectionExists] = useState<boolean | null>(
null,
);
const [collectionError, setCollectionError] = useState<string | null>(null);
const [processing, setProcessing] = useState(false); const [processing, setProcessing] = useState(false);
const [fullscreenTab, setFullscreenTab] = useState<string | null>(null);
// Check collection status when tema changes const [currentTab, setCurrentTab] = useState("overview");
useEffect(() => {
checkCollectionStatus();
}, [selectedTema]);
// Load files when tema changes
useEffect(() => {
loadFiles();
}, [selectedTema]);
const checkCollectionStatus = async () => {
if (!selectedTema) {
setCollectionExists(null);
return;
}
setIsCheckingCollection(true);
setCollectionError(null);
try {
const result = await api.checkCollectionExists(selectedTema);
setCollectionExists(result.exists);
} catch (err) {
console.error("Error checking collection:", err);
setCollectionError(
err instanceof Error ? err.message : "Error al verificar colección",
);
setCollectionExists(null);
} finally {
setIsCheckingCollection(false);
}
};
const handleCreateCollection = async () => {
if (!selectedTema) return;
setIsCheckingCollection(true);
setCollectionError(null);
try {
const result = await api.createCollection(selectedTema);
if (result.success) {
setCollectionExists(true);
console.log(`Collection "${selectedTema}" created successfully`);
}
} catch (err) {
console.error("Error creating collection:", err);
setCollectionError(
err instanceof Error ? err.message : "Error al crear colección",
);
} finally {
setIsCheckingCollection(false);
}
};
const loadFiles = async () => {
// This will be handled by FilesTab component
};
const handleProcessingChange = (isProcessing: boolean) => { const handleProcessingChange = (isProcessing: boolean) => {
setProcessing(isProcessing); setProcessing(isProcessing);
onProcessingChange?.(isProcessing); onProcessingChange?.(isProcessing);
}; };
const totalFiles = files.length; const openFullscreen = (tabValue: string) => {
setFullscreenTab(tabValue);
};
const closeFullscreen = () => {
setFullscreenTab(null);
};
const renderTabContent = (tabValue: string, isFullscreen = false) => {
const className = isFullscreen ? "h-[calc(100vh-8rem)] flex flex-col" : "";
switch (tabValue) {
case "overview":
return (
<div className={className}>
<DashboardTab selectedTema={selectedTema} />
</div>
);
case "files":
return (
<div className={className}>
<FilesTab
selectedTema={selectedTema}
processing={processing}
onProcessingChange={handleProcessingChange}
/>
</div>
);
case "chat":
return (
<div className={className}>
<ChatTab selectedTema={selectedTema} />
</div>
);
default:
return null;
}
};
const getTabTitle = (tabValue: string) => {
switch (tabValue) {
case "overview":
return "Overview";
case "files":
return "Files";
case "chat":
return "Chat";
default:
return "";
}
};
return ( return (
<div className="flex flex-col h-full bg-white"> <div className="flex flex-col h-full bg-white">
<div className="border-b border-gray-200 px-6 py-4"> <div className="border-b border-gray-200 px-6 py-4">
<div className="flex flex-wrap items-center justify-between gap-4"> <div className="flex flex-wrap items-center justify-between gap-4">
<div> <div>
<div className="flex items-center gap-3 mb-2"> <h2 className="text-2xl font-semibold text-gray-900 mb-2">
<h2 className="text-2xl font-semibold text-gray-900"> {selectedTema ? `Dataroom: ${selectedTema}` : "Select a dataroom"}
{selectedTema
? `Dataroom: ${selectedTema}`
: "Selecciona un dataroom"}
</h2> </h2>
{/* Collection Status Indicator */}
{selectedTema && (
<div className="flex items-center gap-2">
{isCheckingCollection ? (
<>
<Loader2 className="w-4 h-4 animate-spin text-gray-500" />
<span className="text-xs text-gray-500">
Verificando...
</span>
</>
) : collectionExists === true ? (
<>
<CheckCircle2 className="w-4 h-4 text-green-600" />
<span className="text-xs text-green-600">
Colección disponible
</span>
</>
) : collectionExists === false ? (
<>
<AlertCircle className="w-4 h-4 text-yellow-600" />
<button
onClick={handleCreateCollection}
className="text-xs text-yellow-600 hover:text-yellow-700 underline"
>
Crear colección
</button>
</>
) : collectionError ? (
<>
<AlertCircle className="w-4 h-4 text-red-600" />
<span className="text-xs text-red-600">
Error de conexión
</span>
</>
) : null}
</div>
)}
</div>
<p className="text-sm text-gray-600"> <p className="text-sm text-gray-600">
{selectedTema {selectedTema
? `${totalFiles} archivo${totalFiles !== 1 ? "s" : ""}` ? "Manage files, review metrics, and chat with AI about the content."
: "Selecciona un dataroom de la barra lateral para ver sus archivos"} : "Pick a dataroom from the sidebar to get started."}
</p> </p>
</div> </div>
</div> </div>
</div> </div>
<Tabs defaultValue="files" className="flex flex-col flex-1"> <Tabs
value={currentTab}
onValueChange={setCurrentTab}
className="flex flex-col flex-1"
>
<div className="border-b border-gray-200 px-6 py-2"> <div className="border-b border-gray-200 px-6 py-2">
<TabsList className="flex h-10 w-full items-center gap-2 bg-transparent p-0 justify-start"> <TabsList className="flex h-10 w-full items-center gap-2 bg-transparent p-0 justify-between">
<div className="flex items-center gap-2">
<TabsTrigger <TabsTrigger
value="overview" value="overview"
className="rounded-md px-4 py-2 text-sm font-medium text-gray-600 transition data-[state=active]:bg-gray-900 data-[state=active]:text-white data-[state=active]:shadow" className="rounded-md px-4 py-2 text-sm font-medium text-gray-600 transition data-[state=active]:bg-gray-900 data-[state=active]:text-white data-[state=active]:shadow"
@@ -171,25 +124,60 @@ export function DataroomView({ onProcessingChange }: DataroomViewProps = {}) {
> >
Chat Chat
</TabsTrigger> </TabsTrigger>
</div>
<Button
variant="outline"
size="sm"
onClick={() => openFullscreen(currentTab)}
className="ml-auto"
>
<Expand className="h-4 w-4" />
<span className="sr-only">Open fullscreen</span>
</Button>
</TabsList> </TabsList>
</div> </div>
<TabsContent value="overview" className="mt-0 flex-1"> <TabsContent value="overview" className="mt-0 flex-1">
<DashboardTab selectedTema={selectedTema} /> {renderTabContent("overview")}
</TabsContent> </TabsContent>
<TabsContent value="files" className="mt-0 flex flex-1 flex-col"> <TabsContent value="files" className="mt-0 flex flex-1 flex-col">
<FilesTab {renderTabContent("files")}
selectedTema={selectedTema}
processing={processing}
onProcessingChange={handleProcessingChange}
/>
</TabsContent> </TabsContent>
<TabsContent value="chat" className="mt-0 flex-1"> <TabsContent value="chat" className="mt-0 flex-1">
<ChatTab selectedTema={selectedTema} /> {renderTabContent("chat")}
</TabsContent> </TabsContent>
</Tabs> </Tabs>
<Dialog
open={fullscreenTab !== null}
onOpenChange={(open: boolean) => !open && closeFullscreen()}
>
<DialogContent className="max-w-[100vw] max-h-[100vh] w-[100vw] h-[100vh] m-0 rounded-none [&>button]:hidden">
<DialogHeader className="flex flex-row items-center justify-between space-y-0 pb-4">
<DialogTitle className="text-xl font-semibold">
{selectedTema
? `${getTabTitle(fullscreenTab || "")} - ${selectedTema}`
: getTabTitle(fullscreenTab || "")}
</DialogTitle>
<Button
variant="outline"
size="sm"
onClick={closeFullscreen}
className="h-8 w-8 p-0"
>
<Minimize2 className="h-4 w-4" />
<span className="sr-only">Exit fullscreen</span>
</Button>
</DialogHeader>
<div className="flex-1 overflow-hidden">
{fullscreenTab && renderTabContent(fullscreenTab, true)}
</div>
</DialogContent>
</Dialog>
</div> </div>
); );
} }

View File

@@ -1,4 +1,4 @@
import { Button } from '@/components/ui/button' import { Button } from "@/components/ui/button";
import { import {
Dialog, Dialog,
DialogContent, DialogContent,
@@ -6,17 +6,17 @@ import {
DialogFooter, DialogFooter,
DialogHeader, DialogHeader,
DialogTitle, DialogTitle,
} from '@/components/ui/dialog' } from "@/components/ui/dialog";
import { Trash2, AlertTriangle } from 'lucide-react' import { Trash2, AlertTriangle } from "lucide-react";
interface DeleteConfirmDialogProps { interface DeleteConfirmDialogProps {
open: boolean open: boolean;
onOpenChange: (open: boolean) => void onOpenChange: (open: boolean) => void;
onConfirm: () => void onConfirm: () => void;
title: string title: string;
description: string description: string;
fileList?: string[] fileList?: string[];
loading?: boolean loading?: boolean;
} }
export function DeleteConfirmDialog({ export function DeleteConfirmDialog({
@@ -26,7 +26,7 @@ export function DeleteConfirmDialog({
title, title,
description, description,
fileList, fileList,
loading = false loading = false,
}: DeleteConfirmDialogProps) { }: DeleteConfirmDialogProps) {
return ( return (
<Dialog open={open} onOpenChange={onOpenChange}> <Dialog open={open} onOpenChange={onOpenChange}>
@@ -41,7 +41,7 @@ export function DeleteConfirmDialog({
{fileList && fileList.length > 0 && ( {fileList && fileList.length > 0 && (
<div className="max-h-40 overflow-y-auto bg-gray-50 rounded p-3"> <div className="max-h-40 overflow-y-auto bg-gray-50 rounded p-3">
<p className="text-sm font-medium mb-2">Archivos a eliminar:</p> <p className="text-sm font-medium mb-2">Files to delete:</p>
<ul className="text-sm space-y-1"> <ul className="text-sm space-y-1">
{fileList.map((filename, index) => ( {fileList.map((filename, index) => (
<li key={index} className="flex items-center gap-2"> <li key={index} className="flex items-center gap-2">
@@ -59,17 +59,13 @@ export function DeleteConfirmDialog({
onClick={() => onOpenChange(false)} onClick={() => onOpenChange(false)}
disabled={loading} disabled={loading}
> >
Cancelar Cancel
</Button> </Button>
<Button <Button variant="destructive" onClick={onConfirm} disabled={loading}>
variant="destructive" {loading ? "Deleting…" : "Delete"}
onClick={onConfirm}
disabled={loading}
>
{loading ? 'Eliminando...' : 'Eliminar'}
</Button> </Button>
</DialogFooter> </DialogFooter>
</DialogContent> </DialogContent>
</Dialog> </Dialog>
) );
} }

View File

@@ -1,4 +1,4 @@
import { useState } from "react"; import { useState, useEffect } from "react";
import { useFileStore } from "@/stores/fileStore"; import { useFileStore } from "@/stores/fileStore";
import { api } from "@/services/api"; import { api } from "@/services/api";
import { Button } from "@/components/ui/button"; import { Button } from "@/components/ui/button";
@@ -60,7 +60,7 @@ export function FilesTab({
const [deleting, setDeleting] = useState(false); const [deleting, setDeleting] = useState(false);
const [downloading, setDownloading] = useState(false); const [downloading, setDownloading] = useState(false);
// Estados para el modal de preview de PDF // PDF preview modal state
const [previewModalOpen, setPreviewModalOpen] = useState(false); const [previewModalOpen, setPreviewModalOpen] = useState(false);
const [previewFileUrl, setPreviewFileUrl] = useState<string | null>(null); const [previewFileUrl, setPreviewFileUrl] = useState<string | null>(null);
const [previewFileName, setPreviewFileName] = useState(""); const [previewFileName, setPreviewFileName] = useState("");
@@ -69,17 +69,22 @@ export function FilesTab({
); );
const [loadingPreview, setLoadingPreview] = useState(false); const [loadingPreview, setLoadingPreview] = useState(false);
// Estados para el modal de chunks // Chunk viewer modal state
const [chunkViewerOpen, setChunkViewerOpen] = useState(false); const [chunkViewerOpen, setChunkViewerOpen] = useState(false);
const [chunkFileName, setChunkFileName] = useState(""); const [chunkFileName, setChunkFileName] = useState("");
const [chunkFileTema, setChunkFileTema] = useState(""); const [chunkFileTema, setChunkFileTema] = useState("");
// Estados para chunking // LandingAI chunking state
const [chunkingConfigOpen, setChunkingConfigOpen] = useState(false); const [chunkingConfigOpen, setChunkingConfigOpen] = useState(false);
const [chunkingFileName, setChunkingFileName] = useState(""); const [chunkingFileName, setChunkingFileName] = useState("");
const [chunkingFileTema, setChunkingFileTema] = useState(""); const [chunkingFileTema, setChunkingFileTema] = useState("");
const [chunkingCollectionName, setChunkingCollectionName] = useState(""); const [chunkingCollectionName, setChunkingCollectionName] = useState("");
// Load files when component mounts or selectedTema changes
useEffect(() => {
loadFiles();
}, [selectedTema]);
const loadFiles = async () => { const loadFiles = async () => {
// Don't load files if no dataroom is selected // Don't load files if no dataroom is selected
if (!selectedTema) { if (!selectedTema) {
@@ -118,10 +123,10 @@ export function FilesTab({
setDeleting(true); setDeleting(true);
if (fileToDelete) { if (fileToDelete) {
// Eliminar archivo individual // Delete single file
await api.deleteFile(fileToDelete, selectedTema || undefined); await api.deleteFile(fileToDelete, selectedTema || undefined);
} else { } else {
// Eliminar archivos seleccionados // Delete selected files
const filesToDelete = Array.from(selectedFiles); const filesToDelete = Array.from(selectedFiles);
await api.deleteFiles(filesToDelete, selectedTema || undefined); await api.deleteFiles(filesToDelete, selectedTema || undefined);
clearSelection(); clearSelection();
@@ -154,9 +159,7 @@ export function FilesTab({
try { try {
setDownloading(true); setDownloading(true);
const filesToDownload = Array.from(selectedFiles); const filesToDownload = Array.from(selectedFiles);
const zipName = selectedTema const zipName = selectedTema ? `${selectedTema}_files` : "selected_files";
? `${selectedTema}_archivos`
: "archivos_seleccionados";
await api.downloadMultipleFiles( await api.downloadMultipleFiles(
filesToDownload, filesToDownload,
selectedTema || undefined, selectedTema || undefined,
@@ -213,10 +216,10 @@ export function FilesTab({
tema: chunkingFileTema, tema: chunkingFileTema,
collection_name: chunkingCollectionName, collection_name: chunkingCollectionName,
mode: config.mode, mode: config.mode,
schema_id: config.schemaId, schema_id: config.schema_id,
include_chunk_types: config.includeChunkTypes, include_chunk_types: config.include_chunk_types,
max_tokens_per_chunk: config.maxTokensPerChunk, max_tokens_per_chunk: config.max_tokens_per_chunk,
merge_small_chunks: config.mergeSmallChunks, merge_small_chunks: config.merge_small_chunks,
}; };
await api.processWithLandingAI(processConfig); await api.processWithLandingAI(processConfig);
@@ -229,7 +232,7 @@ export function FilesTab({
} }
}; };
// Filtrar archivos por término de búsqueda // Filter files by search term
const filteredFiles = files.filter((file) => const filteredFiles = files.filter((file) =>
file.name.toLowerCase().includes(searchTerm.toLowerCase()), file.name.toLowerCase().includes(searchTerm.toLowerCase()),
); );
@@ -245,7 +248,7 @@ export function FilesTab({
}; };
const formatDate = (dateString: string): string => { const formatDate = (dateString: string): string => {
return new Date(dateString).toLocaleDateString("es-ES", { return new Date(dateString).toLocaleDateString("en-US", {
year: "numeric", year: "numeric",
month: "short", month: "short",
day: "numeric", day: "numeric",
@@ -257,15 +260,15 @@ export function FilesTab({
const getDeleteDialogProps = () => { const getDeleteDialogProps = () => {
if (fileToDelete) { if (fileToDelete) {
return { return {
title: "Eliminar archivo", title: "Delete file",
message: `¿Estás seguro de que deseas eliminar el archivo "${fileToDelete}"?`, message: `Are you sure you want to delete "${fileToDelete}"?`,
fileList: [fileToDelete], fileList: [fileToDelete],
}; };
} else { } else {
const filesToDelete = Array.from(selectedFiles); const filesToDelete = Array.from(selectedFiles);
return { return {
title: "Eliminar archivos seleccionados", title: "Delete selected files",
message: `¿Estás seguro de que deseas eliminar ${filesToDelete.length} archivo${filesToDelete.length > 1 ? "s" : ""}?`, message: `Are you sure you want to delete ${filesToDelete.length} file${filesToDelete.length > 1 ? "s" : ""}?`,
fileList: filesToDelete, fileList: filesToDelete,
}; };
} }
@@ -275,9 +278,7 @@ export function FilesTab({
return ( return (
<div className="flex flex-col items-center justify-center h-64"> <div className="flex flex-col items-center justify-center h-64">
<FileText className="w-12 h-12 text-gray-400 mb-4" /> <FileText className="w-12 h-12 text-gray-400 mb-4" />
<p className="text-gray-500"> <p className="text-gray-500">Select a dataroom to view its files</p>
Selecciona un dataroom para ver sus archivos
</p>
</div> </div>
); );
} }
@@ -289,7 +290,7 @@ export function FilesTab({
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
<div className="animate-spin rounded-full h-4 w-4 border-b-2 border-blue-600"></div> <div className="animate-spin rounded-full h-4 w-4 border-b-2 border-blue-600"></div>
<span className="text-sm text-blue-800"> <span className="text-sm text-blue-800">
Procesando archivos con LandingAI... Processing files with LandingAI
</span> </span>
</div> </div>
</div> </div>
@@ -300,7 +301,7 @@ export function FilesTab({
<div className="relative flex-1 max-w-md"> <div className="relative flex-1 max-w-md">
<Search className="absolute left-3 top-1/2 transform -translate-y-1/2 text-gray-400 w-4 h-4" /> <Search className="absolute left-3 top-1/2 transform -translate-y-1/2 text-gray-400 w-4 h-4" />
<Input <Input
placeholder="Buscar archivos..." placeholder="Search files..."
value={searchTerm} value={searchTerm}
onChange={(e) => setSearchTerm(e.target.value)} onChange={(e) => setSearchTerm(e.target.value)}
className="pl-10" className="pl-10"
@@ -319,7 +320,7 @@ export function FilesTab({
className="gap-2" className="gap-2"
> >
<Download className="w-4 h-4" /> <Download className="w-4 h-4" />
Descargar ({selectedFiles.size}) Download ({selectedFiles.size})
</Button> </Button>
<Button <Button
variant="outline" variant="outline"
@@ -329,7 +330,7 @@ export function FilesTab({
className="gap-2 text-red-600 hover:text-red-700" className="gap-2 text-red-600 hover:text-red-700"
> >
<Trash2 className="w-4 h-4" /> <Trash2 className="w-4 h-4" />
Eliminar ({selectedFiles.size}) Delete ({selectedFiles.size})
</Button> </Button>
</> </>
)} )}
@@ -340,7 +341,7 @@ export function FilesTab({
className="gap-2" className="gap-2"
> >
<Upload className="w-4 h-4" /> <Upload className="w-4 h-4" />
Subir archivo Upload files
</Button> </Button>
</div> </div>
</div> </div>
@@ -350,17 +351,17 @@ export function FilesTab({
<div className="p-6"> <div className="p-6">
{loading ? ( {loading ? (
<div className="flex items-center justify-center h-64"> <div className="flex items-center justify-center h-64">
<p className="text-gray-500">Cargando archivos...</p> <p className="text-gray-500">Loading files</p>
</div> </div>
) : filteredFiles.length === 0 ? ( ) : filteredFiles.length === 0 ? (
<div className="flex flex-col items-center justify-center h-64"> <div className="flex flex-col items-center justify-center h-64">
<FileText className="w-12 h-12 text-gray-400 mb-4" /> <FileText className="w-12 h-12 text-gray-400 mb-4" />
<p className="text-gray-500"> <p className="text-gray-500">
{!selectedTema {!selectedTema
? "Selecciona un dataroom para ver sus archivos" ? "Select a dataroom to view its files"
: searchTerm : searchTerm
? "No se encontraron archivos" ? "No files match your search"
: "No hay archivos en este dataroom"} : "This dataroom has no files yet"}
</p> </p>
</div> </div>
) : ( ) : (
@@ -382,10 +383,10 @@ export function FilesTab({
}} }}
/> />
</TableHead> </TableHead>
<TableHead>Archivo</TableHead> <TableHead>File</TableHead>
<TableHead>Tamaño</TableHead> <TableHead>Size</TableHead>
<TableHead>Modificado</TableHead> <TableHead>Modified</TableHead>
<TableHead className="text-right">Acciones</TableHead> <TableHead className="text-right">Actions</TableHead>
</TableRow> </TableRow>
</TableHeader> </TableHeader>
<TableBody> <TableBody>
@@ -413,7 +414,7 @@ export function FilesTab({
onClick={() => handlePreviewFile(file.name)} onClick={() => handlePreviewFile(file.name)}
disabled={loadingPreview} disabled={loadingPreview}
className="h-8 w-8 p-0" className="h-8 w-8 p-0"
title="Vista previa" title="Preview"
> >
<Eye className="w-4 h-4" /> <Eye className="w-4 h-4" />
</Button> </Button>
@@ -422,7 +423,7 @@ export function FilesTab({
size="sm" size="sm"
onClick={() => handleViewChunks(file.name)} onClick={() => handleViewChunks(file.name)}
className="h-8 w-8 p-0" className="h-8 w-8 p-0"
title="Ver chunks" title="View chunks"
> >
<MessageSquare className="w-4 h-4" /> <MessageSquare className="w-4 h-4" />
</Button> </Button>
@@ -431,7 +432,7 @@ export function FilesTab({
size="sm" size="sm"
onClick={() => handleStartChunking(file.name)} onClick={() => handleStartChunking(file.name)}
className="h-8 w-8 p-0" className="h-8 w-8 p-0"
title="Procesar con LandingAI" title="Process with LandingAI"
> >
<Scissors className="w-4 h-4" /> <Scissors className="w-4 h-4" />
</Button> </Button>
@@ -441,7 +442,7 @@ export function FilesTab({
onClick={() => handleDownloadFile(file.name)} onClick={() => handleDownloadFile(file.name)}
disabled={downloading} disabled={downloading}
className="h-8 w-8 p-0" className="h-8 w-8 p-0"
title="Descargar" title="Download"
> >
<Download className="w-4 h-4" /> <Download className="w-4 h-4" />
</Button> </Button>
@@ -451,7 +452,7 @@ export function FilesTab({
onClick={() => handleDeleteFile(file.name)} onClick={() => handleDeleteFile(file.name)}
disabled={deleting} disabled={deleting}
className="h-8 w-8 p-0 text-red-600 hover:text-red-700 hover:bg-red-50" className="h-8 w-8 p-0 text-red-600 hover:text-red-700 hover:bg-red-50"
title="Eliminar" title="Delete"
> >
<Trash2 className="w-4 h-4" /> <Trash2 className="w-4 h-4" />
</Button> </Button>
@@ -498,7 +499,7 @@ export function FilesTab({
tema={chunkFileTema} tema={chunkFileTema}
/> />
{/* Modal de configuración de chunking con LandingAI */} {/* LandingAI chunking config modal */}
<ChunkingConfigModalLandingAI <ChunkingConfigModalLandingAI
isOpen={chunkingConfigOpen} isOpen={chunkingConfigOpen}
onClose={() => setChunkingConfigOpen(false)} onClose={() => setChunkingConfigOpen(false)}

View File

@@ -1,25 +1,20 @@
import { useState, useEffect } from 'react' import { useState, useEffect } from "react";
import { import {
Dialog, Dialog,
DialogContent, DialogContent,
DialogHeader, DialogHeader,
DialogTitle, DialogTitle,
DialogDescription DialogDescription,
} from '@/components/ui/dialog' } from "@/components/ui/dialog";
import { Button } from '@/components/ui/button' import { Button } from "@/components/ui/button";
import { import { Download, Loader2, FileText, ExternalLink } from "lucide-react";
Download,
Loader2,
FileText,
ExternalLink
} from 'lucide-react'
interface PDFPreviewModalProps { interface PDFPreviewModalProps {
open: boolean open: boolean;
onOpenChange: (open: boolean) => void onOpenChange: (open: boolean) => void;
fileUrl: string | null fileUrl: string | null;
fileName: string fileName: string;
onDownload?: () => void onDownload?: () => void;
} }
export function PDFPreviewModal({ export function PDFPreviewModal({
@@ -27,45 +22,40 @@ export function PDFPreviewModal({
onOpenChange, onOpenChange,
fileUrl, fileUrl,
fileName, fileName,
onDownload onDownload,
}: PDFPreviewModalProps) { }: PDFPreviewModalProps) {
// Estado para manejar el loading del iframe // Track iframe loading state
const [loading, setLoading] = useState(true) const [loading, setLoading] = useState(true);
// Efecto para manejar el timeout del loading // Hide loading if iframe never fires onLoad
useEffect(() => { useEffect(() => {
if (open && fileUrl) { if (open && fileUrl) {
setLoading(true) setLoading(true);
// Timeout para ocultar loading automáticamente después de 3 segundos
// Algunos iframes no disparan onLoad correctamente
const timeout = setTimeout(() => { const timeout = setTimeout(() => {
setLoading(false) setLoading(false);
}, 3000) }, 3000);
return () => clearTimeout(timeout) return () => clearTimeout(timeout);
} }
}, [open, fileUrl]) }, [open, fileUrl]);
// Manejar cuando el iframe termina de cargar
const handleIframeLoad = () => { const handleIframeLoad = () => {
setLoading(false) setLoading(false);
} };
// Abrir PDF en nueva pestaña
const openInNewTab = () => { const openInNewTab = () => {
if (fileUrl) { if (fileUrl) {
window.open(fileUrl, '_blank') window.open(fileUrl, "_blank");
}
} }
};
// Reiniciar loading cuando cambia el archivo
const handleOpenChange = (open: boolean) => { const handleOpenChange = (open: boolean) => {
if (open) { if (open) {
setLoading(true) setLoading(true);
}
onOpenChange(open)
} }
onOpenChange(open);
};
return ( return (
<Dialog open={open} onOpenChange={handleOpenChange}> <Dialog open={open} onOpenChange={handleOpenChange}>
@@ -75,81 +65,68 @@ export function PDFPreviewModal({
<FileText className="w-5 h-5" /> <FileText className="w-5 h-5" />
{fileName} {fileName}
</DialogTitle> </DialogTitle>
<DialogDescription> <DialogDescription>PDF preview</DialogDescription>
Vista previa del documento PDF
</DialogDescription>
</DialogHeader> </DialogHeader>
{/* Barra de controles */} {/* Controls */}
<div className="flex items-center justify-between gap-4 px-6 py-3 border-b bg-gray-50"> <div className="flex items-center justify-between gap-4 px-6 py-3 border-b bg-gray-50">
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">
<Button <Button
variant="outline" variant="outline"
size="sm" size="sm"
onClick={openInNewTab} onClick={openInNewTab}
title="Abrir en nueva pestaña" title="Open in new tab"
> >
<ExternalLink className="w-4 h-4 mr-2" /> <ExternalLink className="w-4 h-4 mr-2" />
Abrir en pestaña nueva Open in new tab
</Button> </Button>
</div> </div>
{/* Botón de descarga */} {/* Download button */}
{onDownload && ( {onDownload && (
<Button <Button
variant="outline" variant="outline"
size="sm" size="sm"
onClick={onDownload} onClick={onDownload}
title="Descargar archivo" title="Download file"
> >
<Download className="w-4 h-4 mr-2" /> <Download className="w-4 h-4 mr-2" />
Descargar Download
</Button> </Button>
)} )}
</div> </div>
{/* Área de visualización del PDF con iframe */} {/* PDF iframe */}
<div className="flex-1 relative bg-gray-100"> <div className="flex-1 relative bg-gray-100 overflow-hidden min-h-0">
{!fileUrl ? ( {!fileUrl ? (
<div className="flex items-center justify-center h-full text-center text-gray-500 p-8"> <div className="flex items-center justify-center h-full text-center text-gray-500 p-8">
<div> <div>
<FileText className="w-16 h-16 mx-auto mb-4 text-gray-400" /> <FileText className="w-16 h-16 mx-auto mb-4 text-gray-400" />
<p>No se ha proporcionado un archivo para previsualizar</p> <p>No file available for preview</p>
</div> </div>
</div> </div>
) : ( ) : (
<> <>
{/* Indicador de carga */} {/* Loading state */}
{loading && ( {loading && (
<div className="absolute inset-0 flex items-center justify-center bg-white z-10"> <div className="absolute inset-0 flex items-center justify-center bg-white z-10">
<div className="text-center"> <div className="text-center">
<Loader2 className="w-12 h-12 animate-spin text-blue-500 mx-auto mb-4" /> <Loader2 className="w-12 h-12 animate-spin text-blue-500 mx-auto mb-4" />
<p className="text-gray-600">Cargando PDF...</p> <p className="text-gray-600">Loading PDF</p>
</div> </div>
</div> </div>
)} )}
{/*
Iframe para mostrar el PDF
El navegador maneja toda la visualización, zoom, scroll, etc.
Esto muestra el PDF exactamente como se vería si lo abrieras directamente
*/}
<iframe <iframe
src={fileUrl} src={fileUrl}
className="w-full h-full border-0" className="w-full h-full border-0"
title={`Vista previa de ${fileName}`} title={`Preview of ${fileName}`}
onLoad={handleIframeLoad} onLoad={handleIframeLoad}
style={{ minHeight: '600px' }}
/> />
</> </>
)} )}
</div> </div>
{/* Footer con información */}
<div className="px-6 py-3 border-t bg-gray-50 text-xs text-gray-500 text-center">
{fileName}
</div>
</DialogContent> </DialogContent>
</Dialog> </Dialog>
) );
} }

View File

@@ -84,7 +84,7 @@ export function Sidebar({
const handleCreateDataroom = async () => { const handleCreateDataroom = async () => {
const trimmed = newDataroomName.trim(); const trimmed = newDataroomName.trim();
if (!trimmed) { if (!trimmed) {
setCreateError("El nombre es obligatorio"); setCreateError("Name is required");
return; return;
} }
@@ -108,7 +108,7 @@ export function Sidebar({
setCreateError( setCreateError(
error instanceof Error error instanceof Error
? error.message ? error.message
: "No se pudo crear el dataroom. Inténtalo nuevamente.", : "Could not create the dataroom. Please try again.",
); );
} finally { } finally {
setCreatingDataroom(false); setCreatingDataroom(false);
@@ -168,15 +168,15 @@ export function Sidebar({
tema: string, tema: string,
e: React.MouseEvent<HTMLButtonElement>, e: React.MouseEvent<HTMLButtonElement>,
) => { ) => {
e.stopPropagation(); // Evitar que se seleccione el tema al hacer clic en el icono e.stopPropagation(); // Prevent selecting the dataroom when clicking delete
const confirmed = window.confirm( const confirmed = window.confirm(
`¿Estás seguro de que deseas eliminar el dataroom "${tema}"?\n\n` + `Are you sure you want to delete the dataroom "${tema}"?\n\n` +
`Esto eliminará:\n` + `This will remove:\n` +
`El dataroom de la base de datos\n` + `The dataroom from the database\n` +
`Todos los archivos del tema en Azure Blob Storage\n` + `All files stored for this topic in Azure Blob Storage\n` +
`La colección "${tema}" en Qdrant (si existe)\n\n` + `The "${tema}" collection in Qdrant (if it exists)\n\n` +
`Esta acción no se puede deshacer.`, `This action cannot be undone.`,
); );
if (!confirmed) return; if (!confirmed) return;
@@ -191,10 +191,10 @@ export function Sidebar({
console.error(`Error deleting dataroom "${tema}":`, error); console.error(`Error deleting dataroom "${tema}":`, error);
// If dataroom deletion fails, fall back to legacy deletion // If dataroom deletion fails, fall back to legacy deletion
// Eliminar todos los archivos del tema en Azure Blob Storage // Delete all topic files in Azure Blob Storage
await api.deleteTema(tema); await api.deleteTema(tema);
// Intentar eliminar la colección en Qdrant (si existe) // Attempt to delete the Qdrant collection (if it exists)
try { try {
const collectionExists = await api.checkCollectionExists(tema); const collectionExists = await api.checkCollectionExists(tema);
if (collectionExists.exists) { if (collectionExists.exists) {
@@ -202,7 +202,7 @@ export function Sidebar({
} }
} catch (collectionError) { } catch (collectionError) {
console.warn( console.warn(
`No se pudo eliminar la colección "${tema}" de Qdrant:`, `Could not delete the "${tema}" collection from Qdrant:`,
collectionError, collectionError,
); );
} }
@@ -216,9 +216,9 @@ export function Sidebar({
setSelectedTema(null); setSelectedTema(null);
} }
} catch (error) { } catch (error) {
console.error(`Error eliminando dataroom "${tema}":`, error); console.error(`Error deleting dataroom "${tema}":`, error);
alert( alert(
`Error al eliminar el dataroom: ${error instanceof Error ? error.message : "Error desconocido"}`, `Unable to delete dataroom: ${error instanceof Error ? error.message : "Unknown error"}`,
); );
} finally { } finally {
setDeletingTema(null); setDeletingTema(null);
@@ -251,9 +251,7 @@ export function Sidebar({
className="text-slate-400 hover:text-slate-100" className="text-slate-400 hover:text-slate-100"
onClick={onToggleCollapse} onClick={onToggleCollapse}
disabled={disabled} disabled={disabled}
aria-label={ aria-label={collapsed ? "Expand sidebar" : "Collapse sidebar"}
collapsed ? "Expandir barra lateral" : "Contraer barra lateral"
}
> >
{collapsed ? ( {collapsed ? (
<ChevronRight className="h-4 w-4" /> <ChevronRight className="h-4 w-4" />
@@ -273,16 +271,13 @@ export function Sidebar({
collapsed ? "justify-center" : "justify-between", collapsed ? "justify-center" : "justify-between",
)} )}
> >
<h2 {!collapsed && (
className={cn( <h2 className="text-sm font-medium text-slate-300">
"text-sm font-medium text-slate-300", Datarooms
collapsed && "text-xs text-center",
)}
>
{collapsed ? "Rooms" : "Datarooms"}
</h2> </h2>
)}
{renderWithTooltip( {renderWithTooltip(
"Crear dataroom", "Create",
<Button <Button
variant="ghost" variant="ghost"
size="sm" size="sm"
@@ -296,15 +291,15 @@ export function Sidebar({
disabled={disabled || creatingDataroom} disabled={disabled || creatingDataroom}
> >
<Plus className="h-4 w-4" /> <Plus className="h-4 w-4" />
{!collapsed && <span>Crear dataroom</span>} {!collapsed && <span>Create</span>}
</Button>, </Button>,
)} )}
</div> </div>
{/* Lista de temas */} {/* Dataroom list */}
{loading ? ( {loading ? (
<div className="text-sm text-slate-400 px-3 py-2 text-center"> <div className="text-sm text-slate-400 px-3 py-2 text-center">
{collapsed ? "..." : "Cargando..."} {collapsed ? "..." : "Loading..."}
</div> </div>
) : Array.isArray(temas) && temas.length > 0 ? ( ) : Array.isArray(temas) && temas.length > 0 ? (
temas.map((tema) => ( temas.map((tema) => (
@@ -334,7 +329,7 @@ export function Sidebar({
onClick={(e) => handleDeleteTema(tema, e)} onClick={(e) => handleDeleteTema(tema, e)}
disabled={deletingTema === tema || disabled} disabled={deletingTema === tema || disabled}
className="absolute right-2 top-1/2 -translate-y-1/2 p-1.5 rounded hover:bg-red-500/20 opacity-0 group-hover:opacity-100 transition-opacity disabled:opacity-50" className="absolute right-2 top-1/2 -translate-y-1/2 p-1.5 rounded hover:bg-red-500/20 opacity-0 group-hover:opacity-100 transition-opacity disabled:opacity-50"
title="Eliminar dataroom y colección" title="Delete dataroom and collection"
> >
<Trash2 className="h-4 w-4 text-red-400" /> <Trash2 className="h-4 w-4 text-red-400" />
</button> </button>
@@ -344,8 +339,8 @@ export function Sidebar({
) : ( ) : (
<div className="text-sm text-slate-400 px-3 py-2 text-center"> <div className="text-sm text-slate-400 px-3 py-2 text-center">
{Array.isArray(temas) && temas.length === 0 {Array.isArray(temas) && temas.length === 0
? "No hay datarooms" ? "No datarooms found"
: "Cargando datarooms..."} : "Loading datarooms..."}
</div> </div>
)} )}
</div> </div>
@@ -360,7 +355,7 @@ export function Sidebar({
> >
{onNavigateToSchemas && {onNavigateToSchemas &&
renderWithTooltip( renderWithTooltip(
"Gestionar Schemas", "Manage schemas",
<Button <Button
variant="default" variant="default"
size="sm" size="sm"
@@ -373,12 +368,12 @@ export function Sidebar({
> >
<Database className={cn("h-4 w-4", !collapsed && "mr-2")} /> <Database className={cn("h-4 w-4", !collapsed && "mr-2")} />
<span className={cn(collapsed && "sr-only")}> <span className={cn(collapsed && "sr-only")}>
Gestionar Schemas Manage Schemas
</span> </span>
</Button>, </Button>,
)} )}
{renderWithTooltip( {renderWithTooltip(
"Actualizar datarooms", "Refresh datarooms",
<Button <Button
variant="ghost" variant="ghost"
size="sm" size="sm"
@@ -391,7 +386,7 @@ export function Sidebar({
> >
<RefreshCcw className={cn("mr-2 h-4 w-4", collapsed && "mr-0")} /> <RefreshCcw className={cn("mr-2 h-4 w-4", collapsed && "mr-0")} />
<span className={cn(collapsed && "sr-only")}> <span className={cn(collapsed && "sr-only")}>
Actualizar datarooms Refresh datarooms
</span> </span>
</Button>, </Button>,
)} )}
@@ -406,14 +401,14 @@ export function Sidebar({
aria-describedby="create-dataroom-description" aria-describedby="create-dataroom-description"
> >
<DialogHeader> <DialogHeader>
<DialogTitle>Crear dataroom</DialogTitle> <DialogTitle>Create dataroom</DialogTitle>
<DialogDescription id="create-dataroom-description"> <DialogDescription id="create-dataroom-description">
Define un nombre único para organizar tus archivos. Choose a unique name to organize your files.
</DialogDescription> </DialogDescription>
</DialogHeader> </DialogHeader>
<div className="space-y-3"> <div className="space-y-3">
<div className="space-y-2"> <div className="space-y-2">
<Label htmlFor="dataroom-name">Nombre del dataroom</Label> <Label htmlFor="dataroom-name">Dataroom name</Label>
<Input <Input
id="dataroom-name" id="dataroom-name"
value={newDataroomName} value={newDataroomName}
@@ -423,7 +418,7 @@ export function Sidebar({
setCreateError(null); setCreateError(null);
} }
}} }}
placeholder="Ej: normativa, contratos, fiscal..." placeholder="e.g., policies, contracts, finance..."
autoFocus autoFocus
/> />
{createError && ( {createError && (
@@ -437,13 +432,13 @@ export function Sidebar({
onClick={() => handleCreateDialogOpenChange(false)} onClick={() => handleCreateDialogOpenChange(false)}
disabled={creatingDataroom} disabled={creatingDataroom}
> >
Cancelar Cancel
</Button> </Button>
<Button <Button
onClick={handleCreateDataroom} onClick={handleCreateDataroom}
disabled={creatingDataroom || newDataroomName.trim() === ""} disabled={creatingDataroom || newDataroomName.trim() === ""}
> >
{creatingDataroom ? "Creando..." : "Crear dataroom"} {creatingDataroom ? "Creating…" : "Create"}
</Button> </Button>
</DialogFooter> </DialogFooter>
</DialogContent> </DialogContent>

View File

@@ -0,0 +1,206 @@
import React, { useState } from "react";
import {
Globe,
ExternalLink,
Search,
ChevronDown,
ChevronRight,
Info,
Star,
} from "lucide-react";
import { cn } from "@/lib/utils";
interface SearchResult {
title: string;
url: string;
content: string;
score?: number;
}
interface WebSearchData {
query: string;
results: SearchResult[];
summary: string;
total_results: number;
}
interface WebSearchResultsProps {
data: WebSearchData;
}
const getScoreColor = (score?: number) => {
if (!score) return "text-gray-500";
if (score >= 0.8) return "text-green-600";
if (score >= 0.6) return "text-yellow-600";
return "text-gray-500";
};
const getScoreStars = (score?: number) => {
if (!score) return 0;
return Math.round(score * 5);
};
const truncateContent = (content: string, maxLength: number = 200) => {
if (content.length <= maxLength) return content;
return content.slice(0, maxLength) + "...";
};
export const WebSearchResults: React.FC<WebSearchResultsProps> = ({ data }) => {
const [expandedResults, setExpandedResults] = useState<Set<number>>(new Set());
const [showAllResults, setShowAllResults] = useState(false);
const { query, results, summary, total_results } = data;
const toggleResult = (index: number) => {
const newExpanded = new Set(expandedResults);
if (newExpanded.has(index)) {
newExpanded.delete(index);
} else {
newExpanded.add(index);
}
setExpandedResults(newExpanded);
};
const visibleResults = showAllResults ? results : results.slice(0, 3);
return (
<div className="w-full bg-white border border-gray-200 rounded-lg shadow-sm overflow-hidden">
{/* Header */}
<div className="bg-gradient-to-r from-green-50 to-emerald-50 p-4">
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<Globe className="w-6 h-6 text-green-600" />
<div>
<h3 className="font-semibold text-gray-900">Web Search Results</h3>
<div className="flex items-center gap-2 text-xs text-gray-600">
<Search className="w-3 h-3" />
<span>"{query}"</span>
</div>
</div>
</div>
<div className="text-right">
<div className="text-sm font-medium text-gray-900">{results.length}</div>
<div className="text-xs text-gray-600">
of {total_results} results
</div>
</div>
</div>
</div>
<div className="p-4 space-y-4">
{/* Summary */}
{summary && (
<div className="bg-blue-50 border border-blue-200 rounded-lg p-3">
<div className="flex items-start gap-2">
<Info className="w-4 h-4 text-blue-600 mt-0.5 flex-shrink-0" />
<div>
<h4 className="font-medium text-blue-900 text-sm mb-1">
Summary
</h4>
<p className="text-sm text-blue-800">{summary}</p>
</div>
</div>
</div>
)}
{/* Search Results */}
<div className="space-y-3">
{visibleResults.map((result, index) => {
const isExpanded = expandedResults.has(index);
const stars = getScoreStars(result.score);
return (
<div key={index} className="border border-gray-200 rounded-lg overflow-hidden">
<div className="p-3">
<div className="flex items-start justify-between mb-2">
<div className="flex-1 min-w-0">
<h4 className="font-medium text-gray-900 text-sm mb-1 line-clamp-2">
{result.title}
</h4>
<div className="flex items-center gap-2">
<a
href={result.url}
target="_blank"
rel="noopener noreferrer"
className="text-xs text-blue-600 hover:text-blue-800 flex items-center gap-1 truncate"
>
<ExternalLink className="w-3 h-3 flex-shrink-0" />
{new URL(result.url).hostname}
</a>
{result.score && (
<div className="flex items-center gap-1">
<div className="flex">
{[...Array(5)].map((_, i) => (
<Star
key={i}
className={cn(
"w-3 h-3",
i < stars
? "text-yellow-400 fill-current"
: "text-gray-300"
)}
/>
))}
</div>
<span className={cn("text-xs font-medium", getScoreColor(result.score))}>
{Math.round((result.score || 0) * 100)}%
</span>
</div>
)}
</div>
</div>
</div>
<div className="text-sm text-gray-700">
{isExpanded ? result.content : truncateContent(result.content)}
</div>
{result.content.length > 200 && (
<button
onClick={() => toggleResult(index)}
className="mt-2 flex items-center gap-1 text-xs text-blue-600 hover:text-blue-800 font-medium"
>
{isExpanded ? (
<>
<ChevronDown className="w-3 h-3" />
Show less
</>
) : (
<>
<ChevronRight className="w-3 h-3" />
Read more
</>
)}
</button>
)}
</div>
</div>
);
})}
</div>
{/* Show More/Less Button */}
{results.length > 3 && (
<div className="text-center">
<button
onClick={() => setShowAllResults(!showAllResults)}
className="text-sm text-blue-600 hover:text-blue-800 font-medium px-4 py-2 rounded-lg hover:bg-blue-50 transition-colors"
>
{showAllResults
? "Show fewer results"
: `Show ${results.length - 3} more results`}
</button>
</div>
)}
{/* No Results */}
{results.length === 0 && (
<div className="text-center py-6">
<Search className="w-8 h-8 text-gray-400 mx-auto mb-2" />
<p className="text-sm text-gray-600">No results found for "{query}"</p>
</div>
)}
</div>
</div>
);
};

View File

@@ -39,6 +39,30 @@ interface DataroomsResponse {
}>; }>;
} }
interface DataroomInfo {
name: string;
collection: string;
storage: string;
file_count: number;
total_size_bytes: number;
total_size_mb: number;
collection_exists: boolean;
vector_count: number | null;
collection_info: {
vectors_count: number;
indexed_vectors_count: number;
points_count: number;
segments_count: number;
status: string;
} | null;
file_types: Record<string, number>;
recent_files: Array<{
name: string;
size_mb: number;
last_modified: string;
}>;
}
interface CreateDataroomRequest { interface CreateDataroomRequest {
name: string; name: string;
collection?: string; collection?: string;
@@ -100,6 +124,15 @@ export const api = {
return response.json(); return response.json();
}, },
// Obtener información detallada de un dataroom
getDataroomInfo: async (dataroomName: string): Promise<DataroomInfo> => {
const response = await fetch(
`${API_BASE_URL}/dataroom/${encodeURIComponent(dataroomName)}/info`,
);
if (!response.ok) throw new Error("Error fetching dataroom info");
return response.json();
},
// Obtener archivos (todos o por tema) // Obtener archivos (todos o por tema)
getFiles: async (tema?: string): Promise<FileListResponse> => { getFiles: async (tema?: string): Promise<FileListResponse> => {
const url = tema const url = tema

Binary file not shown.

View File

@@ -0,0 +1,791 @@
*2
$6
SELECT
$1
0
*24
$14
FT._CREATEIFNX
$23
extracted_doc:doc:index
$2
ON
$4
HASH
$6
PREFIX
$1
1
$18
extracted_doc:doc:
$6
SCHEMA
$2
pk
$3
TAG
$9
SEPARATOR
$1
|
$9
file_name
$3
TAG
$9
SEPARATOR
$1
|
$4
tema
$3
TAG
$9
SEPARATOR
$1
|
$15
collection_name
$3
TAG
$9
SEPARATOR
$1
|
*3
$3
SET
$28
extracted_doc:doc:index:hash
$40
9de4cb60a645142de3d0f914909eb21259ea256c
*12
$14
FT._CREATEIFNX
$35
:app.models.dataroom.DataRoom:index
$2
ON
$4
HASH
$6
PREFIX
$1
1
$30
:app.models.dataroom.DataRoom:
$6
SCHEMA
$2
pk
$3
TAG
$9
SEPARATOR
$1
|
*3
$3
SET
$40
:app.models.dataroom.DataRoom:index:hash
$40
5ba27839f9c1b369df2b0734904dd56c02d2cea5
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9J296J6FHT5AKG92D2VX9VW
$2
pk
$26
01K9J296J6FHT5AKG92D2VX9VW
$4
name
$10
ABBEY C.U.
$10
collection
$0
$7
storage
$0
*2
$3
DEL
$56
:app.models.dataroom.DataRoom:01K9J296J6FHT5AKG92D2VX9VW
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9J2J0A0HPWJQEY0PXFCVPW8
$2
pk
$26
01K9J2J0A0HPWJQEY0PXFCVPW8
$4
name
$4
ABBY
$10
collection
$0
$7
storage
$0
*2
$3
DEL
$56
:app.models.dataroom.DataRoom:01K9J2J0A0HPWJQEY0PXFCVPW8
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9J2R5ZGS96G60P0G80W248Z
$2
pk
$26
01K9J2R5ZGS96G60P0G80W248Z
$4
name
$4
ABBY
$10
collection
$4
abby
$7
storage
$4
abby
*2
$6
SELECT
$1
0
*14
$4
HSET
$44
extracted_doc:doc:01K9J44CG40W28WKNC6RD5QSGM
$2
pk
$26
01K9J44CG40W28WKNC6RD5QSGM
$9
file_name
$21
tax_year_2022_990.pdf
$4
tema
$4
ABBY
$15
collection_name
$4
ABBY
$19
extracted_data_json
$5684
{
"ein": "31-0329725",
"legal_name": "0220 ABBEY CREDIT UNION INC",
"phone_number": "(397) 898-7800",
"website_url": "www.abbeycu.com",
"return_type": "990",
"amended_return": "",
"group_exemption_number": "",
"subsection_code": "501(c)(14)",
"ruling_date": "",
"accounting_method": "Accrual",
"organization_type": "Corporation",
"year_of_formation": "1937",
"incorporation_state": "OH",
"total_revenue": 6146738,
"contributions_gifts_grants": 0,
"program_service_revenue": 5656278,
"membership_dues": 0,
"investment_income": 490460,
"gains_losses_sales_assets": 0,
"rental_income": 0,
"related_organizations_revenue": 0,
"gaming_revenue": 0,
"other_revenue": 0,
"government_grants": 0,
"foreign_contributions": 0,
"total_expenses": 5526970,
"program_services_expenses": 386454,
"management_general_expenses": 5140516,
"fundraising_expenses": 0,
"grants_us_organizations": 0,
"grants_us_individuals": 0,
"grants_foreign_organizations": 0,
"grants_foreign_individuals": 0,
"compensation_officers": 629393,
"compensation_other_staff": 1329887,
"payroll_taxes_benefits": 452511,
"professional_fees": 96471,
"office_occupancy_costs": 292244,
"information_technology_costs": 287125,
"travel_conference_expenses": 26662,
"depreciation_amortization": 236836,
"insurance": 39570,
"officers_list": [
"Lisa Burk, Chief Experience Officer, 40.00 hrs/wk, Officer, $73,286 compensation, $8,931 other compensation",
"Eric Stetzel, VP of Business Services, 40.00 hrs/wk, Key employee, $111,064 compensation, $15,979 other compensation",
"Dean Pielemeier, CEO, 40.00 hrs/wk, Officer, $212,994 compensation, $20,202 other compensation",
"Blanca Criner, Chief Marketing and Business Development Officer, 40.00 hrs/wk, Officer, $121,036 compensation, $4,540 other compensation",
"Teri Puthoff, CFO, 40.00 hrs/wk, Officer, $120,962 compensation, $15,754 other compensation",
"Michael Thein, Chairman, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Latham Farley, Board Member, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Steve Wilmoth, Treasurer, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Julie Trick, Secretary, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Michele Blake, Board Member, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Cheryl Saunders, Board Member, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Heather Scaggs-Richardson, Vice Chairman, 1.00 hrs/wk, Individual trustee or director, $0 compensation"
],
"governing_body_size": 7,
"independent_members": 7,
"financial_statements_reviewed": "No",
"form_990_provided_to_governing_body": "Yes",
"conflict_of_interest_policy": "Yes",
"whistleblower_policy": "Yes",
"document_retention_policy": "Yes",
"ceo_compensation_review_process": "Compensation committee, independent compensation consultant, approval by the board or compensation committee",
"public_disclosure_practices": "Audited financial statements are handed out each year at the annual meeting. Other documents are not made available to the public unless requested.",
"program_accomplishments_list": [
"Our mission is to help our members improve their economic well being and quality of life by being competitive convenient and cutting edge.",
"Abbey provides checking savings accounts money market accounts certificates of deposit and IRAs. Abbey also offers free atm debit card services wire transfer services mobile banking and insurance related products.",
"Abbey provides personal vehicle mortgage and credit card loans to its members."
],
"total_fundraising_event_revenue": 0,
"total_fundraising_event_expenses": 0,
"professional_fundraiser_fees": 0,
"number_of_employees": 50,
"number_of_volunteers": 7,
"occupancy_costs": 132399,
"fundraising_method_descriptions": "",
"joint_ventures_disregarded_entities": "",
"base_compensation": 546779,
"bonus": 34415,
"incentive": 59692,
"other_compensation": 24691,
"non_fixed_compensation": "",
"first_class_travel": "No",
"housing_allowance": "No",
"expense_account_usage": "No",
"supplemental_retirement": "No",
"lobbying_expenditures_direct": 0,
"lobbying_expenditures_grassroots": 0,
"election_501h_status": "",
"political_campaign_expenditures": 0,
"related_organizations_affiliates": "",
"investment_types": "Publicly traded securities",
"donor_restricted_endowment_values": 0,
"net_appreciation_depreciation": -2190939,
"related_organization_transactions": "",
"loans_to_from_related_parties": "Lynn Cook (Retired CEO) Life Insurance loan, $1,400,000; Dean Pielemeier (CEO) Life Insurance loan, $1,239,997",
"penalties_excise_taxes_reported": "No",
"unrelated_business_income_disclosure": "Yes",
"foreign_bank_account_reporting": "No",
"schedule_o_narrative_explanations": "Membership to Abbey is open to those who live work worship or attend school in Montgomery Miami Shelby Darke or Greene counties. Abbey CU is owned by the people who open a savings account at Abbey CU. Each member has one vote at the annual election of the Board of Directors. Decisions that require approval: Ohio Division of Financial Institutions Pursuant to regulation. CFO prepares the form and CEO reviews before submitting. Approval of Compensation Committee and Board of Directors for CEO compensation. CEO establishes compensation for Key Employees. Audited financial statements are handed out each year at the annual meeting. Other documents are not made available to the public unless requested."
}
$20
extraction_timestamp
$26
2025-11-08T16:17:31.137820
*2
$6
SELECT
$1
0
*14
$4
HSET
$44
extracted_doc:doc:01K9MH40SHHNWXNGAP4MDX6HZF
$2
pk
$26
01K9MH40SHHNWXNGAP4MDX6HZF
$9
file_name
$21
tax_year_2019_990.pdf
$4
tema
$4
ABBY
$15
collection_name
$4
ABBY
$19
extracted_data_json
$4954
{
"ein": "31-0329725",
"legal_name": "Abbey Credit Union Inc",
"phone_number": "937-898-7800",
"website_url": "www.abbeycu.com",
"return_type": "990",
"amended_return": "",
"group_exemption_number": "",
"subsection_code": "501(c)(14)",
"ruling_date": "",
"accounting_method": "Accrual",
"organization_type": "Corporation",
"year_of_formation": "1937",
"incorporation_state": "OH",
"total_revenue": 4659611,
"contributions_gifts_grants": 0,
"program_service_revenue": 4114069,
"membership_dues": 0,
"investment_income": 545542,
"gains_losses_sales_assets": 0,
"rental_income": 6725,
"related_organizations_revenue": 0,
"gaming_revenue": 0,
"other_revenue": 0,
"government_grants": 0,
"foreign_contributions": 0,
"total_expenses": 4159254,
"program_services_expenses": 0,
"management_general_expenses": 0,
"fundraising_expenses": 0,
"grants_us_organizations": 0,
"grants_us_individuals": 0,
"grants_foreign_organizations": 0,
"grants_foreign_individuals": 0,
"compensation_officers": 338208,
"compensation_other_staff": 1018956,
"payroll_taxes_benefits": 309032,
"professional_fees": 34366,
"office_occupancy_costs": 0,
"information_technology_costs": 0,
"travel_conference_expenses": 0,
"depreciation_amortization": 0,
"insurance": 0,
"officers_list": [
"Michael Thein, Chairman",
"Nancy Wood, Vice Chairman",
"Steve Wilmoth, Treasurer",
"Julie Trick, Secretary",
"Michele Blake, Board Member",
"Cheryl Saunders, Board Member",
"Heather Scaggs-Richardson, Board Member",
"Dean Pielemeier, CEO",
"Teri Puthoff, VP of Finance",
"Blanca Ortiz, VP of Business Development"
],
"governing_body_size": 7,
"independent_members": 7,
"financial_statements_reviewed": "Yes",
"form_990_provided_to_governing_body": "Yes",
"conflict_of_interest_policy": "Yes",
"whistleblower_policy": "Yes",
"document_retention_policy": "Yes",
"ceo_compensation_review_process": "Compensation for Dean Pielemeier, CEO is reviewed by the Personnel Committee of the board of directors and is acted upon by the board of directors per committee recommendation Compensation for officers is determined by Dean Pielemeier, CEO",
"public_disclosure_practices": "Documents are not made available to the public",
"program_accomplishments_list": [
"Our mission is to help our members improve their economic well-being and quality of life by being competitive, convenient, and cutting edge",
"LENDING - ABBEY PROVIDES PERSONAL, VEHICLE, MORTGAGE, AND CREDIT CARD LOANS TO ITS MEMBERS.",
"FINANCIAL SERVICES - ABBEY PROVIDES CHECKING ACCOUNTS, SAVINGS ACCOUNTS, MONEY MARKET ACCOUNTS"
],
"total_fundraising_event_revenue": 0,
"total_fundraising_event_expenses": 0,
"professional_fundraiser_fees": 0,
"number_of_employees": 36,
"number_of_volunteers": 7,
"occupancy_costs": 0,
"fundraising_method_descriptions": "",
"joint_ventures_disregarded_entities": "",
"base_compensation": 154796,
"bonus": 6409,
"incentive": 0,
"other_compensation": 4642,
"non_fixed_compensation": "",
"first_class_travel": "",
"housing_allowance": "",
"expense_account_usage": "",
"supplemental_retirement": "Yes",
"lobbying_expenditures_direct": 0,
"lobbying_expenditures_grassroots": 0,
"election_501h_status": "",
"political_campaign_expenditures": 0,
"related_organizations_affiliates": "",
"investment_types": "",
"donor_restricted_endowment_values": 0,
"net_appreciation_depreciation": 281072,
"related_organization_transactions": "",
"loans_to_from_related_parties": "LYNN COOK, PRIOR CEO, LIFE INSURAN, Loan from organization, $140,000 balance due",
"penalties_excise_taxes_reported": "",
"unrelated_business_income_disclosure": "Yes",
"foreign_bank_account_reporting": "No",
"schedule_o_narrative_explanations": "Our mission is to help our members improve their economic well-being and quality of life by being competitive, convenient, and cutting edge. Membership to Abbey is open to those who live, work, or worship, or attend school in Montgomery, Miami, Shelby, Darke, or Greene counties. Abbey Credit Union is owned by these people that open an account at Abbey CU. Yes - Each member gets one vote at the annual election. Yes - Pursuant to the regulation of the Ohio Division of Credit Unions. The VP of Finance prepared the Form 990 based on financial records Financial statements are audited annually The CEO reviews the returns prior to filing. The Credit Union has a written conflict of interest policy that states that board members are responsible for disclosing possible conflicts of interest as they arise This is reviewed annually. Compensation for Dean Pielemeier, CEO is reviewed by the Personnel Committee of the board of directors and is acted upon by the board of directors per committee recommendation Compensation for officers is determined by Dean Pielemeier, CEO."
}
$20
extraction_timestamp
$26
2025-11-09T14:42:59.494840
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9MNBVRY6KVF4XCK70JF70T0
$2
pk
$26
01K9MNBVRY6KVF4XCK70JF70T0
$4
name
$5
ABBEY
$10
collection
$5
abbey
$7
storage
$5
abbey
*14
$4
HSET
$44
extracted_doc:doc:01K9MNMNYH1MYP0TQ5SWZRS5BG
$2
pk
$26
01K9MNMNYH1MYP0TQ5SWZRS5BG
$9
file_name
$21
tax_year_2022_990.pdf
$4
tema
$5
ABBEY
$15
collection_name
$5
ABBEY
$19
extracted_data_json
$5525
{
"ein": "31-0329725",
"calendar_year": 2022,
"legal_name": "0220 ABBEY CREDIT UNION INC",
"phone_number": "(397) 898-7800",
"website_url": "www.abbeycu.com",
"return_type": "990",
"amended_return": "",
"group_exemption_number": "",
"subsection_code": "501(c)(14)",
"ruling_date": "",
"accounting_method": "Accrual",
"organization_type": "Corporation",
"year_of_formation": "1937",
"incorporation_state": "OH",
"total_revenue": 6146738,
"contributions_gifts_grants": 0,
"program_service_revenue": 5656278,
"membership_dues": 0,
"investment_income": 490460,
"gains_losses_sales_assets": 0,
"rental_income": 0,
"related_organizations_revenue": 0,
"gaming_revenue": 0,
"other_revenue": 0,
"government_grants": 0,
"foreign_contributions": 0,
"total_expenses": 5526970,
"program_services_expenses": 386454,
"management_general_expenses": 5140516,
"fundraising_expenses": 0,
"grants_us_organizations": 0,
"grants_us_individuals": 0,
"grants_foreign_organizations": 0,
"grants_foreign_individuals": 0,
"compensation_officers": 629393,
"compensation_other_staff": 1329887,
"payroll_taxes_benefits": 452511,
"professional_fees": 96462,
"office_occupancy_costs": 292244,
"information_technology_costs": 287125,
"travel_conference_expenses": 26662,
"depreciation_amortization": 236836,
"insurance": 39570,
"officers_list": [
"Lisa Burk, Chief Experience Officer, 40.00 hrs/wk, Officer, $73,286 compensation, $8,931 other compensation",
"Eric Stetzel, VP of Business Services, 40.00 hrs/wk, Key employee, $111,064 compensation, $15,979 other compensation",
"Dean Pielemeier, CEO, 40.00 hrs/wk, Officer, $212,994 compensation, $20,202 other compensation",
"Blanca Criner, Chief Marketing and Business Development Officer, 40.00 hrs/wk, Officer, $121,036 compensation, $4,540 other compensation",
"Teri Puthoff, CFO, 40.00 hrs/wk, Officer, $120,962 compensation, $15,754 other compensation",
"Michael Thein, Chairman, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Latham Farley, Board Member, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Steve Wilmoth, Treasurer, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Julie Trick, Secretary, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Michele Blake, Board Member, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Cheryl Saunders, Board Member, 1.00 hrs/wk, Individual trustee or director, $0 compensation",
"Heather Scaggs-Richardson, Vice Chairman, 1.00 hrs/wk, Individual trustee or director, $0 compensation"
],
"governing_body_size": 7,
"independent_members": 7,
"financial_statements_reviewed": "No",
"form_990_provided_to_governing_body": "Yes",
"conflict_of_interest_policy": "Yes",
"whistleblower_policy": "Yes",
"document_retention_policy": "Yes",
"ceo_compensation_review_process": "Approval of Compensation Committee and Board of Directors",
"public_disclosure_practices": "Audited financial statements are handed out each year at the annual meeting. Other documents are not made available to the public unless requested.",
"program_accomplishments_list": [
"Abbey provides checking savings accounts money market accounts certificates of deposit and IRAs. Abbey also offers free atm debit card services wire transfer services mobile banking and insurance related products.",
"Abbey provides personal vehicle mortgage and credit card loans to its members."
],
"total_fundraising_event_revenue": 0,
"total_fundraising_event_expenses": 0,
"professional_fundraiser_fees": 0,
"number_of_employees": 50,
"number_of_volunteers": 7,
"occupancy_costs": 132399,
"fundraising_method_descriptions": "",
"joint_ventures_disregarded_entities": "",
"base_compensation": 544779,
"bonus": 34315,
"incentive": 59692,
"other_compensation": 0,
"non_fixed_compensation": "",
"first_class_travel": "No",
"housing_allowance": "No",
"expense_account_usage": "No",
"supplemental_retirement": "No",
"lobbying_expenditures_direct": 0,
"lobbying_expenditures_grassroots": 0,
"election_501h_status": "",
"political_campaign_expenditures": 0,
"related_organizations_affiliates": "",
"investment_types": "Publicly traded securities",
"donor_restricted_endowment_values": 0,
"net_appreciation_depreciation": -2190939,
"related_organization_transactions": "",
"loans_to_from_related_parties": "Lynn Cook (Retired CEO) and Dean Pielemeier (CEO) received life insurance loans from the organization, each with a balance due of $1,400,000 and $1,239,997 respectively.",
"penalties_excise_taxes_reported": "No",
"unrelated_business_income_disclosure": "Yes",
"foreign_bank_account_reporting": "No",
"schedule_o_narrative_explanations": "Membership to Abbey is open to those who live work worship or attend school in Montgomery Miami Shelby Darke or Greene counties. Abbey CU is owned by the people who open a savings account at Abbey CU. Each member has one vote at the annual election of the Board of Directors. Decisions that require their approval: Ohio Division of Financial Institutions Pursuant to regulation. CFO prepares the form and CEO reviews before submitting. Approval of Compensation Committee and Board of Directors for CEO compensation. Audited financial statements are handed out each year at the annual meeting. Other documents are not made available to the public unless requested."
}
$20
extraction_timestamp
$26
2025-11-09T16:01:59.759953
*14
$4
HSET
$44
extracted_doc:doc:01K9MP8DBNWMRH9A966RRJ4E7V
$2
pk
$26
01K9MP8DBNWMRH9A966RRJ4E7V
$9
file_name
$21
tax_year_2019_990.pdf
$4
tema
$5
ABBEY
$15
collection_name
$5
ABBEY
$19
extracted_data_json
$4959
{
"ein": "31-0329725",
"calendar_year": 2019,
"legal_name": "Abbey Credit Union Inc",
"phone_number": "937-898-7800",
"website_url": "www.abbeycu.com",
"return_type": "990",
"amended_return": "",
"group_exemption_number": "",
"subsection_code": "501(c)(14)",
"ruling_date": "",
"accounting_method": "Accrual",
"organization_type": "Corporation",
"year_of_formation": "1937",
"incorporation_state": "OH",
"total_revenue": 4659611,
"contributions_gifts_grants": 0,
"program_service_revenue": 4114069,
"membership_dues": 0,
"investment_income": 545542,
"gains_losses_sales_assets": 0,
"rental_income": 6725,
"related_organizations_revenue": 0,
"gaming_revenue": 0,
"other_revenue": 0,
"government_grants": 0,
"foreign_contributions": 0,
"total_expenses": 4159254,
"program_services_expenses": 0,
"management_general_expenses": 0,
"fundraising_expenses": 0,
"grants_us_organizations": 0,
"grants_us_individuals": 0,
"grants_foreign_organizations": 0,
"grants_foreign_individuals": 0,
"compensation_officers": 338208,
"compensation_other_staff": 1018956,
"payroll_taxes_benefits": 273057,
"professional_fees": 34366,
"office_occupancy_costs": 200152,
"information_technology_costs": 0,
"travel_conference_expenses": 46563,
"depreciation_amortization": 0,
"insurance": 0,
"officers_list": {
"value": [
"Michael Thein, Chairman",
"Nancy Wood, Vice Chairman",
"Steve Wilmoth, Treasurer",
"Julie Trick, Secretary",
"Michele Blake, Board Member",
"Cheryl Saunders, Board Member",
"Heather Scaggs-Richardson, Board Member",
"Dean Pielemeier, CEO",
"Teri Puthoff, VP of Finance",
"Blanca Ortiz, VP of Business Development"
],
"chunk_references": []
},
"governing_body_size": 7,
"independent_members": 7,
"financial_statements_reviewed": "Yes",
"form_990_provided_to_governing_body": "Yes",
"conflict_of_interest_policy": "Yes",
"whistleblower_policy": "Yes",
"document_retention_policy": "Yes",
"ceo_compensation_review_process": "Compensation for Dean Pielemeier, CEO is reviewed by the Personnel Committee of the board of directors and is acted upon by the board of directors per committee recommendation.",
"public_disclosure_practices": "Documents are not made available to the public",
"program_accomplishments_list": {
"value": [
"LENDING - ABBEY PROVIDES PERSONAL, VEHICLE, MORTGAGE, AND CREDIT CARD LOANS TO ITS MEMBERS.",
"FINANCIAL SERVICES - ABBEY PROVIDES CHECKING ACCOUNTS, SAVINGS ACCOUNTS, MONEY MARKET ACCOUNTS, ..."
],
"chunk_references": []
},
"total_fundraising_event_revenue": 0,
"total_fundraising_event_expenses": 0,
"professional_fundraiser_fees": 0,
"number_of_employees": 36,
"number_of_volunteers": 7,
"occupancy_costs": 87912,
"fundraising_method_descriptions": "",
"joint_ventures_disregarded_entities": "",
"base_compensation": 154796,
"bonus": 6409,
"incentive": 0,
"other_compensation": 4642,
"non_fixed_compensation": "",
"first_class_travel": "",
"housing_allowance": "",
"expense_account_usage": "",
"supplemental_retirement": "Yes",
"lobbying_expenditures_direct": 0,
"lobbying_expenditures_grassroots": 0,
"election_501h_status": "",
"political_campaign_expenditures": 0,
"related_organizations_affiliates": "",
"investment_types": "",
"donor_restricted_endowment_values": 0,
"net_appreciation_depreciation": 281072,
"related_organization_transactions": "",
"loans_to_from_related_parties": "Loans to prior CEO Lynn Cook for life insurance, $1,400,000 total outstanding.",
"penalties_excise_taxes_reported": "",
"unrelated_business_income_disclosure": "Yes",
"foreign_bank_account_reporting": "No",
"schedule_o_narrative_explanations": "Our mission is to help our members improve their economic well-being and quality of life by being competitive, convenient, and cutting edge. Membership to Abbey is open to those who live, work, or worship, or attend school in Montgomery, Miami, Shelby, Darke, or Greene counties. Abbey Credit Union is owned by these people that open an account at Abbey CU. Yes - Each member gets one vote at the annual election. Yes - Pursuant to the regulation of the Ohio Division of Credit Unions. The VP of Finance prepared the Form 990 based on financial records. Financial statements are audited annually. The CEO reviews the returns prior to filing. The Credit Union has a written conflict of interest policy that states that board members are responsible for disclosing possible conflicts of interest as they arise. This is reviewed annually. Compensation for Dean Pielemeier, CEO is reviewed by the Personnel Committee of the board of directors and is acted upon by the board of directors per committee recommendation. Compensation for officers is determined by Dean Pielemeier, CEO. Documents are not made available to the public."
}
$20
extraction_timestamp
$26
2025-11-09T16:12:46.318554
*2
$3
DEL
$56
:app.models.dataroom.DataRoom:01K9J2R5ZGS96G60P0G80W248Z
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9MSH70T90HY2VB6DPJVQZFQ
$2
pk
$26
01K9MSH70T90HY2VB6DPJVQZFQ
$4
name
$9
NEW HEART
$10
collection
$9
new_heart
$7
storage
$9
new_heart
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9MSJ57MS3DZFBG5TQXBDD6W
$2
pk
$26
01K9MSJ57MS3DZFBG5TQXBDD6W
$4
name
$5
OHANA
$10
collection
$5
ohana
$7
storage
$5
ohana
*10
$4
HSET
$56
:app.models.dataroom.DataRoom:01K9MSJJTH48BR7KQ27PXB2C3S
$2
pk
$26
01K9MSJJTH48BR7KQ27PXB2C3S
$4
name
$5
SOSFS
$10
collection
$5
sosfs
$7
storage
$5
sosfs

BIN
redis_data/dump.rdb Normal file

Binary file not shown.

0
redis_data/temp-9.rdb Normal file
View File