ai: structured-output + reviewer agent for indicator summaries
Replaces the regex-based clean_summary / looks_like_leakage pipeline
that produced the 2026-05-29 valuation-read leak. Two layers of defence
in depth:
1. JSON-mode generation. The per-group and aggregate summary system
prompts now require the model to emit a single object
{"read": "..."}; response_format={"type":"json_object"} is passed
through to the provider so the API enforces well-formed JSON. Prose
outside the field is physically impossible. The "read" field is the
only schema slot, so the model has nowhere to spill scratchpad
into the envelope.
2. Reviewer agent. services/output_review.review_read() makes a second
small LLM call that judges whether the candidate "read" string is
publishable. It catches the residual failure mode — scratchpad
INSIDE the field ("Let's see…", multi-question parentheticals,
meta-commentary) — and returns a JSON verdict {"clean": bool,
"reason": str}. Any failure (provider error, parse error, missing
field) returns clean=false (fail-safe). Cost ~$0.0001/check; latency
~1-2 s in the hourly job, no user-facing latency.
The old regex scaffolding (_LEAK_PATTERNS, clean_summary,
looks_like_leakage, _TRAILING_QUOTE) is deleted entirely. It produced
false positives (chopped legitimate "The indicators are…" leaders) and
false negatives (never matched the chain-of-thought patterns the model
actually emits). The reviewer agent is strictly better on both.
On reviewer/parse rejection: don't persist a new IndicatorSummary; the
API's existing fallback to the previous good row continues to serve
the panel. Failures are logged as ind_summary.json_invalid /
ind_summary.reviewer_rejected so we can measure the rejection rate.
Reviewer cost is added to the row's recorded cost_usd so the monthly
budget cap covers the full pipeline.
Adds tests/test_output_review.py: 11 cases covering _extract_read
(JSON envelope handling — invalid JSON, missing field, wrong types,
empty values) and review_read (clean / unclean verdicts plus three
fail-safe paths for malformed reviewer responses).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
19d4854f50
commit
45fa31bb2b
4 changed files with 396 additions and 141 deletions
107
app/services/output_review.py
Normal file
107
app/services/output_review.py
Normal file
|
|
@ -0,0 +1,107 @@
|
|||
"""Second-pass reviewer agent for AI-generated reads.
|
||||
|
||||
The per-group and aggregate indicator summaries are generated in JSON
|
||||
mode and the publishable text comes out of a single "read" field, but a
|
||||
misbehaving model can still slip chain-of-thought INSIDE the field
|
||||
("Let's see…", "X? Actually Y?", multi-question parentheticals). This
|
||||
module makes a small second LLM call that judges the candidate read as
|
||||
clean / unclean. Cost is ~$0.0001 per check; latency ~1-2 s in the
|
||||
hourly job. No user-facing latency.
|
||||
|
||||
The reviewer is deliberately a tiny, JSON-shaped classifier — same
|
||||
JSON-mode mechanism as the generator, so the verdict can't be lost in
|
||||
prose. If parsing fails or the call errors, the row is rejected
|
||||
(fail-safe: the previously cached good summary stays visible).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from dataclasses import dataclass
|
||||
|
||||
import httpx
|
||||
|
||||
from app.logging import get_logger
|
||||
from app.services.openrouter import call_llm
|
||||
|
||||
log = get_logger("output_review")
|
||||
|
||||
|
||||
_SYSTEM_PROMPT = """\
|
||||
You are a strict editor for a financial-markets dashboard. The author
|
||||
was asked to produce a short interpretive read for human readers.
|
||||
You receive their proposed read and decide if it is publishable as-is.
|
||||
|
||||
Mark CLEAN only if the text reads like a finished interpretation a
|
||||
reader could see on a public dashboard without confusion.
|
||||
|
||||
Mark UNCLEAN if the text contains ANY of:
|
||||
- Chain-of-thought / scratchpad markers used as thinking — phrases like
|
||||
"Let me", "Let's see", "we need to", "actually" (correcting itself),
|
||||
"wait", "hmm", "or rather", "I should".
|
||||
- Self-questioning parentheticals: "Q1 2026? Actually Q4 2025?",
|
||||
"is it X or Y?", any place where the author appears to be working
|
||||
out the answer in front of the reader.
|
||||
- Multiple rhetorical questions or any question that interrupts the
|
||||
declarative voice. A clean interpretive read is assertive.
|
||||
- Meta-commentary about the task, output format, word limits, or
|
||||
instructions — e.g. "as required by the constraints", "the prompt
|
||||
asks", "let me address each".
|
||||
- Partial / truncated content. Starts mid-word, mid-number, mid-clause.
|
||||
- Visible internal numbers without clear meaning ("change 1y +5.9%?"),
|
||||
raw column names ("as_of 2026-01-01"), or any debug-like fragments.
|
||||
- Anything other than the finished, publishable interpretation.
|
||||
|
||||
Return ONLY a JSON object with this exact shape:
|
||||
{"clean": true | false, "reason": "<≤20 words, plain text>"}
|
||||
No preamble, no markdown fences, no other fields.
|
||||
"""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Verdict:
|
||||
clean: bool
|
||||
reason: str
|
||||
cost_usd: float | None # cost of the review call itself, for the ledger
|
||||
|
||||
|
||||
async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict:
|
||||
"""Ask the LLM whether `candidate` is a publishable read.
|
||||
|
||||
Returns Verdict(clean, reason, cost). Any error — provider failure,
|
||||
JSON parse failure, missing field, wrong type — yields a CONSERVATIVE
|
||||
verdict (clean=False) so the caller drops the candidate. The
|
||||
previously cached good summary stays visible on the dashboard."""
|
||||
if not candidate or not candidate.strip():
|
||||
return Verdict(clean=False, reason="empty candidate", cost_usd=0.0)
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": _SYSTEM_PROMPT},
|
||||
# Sent as a fenced user turn so the model can't confuse the
|
||||
# candidate with instructions, even if the candidate happens to
|
||||
# contain prompt-like prose.
|
||||
{"role": "user", "content": f"Candidate read:\n```\n{candidate}\n```"},
|
||||
]
|
||||
try:
|
||||
result = await call_llm(
|
||||
client, messages,
|
||||
max_tokens=120,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
except Exception as e:
|
||||
log.warning("review.call_failed", error=str(e)[:200])
|
||||
return Verdict(clean=False, reason=f"reviewer error: {str(e)[:80]}",
|
||||
cost_usd=None)
|
||||
|
||||
try:
|
||||
parsed = json.loads(result.content)
|
||||
except json.JSONDecodeError:
|
||||
log.warning("review.parse_failed", preview=result.content[:200])
|
||||
return Verdict(clean=False, reason="reviewer returned non-JSON",
|
||||
cost_usd=result.cost_usd)
|
||||
|
||||
clean = parsed.get("clean")
|
||||
reason = parsed.get("reason") or ""
|
||||
if not isinstance(clean, bool):
|
||||
return Verdict(clean=False, reason="reviewer omitted bool 'clean'",
|
||||
cost_usd=result.cost_usd)
|
||||
return Verdict(clean=clean, reason=str(reason)[:200], cost_usd=result.cost_usd)
|
||||
Loading…
Add table
Add a link
Reference in a new issue