Extends the reviewer agent — previously only protecting indicator
summaries — to every AI-generated surface that reaches a user. The
reviewer's prompt already rejects scratchpad, truncation,
meta-commentary, and (since a6e476b) financial advice; wiring it in
turns those rules from prompt-level "asks" into structural gates.
Four call sites updated:
- ai_log_job.run() : after each tone/analysis variant is generated,
pass through review_read. On reject, log the reason and skip the
StrategicLog insert; the API's existing "latest StrategicLog" lookup
falls back to the previous clean log.
- services/portfolio_analysis.analyse() : on reject, raise a clean
RuntimeError that the /api/analyze router already maps to HTTP 502
with a retry-able message. Portfolio analysis isn't cached server-
side, so the user retries; the reviewer's verdict reason goes into
the AICall ledger as the leaked-status row's error column.
- routers/chat.chat() : on reject, instead of returning the raw
assistant content we return a short refusal explaining the limit
and inviting a rephrase. Adds ~1-2 s of latency per turn (one extra
LLM call to Haiku) — the only user-facing latency tax.
- jobs/email_digest_job._generate_variants() : on reject, the variant
is dropped for the cycle. Recipients on the rejected tone get no
digest email this run, which is better than delivering inbox copy
that drifts into advice (emails are unrecallable once sent).
In every case the AICall ledger row records the reviewer cost so
month_spend stays accurate across all paths.
The reviewer system prompt is slightly generalised to cover both the
indicator-summary case and the longer-form log/digest/chat case:
- removes "short interpretive read" framing
- softens the "any question" rule so genuine rhetorical structure in
a long-form log doesn't trigger a reject
tests/conftest.py grows an autouse fixture that stubs review_read to
clean=True in every consumer module. Tests that mock the generator
shouldn't have to also mock the safety gate behind it; tests that
specifically want the reject branch can override with their own
monkeypatch. test_output_review.py is unaffected — it imports
review_read directly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
162 lines
7.3 KiB
Python
162 lines
7.3 KiB
Python
"""Second-pass reviewer agent for AI-generated reads.
|
|
|
|
The per-group and aggregate indicator summaries are generated in JSON
|
|
mode and the publishable text comes out of a single "read" field, but a
|
|
misbehaving model can still slip chain-of-thought INSIDE the field
|
|
("Let's see…", "X? Actually Y?", multi-question parentheticals). This
|
|
module makes a small second LLM call that judges the candidate read as
|
|
clean / unclean. Cost is ~$0.0001 per check; latency ~1-2 s in the
|
|
hourly job. No user-facing latency.
|
|
|
|
The reviewer is deliberately a tiny, JSON-shaped classifier — same
|
|
JSON-mode mechanism as the generator, so the verdict can't be lost in
|
|
prose. If parsing fails or the call errors, the row is rejected
|
|
(fail-safe: the previously cached good summary stays visible).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
from dataclasses import dataclass
|
|
|
|
import httpx
|
|
|
|
from app.config import get_settings
|
|
from app.logging import get_logger
|
|
from app.services.openrouter import call_llm
|
|
|
|
log = get_logger("output_review")
|
|
|
|
|
|
# The reviewer runs through OpenRouter against a small, non-thinking
|
|
# model. DeepSeek-V4-flash (our generator default) emits internal
|
|
# chain-of-thought before its JSON output even when the prompt forbids
|
|
# it, which truncates the JSON at any reasonable max_tokens cap and
|
|
# breaks the parser. Anthropic's Haiku family answers structured-output
|
|
# tasks tersely and deterministically — no chain-of-thought tax. Cost
|
|
# is ~$0.0001-$0.0003 per review depending on candidate length.
|
|
DEFAULT_REVIEWER_MODEL = "anthropic/claude-haiku-4.5"
|
|
|
|
|
|
_SYSTEM_PROMPT = """\
|
|
You are a strict editor for a financial-markets dashboard. The author
|
|
was asked to produce editorial commentary on public market data for
|
|
human readers. You receive the proposed text — it may be a one-line
|
|
read, a multi-paragraph daily log, a portfolio analysis, a chat
|
|
reply, or an email digest — and decide if it is publishable as-is.
|
|
|
|
Mark CLEAN only if the text reads like finished editorial commentary
|
|
a reader could see on a public dashboard without confusion.
|
|
|
|
Mark UNCLEAN if the text contains ANY of:
|
|
- Chain-of-thought / scratchpad markers — the author thinking on the
|
|
page rather than presenting finished commentary. Phrases like
|
|
"Let me", "Let's see", "we need to", "actually" (correcting itself),
|
|
"wait", "hmm", "or rather", "I should". Rhetorical questions used
|
|
as structure are fine; questions that the author then answers in
|
|
front of the reader (self-questioning) are not.
|
|
- Self-questioning parentheticals: "Q1 2026? Actually Q4 2025?",
|
|
"is it X or Y?", any place where the author appears to be working
|
|
out the answer in front of the reader.
|
|
- Meta-commentary about the task, output format, word limits, or
|
|
instructions — e.g. "as required by the constraints", "the prompt
|
|
asks", "let me address each".
|
|
- Partial / truncated content. Starts mid-word, mid-number, mid-clause,
|
|
ends mid-thought.
|
|
- Visible internal numbers without clear meaning ("change 1y +5.9%?"),
|
|
raw column names ("as_of 2026-01-01"), or any debug-like fragments.
|
|
- FINANCIAL ADVICE or any phrasing that recommends an action the
|
|
reader should take. This service is editorial commentary on public
|
|
data, not investment advice; the operator is not licensed to give
|
|
it. Reject any of:
|
|
* Buy/sell/hold/accumulate/trim/exit/enter/rotate language.
|
|
* Allocation guidance ("overweight", "underweight",
|
|
"X% in bonds", "increase exposure to").
|
|
* Price targets or specific level predictions ("will reach $X",
|
|
"target Y", "expect Z by year-end").
|
|
* Personalised framing ("you should", "investors should",
|
|
"consider buying", "we recommend").
|
|
DESCRIPTIVE / INTERPRETIVE language about market state is fine —
|
|
"valuations are stretched", "real yields are restrictive", "rates
|
|
and credit disagree". The test: does the text describe a STATE, or
|
|
does it suggest an ACTION? States are fine; actions are not.
|
|
- Anything else other than the finished, publishable commentary.
|
|
|
|
Return ONLY a JSON object with this exact shape:
|
|
{"clean": true | false, "reason": "<≤20 words, plain text>"}
|
|
No preamble, no markdown fences, no other fields.
|
|
"""
|
|
|
|
|
|
@dataclass(frozen=True)
|
|
class Verdict:
|
|
clean: bool
|
|
reason: str
|
|
cost_usd: float | None # cost of the review call itself, for the ledger
|
|
|
|
|
|
async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict:
|
|
"""Ask the LLM whether `candidate` is a publishable read.
|
|
|
|
Returns Verdict(clean, reason, cost). Any error — provider failure,
|
|
JSON parse failure, missing field, wrong type — yields a CONSERVATIVE
|
|
verdict (clean=False) so the caller drops the candidate. The
|
|
previously cached good summary stays visible on the dashboard."""
|
|
if not candidate or not candidate.strip():
|
|
return Verdict(clean=False, reason="empty candidate", cost_usd=0.0)
|
|
|
|
messages = [
|
|
{"role": "system", "content": _SYSTEM_PROMPT},
|
|
# Sent as a fenced user turn so the model can't confuse the
|
|
# candidate with instructions, even if the candidate happens to
|
|
# contain prompt-like prose.
|
|
{"role": "user", "content": f"Candidate read:\n```\n{candidate}\n```"},
|
|
]
|
|
settings = get_settings()
|
|
reviewer_model = getattr(settings, "REVIEWER_MODEL", None) or DEFAULT_REVIEWER_MODEL
|
|
try:
|
|
result = await call_llm(
|
|
client, messages,
|
|
# Pin to OpenRouter so a non-DeepSeek model like Haiku is
|
|
# actually reachable; the default provider chain would try
|
|
# DeepSeek native first and 404 on the Anthropic model name.
|
|
provider="openrouter",
|
|
model=reviewer_model,
|
|
# 300 tokens is well above the ~30-token JSON verdict.
|
|
# Haiku doesn't pad with hidden reasoning the way DeepSeek
|
|
# does, so we don't need the 800-token headroom required to
|
|
# absorb the generator's chain-of-thought.
|
|
max_tokens=300,
|
|
response_format={"type": "json_object"},
|
|
)
|
|
except Exception as e:
|
|
log.warning("review.call_failed", error=str(e)[:200])
|
|
return Verdict(clean=False, reason=f"reviewer error: {str(e)[:80]}",
|
|
cost_usd=None)
|
|
|
|
# Haiku (and several other models) occasionally wrap their JSON
|
|
# output in a markdown code fence even with response_format set —
|
|
# ```json\n{...}\n``` — so strip a single leading/trailing fence
|
|
# before parsing. We do this defensively for any model; it's a
|
|
# no-op for callers that already emit bare JSON.
|
|
raw = result.content.strip()
|
|
if raw.startswith("```"):
|
|
first_nl = raw.find("\n")
|
|
if first_nl != -1:
|
|
raw = raw[first_nl + 1:]
|
|
if raw.rstrip().endswith("```"):
|
|
raw = raw.rstrip()[:-3].rstrip()
|
|
raw = raw.strip()
|
|
|
|
try:
|
|
parsed = json.loads(raw)
|
|
except json.JSONDecodeError:
|
|
log.warning("review.parse_failed", preview=result.content[:200])
|
|
return Verdict(clean=False, reason="reviewer returned non-JSON",
|
|
cost_usd=result.cost_usd)
|
|
|
|
clean = parsed.get("clean")
|
|
reason = parsed.get("reason") or ""
|
|
if not isinstance(clean, bool):
|
|
return Verdict(clean=False, reason="reviewer omitted bool 'clean'",
|
|
cost_usd=result.cost_usd)
|
|
return Verdict(clean=clean, reason=str(reason)[:200], cost_usd=result.cost_usd)
|