review: gate strategic-log, portfolio, chat, and digest on reviewer

Extends the reviewer agent — previously only protecting indicator summaries — to every AI-generated surface that reaches a user. The reviewer's prompt already rejects scratchpad, truncation, meta-commentary, and (since a6e476b) financial advice; wiring it in turns those rules from prompt-level "asks" into structural gates. Four call sites updated: - ai_log_job.run() : after each tone/analysis variant is generated, pass through review_read. On reject, log the reason and skip the StrategicLog insert; the API's existing "latest StrategicLog" lookup falls back to the previous clean log. - services/portfolio_analysis.analyse() : on reject, raise a clean RuntimeError that the /api/analyze router already maps to HTTP 502 with a retry-able message. Portfolio analysis isn't cached server- side, so the user retries; the reviewer's verdict reason goes into the AICall ledger as the leaked-status row's error column. - routers/chat.chat() : on reject, instead of returning the raw assistant content we return a short refusal explaining the limit and inviting a rephrase. Adds ~1-2 s of latency per turn (one extra LLM call to Haiku) — the only user-facing latency tax. - jobs/email_digest_job._generate_variants() : on reject, the variant is dropped for the cycle. Recipients on the rejected tone get no digest email this run, which is better than delivering inbox copy that drifts into advice (emails are unrecallable once sent). In every case the AICall ledger row records the reviewer cost so month_spend stays accurate across all paths. The reviewer system prompt is slightly generalised to cover both the indicator-summary case and the longer-form log/digest/chat case: - removes "short interpretive read" framing - softens the "any question" rule so genuine rhetorical structure in a long-form log doesn't trigger a reject tests/conftest.py grows an autouse fixture that stubs review_read to clean=True in every consumer module. Tests that mock the generator shouldn't have to also mock the safety gate behind it; tests that specifically want the reject branch can override with their own monkeypatch. test_output_review.py is unaffected — it imports review_read directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:40:04 +02:00 · 2026-05-29 14:40:04 +02:00 · f9534f7ad6
commit f9534f7ad6
parent a6e476b851
6 changed files with 161 additions and 19 deletions
--- a/app/services/output_review.py
+++ b/app/services/output_review.py
@ -39,25 +39,29 @@ DEFAULT_REVIEWER_MODEL = "anthropic/claude-haiku-4.5"

 _SYSTEM_PROMPT = """\
 You are a strict editor for a financial-markets dashboard. The author
-was asked to produce a short interpretive read for human readers.
-You receive their proposed read and decide if it is publishable as-is.
+was asked to produce editorial commentary on public market data for
+human readers. You receive the proposed text — it may be a one-line
+read, a multi-paragraph daily log, a portfolio analysis, a chat
+reply, or an email digest — and decide if it is publishable as-is.

-Mark CLEAN only if the text reads like a finished interpretation a
-reader could see on a public dashboard without confusion.
+Mark CLEAN only if the text reads like finished editorial commentary
+a reader could see on a public dashboard without confusion.

 Mark UNCLEAN if the text contains ANY of:
- Chain-of-thought / scratchpad markers used as thinking — phrases like
+- Chain-of-thought / scratchpad markers — the author thinking on the
+  page rather than presenting finished commentary. Phrases like
  "Let me", "Let's see", "we need to", "actually" (correcting itself),
-  "wait", "hmm", "or rather", "I should".
+  "wait", "hmm", "or rather", "I should". Rhetorical questions used
+  as structure are fine; questions that the author then answers in
+  front of the reader (self-questioning) are not.
 - Self-questioning parentheticals: "Q1 2026? Actually Q4 2025?",
  "is it X or Y?", any place where the author appears to be working
  out the answer in front of the reader.
- Multiple rhetorical questions or any question that interrupts the
-  declarative voice. A clean interpretive read is assertive.
 - Meta-commentary about the task, output format, word limits, or
  instructions — e.g. "as required by the constraints", "the prompt
  asks", "let me address each".
- Partial / truncated content. Starts mid-word, mid-number, mid-clause.
+- Partial / truncated content. Starts mid-word, mid-number, mid-clause,
+  ends mid-thought.
 - Visible internal numbers without clear meaning ("change 1y +5.9%?"),
  raw column names ("as_of 2026-01-01"), or any debug-like fragments.
 - FINANCIAL ADVICE or any phrasing that recommends an action the
@ -75,7 +79,7 @@ Mark UNCLEAN if the text contains ANY of:
  "valuations are stretched", "real yields are restrictive", "rates
  and credit disagree". The test: does the text describe a STATE, or
  does it suggest an ACTION? States are fine; actions are not.
- Anything else other than the finished, publishable interpretation.
+- Anything else other than the finished, publishable commentary.

 Return ONLY a JSON object with this exact shape:
 {"clean": true | false, "reason": "<≤20 words, plain text>"}