review: gate strategic-log, portfolio, chat, and digest on reviewer

Extends the reviewer agent — previously only protecting indicator summaries — to every AI-generated surface that reaches a user. The reviewer's prompt already rejects scratchpad, truncation, meta-commentary, and (since a6e476b) financial advice; wiring it in turns those rules from prompt-level "asks" into structural gates. Four call sites updated: - ai_log_job.run() : after each tone/analysis variant is generated, pass through review_read. On reject, log the reason and skip the StrategicLog insert; the API's existing "latest StrategicLog" lookup falls back to the previous clean log. - services/portfolio_analysis.analyse() : on reject, raise a clean RuntimeError that the /api/analyze router already maps to HTTP 502 with a retry-able message. Portfolio analysis isn't cached server- side, so the user retries; the reviewer's verdict reason goes into the AICall ledger as the leaked-status row's error column. - routers/chat.chat() : on reject, instead of returning the raw assistant content we return a short refusal explaining the limit and inviting a rephrase. Adds ~1-2 s of latency per turn (one extra LLM call to Haiku) — the only user-facing latency tax. - jobs/email_digest_job._generate_variants() : on reject, the variant is dropped for the cycle. Recipients on the rejected tone get no digest email this run, which is better than delivering inbox copy that drifts into advice (emails are unrecallable once sent). In every case the AICall ledger row records the reviewer cost so month_spend stays accurate across all paths. The reviewer system prompt is slightly generalised to cover both the indicator-summary case and the longer-form log/digest/chat case: - removes "short interpretive read" framing - softens the "any question" rule so genuine rhetorical structure in a long-form log doesn't trigger a reject tests/conftest.py grows an autouse fixture that stubs review_read to clean=True in every consumer module. Tests that mock the generator shouldn't have to also mock the safety gate behind it; tests that specifically want the reject branch can override with their own monkeypatch. test_output_review.py is unaffected — it imports review_read directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:40:04 +02:00 · 2026-05-29 14:40:04 +02:00 · f9534f7ad6
commit f9534f7ad6
parent a6e476b851
6 changed files with 161 additions and 19 deletions
--- a/app/routers/chat.py
+++ b/app/routers/chat.py
@ -24,6 +24,10 @@ from app.routers.api import _md_to_html
 from app.services.i18n import respond_in_clause
 from app.services.llm_prompts import build_chat_system_prompt
 from app.services.openrouter import call_llm, month_start
+from app.services.output_review import review_read
+
+from app.logging import get_logger
+log = get_logger("chat")

 router = APIRouter(dependencies=[Depends(require_token)])

@ -176,6 +180,11 @@ async def chat(
    try:
        async with httpx.AsyncClient(follow_redirects=True) as client:
            result = await call_llm(client, msgs)
+            # Reviewer gate. The chat turn could solicit advice with a
+            # leading question; the generator's system prompt forbids it,
+            # but the reviewer is the enforcement layer. ~1-2 s extra
+            # latency per turn on top of the generation call.
+            verdict = await review_read(client, result.content)
    except Exception as e:
        session.add(AICall(
            model=s.OPENROUTER_MODEL, status="error", error=str(e)[:500],
@ -183,11 +192,40 @@ async def chat(
        await session.commit()
        raise HTTPException(status_code=502, detail=f"OpenRouter error: {e}")

+    full_cost = (result.cost_usd or 0.0) + (verdict.cost_usd or 0.0)
+    if not verdict.clean:
+        # Rejected reply. Record the cost and surface a generic refusal
+        # the user can retry, rather than letting potentially non-compliant
+        # text reach them.
+        session.add(AICall(
+            model=result.model,
+            prompt_tokens=result.prompt_tokens,
+            completion_tokens=result.completion_tokens,
+            cost_usd=full_cost, status="leaked",
+            error=f"reviewer: {verdict.reason}",
+        ))
+        await session.commit()
+        log.warning("chat.reviewer_rejected", reason=verdict.reason,
+                    preview=result.content[:120])
+        refusal = (
+            "I can't generate that reply — it would have crossed into "
+            "investment advice or specific recommendations, which I'm "
+            "not licensed to give. Try rephrasing as a question about "
+            "what the data means rather than what to do."
+        )
+        return {
+            "role": "assistant",
+            "content": refusal,
+            "content_html": _md_to_html(refusal),
+            "prompt_tokens": result.prompt_tokens,
+            "completion_tokens": result.completion_tokens,
+        }
+
    session.add(AICall(
        model=result.model,
        prompt_tokens=result.prompt_tokens,
        completion_tokens=result.completion_tokens,
-        cost_usd=result.cost_usd,
+        cost_usd=full_cost,
        status="ok",
    ))
    await session.commit()