Commit graph

2 commits

Author SHA1 Message Date
19d4854f50 llm: support JSON-mode + stop publishing the reasoning field
Two changes to the LLM call path that together close the
chain-of-thought leakage surface:

1. _call_provider accepts an optional `response_format` (forwarded to
   the OpenAI-shaped API — DeepSeek and OpenRouter both honour
   {"type": "json_object"}). Threaded through call_llm so callers can
   force structured output without monkey-patching the body. The
   indicator-summary job will use this next: it'll require the model
   to emit {"read": "..."} and parse the field, making prose outside
   the JSON object physically impossible to publish.

2. Empty `content` no longer falls back to the `reasoning` field.
   `reasoning` is the model's internal scratchpad — "Let's see...",
   half-formed math, planning notes. We had a fallback that surfaced
   it when content was null, but the field is intended for debugging
   the model, not for publication. After the 2026-05-29 valuation
   read leaked into production, the fallback is gone: an empty
   content row now raises so the caller retries or skips, and the
   previous good row remains visible. Test updated to assert this
   safer behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 13:02:36 +02:00
f9f4f25ef7 tests: backfill coverage for openrouter transport, auth sessions, cadence
Three new test files covering modules the audit flagged as having zero
direct coverage:

- test_openrouter_transport.py (18 tests): provider chain selection,
  endpoint resolution, _call_provider parse path (including the
  reasoning-field fallback and token-based cost estimation), and
  call_llm's cross-provider failover. Uses httpx.MockTransport so no
  network. Patches _call_provider for failover tests to bypass
  tenacity's retry delays.

- test_auth_session.py (7 tests): sign/verify round-trip, tampered
  cookie rejection, expired cookie rejection (via TTL monkeypatch),
  garbage input handling, salt isolation between session and pending
  serializers, and rejection of cookies signed with a different secret.

- test_cadence_policy.py (16 tests): is_active_window weekday/weekend
  + half-open interval boundaries, min_gap_hours across bands,
  should_run gating for first-run / active / off-hours / weekend
  / naive-datetime cases, and the NEWS_POLICY 20-minute / 3-hour
  variations.

Suite goes from 291 to 336 passing.
2026-05-28 13:58:28 +02:00