read.markets

History

Giorgio Gilestro 45fa31bb2b ai: structured-output + reviewer agent for indicator summaries Replaces the regex-based clean_summary / looks_like_leakage pipeline that produced the 2026-05-29 valuation-read leak. Two layers of defence in depth: 1. JSON-mode generation. The per-group and aggregate summary system prompts now require the model to emit a single object {"read": "..."}; response_format={"type":"json_object"} is passed through to the provider so the API enforces well-formed JSON. Prose outside the field is physically impossible. The "read" field is the only schema slot, so the model has nowhere to spill scratchpad into the envelope. 2. Reviewer agent. services/output_review.review_read() makes a second small LLM call that judges whether the candidate "read" string is publishable. It catches the residual failure mode — scratchpad INSIDE the field ("Let's see…", multi-question parentheticals, meta-commentary) — and returns a JSON verdict {"clean": bool, "reason": str}. Any failure (provider error, parse error, missing field) returns clean=false (fail-safe). Cost ~$0.0001/check; latency ~1-2 s in the hourly job, no user-facing latency. The old regex scaffolding (_LEAK_PATTERNS, clean_summary, looks_like_leakage, _TRAILING_QUOTE) is deleted entirely. It produced false positives (chopped legitimate "The indicators are…" leaders) and false negatives (never matched the chain-of-thought patterns the model actually emits). The reviewer agent is strictly better on both. On reviewer/parse rejection: don't persist a new IndicatorSummary; the API's existing fallback to the previous good row continues to serve the panel. Failures are logged as ind_summary.json_invalid / ind_summary.reviewer_rejected so we can measure the rejection rate. Reviewer cost is added to the row's recorded cost_usd so the monthly budget cap covers the full pipeline. Adds tests/test_output_review.py: 11 cases covering _extract_read (JSON envelope handling — invalid JSON, missing field, wrong types, empty values) and review_read (clean / unclean verdicts plus three fail-safe paths for malformed reviewer responses). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>		2026-05-29 13:10:52 +02:00
..
fixtures	tests: add fabricated IBKR fixture for LLM parser	2026-05-27 12:06:47 +02:00
conftest.py	tests: extract _build_session_factory to a shared conftest fixture	2026-05-27 20:50:09 +02:00
test_access.py	phase D milestones 1+2: referral system + paid-access gate	2026-05-21 23:25:35 +01:00
test_api_helpers.py	initial commit — cassandra v0.1	2026-05-15 21:56:10 +01:00
test_auth_session.py	tests: backfill coverage for openrouter transport, auth sessions, cadence	2026-05-28 13:58:28 +02:00
test_branding_consistency.py	css: split cassandra.css into per-section files	2026-05-28 12:31:29 +02:00
test_cadence_policy.py	tests: backfill coverage for openrouter transport, auth sessions, cadence	2026-05-28 13:58:28 +02:00
test_chat_and_log_gates.py	routers: extract chat + ops from api.py	2026-05-27 21:43:17 +02:00
test_cli.py	phase D milestones 1+2: referral system + paid-access gate	2026-05-21 23:25:35 +01:00
test_config_loading.py	test: drop stale "pie" assertion from test_default_groups_present	2026-05-26 00:20:01 +02:00
test_csv_import.py	sync: encrypted cloud backup for portfolios + settings UX rework	2026-05-23 16:15:54 +02:00
test_digest_prompts.py	openrouter: split into llm_prompts (prompt engineering) + transport	2026-05-27 21:27:23 +02:00
test_email_digest_job.py	test+fix: make the suite run cleanly in the test container	2026-05-26 00:11:18 +02:00
test_email_render.py	email: split digest renderer to digest_email.py	2026-05-27 21:33:06 +02:00
test_email_service.py	brand: rename product to "Read the Markets" (read.markets)	2026-05-22 19:39:38 +01:00
test_email_unsubscribe.py	email: tighten unsubscribe — test isolation, accurate comments, tighter assertion	2026-05-25 23:10:29 +02:00
test_glossary.py	phase G: data minimisation + passwordless auth + DeepSeek-first LLM	2026-05-18 14:16:57 +01:00
test_i18n.py	cleanup: drop redundant @pytest.mark.asyncio + fix log_id type	2026-05-27 19:32:38 +02:00
test_instrument_map.py	phase B (1/4): CSV parser + InstrumentMap (T212 shortcode → Yahoo ticker)	2026-05-16 10:53:08 +01:00
test_llm_csv_parser.py	models: align translation column naming + add token counts	2026-05-27 21:18:29 +02:00
test_localization_integration.py	models: align translation column naming + add token counts	2026-05-27 21:18:29 +02:00
test_market_parsing.py	initial commit — cassandra v0.1	2026-05-15 21:56:10 +01:00
test_news_parsing.py	initial commit — cassandra v0.1	2026-05-15 21:56:10 +01:00
test_news_tagging.py	news: auto-tag headlines + market-aware cadence + filter UI	2026-05-21 23:25:03 +01:00
test_news_window.py	test+fix: make the suite run cleanly in the test container	2026-05-26 00:11:18 +02:00
test_openrouter_prompt.py	openrouter: split into llm_prompts (prompt engineering) + transport	2026-05-27 21:27:23 +02:00
test_openrouter_transport.py	llm: support JSON-mode + stop publishing the reasoning field	2026-05-29 13:02:36 +02:00
test_otp_service.py	phase G: data minimisation + passwordless auth + DeepSeek-first LLM	2026-05-18 14:16:57 +01:00
test_output_review.py	ai: structured-output + reviewer agent for indicator summaries	2026-05-29 13:10:52 +02:00
test_pending_cookie.py	phase D milestones 1+2: referral system + paid-access gate	2026-05-21 23:25:35 +01:00
test_polar_webhook.py	polar: build /api/polar/webhook handler	2026-05-26 17:42:41 +02:00
test_portfolio_analysis.py	phase G: data minimisation + passwordless auth + DeepSeek-first LLM	2026-05-18 14:16:57 +01:00
test_portfolio_sync_api.py	sync: detect orphaned blobs (pepper rotation) + fix AESGCM arg order	2026-05-25 12:49:11 +02:00
test_portfolio_sync_service.py	sync: encrypted cloud backup for portfolios + settings UX rework	2026-05-23 16:15:54 +02:00
test_referral.py	phase D milestones 1+2: referral system + paid-access gate	2026-05-21 23:25:35 +01:00
test_referral_conversion.py	tests: extract _build_session_factory to a shared conftest fixture	2026-05-27 20:50:09 +02:00
test_settings_digest_api.py	settings: digest opt-in + tone (PATCH /api/settings/digest + UI)	2026-05-25 23:23:03 +02:00
test_stripe_billing.py	stripe: detect buyer currency at checkout (GBP/USD/EUR)	2026-05-28 12:42:40 +02:00
test_ticker_validate.py	tests: extract _build_session_factory to a shared conftest fixture	2026-05-27 20:50:09 +02:00
test_universe_unlinkability.py	phase G: data minimisation + passwordless auth + DeepSeek-first LLM	2026-05-18 14:16:57 +01:00
test_verify_subscribe.py	ui: collapsible settings sections + welcome-email + larger auth inputs	2026-05-26 22:32:59 +02:00