Replaces the regex-based clean_summary / looks_like_leakage pipeline
that produced the 2026-05-29 valuation-read leak. Two layers of defence
in depth:
1. JSON-mode generation. The per-group and aggregate summary system
prompts now require the model to emit a single object
{"read": "..."}; response_format={"type":"json_object"} is passed
through to the provider so the API enforces well-formed JSON. Prose
outside the field is physically impossible. The "read" field is the
only schema slot, so the model has nowhere to spill scratchpad
into the envelope.
2. Reviewer agent. services/output_review.review_read() makes a second
small LLM call that judges whether the candidate "read" string is
publishable. It catches the residual failure mode — scratchpad
INSIDE the field ("Let's see…", multi-question parentheticals,
meta-commentary) — and returns a JSON verdict {"clean": bool,
"reason": str}. Any failure (provider error, parse error, missing
field) returns clean=false (fail-safe). Cost ~$0.0001/check; latency
~1-2 s in the hourly job, no user-facing latency.
The old regex scaffolding (_LEAK_PATTERNS, clean_summary,
looks_like_leakage, _TRAILING_QUOTE) is deleted entirely. It produced
false positives (chopped legitimate "The indicators are…" leaders) and
false negatives (never matched the chain-of-thought patterns the model
actually emits). The reviewer agent is strictly better on both.
On reviewer/parse rejection: don't persist a new IndicatorSummary; the
API's existing fallback to the previous good row continues to serve
the panel. Failures are logged as ind_summary.json_invalid /
ind_summary.reviewer_rejected so we can measure the rejection rate.
Reviewer cost is added to the row's recorded cost_usd so the monthly
budget cap covers the full pipeline.
Adds tests/test_output_review.py: 11 cases covering _extract_read
(JSON envelope handling — invalid JSON, missing field, wrong types,
empty values) and review_read (clean / unclean verdicts plus three
fail-safe paths for malformed reviewer responses).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| alembic | ||
| app | ||
| config | ||
| docs | ||
| scripts | ||
| tasks | ||
| tests | ||
| .dockerignore | ||
| .env.example | ||
| .gitignore | ||
| alembic.ini | ||
| docker-compose.override.yml | ||
| docker-compose.prod.yml | ||
| docker-compose.test.yml | ||
| docker-compose.yml | ||
| Dockerfile | ||
| pyproject.toml | ||
| README.md | ||
| requirements.lock | ||
Read the Markets
Containerised macro-strategy dashboard — hourly market data, RSS news, Trading 212 portfolio, and an AI-generated strategic log written by Cassandra, the in-product seer. Read-only by design.
Production:
- Landing: https://read.markets
- App: https://app.read.markets
The Python package is still named cassandra and several internal identifiers (cookie names, advisory-lock keys, CASSANDRA_TOKEN env var, CSS filename) keep the legacy name on purpose — renaming them would invalidate live sessions / locks / configs for no user benefit. See app/branding.py for the brand single-source-of-truth.
Quick start (local dev)
cp .env.example .env # fill in API keys; set CASSANDRA_TOKEN if exposing
docker compose up --build # db + app + scheduler + daily backup sidecar
open http://localhost:8000/ # or whichever CASSANDRA_PORT you set
docker-compose.override.yml is auto-loaded and adds the host port
binding so the app is reachable on localhost.
Production (VPS, NPM-fronted)
Always invoke with explicit -f flags — that way the dev override is
skipped and the prod overlay (no host port, joins the external
intranet Docker network, uvicorn on port 80) is applied:
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build
Point Nginx Proxy Manager at upstream readmarkets-app-1:80.
Architecture
- app (FastAPI + Jinja2 + HTMX) — web dashboard on port 8000
- scheduler (APScheduler) — hourly ingestion jobs (market, news, portfolio, AI log)
- db (MariaDB 11) — quotes, headlines, portfolio snapshots, strategic logs, job runs
- backup (sidecar) — daily mariadb-dump to
./backup/
See /home/gg/.claude/plans/ok-i-think-this-tidy-lake.md for the design plan.
Config
| File | Purpose |
|---|---|
config/default.toml |
Universal data tables: indicator groups, RSS feeds, keyword presets |
config/portfolio.toml |
User-specific portfolios (overrides default.toml) |
.env |
Secrets and runtime knobs — mounted read-only into containers |
Endpoints
GET /— dashboardGET /portfolio/{name}— portfolio detailGET /news— news feedGET /log— strategic-log archiveGET /api/health— job status (last success / failure per job)
All authenticated routes require Authorization: Bearer $CASSANDRA_TOKEN if the env is set; if unset, the app is open (LAN-only mode).