From 1ecc5271187a08393bd7e08773506078e2197a8f Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 15:39:03 +0200 Subject: [PATCH 01/69] cleanup: drop dead upload.html + soften broker-only marketing copy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Delete app/templates/upload.html. The /upload route redirected to /settings#import (302) and never rendered this template; the file was carrying stale Trading-212-only copy. - Landing + pricing pages: replace "Trading 212 today, more brokers planned" with "Trading 212 natively, other formats auto-detected" to reflect the LLM-fallback parser that's been live for a few days. The /upload redirect route in app/routers/pages.py stays — it remains a useful bookmark-forwarder for users with old links. Co-Authored-By: Claude Opus 4.7 --- app/templates/landing.html | 8 +-- app/templates/pricing.html | 2 +- app/templates/upload.html | 104 ------------------------------------- 3 files changed, 5 insertions(+), 109 deletions(-) delete mode 100644 app/templates/upload.html diff --git a/app/templates/landing.html b/app/templates/landing.html index 2732a68..8d3d170 100644 --- a/app/templates/landing.html +++ b/app/templates/landing.html @@ -116,10 +116,10 @@

Paid users can also drop a portfolio CSV from their broker - (Trading 212 today, more brokers planned) for an AI sense-check on - concentration, regime fit, and currency exposure. Holdings stay in - your browser by default; opt in to encrypted cloud sync to restore - on another device. + — Trading 212 natively, other formats auto-detected — + for an AI sense-check on concentration, regime fit, and currency + exposure. Holdings stay in your browser by default; opt in to + encrypted cloud sync to restore on another device.

diff --git a/app/templates/pricing.html b/app/templates/pricing.html index 93f1562..c32fb26 100644 --- a/app/templates/pricing.html +++ b/app/templates/pricing.html @@ -62,7 +62,7 @@
  • Strategic log refreshed every hour instead of every six — track intraday moves as they unfold
  • Follow-up chat on any past log — ask the model a question against the day’s full context
  • Daily email digest (Mon–Sat) — ~600-word read of the session ahead, on top of the Sunday recap
  • -
  • Portfolio import from a broker CSV (Trading 212 supported today; more brokers planned)
  • +
  • Portfolio import from any broker CSV — Trading 212 natively, other formats auto-detected
  • AI portfolio read — diversification, sector and currency concentration, macro-regime fit on your holdings
  • Optional encrypted cloud sync — PIN-derived encryption in your browser, second-layer wrap on the server, no plaintext holdings server-side
  • diff --git a/app/templates/upload.html b/app/templates/upload.html deleted file mode 100644 index 0c397e8..0000000 --- a/app/templates/upload.html +++ /dev/null @@ -1,104 +0,0 @@ -{% extends "base.html" %} -{% block title %}{{ BRAND_NAME }} · Import Portfolio{% endblock %} - -{% block main %} -
    -
    - Import portfolio (Trading 212 CSV) - held locally · optional encrypted cloud sync (paid) -
    - -
    -

    - Export your pie from the T212 web app - (Trading 212 → Investing → Your Pie → ⋯ → Export) - and drop the CSV here. Each Slice is resolved to its Yahoo ticker; - the parsed pie is kept in this browser's localStorage. - The server learns only which tickers exist (anonymously) so it can - fetch their prices. If you have cloud sync - enabled, an encrypted copy is also pushed to the - server — only your PIN can decrypt it. -

    - -
    -
    - -
    -
    Drop a T212 pie CSV here
    -
    or browse · max 1 MB
    -
    -
    - - -
    - - -
    -
    - - - - -{% endblock %} From 76f81648e5ee1bb1c341b8a747443801dab1c1e6 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 15:50:10 +0200 Subject: [PATCH 02/69] docs: spec for Italian localization (ES/FR/DE as WIP) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hybrid model: per-user surfaces (analyse, digest, chat) generated directly in the target language via a "Respond in Italian" clause appended to the system prompt. Shared content (strategic log) generated in English as today, then post-translated and cached per language in a new strategic_log_translations table. Translation calls fan out in parallel with asyncio.gather so total job latency stays bounded by max(single call). No separate translation-model setting — DeepSeek-4-flash at $0.28/M output is cheap enough that the routine cost is noise (~$0.005/day with Italian only at 24 logs/day). Users.lang VARCHAR(8) DEFAULT 'en'. Settings dropdown lists all four options but ES/FR/DE are disabled UI-side and rejected server-side against an ACTIVE_LANGUAGES allowlist — flipping them on later is a one-line constant change. Co-Authored-By: Claude Opus 4.7 --- .../2026-05-27-localization-italian-design.md | 375 ++++++++++++++++++ 1 file changed, 375 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-27-localization-italian-design.md diff --git a/docs/superpowers/specs/2026-05-27-localization-italian-design.md b/docs/superpowers/specs/2026-05-27-localization-italian-design.md new file mode 100644 index 0000000..1e98fa6 --- /dev/null +++ b/docs/superpowers/specs/2026-05-27-localization-italian-design.md @@ -0,0 +1,375 @@ +# Localization (Italian active, ES/FR/DE WIP) — Design Spec + +**Date:** 2026-05-27 +**Status:** Draft — pending implementation plan + +## Context + +All AI-generated content (strategic log, daily email digest, portfolio +analysis, follow-up chat) is English-only today. The operator wants to +add Italian translation as the first localization, with Spanish, +French, and German listed as "coming soon" in the settings UI but not +yet functional. Italian must work end-to-end from settings dropdown to +rendered output; the other three exist as commitments and design +placeholders so adding them later is a flag flip. + +This is foundational plumbing: it touches every LLM call site we ship +today and shapes how every future AI feature handles language. Doing it +first means later features (qty/cost edit narratives, P/L summaries, +alert text, etc.) inherit the i18n wiring for free instead of needing a +retrofit. + +## Goals + +- A user can pick `Italiano` from a settings dropdown and immediately + see every AI-generated surface in Italian. +- Adding `es`, `fr`, or `de` later is a one-line change to a constant + plus optionally validating the dropdown's enabled set. +- Translation cost stays in the "noise" range — we use the same + DeepSeek-4-flash model the rest of the system uses (~$0.28/M output + tokens). No separate "cheap translation" plumbing. +- Strategic-log reads stay instant for non-English users — no + read-time translation latency. + +## Non-goals + +- UI label translation. The dashboard buttons, settings labels, + headings, and other chrome remain English. Only the AI's own output + is localized. +- Translation of indicator summaries. The same pattern will apply when + those become user-facing prose, but they aren't surfaced today. +- Backfilling translations for historical strategic logs. Translation + only happens going forward, at the moment a new English log is written. +- Activation of Spanish/French/German. They appear in the dropdown as + "coming soon" with disabled options; the value-validation layer in + the settings POST refuses them. + +## Two distinct translation paths + +The system has two categories of AI-generated content, with different +generation patterns: + +### Per-user content (analyse, digest, chat) + +Each call already produces output for exactly one user. The fix is +trivial: the user's `lang` threads into the prompt assembly, and the +system prompt gains a `"Respond in Italian."` clause when `lang != 'en'`. +One LLM call, no extra cost, no extra latency. + +### Shared content (strategic log) + +The hourly `ai_log_job` writes a single English log row used by every +user. To serve non-English users, we generate the English log as today, +then translate it to each active non-English language via a separate +LLM call and store the result in a new `strategic_log_translations` +table. Translations are fanned out in parallel with `asyncio.gather` so +total translation time is max(single call), not sum. The `/log` +endpoint serves the translation matching the requester's `lang`, +falling back to English if none exists. + +Why translate-after rather than generate-N-times: the strategic log +includes live market data, headlines, and references that are +expensive to assemble. Re-running the full generation in each language +duplicates that work; translating the rendered output preserves a +single source of truth (the English original) and only spends LLM +tokens on the actual prose conversion. + +## Architecture + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ User has user.lang preference │ +│ Values: 'en' (default) | 'it' (active) | 'es'/'fr'/'de' (WIP) │ +└─────────────────────────────────────────────────────────────────┘ + │ + ├─ Per-user surfaces (portfolio analyse, daily digest, chat) + │ └─ prompt assembly threads user.lang to + │ respond_in_clause() → appended to system prompt + │ when lang != 'en'. Single call_llm, no extra cost. + │ + └─ Shared surfaces (strategic log) + ├─ ai_log_job writes the English row as today + ├─ Then SELECTs distinct users.lang where lang != 'en' + │ AND user has active paid access + ├─ asyncio.gather of one translate() call per language + └─ Each result → INSERT into strategic_log_translations + keyed by (log_id, lang) UNIQUE +``` + +## Data model + +### `users.lang` (new column) + +```sql +ALTER TABLE users + ADD COLUMN lang VARCHAR(8) NOT NULL DEFAULT 'en'; +``` + +Existing rows pick up the `en` default. Application-level validation +restricts writes to the `ACTIVE_LANGUAGES` set; the database column +accepts anything in `VARCHAR(8)` (no CHECK constraint — we want to +add new languages without a migration). + +### `strategic_log_translations` (new table) + +```sql +CREATE TABLE strategic_log_translations ( + id BIGINT PRIMARY KEY AUTO_INCREMENT, + log_id BIGINT NOT NULL, + lang VARCHAR(8) NOT NULL, + content_md TEXT NOT NULL, + generated_at DATETIME(6) NOT NULL, + llm_model VARCHAR(64), + llm_cost_usd FLOAT, + CONSTRAINT fk_slt_log + FOREIGN KEY (log_id) REFERENCES strategic_logs(id) ON DELETE CASCADE, + CONSTRAINT uq_slt_log_lang UNIQUE (log_id, lang) +); +``` + +ON DELETE CASCADE means evicting an old strategic log row also drops +its translations. The UNIQUE constraint prevents duplicate translations +for the same log/lang combo. + +## Components + +### `app/services/i18n.py` (new) + +```python +LANGUAGES = { + "en": "English", + "it": "Italian", + "es": "Spanish", + "fr": "French", + "de": "German", +} + +# Set of language codes that users can actually pick from the settings +# dropdown. ES/FR/DE remain in LANGUAGES so their labels render, but +# the settings POST validator and the strategic-log translation fan-out +# both consult this set. +ACTIVE_LANGUAGES = {"en", "it"} + + +def respond_in_clause(lang: str) -> str: + """Suffix appended to per-user LLM system prompts. + + Returns an empty string for 'en' (the default everywhere already). + Otherwise returns "\n\nRespond in ." so the model knows + to write its output in the user's language. + """ + if not lang or lang == "en": + return "" + name = LANGUAGES.get(lang, "English") + return f"\n\nRespond in {name}." +``` + +### `app/services/translation.py` (new) + +```python +async def translate( + client: httpx.AsyncClient, + text: str, + target_lang: str, +) -> tuple[str, LogResult]: + """Translate ``text`` (markdown) to ``target_lang``. + + Uses the default ``call_llm`` provider chain — DeepSeek-4-flash via + the OG API is already cheap enough ($0.28/M output) that a separate + 'translation model' setting would be over-engineering. + + Returns ``(translated_markdown, LogResult)`` so the caller can + persist provenance (model + cost) alongside the translation. + Raises on provider failure; caller decides whether to surface or + swallow. + """ +``` + +System prompt: *"Translate the following markdown to {language}. Preserve all formatting (headings, lists, links, emphasis). Do NOT translate ticker symbols, company names, numbers, percentages, or dates. Output ONLY the translated markdown — no preamble, no commentary."* + +### `app/models.py` (modified) + +- `User`: add `lang: Mapped[str] = mapped_column(String(8), nullable=False, default="en", server_default="en")` +- New class `StrategicLogTranslation` matching the table above + +### `app/jobs/ai_log_job.py` (modified) + +After the existing English log row is persisted, add a translation +fan-out: + +```python +# Select distinct active non-English languages. +async with session_factory() as session: + rows = (await session.execute( + select(User.lang).distinct() + .where(User.lang.in_(ACTIVE_LANGUAGES - {"en"})) + )).scalars().all() +active_langs = list(rows) + +if active_langs: + async with httpx.AsyncClient(...) as client: + results = await asyncio.gather(*[ + translate(client, log_row.content_md, lang) + for lang in active_langs + ], return_exceptions=True) + for lang, result in zip(active_langs, results): + if isinstance(result, Exception): + log.warning("log.translate.failed", lang=lang, error=str(result)[:200]) + continue + translated_md, llm_log = result + session.add(StrategicLogTranslation( + log_id=log_row.id, lang=lang, + content_md=translated_md, + generated_at=utcnow(), + llm_model=llm_log.model, + llm_cost_usd=llm_log.cost_usd, + )) + await session.commit() +``` + +Errors in individual language translations are logged but do not fail +the job. Missing translations get rendered as the English fallback at +read time. + +### `app/jobs/email_digest_job.py` (modified) + +The digest is already per-user and assembles its own prompt. Thread +`user.lang` through: + +- `_generate_variants(...)` accepts a `target_lang` param +- The system prompt assembly appends `respond_in_clause(target_lang)` +- Subject-line generation runs in the same call, so it's localized too + +### `app/services/portfolio_analysis.py` (modified) + +- `AnalysisRequest` gains a `lang: str = "en"` field, populated by the + route from `principal.user.lang` +- `analyse(...)` appends `respond_in_clause(req.lang)` to its system prompt + +### `app/routers/universe.py` (modified — the `/api/analyze` route) + +Read the current user's `lang` and put it in the payload before calling +`analyse(...)`. (The current route gets the principal via Depends.) + +### `app/routers/pages.py` / the `/log` resolution (modified) + +When rendering `/log` (and the `/log/{day}` historical variant), look +up the user's `lang`. If `lang != 'en'`, attempt to fetch the matching +`StrategicLogTranslation`; if present, render that. If absent, fall +back to the English `StrategicLog.content_md`. No silent error — the +fallback is the intended graceful path. + +### Settings UI (`app/templates/settings.html` modified) + +New section under existing user preferences (alongside the digest-tone +toggle): + +```html +
    + Language +

    + The language the AI uses for the strategic log, your daily digest, + and portfolio commentary. UI labels stay in English for now. +

    +
    + + +
    +
    +``` + +### Settings POST endpoint (new) + +```python +@router.post("/settings/language") +async def set_language( + lang: str = Form(...), + cu: CurrentUser = Depends(require_auth), + session: AsyncSession = Depends(get_session), +): + if lang not in ACTIVE_LANGUAGES: + raise HTTPException(status_code=400, detail="unsupported language") + if cu.user is None: + raise HTTPException(status_code=403, detail="user required") + cu.user.lang = lang + await session.commit() + return RedirectResponse(url="/settings#language", status_code=303) +``` + +Server-side validation against `ACTIVE_LANGUAGES` is the gate that +keeps ES/FR/DE non-functional even if someone POSTs them by hand. + +## Error handling + +| Case | Behaviour | +|---|---| +| Translation provider down at ai_log_job time | English row still written. Translation row missing for that hour and language. Next hour retries. No retroactive backfill in v1. | +| Translation returns malformed markdown | Stored anyway (we trust DeepSeek output enough that this is rare). Operator can delete a bad row by hand. | +| User has `lang=it` but no IT translation for the latest log | Fall back to English silently. Better than an empty pane. | +| User saves an unsupported lang (`es`/`fr`/`de`/`xx`) via raw POST | 400 — validated against `ACTIVE_LANGUAGES`. | +| Migrating an existing user with no `lang` column | The `DEFAULT 'en'` clause on the migration handles it; no application code change needed. | +| User picks Italian, then logs change reaches them mid-hour | The next ai_log_job tick generates and translates a fresh log; users see the IT version on the next refresh. | + +## Tests + +Backend (`tests/test_i18n.py`, `tests/test_translation.py`, +`tests/test_localization_integration.py`): + +- `respond_in_clause('en')` returns empty string +- `respond_in_clause('it')` includes the word "Italian" +- `respond_in_clause('xx')` falls back to "English" (defensive) +- `translate()` mocked happy path returns the translated text + LogResult +- `translate()` provider failure raises +- ai_log_job: with no non-en users, no translation calls happen (mock asserts call_count=0) +- ai_log_job: with one user at `lang='it'`, one translation row written with the right `lang` and `log_id` +- ai_log_job: translation failure on one lang doesn't fail the job; the other lang's row still writes +- `/log` serves IT row when `user.lang='it'` and an IT translation exists +- `/log` falls back to English when `user.lang='it'` but no IT translation exists +- `/settings/language` POST: accepts `en`/`it`, rejects `es`/`fr`/`de`/`xx` with 400 +- `analyse()` system prompt contains `"Respond in Italian."` when `lang='it'` (assert on the messages list passed to call_llm) +- digest job system prompt likewise contains the clause when the user is Italian + +## Verification + +End-to-end manual check after deploy: + +1. **Switch a paid test user to Italian via the settings dropdown.** Confirm `users.lang='it'` in the DB. +2. **Wait for the next hourly log generation** (or trigger manually via cron/admin). Confirm a new `strategic_log_translations` row exists with `lang='it'` and `content_md` clearly Italian. +3. **Open the dashboard as that user.** Strategic log renders in Italian. +4. **Trigger the daily digest send for that user** (CLI: `python -m app.cli send-test-digest user@x daily`). Confirm the received email is in Italian. +5. **Click "Analyse my portfolio"** on the dashboard. Confirm the AI commentary is in Italian. +6. **Switch the same user back to English.** Confirm the next dashboard refresh shows the English log. The IT translation row stays in the DB (other IT users still benefit). +7. **Inspect the dropdown.** Verify ES/FR/DE appear with "(coming soon)" suffix and the option is disabled. +8. **Attempt `curl -X POST /settings/language -d lang=es`** with a valid session cookie. Expect 400. + +## Migration / rollout + +- Alembic migration `0022_localization` adds `users.lang` and creates + `strategic_log_translations`. Existing rows pick up `en` default. +- App restart picks up the new code paths. Pre-existing English logs + stay as-is. The first ai_log_job tick after deploy generates the + first Italian translation for whatever active IT users exist (likely + zero on day one until someone opts in). +- Removing localization later (if needed) is harmless: setting any + user's `lang` back to `en` makes their experience identical to the + pre-localization state. + +## Out-of-scope clarifications + +- We do not translate UI labels. Italian users see English buttons, + headings, and tooltips. Future scope. +- We do not translate user-generated content (chat questions the user + types). Only the AI's output is localized; user-supplied input flows + through unchanged. +- We do not translate the email subject line independently. The same + per-user LLM call that generates the digest body also generates the + subject in the target language. +- We do not surface translation cost in any user-visible UI. Cost is + recorded in `strategic_log_translations.llm_cost_usd` and the existing + `ai_calls` ledger picks up per-user calls as today. From e6308260a59d8dc6e4e1cf87c5065169cabae11d Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:08:28 +0200 Subject: [PATCH 03/69] =?UTF-8?q?docs:=20localization=20spec=20=E2=80=94?= =?UTF-8?q?=20explicit=20no-tier-gating=20decision?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Translate for any user with lang='it' regardless of paid/free status. Italian + UK are the first markets, so IT availability is part of the public-facing experience — a free-tier visitor needs to see the AI in Italian to convert. At ~$0.005/day total cost the gating isn't worth the savings. Co-Authored-By: Claude Opus 4.7 --- .../specs/2026-05-27-localization-italian-design.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/superpowers/specs/2026-05-27-localization-italian-design.md b/docs/superpowers/specs/2026-05-27-localization-italian-design.md index 1e98fa6..1c03633 100644 --- a/docs/superpowers/specs/2026-05-27-localization-italian-design.md +++ b/docs/superpowers/specs/2026-05-27-localization-italian-design.md @@ -373,3 +373,10 @@ End-to-end manual check after deploy: - We do not surface translation cost in any user-visible UI. Cost is recorded in `strategic_log_translations.llm_cost_usd` and the existing `ai_calls` ledger picks up per-user calls as today. +- We do **not** gate strategic-log translation on user tier. Any user + with `lang='it'` triggers Italian translation for that hour's log, + regardless of whether they are paid, on credit, or free. Rationale: + Italian + UK are the first markets the operator is targeting, so + Italian availability is part of the public-facing experience — a + free-tier visitor needs to see the AI in Italian to convert. At + ~$0.005/day total cost the gating overhead is not worth the savings. From 8af1da12dddf9fce8a21cc6e47150ec618476600 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:13:29 +0200 Subject: [PATCH 04/69] docs: implementation plan for Italian localization MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 11 TDD-style tasks: i18n service, translation helper, model + migration, ai_log_job translation fan-out, per-user surfaces (analyse, digest), localized /log endpoint, PATCH /api/settings/language, dropdown UI, and final regression + manual smoke. Per-user surfaces append "Respond in Italian." to the system prompt (one extra line, no extra LLM call). The strategic log is generated in English, then fanned out to translate() per active non-en language in parallel via asyncio.gather. The /log endpoint serves the matching translation row when present, English fallback otherwise. Translation uses the default call_llm provider chain — no separate cheap-model carve-out needed at DeepSeek's $0.28/M output pricing. Co-Authored-By: Claude Opus 4.7 --- .../plans/2026-05-27-localization-italian.md | 1589 +++++++++++++++++ 1 file changed, 1589 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-27-localization-italian.md diff --git a/docs/superpowers/plans/2026-05-27-localization-italian.md b/docs/superpowers/plans/2026-05-27-localization-italian.md new file mode 100644 index 0000000..363689b --- /dev/null +++ b/docs/superpowers/plans/2026-05-27-localization-italian.md @@ -0,0 +1,1589 @@ +# Localization (Italian active, ES/FR/DE WIP) — Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Make every AI-generated user-facing surface render in Italian when the user picks `Italiano` in settings; lay the wiring so adding ES/FR/DE later is a one-line constant change. + +**Architecture:** Per-user surfaces (`portfolio_analysis`, `email_digest_job`, follow-up chat if present) thread the user's `lang` into the LLM prompt via a `respond_in_clause()` helper — one extra line on the system prompt, no extra call. The hourly `ai_log_job` writes the English `StrategicLog` row as today, then fans out parallel `translate()` calls (`asyncio.gather`) — one per active non-en language with at least one user — and persists each result in a new `strategic_log_translations` table. The `/log` endpoint serves the matching translation when present and falls back to English otherwise. + +**Tech Stack:** FastAPI · SQLAlchemy 2.0 async · Alembic · MariaDB (prod) / aiosqlite (tests) · existing `openrouter.call_llm` (DeepSeek-4-flash primary, OpenRouter fallback) · Jinja2 templates + +**Spec:** `docs/superpowers/specs/2026-05-27-localization-italian-design.md` + +--- + +## File Structure + +**Create:** +- `app/services/i18n.py` — `LANGUAGES`, `ACTIVE_LANGUAGES`, `respond_in_clause()` +- `app/services/translation.py` — `translate(client, text, target_lang)` wrapping `call_llm` +- `alembic/versions/0022_localization.py` — adds `users.lang`, creates `strategic_log_translations` +- `tests/test_i18n.py` — unit tests for the two new services +- `tests/test_localization_integration.py` — wiring/fan-out + route-level integration + +**Modify:** +- `app/models.py` — add `User.lang`, new `StrategicLogTranslation` model +- `app/jobs/ai_log_job.py` — translation fan-out after English row is committed +- `app/services/portfolio_analysis.py` — accept + thread `lang` field +- `app/routers/universe.py` — pass `cu.user.lang` (or `"en"` for admin) into `parse_request` +- `app/jobs/email_digest_job.py` — thread `user.lang` into the per-user prompt +- `app/routers/api.py` — add `PATCH /api/settings/language` endpoint +- `app/routers/pages.py` — `log_page` / `log_page_day` serve translated content when available +- `app/templates/settings.html` — language dropdown + small JS handler + +**Reuse without modification:** +- `app/services/openrouter.call_llm`, `LogResult` +- `app/auth.require_auth` / `require_token` / `CurrentUser` +- `app/db.Base`, `utcnow`, `get_session` +- Test session-factory pattern from `tests/test_referral_conversion.py::_build_session_factory` + +--- + +## Test Conventions + +All tests runnable in the project-isolated container: + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_i18n.py tests/test_localization_integration.py -v +``` + +DB-touching tests use the per-test `_build_session_factory(tmp_path)` pattern from `tests/test_referral_conversion.py`. LLM calls mocked via `monkeypatch.setattr(, "call_llm", AsyncMock(...))`. Real Yahoo/network calls forbidden. + +--- + +### Task 1: i18n service — LANGUAGES, ACTIVE_LANGUAGES, respond_in_clause + +**Files:** +- Create: `app/services/i18n.py` +- Test: `tests/test_i18n.py` + +- [ ] **Step 1: Write failing tests** + +Create `tests/test_i18n.py`: + +```python +"""Unit tests for app.services.i18n.""" +from __future__ import annotations + +import pytest + + +def test_languages_contains_all_four_plus_english(): + from app.services.i18n import LANGUAGES + assert set(LANGUAGES.keys()) == {"en", "it", "es", "fr", "de"} + assert LANGUAGES["en"] == "English" + assert LANGUAGES["it"] == "Italian" + assert LANGUAGES["es"] == "Spanish" + assert LANGUAGES["fr"] == "French" + assert LANGUAGES["de"] == "German" + + +def test_active_languages_is_en_and_it_only(): + from app.services.i18n import ACTIVE_LANGUAGES + assert ACTIVE_LANGUAGES == {"en", "it"} + + +def test_respond_in_clause_empty_for_english(): + from app.services.i18n import respond_in_clause + assert respond_in_clause("en") == "" + + +def test_respond_in_clause_empty_for_none_or_empty(): + from app.services.i18n import respond_in_clause + assert respond_in_clause("") == "" + assert respond_in_clause(None) == "" + + +def test_respond_in_clause_italian(): + from app.services.i18n import respond_in_clause + result = respond_in_clause("it") + assert "Italian" in result + assert result.startswith("\n\n") + + +def test_respond_in_clause_unknown_lang_falls_back_to_english(): + """Defensive: a raw POST or stale lang code should not crash the + prompt assembly. Unknown codes map to no-suffix (English default).""" + from app.services.i18n import respond_in_clause + assert respond_in_clause("xx") == "" +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_i18n.py -v +``` + +Expected: 6 FAIL with `ImportError`. + +- [ ] **Step 3: Implement `app/services/i18n.py`** + +```python +"""Language registry + prompt helpers for localized AI output. + +Two surfaces consume this module: +- Per-user LLM call sites (portfolio analysis, digest, chat) call + ``respond_in_clause(user.lang)`` and append the result to their + system prompt. +- The settings dropdown + its PATCH endpoint consult ``ACTIVE_LANGUAGES`` + to decide which options are selectable. + +Adding Spanish/French/German support later is a one-line constant +change: extend ``ACTIVE_LANGUAGES`` to include the new code. No other +code change is required — the rest of the system already treats them +as first-class via ``LANGUAGES``. +""" +from __future__ import annotations + + +# Display labels for every language the system knows about. ES/FR/DE +# are kept here so labels still render in the dropdown (as disabled +# options) without requiring code changes to enable them later. +LANGUAGES: dict[str, str] = { + "en": "English", + "it": "Italian", + "es": "Spanish", + "fr": "French", + "de": "German", +} + + +# Languages users can actually select. Settings POST validates against +# this; the strategic-log translation fan-out only considers these. +ACTIVE_LANGUAGES: set[str] = {"en", "it"} + + +def respond_in_clause(lang: str | None) -> str: + """Suffix appended to per-user LLM system prompts. + + Returns an empty string for ``en`` (no nudge needed), an unknown + code, or ``None``/empty input — those callers want the default + English path. Otherwise returns ``"\\n\\nRespond in ."`` + keyed off ``LANGUAGES``. + """ + if not lang or lang == "en" or lang not in LANGUAGES: + return "" + return f"\n\nRespond in {LANGUAGES[lang]}." +``` + +- [ ] **Step 4: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_i18n.py -v +``` + +Expected: 6 PASS. + +- [ ] **Step 5: Commit** + +```bash +git add app/services/i18n.py tests/test_i18n.py +git commit -m "i18n: add LANGUAGES, ACTIVE_LANGUAGES, respond_in_clause helper" +``` + +## Context + +- Working directory: `/home/gg/mydocker_images/products/read.markets`. Branch `main`. Commit directly. +- Test runner: ONLY `docker compose -f docker-compose.test.yml ...`. NEVER plain `docker compose ...` against the prod stack. +- If `git commit` is blocked by the auto-mode classifier, leave the tree dirty and report — the controller will commit. + +--- + +### Task 2: translation service + +**Files:** +- Create: `app/services/translation.py` +- Test: `tests/test_i18n.py` (append) + +- [ ] **Step 1: Write failing tests** + +Append to `tests/test_i18n.py`: + +```python +@pytest.mark.asyncio +async def test_translate_happy_path(monkeypatch): + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + from app.services.openrouter import LogResult + + monkeypatch.setattr(mod, "call_llm", AsyncMock(return_value=LogResult( + content="# Apertura\n\nIl mercato è in calo dello 0,4%.", + model="deepseek/deepseek-v4-flash", + prompt_tokens=300, completion_tokens=80, cost_usd=0.00002, + ))) + + client = MagicMock() + translated, llm_log = await mod.translate( + client, "# Open\n\nThe market is down 0.4%.", "it", + ) + assert "Apertura" in translated + assert llm_log.model == "deepseek/deepseek-v4-flash" + assert llm_log.cost_usd == pytest.approx(0.00002) + + +@pytest.mark.asyncio +async def test_translate_strips_code_fences(monkeypatch): + """If the LLM wraps the output in ```markdown ... ```, strip it.""" + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + from app.services.openrouter import LogResult + + fenced = "```markdown\n# Titolo\n\nCorpo.\n```" + monkeypatch.setattr(mod, "call_llm", AsyncMock(return_value=LogResult( + content=fenced, model="m", prompt_tokens=10, completion_tokens=20, cost_usd=0.0, + ))) + + client = MagicMock() + translated, _ = await mod.translate(client, "# Title\n\nBody.", "it") + assert "```" not in translated + assert translated.startswith("# Titolo") + + +@pytest.mark.asyncio +async def test_translate_provider_failure_propagates(monkeypatch): + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + + monkeypatch.setattr(mod, "call_llm", AsyncMock(side_effect=RuntimeError("upstream down"))) + + client = MagicMock() + with pytest.raises(RuntimeError, match="upstream down"): + await mod.translate(client, "# Title\n\nBody.", "it") + + +@pytest.mark.asyncio +async def test_translate_unknown_lang_returns_source_unchanged(monkeypatch): + """Defensive: an unknown lang code (or 'en') short-circuits without + calling the LLM. Callers shouldn't have to gate the call themselves.""" + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + from app.services.openrouter import LogResult + + call_mock = AsyncMock(return_value=LogResult( + content="should not be returned", + model="m", prompt_tokens=0, completion_tokens=0, cost_usd=0.0, + )) + monkeypatch.setattr(mod, "call_llm", call_mock) + + client = MagicMock() + out, _ = await mod.translate(client, "Hello world.", "en") + assert out == "Hello world." + call_mock.assert_not_awaited() +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_i18n.py -k translate -v +``` + +Expected: 4 FAIL with `ImportError`. + +- [ ] **Step 3: Implement `app/services/translation.py`** + +```python +"""Markdown translation via the existing LLM provider chain. + +DeepSeek-4-flash at ~$0.28/M output tokens is cheap enough that we +don't bother with a separate translation-only model. ``call_llm``'s +provider chain (DeepSeek primary, OpenRouter fallback) handles this +path identically to any other LLM call. + +The translator is content-aware in one important way: it instructs the +model to preserve markdown structure, ticker symbols, numbers, dates, +and percentages verbatim. This keeps generated artefacts (tables of +quotes, embedded percentages, dated references) intact across the +translation boundary. +""" +from __future__ import annotations + +import httpx + +from app.services.i18n import LANGUAGES +from app.services.openrouter import LogResult, call_llm + + +_SYSTEM_PROMPT_TMPL = """\ +You are an expert translator working on financial-markets commentary. +Translate the following markdown text to {language}. + +Strict rules: +- Preserve ALL markdown formatting (headings, lists, emphasis, links, + tables, code spans). +- Do NOT translate ticker symbols (AAPL, MSFT, VOD.L, ASML.AS, etc.), + company legal names, percentages, dates, ISO currency codes, or any + numbers. +- Do NOT add commentary, preambles, or apologies. Output ONLY the + translated markdown. +""" + + +async def translate( + client: httpx.AsyncClient, + text: str, + target_lang: str, +) -> tuple[str, LogResult]: + """Translate markdown ``text`` to ``target_lang``. + + Returns ``(translated_markdown, LogResult)``. Caller persists the + cost/model provenance from LogResult next to the cached row. + + Short-circuits without calling the LLM when ``target_lang`` is + ``'en'``, unknown, or empty — returns the source unchanged with a + zero-cost stub LogResult. This lets fan-out callers iterate over + all languages without per-call gating. + + Raises on provider failure (HTTP error, all chain providers down). + Callers in fan-out paths should catch and log per-language. + """ + if not target_lang or target_lang == "en" or target_lang not in LANGUAGES: + # No-op fast path. Returning a fake LogResult keeps the call + # signature stable for callers who unpack the tuple. + return text, LogResult( + content=text, model="noop", + prompt_tokens=0, completion_tokens=0, cost_usd=0.0, + ) + + system_prompt = _SYSTEM_PROMPT_TMPL.format(language=LANGUAGES[target_lang]) + messages = [ + {"role": "system", "content": system_prompt}, + {"role": "user", "content": text}, + ] + result = await call_llm(client, messages) + + content = (result.content or "").strip() + # Strip code fences if the model wrapped its output despite the system rule. + if content.startswith("```"): + # Drop the opening fence (with optional language tag). + first_nl = content.find("\n") + if first_nl != -1: + content = content[first_nl + 1:] + # Drop the closing fence. + if content.rstrip().endswith("```"): + content = content.rstrip()[:-3].rstrip() + content = content.strip() + + return content, result +``` + +- [ ] **Step 4: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_i18n.py -v +``` + +Expected: 10 tests pass total (6 from Task 1 + 4 new). + +- [ ] **Step 5: Commit** + +```bash +git add app/services/translation.py tests/test_i18n.py +git commit -m "i18n: add translate() helper backed by call_llm" +``` + +--- + +### Task 3: User.lang column + StrategicLogTranslation model + +**Files:** +- Modify: `app/models.py` +- Test: `tests/test_localization_integration.py` + +- [ ] **Step 1: Write failing tests** + +Create `tests/test_localization_integration.py`: + +```python +"""Integration tests: model surface, ai_log_job translation fan-out, +route-level localized fetch, settings PATCH validation.""" +from __future__ import annotations + +import pytest + + +def _build_session_factory(tmp_path): + """Per-test sqlite engine + factory. Mirrors test_referral_conversion.py.""" + from sqlalchemy.ext.asyncio import async_sessionmaker, create_async_engine + + from app import db as db_mod + from app.db import Base + import app.models # noqa: F401 — registers models on Base.metadata + + engine = create_async_engine(f"sqlite+aiosqlite:///{tmp_path}/loc.db") + factory = async_sessionmaker(engine, expire_on_commit=False) + db_mod._engine = engine + db_mod._session_factory = factory + + async def _setup(): + async with engine.begin() as conn: + await conn.run_sync(Base.metadata.create_all) + + return engine, factory, _setup + + +def test_user_has_lang_column_with_default_en(): + from sqlalchemy import inspect + from app.models import User + + cols = {c.name: c for c in inspect(User).columns} + assert "lang" in cols + assert cols["lang"].nullable is False + # SQLAlchemy default may be a callable or a literal — check both. + default = cols["lang"].default + assert default is not None + if hasattr(default, "arg"): + assert default.arg == "en" + + +def test_strategic_log_translation_model_columns(): + from sqlalchemy import inspect + from app.models import StrategicLogTranslation + + cols = {c.name: c for c in inspect(StrategicLogTranslation).columns} + assert "log_id" in cols + assert "lang" in cols + assert "content_md" in cols + assert "generated_at" in cols + assert "llm_model" in cols + assert "llm_cost_usd" in cols + assert cols["log_id"].nullable is False + assert cols["lang"].nullable is False + assert cols["content_md"].nullable is False +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: 2 FAIL — first on `User.lang` missing, second on `StrategicLogTranslation` import error. + +- [ ] **Step 3: Add `User.lang` column** + +In `app/models.py`, find the `User` class. Find a sensible place near other user-preference columns (next to `tone` or `digest_tone`) and add: + +```python + # Preferred language for AI-generated content (strategic log, + # digest emails, portfolio commentary). Default 'en'. The settings + # PATCH endpoint validates against ACTIVE_LANGUAGES in + # app/services/i18n.py before writing. + lang: Mapped[str] = mapped_column( + String(8), nullable=False, default="en", server_default="en", + ) +``` + +- [ ] **Step 4: Add `StrategicLogTranslation` model** + +In `app/models.py`, append after the existing `StrategicLog` class (around line 108-122): + +```python +class StrategicLogTranslation(Base): + """Cached translation of a single StrategicLog row. + + Populated by ai_log_job after the English row is committed: one + row per (log_id, lang) combination. The /log endpoint serves the + matching row when available and falls back to the English source + when no row exists yet (e.g. translation failed or the language + was added after the log was generated). + + No user attribution — the cache is shared. Setting `lang` on a + user just selects which (already-translated) variant they see. + """ + __tablename__ = "strategic_log_translations" + + id: Mapped[int] = mapped_column(_PK, primary_key=True, autoincrement=True) + log_id: Mapped[int] = mapped_column( + _PK, ForeignKey("strategic_logs.id", ondelete="CASCADE"), nullable=False, + ) + lang: Mapped[str] = mapped_column(String(8), nullable=False) + content_md: Mapped[str] = mapped_column(Text, nullable=False) + generated_at: Mapped[datetime] = mapped_column( + DateTime(timezone=True), nullable=False, default=utcnow, + ) + llm_model: Mapped[str | None] = mapped_column(String(64)) + llm_cost_usd: Mapped[float | None] = mapped_column(Float) + + __table_args__ = ( + UniqueConstraint("log_id", "lang", name="uq_slt_log_lang"), + ) +``` + +- [ ] **Step 5: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: 2 PASS. + +- [ ] **Step 6: Commit** + +```bash +git add app/models.py tests/test_localization_integration.py +git commit -m "models: add User.lang + StrategicLogTranslation" +``` + +## Context for this task + +- The existing `User` class already imports `String`, `mapped_column`, `Mapped`. No new imports needed for `User.lang`. +- For `StrategicLogTranslation`, `_PK`, `String`, `Text`, `DateTime`, `Float`, `ForeignKey`, `UniqueConstraint`, `Mapped`, `mapped_column`, `Base`, `utcnow`, `datetime` are all already imported at the top of `app/models.py` — no new imports needed. + +--- + +### Task 4: Alembic migration 0022 + +**Files:** +- Create: `alembic/versions/0022_localization.py` + +- [ ] **Step 1: Write the migration** + +```python +"""localization: users.lang + strategic_log_translations. + +Revision ID: 0022 +Revises: 0021 +Create Date: 2026-05-27 +""" +from typing import Sequence, Union + +import sqlalchemy as sa +from alembic import op + + +revision: str = "0022" +down_revision: Union[str, None] = "0021" +branch_labels: Union[str, Sequence[str], None] = None +depends_on: Union[str, Sequence[str], None] = None + + +def upgrade() -> None: + op.add_column( + "users", + sa.Column( + "lang", sa.String(length=8), nullable=False, + server_default="en", + ), + ) + op.create_table( + "strategic_log_translations", + sa.Column("id", sa.BigInteger(), primary_key=True, autoincrement=True), + sa.Column("log_id", sa.BigInteger(), nullable=False), + sa.Column("lang", sa.String(length=8), nullable=False), + sa.Column("content_md", sa.Text(), nullable=False), + sa.Column("generated_at", sa.DateTime(timezone=True), nullable=False), + sa.Column("llm_model", sa.String(length=64), nullable=True), + sa.Column("llm_cost_usd", sa.Float(), nullable=True), + sa.ForeignKeyConstraint( + ["log_id"], ["strategic_logs.id"], + ondelete="CASCADE", name="fk_slt_log", + ), + sa.UniqueConstraint("log_id", "lang", name="uq_slt_log_lang"), + ) + + +def downgrade() -> None: + op.drop_table("strategic_log_translations") + op.drop_column("users", "lang") +``` + +- [ ] **Step 2: Verify migration chain integrity** + +```bash +docker compose -f docker-compose.test.yml run --rm test python -c " +from alembic.config import Config +from alembic.script import ScriptDirectory +sd = ScriptDirectory.from_config(Config('alembic.ini')) +heads = sd.get_heads() +assert heads == ('0022',), heads +rev = sd.get_revision('0022') +assert rev.down_revision == '0021' +print('OK') +" +``` + +Expected: prints `OK`. + +- [ ] **Step 3: Commit** + +```bash +git add alembic/versions/0022_localization.py +git commit -m "alembic: add 0022 localization (users.lang + strategic_log_translations)" +``` + +## Context + +- Previous migration is `0021_csv_format_template.py`. The new file follows the same hand-rolled style. +- Codebase convention for integer server_defaults is `sa.text("0")` — but here we have no integer defaults to set. String default `"en"` uses the bare string per existing migrations (e.g. `0011_drop_portfolio_tables.py` uses `server_default="GBP"`). + +--- + +### Task 5: ai_log_job translation fan-out + +**Files:** +- Modify: `app/jobs/ai_log_job.py` +- Test: `tests/test_localization_integration.py` (append) + +- [ ] **Step 1: Inspect the existing log-writing path** + +```bash +grep -n "def \|StrategicLog\|session.commit\|session.add" app/jobs/ai_log_job.py | head -20 +``` + +Locate the function and the line where the new `StrategicLog` row is committed. The fan-out runs **after** that commit. + +- [ ] **Step 2: Write failing tests** + +Append to `tests/test_localization_integration.py`: + +```python +@pytest.mark.asyncio +async def test_log_translation_fanout_no_active_non_en_users(tmp_path, monkeypatch): + """When no users have an active non-en lang, the fan-out makes no + translation calls and no rows are inserted.""" + from unittest.mock import AsyncMock + from sqlalchemy import select + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + from app.jobs import ai_log_job + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + fake_translate = AsyncMock() + monkeypatch.setattr(ai_log_job, "translate", fake_translate) + + # Seed an English user (no non-en users). + async with factory() as session: + session.add(User(id=1, email="en@x", tier="paid", lang="en")) + slog = StrategicLog( + generated_at=utcnow(), content_md="# Open\n\nDown 0.4%.", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + async with factory() as session: + await ai_log_job.translate_log_for_active_languages(session, log_id) + + fake_translate.assert_not_awaited() + async with factory() as session: + rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() + assert rows == [] + + +@pytest.mark.asyncio +async def test_log_translation_fanout_italian_user(tmp_path, monkeypatch): + """One user at lang=it triggers one translation; the row lands with + the right lang and log_id.""" + from unittest.mock import AsyncMock + from sqlalchemy import select + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + from app.services.openrouter import LogResult + from app.jobs import ai_log_job + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async def _fake_translate(client, text, target_lang): + assert target_lang == "it" + return "# Apertura\n\nIn calo 0,4%.", LogResult( + content="# Apertura\n\nIn calo 0,4%.", + model="deepseek/deepseek-v4-flash", + prompt_tokens=300, completion_tokens=80, cost_usd=0.00002, + ) + monkeypatch.setattr(ai_log_job, "translate", _fake_translate) + + async with factory() as session: + session.add(User(id=2, email="it@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content_md="# Open\n\nDown 0.4%.", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + async with factory() as session: + await ai_log_job.translate_log_for_active_languages(session, log_id) + + async with factory() as session: + rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() + assert len(rows) == 1 + row = rows[0] + assert row.log_id == log_id + assert row.lang == "it" + assert row.content_md.startswith("# Apertura") + assert row.llm_model == "deepseek/deepseek-v4-flash" + assert row.llm_cost_usd == pytest.approx(0.00002) + + +@pytest.mark.asyncio +async def test_log_translation_fanout_per_language_failure_isolated(tmp_path, monkeypatch): + """If one language's translation fails, the others (if any) still land + and the job does not raise.""" + from unittest.mock import AsyncMock + from sqlalchemy import select + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + from app.jobs import ai_log_job + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async def _fake_translate(client, text, target_lang): + raise RuntimeError("upstream down") + monkeypatch.setattr(ai_log_job, "translate", _fake_translate) + + async with factory() as session: + session.add(User(id=3, email="it@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content_md="# Open", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + # Must NOT raise. + async with factory() as session: + await ai_log_job.translate_log_for_active_languages(session, log_id) + + async with factory() as session: + rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() + assert rows == [] +``` + +- [ ] **Step 3: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k log_translation_fanout -v +``` + +Expected: 3 FAIL with `AttributeError: module 'app.jobs.ai_log_job' has no attribute 'translate_log_for_active_languages'`. + +- [ ] **Step 4: Add the fan-out function** + +In `app/jobs/ai_log_job.py`, add (at module scope, alongside other helpers; use existing imports + add what's missing — `httpx`, `asyncio`, `select`, the i18n + translation modules, and the model imports): + +```python +import asyncio +import httpx + +from sqlalchemy import select + +from app.db import utcnow +from app.models import User, StrategicLogTranslation +from app.services.i18n import ACTIVE_LANGUAGES +from app.services.translation import translate +``` + +(Add only the lines not already present.) + +Then add the function: + +```python +async def translate_log_for_active_languages(session, log_id: int) -> None: + """Fan out per-language translations for the strategic log identified + by ``log_id``. + + Reads ``users.lang`` (deduplicated, restricted to ACTIVE_LANGUAGES + minus English), one translation call per language in parallel via + ``asyncio.gather``, persists each successful result as a + ``StrategicLogTranslation`` row. Per-language failures are logged + but never raise — the strategic log itself is already committed at + this point and translation is a best-effort enhancement. + + The job orchestrator calls this AFTER the English ``StrategicLog`` + row is committed; pass the row's ``id`` in. + """ + from app.models import StrategicLog # local import: avoid widening top-level imports + target_langs = sorted({l for l in ACTIVE_LANGUAGES if l != "en"}) + if not target_langs: + return + + active_langs = (await session.execute( + select(User.lang).distinct().where(User.lang.in_(target_langs)) + )).scalars().all() + if not active_langs: + return + + log_row = await session.get(StrategicLog, log_id) + if log_row is None: + log.warning("log.translate.missing_log", log_id=log_id) + return + + async with httpx.AsyncClient(follow_redirects=True, timeout=60) as client: + results = await asyncio.gather(*[ + translate(client, log_row.content_md, lang) + for lang in active_langs + ], return_exceptions=True) + + for lang, result in zip(active_langs, results): + if isinstance(result, Exception): + log.warning("log.translate.failed", lang=lang, log_id=log_id, + error=str(result)[:200]) + continue + translated_md, llm_log = result + session.add(StrategicLogTranslation( + log_id=log_id, lang=lang, + content_md=translated_md, + generated_at=utcnow(), + llm_model=llm_log.model, + llm_cost_usd=llm_log.cost_usd, + )) + await session.commit() +``` + +- [ ] **Step 5: Wire the fan-out into the existing log-write path** + +Find the function in `ai_log_job.py` that writes a `StrategicLog` row and calls `session.commit()`. After that commit, capture the row's `id` and call: + +```python +await translate_log_for_active_languages(session, slog.id) +``` + +(Use whatever the actual local variable name for the row is.) + +- [ ] **Step 6: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: 5 tests pass (2 from Task 3 + 3 new). + +- [ ] **Step 7: Commit** + +```bash +git add app/jobs/ai_log_job.py tests/test_localization_integration.py +git commit -m "ai-log-job: translate strategic log for active non-en languages" +``` + +--- + +### Task 6: portfolio_analysis localization + +**Files:** +- Modify: `app/services/portfolio_analysis.py` +- Modify: `app/routers/universe.py` — the `/api/analyze` route +- Test: `tests/test_localization_integration.py` (append) + +- [ ] **Step 1: Write failing tests** + +Append to `tests/test_localization_integration.py`: + +```python +@pytest.mark.asyncio +async def test_analyse_threads_lang_into_system_prompt(monkeypatch): + """When lang='it', the system prompt sent to call_llm contains + 'Respond in Italian.' — the LLM does the rest.""" + from unittest.mock import AsyncMock + from app.services import portfolio_analysis as pa + from app.services.openrouter import LogResult + + captured = {} + + async def _fake_call_llm(client, messages, **kw): + captured["messages"] = messages + return LogResult( + content="Analisi del portafoglio in italiano.", + model="m", prompt_tokens=400, completion_tokens=100, cost_usd=0.0001, + ) + monkeypatch.setattr(pa, "call_llm", _fake_call_llm) + + payload = { + "positions": [{"yahoo_ticker": "AAPL", "qty": 10, "avg_cost": 150.0, + "currency": "USD", "name": "Apple Inc"}], + "prices": {"AAPL": {"p": 172.4, "c": "USD"}}, + "fx": {"USD": 1.0}, + "base_currency": "USD", + "tone": "INTERMEDIATE", + "analysis": "NORMAL", + "lang": "it", + } + req = pa.parse_request(payload) + assert req.lang == "it" + + # Direct call into analyse() to inspect the captured prompt. + # Use None session — analyse should not touch the DB in this code path + # because we mocked call_llm before any AICall ledger write. + # If analyse insists on a session, wrap with the test factory. + result = await pa.analyse(None, req) # noqa: ARG — session ignored by mock + system = next(m["content"] for m in captured["messages"] if m["role"] == "system") + assert "Respond in Italian" in system + + +@pytest.mark.asyncio +async def test_analyse_no_clause_when_lang_is_en(monkeypatch): + from unittest.mock import AsyncMock + from app.services import portfolio_analysis as pa + from app.services.openrouter import LogResult + + captured = {} + + async def _fake_call_llm(client, messages, **kw): + captured["messages"] = messages + return LogResult( + content="Portfolio analysis in English.", + model="m", prompt_tokens=400, completion_tokens=100, cost_usd=0.0001, + ) + monkeypatch.setattr(pa, "call_llm", _fake_call_llm) + + payload = { + "positions": [{"yahoo_ticker": "AAPL", "qty": 10, "avg_cost": 150.0, + "currency": "USD", "name": "Apple Inc"}], + "prices": {"AAPL": {"p": 172.4, "c": "USD"}}, + "fx": {"USD": 1.0}, + "base_currency": "USD", + "tone": "INTERMEDIATE", + "analysis": "NORMAL", + "lang": "en", + } + req = pa.parse_request(payload) + await pa.analyse(None, req) + system = next(m["content"] for m in captured["messages"] if m["role"] == "system") + assert "Respond in" not in system +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k analyse -v +``` + +Expected: 2 FAIL — either `lang` field on `AnalysisRequest` missing, or "Respond in Italian" not in the prompt. + +- [ ] **Step 3: Add `lang` to AnalysisRequest and thread through** + +In `app/services/portfolio_analysis.py`: + +1. Locate the `AnalysisRequest` dataclass (or pydantic model). Add a `lang: str = "en"` field next to `tone` / `analysis`. +2. Locate `parse_request`. Read `payload.get("lang", "en")` and pass it to the request constructor. Validate against `LANGUAGES`: + +```python +from app.services.i18n import LANGUAGES, respond_in_clause + +# Inside parse_request, alongside the existing tone/analysis parsing: +lang = (payload.get("lang") or "en").strip().lower() +if lang not in LANGUAGES: + lang = "en" +``` + +Then build the request with `lang=lang`. + +3. Locate `analyse` (the async function that builds messages and calls `call_llm`). After the system prompt is composed, append the i18n clause: + +```python +system_prompt = system_prompt + respond_in_clause(req.lang) +``` + +(Use whatever the local variable name for the system prompt is.) + +- [ ] **Step 4: Pass the user's lang from the route** + +In `app/routers/universe.py`, find `analyze_portfolio` (the `/api/analyze` route handler). Add the user's lang to the payload before calling `parse_request`: + +```python +# Just before parse_request: +user_lang = ( + principal.user.lang if (principal.user and principal.user.lang) else "en" +) +payload["lang"] = user_lang +``` + +(The handler receives `principal` via Depends. Confirm by reading the handler's signature; if the principal isn't already wired in, add `principal: CurrentUser = Depends(require_paid)` matching the existing dep.) + +- [ ] **Step 5: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: 7 tests pass (5 from Tasks 3+5 + 2 new). + +- [ ] **Step 6: Commit** + +```bash +git add app/services/portfolio_analysis.py app/routers/universe.py tests/test_localization_integration.py +git commit -m "analyse: thread user.lang into the system prompt" +``` + +--- + +### Task 7: email_digest_job localization + +**Files:** +- Modify: `app/jobs/email_digest_job.py` +- Test: `tests/test_localization_integration.py` (append) + +- [ ] **Step 1: Write failing test** + +Append to `tests/test_localization_integration.py`: + +```python +@pytest.mark.asyncio +async def test_digest_threads_lang_into_system_prompt(monkeypatch): + """The per-user digest generation appends 'Respond in Italian.' to + the system prompt when the user is Italian.""" + from unittest.mock import AsyncMock + from app.jobs import email_digest_job as ed + from app.services.openrouter import LogResult + + captured = [] + + async def _fake_call_llm(client, messages, **kw): + captured.append(messages) + return LogResult( + content="**Apertura.** Il mercato è in calo.", + model="m", prompt_tokens=300, completion_tokens=400, cost_usd=0.0001, + ) + monkeypatch.setattr(ed, "call_llm", _fake_call_llm) + + # _generate_variants is the helper that runs one LLM call per tone. + # It takes a context dict and a 'kind' (daily/weekly). The exact + # signature is in app/jobs/email_digest_job.py — inspect before + # calling. The test below assumes it accepts a `target_lang` kwarg. + from datetime import datetime, timezone + + ctx = { + "today": datetime.now(timezone.utc), + "quotes_by_group": {}, + "headlines_by_bucket": {}, + "reference_line": None, + } + + # `_generate_variants` should iterate tones internally; we just need + # to assert at least one captured system prompt has the IT clause. + import httpx + async with httpx.AsyncClient() as client: + await ed._generate_variants(None, client, "daily", ctx, target_lang="it") + + assert captured, "no LLM call was made" + italian_found = any( + any( + m["role"] == "system" and "Respond in Italian" in m["content"] + for m in messages + ) + for messages in captured + ) + assert italian_found, "no system prompt contained 'Respond in Italian'" +``` + +- [ ] **Step 2: Run test to verify it fails** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k digest_threads_lang -v +``` + +Expected: FAIL — either `_generate_variants` doesn't accept `target_lang`, or the IT clause isn't in the prompt. + +- [ ] **Step 3: Thread `target_lang` through `_generate_variants` and the per-user driver** + +In `app/jobs/email_digest_job.py`: + +1. Import the helper: + ```python + from app.services.i18n import respond_in_clause + ``` + +2. Find `_generate_variants`. Add `target_lang: str = "en"` to its signature. Where it composes each variant's system prompt, append: + ```python + system_prompt = system_prompt + respond_in_clause(target_lang) + ``` + +3. Find the per-user send path (the function that actually iterates users — likely `_send_for_user` or similar, called from the job's main loop). Where it calls `_generate_variants`, pass `target_lang=user.lang`: + ```python + variants = await _generate_variants( + session, client, kind, ctx, target_lang=user.lang, + ) + ``` + + If the existing call site is in the main job loop and constructs `variants` once for all users, that breaks the "per-user language" contract. In that case the variants must be generated PER USER, not globally. Look for the caller; if it caches `variants` across users, restructure to call `_generate_variants` inside the per-user loop. **Important:** if this requires more than a few lines of change, stop and report a concern — the existing assumption may be wrong and we want explicit guidance. + +- [ ] **Step 4: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: all tests pass (8 total now). + +- [ ] **Step 5: Commit** + +```bash +git add app/jobs/email_digest_job.py tests/test_localization_integration.py +git commit -m "digest: thread user.lang into per-user generation" +``` + +--- + +### Task 8: /log endpoint localized fetch + +**Files:** +- Modify: `app/routers/pages.py` — `log_page` and `log_page_day` +- Test: `tests/test_localization_integration.py` (append) + +- [ ] **Step 1: Inspect the existing log endpoints** + +```bash +grep -n "def log_page\|StrategicLog\|content_md\|generated_at" app/routers/pages.py | head -20 +``` + +Locate the function(s) that fetch the strategic log and pass `content_md` to the template. + +- [ ] **Step 2: Write a failing test** + +Append to `tests/test_localization_integration.py`: + +```python +@pytest.mark.asyncio +async def test_log_endpoint_serves_italian_when_user_is_italian(tmp_path): + """When a user with lang='it' opens /log, the served content_md is + the Italian translation, not the English original.""" + from datetime import datetime, timezone + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=10, email="it@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content_md="# Open\n\nDown 0.4%.", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + session.add(StrategicLogTranslation( + log_id=slog.id, lang="it", + content_md="# Apertura\n\nIn calo 0,4%.", + generated_at=utcnow(), llm_model="m", llm_cost_usd=0.0, + )) + await session.commit() + log_id = slog.id + + # We test the resolver function directly rather than spinning up the + # FastAPI TestClient — the resolver shape returns the rendered MD. + from app.routers.pages import _resolve_log_content + async with factory() as session: + user = await session.get(User, 10) + content = await _resolve_log_content(session, log_id, user.lang) + assert "Apertura" in content + assert "Open" not in content + + +@pytest.mark.asyncio +async def test_log_endpoint_falls_back_to_english_when_no_translation(tmp_path): + """User lang='it' but no IT translation exists → English fallback.""" + from app.db import utcnow + from app.models import StrategicLog, User + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=11, email="it2@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content_md="# Open\n\nDown 0.4%.", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + from app.routers.pages import _resolve_log_content + async with factory() as session: + user = await session.get(User, 11) + content = await _resolve_log_content(session, log_id, user.lang) + assert "Open" in content +``` + +- [ ] **Step 3: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k log_endpoint -v +``` + +Expected: 2 FAIL — `_resolve_log_content` doesn't exist yet. + +- [ ] **Step 4: Add the resolver and wire it into `log_page` / `log_page_day`** + +In `app/routers/pages.py`, add the resolver as a module-level async function: + +```python +async def _resolve_log_content( + session: AsyncSession, log_id: int, lang: str, +) -> str: + """Return the markdown content of strategic log ``log_id`` in the + user's preferred language. + + If ``lang`` is ``en`` or no translation exists for the requested + language, returns the English original. The fallback is silent — + a missing translation is the expected case for hours where + translation hasn't yet run.""" + from app.models import StrategicLog, StrategicLogTranslation + + if lang and lang != "en": + row = (await session.execute( + select(StrategicLogTranslation) + .where(StrategicLogTranslation.log_id == log_id) + .where(StrategicLogTranslation.lang == lang) + )).scalar_one_or_none() + if row is not None: + return row.content_md + log_row = await session.get(StrategicLog, log_id) + return log_row.content_md if log_row is not None else "" +``` + +Then in `log_page` (and `log_page_day` if present), replace the line that pulls `content_md` directly from the StrategicLog row with a call to `_resolve_log_content(session, log.id, cu.user.lang if cu.user else "en")`. + +- [ ] **Step 5: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: all 10 tests pass. + +- [ ] **Step 6: Commit** + +```bash +git add app/routers/pages.py tests/test_localization_integration.py +git commit -m "log: serve translated content when available; English fallback" +``` + +--- + +### Task 9: PATCH /api/settings/language endpoint + +**Files:** +- Modify: `app/routers/api.py` +- Test: `tests/test_localization_integration.py` (append) + +- [ ] **Step 1: Write failing tests** + +Append to `tests/test_localization_integration.py`: + +```python +@pytest.mark.asyncio +async def test_patch_language_accepts_active(tmp_path): + """PATCH /api/settings/language accepts 'en' and 'it' and persists.""" + from app.models import User + from app.routers.api import patch_language_prefs, LanguagePrefsIn + from app.auth import CurrentUser + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=20, email="u@x", tier="paid", lang="en")) + await session.commit() + + class _P: + is_admin = False + def __init__(self, u): self.user = u + + async with factory() as session: + user = await session.get(User, 20) + result = await patch_language_prefs( + payload=LanguagePrefsIn(lang="it"), + principal=_P(user), + session=session, + ) + assert result.lang == "it" + + async with factory() as session: + user = await session.get(User, 20) + assert user.lang == "it" + + +@pytest.mark.asyncio +async def test_patch_language_rejects_wip(tmp_path): + """PATCH rejects 'es'/'fr'/'de'/'xx' with 400 — ACTIVE_LANGUAGES gate.""" + from fastapi import HTTPException + from app.models import User + from app.routers.api import patch_language_prefs, LanguagePrefsIn + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=21, email="u2@x", tier="paid", lang="en")) + await session.commit() + + class _P: + is_admin = False + def __init__(self, u): self.user = u + + for bad in ("es", "fr", "de", "xx"): + async with factory() as session: + user = await session.get(User, 21) + with pytest.raises(HTTPException) as exc: + await patch_language_prefs( + payload=LanguagePrefsIn(lang=bad), + principal=_P(user), + session=session, + ) + assert exc.value.status_code == 400 +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k patch_language -v +``` + +Expected: 2 FAIL with `ImportError` for `patch_language_prefs` / `LanguagePrefsIn`. + +- [ ] **Step 3: Add the endpoint** + +In `app/routers/api.py`, near the existing `patch_digest_prefs` (around lines 868-897), add: + +```python +from app.services.i18n import ACTIVE_LANGUAGES + + +# --------------------------------------------------------------------------- +# Settings — language preference +# --------------------------------------------------------------------------- + + +class LanguagePrefsIn(BaseModel): + lang: str + + +class LanguagePrefsOut(BaseModel): + lang: str + + +@router.patch("/settings/language", response_model=LanguagePrefsOut) +async def patch_language_prefs( + payload: LanguagePrefsIn, + principal: CurrentUser = Depends(require_token), + session: AsyncSession = Depends(get_session), +) -> LanguagePrefsOut: + if principal.user is None: + raise HTTPException(status_code=400, detail="no_user_context") + lang = (payload.lang or "").strip().lower() + if lang not in ACTIVE_LANGUAGES: + raise HTTPException( + status_code=400, + detail=f"unsupported language: {payload.lang!r}", + ) + user = await session.get(User, principal.user.id) + if user is None: + raise HTTPException(status_code=404, detail="user_not_found") + user.lang = lang + await session.commit() + return LanguagePrefsOut(lang=lang) +``` + +(`User` and `BaseModel` are already imported at the top of `app/routers/api.py`. If `ACTIVE_LANGUAGES` import collides with anything else, alias it.) + +- [ ] **Step 4: Run tests to verify they pass** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v +``` + +Expected: all 12 tests pass. + +- [ ] **Step 5: Commit** + +```bash +git add app/routers/api.py tests/test_localization_integration.py +git commit -m "settings: PATCH /api/settings/language with ACTIVE_LANGUAGES gate" +``` + +--- + +### Task 10: Settings UI dropdown + +**Files:** +- Modify: `app/templates/settings.html` +- Modify: `app/routers/pages.py::settings_page` — pass `user.lang` to the template context + +- [ ] **Step 1: Add the language section to settings.html** + +In `app/templates/settings.html`, find the existing settings sections (`
    ` blocks). Add a new section next to the email-digest preferences: + +```html +
    + Language +

    + Language the AI uses for the strategic log, your daily digest, and + portfolio commentary. The interface itself stays in English for now. +

    +
    + + +
    + +
    +``` + +- [ ] **Step 2: Confirm `settings_page` passes `user.lang`** + +In `app/routers/pages.py::settings_page`, the template context already includes `user`. The template reads `user.lang` directly from that object. No code change required — the Jinja2 expression `{% if (user.lang or 'en') == 'it' %}` handles old rows whose `lang` field hasn't been populated yet (defensive, post-migration). + +If the existing template context does NOT pass the `user` object (it should, based on the digest-prefs section), add it. + +- [ ] **Step 3: Manual smoke verification step** + +Smoke is deferred to Task 11. No test step here — pure markup + small inline JS. + +- [ ] **Step 4: Commit** + +```bash +git add app/templates/settings.html +git commit -m "settings: add language dropdown (IT active, ES/FR/DE WIP)" +``` + +--- + +### Task 11: Final regression + deploy + manual smoke + +**Files:** +- (no code changes — verification only) + +- [ ] **Step 1: Full test suite** + +```bash +docker compose -f docker-compose.test.yml run --rm test pytest tests/ 2>&1 | tail -5 +``` + +Expected: every previous test plus the new `tests/test_i18n.py` and `tests/test_localization_integration.py` pass. Total should now be ~280 passing. + +- [ ] **Step 2: Apply migration to prod DB (requires explicit user approval)** + +```bash +docker compose exec app alembic upgrade head +``` + +Expected: `Running upgrade 0021 -> 0022, localization`. + +- [ ] **Step 3: Restart prod app (requires explicit user approval)** + +```bash +docker compose restart app +docker compose logs app --tail 30 | grep -E "(Uvicorn|startup complete|ERROR|Traceback)" +``` + +Expected: `Application startup complete.` cleanly; no tracebacks. + +- [ ] **Step 4: Manual smoke — switch a paid test user to Italian** + +In a paid-tier browser session, open `/settings`. Confirm the Language dropdown appears with all five options, the English option selected, ES/FR/DE disabled and labelled "coming soon". Pick `Italiano`, confirm the inline `✓ saved` status appears. Refresh — Italian remains selected. + +- [ ] **Step 5: Manual smoke — strategic log translation** + +Wait for the next hourly `ai_log_job` tick (or trigger via the scheduler/admin). Confirm a row appears in `strategic_log_translations` with `lang='it'`. NOTE: this requires a prod DB read; only run with explicit user approval: + +```bash +docker compose exec app python -c " +import asyncio +from sqlalchemy import select +from app.db import get_session_factory +from app.models import StrategicLogTranslation + +async def main(): + factory = get_session_factory() + async with factory() as s: + rows = (await s.execute( + select(StrategicLogTranslation).order_by(StrategicLogTranslation.id.desc()).limit(3) + )).scalars().all() + for r in rows: + print(r.id, r.log_id, r.lang, r.llm_model, r.llm_cost_usd, r.content_md[:80]) +asyncio.run(main()) +" +``` + +Expected: at least one row with `lang='it'`, `llm_model` containing `deepseek`, `llm_cost_usd` a small positive number. + +- [ ] **Step 6: Manual smoke — portfolio analysis** + +On the dashboard as the Italian user, click "Analyse" (or whatever triggers `/api/analyze`). Confirm the rendered AI commentary is in Italian. + +- [ ] **Step 7: Manual smoke — email digest** + +```bash +docker compose exec app python -m app.cli send-test-digest daily +``` + +Expected: digest email lands in Italian, including the subject line. + +- [ ] **Step 8: Manual smoke — Edge cases** + +- Direct `curl -X PATCH /api/settings/language` with `{"lang": "es"}` → 400. +- Switch user back to English, refresh dashboard — log renders in English again. + +--- + +## Self-Review + +**Spec coverage walkthrough:** + +- **`users.lang` column with default 'en'** → Task 3 model + Task 4 migration +- **`strategic_log_translations` table** → Task 3 model + Task 4 migration +- **`i18n.LANGUAGES` + `ACTIVE_LANGUAGES`** → Task 1 +- **`respond_in_clause()`** → Task 1 +- **`translate()` helper** → Task 2 (no-op fast path for `en`/unknown; code-fence stripping; raises on provider failure) +- **`ai_log_job` translation fan-out** → Task 5 (parallel via `asyncio.gather`; per-language failure isolated) +- **Portfolio analysis `lang`-aware system prompt** → Task 6 +- **Email digest `lang`-aware per-user generation** → Task 7 +- **`/log` localized fetch with English fallback** → Task 8 +- **`PATCH /api/settings/language` with ACTIVE_LANGUAGES gate** → Task 9 +- **Settings dropdown with IT active + ES/FR/DE disabled** → Task 10 +- **No tier gating on translation** → Task 5 query selects on `User.lang` only, no `tier` filter +- **No retroactive backfill** → not built; only forward-going translations +- **No UI label translation** → out of scope, Task 10 surfaces this in the section copy + +**Type / signature consistency:** + +- `respond_in_clause(lang: str | None) -> str` — used in Tasks 6, 7. Consistent. +- `translate(client, text, target_lang) -> tuple[str, LogResult]` — used in Tasks 2, 5. Consistent. +- `ACTIVE_LANGUAGES: set[str]` — used in Tasks 5, 9. Consistent. +- `LanguagePrefsIn { lang: str }` / `LanguagePrefsOut { lang: str }` — used in Task 9 only. +- `_resolve_log_content(session, log_id, lang) -> str` — used in Task 8 only. +- `translate_log_for_active_languages(session, log_id) -> None` — used in Task 5 only. + +**Note on Task 7:** if `_generate_variants` is currently called ONCE for all users in the digest job (variants shared), the localization plan requires it to be called per-user. The plan flags this and asks the engineer to surface a concern rather than silently restructuring. If the structure differs from expectation, the engineer should escalate before proceeding. From 2ecf250d539e0c840e15461409771df1959b1b21 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:22:41 +0200 Subject: [PATCH 05/69] localization: digest is shared, not per-user (corrected design) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The user pointed out that the only genuinely per-user AI surface is portfolio analysis. The strategic log AND the email digest are both shared cycles — generated once per cycle, consumed by many users. For the digest, this means: - _generate_variants still produces one English variant per tone (as today, unchanged) - A new helper translates each variant once per active non-en lang in parallel via asyncio.gather, producing a {(tone, lang): content} table for the duration of the job run - The per-user send loop selects (user.digest_tone, user.lang), falling back to the English variant of the same tone on miss Translation count per run = tones × non-en active langs = 3 today. 100 Italian users no longer mean 100 translation calls. Co-Authored-By: Claude Opus 4.7 --- .../plans/2026-05-27-localization-italian.md | 269 +++++++++++++----- .../2026-05-27-localization-italian-design.md | 121 +++++--- 2 files changed, 279 insertions(+), 111 deletions(-) diff --git a/docs/superpowers/plans/2026-05-27-localization-italian.md b/docs/superpowers/plans/2026-05-27-localization-italian.md index 363689b..eaef3e2 100644 --- a/docs/superpowers/plans/2026-05-27-localization-italian.md +++ b/docs/superpowers/plans/2026-05-27-localization-italian.md @@ -1020,111 +1020,250 @@ git commit -m "analyse: thread user.lang into the system prompt" --- -### Task 7: email_digest_job localization +### Task 7: email_digest_job — translate variants once, route by (tone, lang) **Files:** - Modify: `app/jobs/email_digest_job.py` - Test: `tests/test_localization_integration.py` (append) -- [ ] **Step 1: Write failing test** +**Design recap:** The digest job already produces one English variant per +tone (NOVICE / INTERMEDIATE / PRO) once per job run. After those English +variants are built, the job translates each one to every active +non-English language in parallel and builds an in-memory lookup +`{(tone, lang): content_md}`. The per-user send step picks the cell +matching `(user.digest_tone, user.lang)`, falling back to `(tone, 'en')` +when a translation is missing or failed. No per-user LLM call. + +- [ ] **Step 1: Inspect the existing digest flow** + +```bash +grep -n "_generate_variants\|_send_one\|active_users\|for .* in .*users" app/jobs/email_digest_job.py | head -20 +``` + +Identify: +1. Where the English variants are built (one call per tone). +2. The shape of the returned object (likely `dict[str, str]` keyed by tone like `"NOVICE"`). +3. The per-user send loop and where it picks a variant for the recipient. + +- [ ] **Step 2: Write a failing test** Append to `tests/test_localization_integration.py`: ```python @pytest.mark.asyncio -async def test_digest_threads_lang_into_system_prompt(monkeypatch): - """The per-user digest generation appends 'Respond in Italian.' to - the system prompt when the user is Italian.""" - from unittest.mock import AsyncMock +async def test_digest_translates_variants_per_active_lang(monkeypatch): + """After English variants are built, the job translates each to every + active non-en lang. The result is an in-memory mapping the send loop + consults.""" + from unittest.mock import AsyncMock, MagicMock from app.jobs import email_digest_job as ed from app.services.openrouter import LogResult - captured = [] - - async def _fake_call_llm(client, messages, **kw): - captured.append(messages) - return LogResult( - content="**Apertura.** Il mercato è in calo.", - model="m", prompt_tokens=300, completion_tokens=400, cost_usd=0.0001, - ) - monkeypatch.setattr(ed, "call_llm", _fake_call_llm) - - # _generate_variants is the helper that runs one LLM call per tone. - # It takes a context dict and a 'kind' (daily/weekly). The exact - # signature is in app/jobs/email_digest_job.py — inspect before - # calling. The test below assumes it accepts a `target_lang` kwarg. - from datetime import datetime, timezone - - ctx = { - "today": datetime.now(timezone.utc), - "quotes_by_group": {}, - "headlines_by_bucket": {}, - "reference_line": None, + # Stub the English variant builder so we control the input set. + english_variants = { + "NOVICE": "**Today.** Markets calmer.", + "INTERMEDIATE": "**Today.** Indices slightly down.", + "PRO": "**Today.** Risk-off rotation, breadth weak.", } - # `_generate_variants` should iterate tones internally; we just need - # to assert at least one captured system prompt has the IT clause. - import httpx - async with httpx.AsyncClient() as client: - await ed._generate_variants(None, client, "daily", ctx, target_lang="it") + # Track every translate() call so we can assert fan-out shape. + translate_calls: list[tuple[str, str]] = [] - assert captured, "no LLM call was made" - italian_found = any( - any( - m["role"] == "system" and "Respond in Italian" in m["content"] - for m in messages + async def _fake_translate(client, text, target_lang): + translate_calls.append((text, target_lang)) + return f"[IT] {text}", LogResult( + content=f"[IT] {text}", model="m", + prompt_tokens=10, completion_tokens=10, cost_usd=0.0, ) - for messages in captured + + monkeypatch.setattr(ed, "translate", _fake_translate) + + # The helper under test takes the English variants dict + a list of + # active non-en languages, returns the {(tone, lang): content} table. + client = MagicMock() + table = await ed._translate_variants_for_active_langs( + client, english_variants, ["it"], ) - assert italian_found, "no system prompt contained 'Respond in Italian'" + + # Three tones × one non-en lang = three translation calls. + assert len(translate_calls) == 3 + assert {lang for _, lang in translate_calls} == {"it"} + + # English entries are present unchanged. + assert table[("NOVICE", "en")] == english_variants["NOVICE"] + assert table[("PRO", "en")] == english_variants["PRO"] + # Italian entries are populated. + assert table[("INTERMEDIATE", "it")].startswith("[IT] ") + + +@pytest.mark.asyncio +async def test_digest_translation_failure_falls_back_to_english(monkeypatch): + """When translate() fails for a (tone, lang) cell, the table entry + for that cell is the English variant of the same tone — the user + still gets a digest, just in English that day.""" + from app.jobs import email_digest_job as ed + + english_variants = {"INTERMEDIATE": "**Today.** Indices down."} + + async def _fake_translate(client, text, target_lang): + raise RuntimeError("upstream down") + monkeypatch.setattr(ed, "translate", _fake_translate) + + from unittest.mock import MagicMock + client = MagicMock() + table = await ed._translate_variants_for_active_langs( + client, english_variants, ["it"], + ) + + assert table[("INTERMEDIATE", "it")] == english_variants["INTERMEDIATE"] + + +def test_digest_pick_variant_uses_user_lang(): + """The variant-picker helper consults user.digest_tone + user.lang.""" + from app.jobs import email_digest_job as ed + + table = { + ("NOVICE", "en"): "novice en", + ("NOVICE", "it"): "novice it", + ("INTERMEDIATE", "en"): "intermediate en", + ("INTERMEDIATE", "it"): "intermediate it", + } + assert ed._pick_variant(table, tone="NOVICE", lang="it") == "novice it" + assert ed._pick_variant(table, tone="INTERMEDIATE", lang="en") == "intermediate en" + # Missing lang → fallback to English variant of the same tone. + assert ed._pick_variant(table, tone="NOVICE", lang="de") == "novice en" + # Missing tone → fallback to INTERMEDIATE/en (the safe default). + assert ed._pick_variant(table, tone="UNKNOWN", lang="en") == "intermediate en" ``` -- [ ] **Step 2: Run test to verify it fails** +- [ ] **Step 3: Run tests to verify they fail** ```bash -docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k digest_threads_lang -v +docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -k "digest_translates or digest_translation_failure or digest_pick" -v ``` -Expected: FAIL — either `_generate_variants` doesn't accept `target_lang`, or the IT clause isn't in the prompt. +Expected: 3 FAIL with `AttributeError` for `_translate_variants_for_active_langs` and `_pick_variant`. -- [ ] **Step 3: Thread `target_lang` through `_generate_variants` and the per-user driver** +- [ ] **Step 4: Implement the two helpers + wire them into the job** -In `app/jobs/email_digest_job.py`: +In `app/jobs/email_digest_job.py`, add the necessary imports at the top (skip any that are already present): -1. Import the helper: - ```python - from app.services.i18n import respond_in_clause - ``` +```python +import asyncio -2. Find `_generate_variants`. Add `target_lang: str = "en"` to its signature. Where it composes each variant's system prompt, append: - ```python - system_prompt = system_prompt + respond_in_clause(target_lang) - ``` +from app.services.i18n import ACTIVE_LANGUAGES +from app.services.translation import translate +``` -3. Find the per-user send path (the function that actually iterates users — likely `_send_for_user` or similar, called from the job's main loop). Where it calls `_generate_variants`, pass `target_lang=user.lang`: - ```python - variants = await _generate_variants( - session, client, kind, ctx, target_lang=user.lang, - ) - ``` +Add the two helpers as module-level functions: - If the existing call site is in the main job loop and constructs `variants` once for all users, that breaks the "per-user language" contract. In that case the variants must be generated PER USER, not globally. Look for the caller; if it caches `variants` across users, restructure to call `_generate_variants` inside the per-user loop. **Important:** if this requires more than a few lines of change, stop and report a concern — the existing assumption may be wrong and we want explicit guidance. +```python +async def _translate_variants_for_active_langs( + client, + english_variants: dict[str, str], + target_langs: list[str], +) -> dict[tuple[str, str], str]: + """Build a {(tone, lang): content_md} table. -- [ ] **Step 4: Run tests to verify they pass** + Starts with the English variants as the canonical cells. For each + (tone, target_lang) pair where target_lang != 'en', calls translate() + in parallel; on failure the cell falls back to the English variant + of the same tone so the digest still goes out, just untranslated. + """ + table: dict[tuple[str, str], str] = { + (tone, "en"): content for tone, content in english_variants.items() + } + pairs = [ + (tone, lang) + for tone in english_variants + for lang in target_langs + if lang != "en" + ] + if not pairs: + return table + + results = await asyncio.gather(*[ + translate(client, english_variants[tone], lang) for tone, lang in pairs + ], return_exceptions=True) + for (tone, lang), result in zip(pairs, results): + if isinstance(result, Exception): + log.warning("digest.translate.failed", + tone=tone, lang=lang, error=str(result)[:200]) + table[(tone, lang)] = english_variants[tone] + continue + translated_md, _llm_log = result + table[(tone, lang)] = translated_md + return table + + +def _pick_variant( + table: dict[tuple[str, str], str], tone: str, lang: str, +) -> str: + """Return the digest content for a recipient. + + Lookup order: exact (tone, lang) → (tone, 'en') → ('INTERMEDIATE', + 'en') → first table value. The last falls are defensive; the table + always contains at least one English entry when the job is sending.""" + if (tone, lang) in table: + return table[(tone, lang)] + if (tone, "en") in table: + return table[(tone, "en")] + if ("INTERMEDIATE", "en") in table: + return table[("INTERMEDIATE", "en")] + return next(iter(table.values())) +``` + +Now find the place in the job loop where English variants are generated +(after `_generate_variants` returns its tone-keyed dict) and before the +per-user send loop. Insert: + +```python +# Build the per-language translation table once per job run. Active +# non-en languages are derived from users.lang so we don't translate +# for languages no one uses today. +active_non_en = sorted({l for l in ACTIVE_LANGUAGES if l != "en"}) +# Optional further filter: only languages with at least one user. +# (See task notes — defer if optimization isn't worth it yet.) +variant_table = await _translate_variants_for_active_langs( + client, variants, active_non_en, +) +``` + +And in the per-user send step, replace the direct variant lookup +(e.g. `content = variants[user.digest_tone]`) with: + +```python +content = _pick_variant( + variant_table, + tone=(user.digest_tone or "INTERMEDIATE").upper(), + lang=(user.lang or "en"), +) +``` + +- [ ] **Step 5: Run tests to verify they pass** ```bash docker compose -f docker-compose.test.yml run --rm test pytest tests/test_localization_integration.py -v ``` -Expected: all tests pass (8 total now). +Expected: all tests pass (≥10 total now). -- [ ] **Step 5: Commit** +- [ ] **Step 6: Commit** ```bash git add app/jobs/email_digest_job.py tests/test_localization_integration.py -git commit -m "digest: thread user.lang into per-user generation" +git commit -m "digest: translate variants once per active non-en language" ``` +## Context + +- Translation count per job run is `tones × non-en active languages`. + Today that's `3 × 1 = 3` translation calls per digest run. Negligible cost. +- A failed translation degrades gracefully — the cell falls back to the + English variant of the same tone. The recipient receives a digest in + English instead of getting no email at all. This matches the spec's + "translation is best-effort" intent. + --- ### Task 8: /log endpoint localized fetch @@ -1569,7 +1708,7 @@ Expected: digest email lands in Italian, including the subject line. - **`translate()` helper** → Task 2 (no-op fast path for `en`/unknown; code-fence stripping; raises on provider failure) - **`ai_log_job` translation fan-out** → Task 5 (parallel via `asyncio.gather`; per-language failure isolated) - **Portfolio analysis `lang`-aware system prompt** → Task 6 -- **Email digest `lang`-aware per-user generation** → Task 7 +- **Email digest: shared variant generation, post-translation, (tone, lang) routing** → Task 7 - **`/log` localized fetch with English fallback** → Task 8 - **`PATCH /api/settings/language` with ACTIVE_LANGUAGES gate** → Task 9 - **Settings dropdown with IT active + ES/FR/DE disabled** → Task 10 @@ -1586,4 +1725,4 @@ Expected: digest email lands in Italian, including the subject line. - `_resolve_log_content(session, log_id, lang) -> str` — used in Task 8 only. - `translate_log_for_active_languages(session, log_id) -> None` — used in Task 5 only. -**Note on Task 7:** if `_generate_variants` is currently called ONCE for all users in the digest job (variants shared), the localization plan requires it to be called per-user. The plan flags this and asks the engineer to surface a concern rather than silently restructuring. If the structure differs from expectation, the engineer should escalate before proceeding. +**Note on Task 7:** the digest job is treated as shared content. `_generate_variants` keeps its existing per-tone behaviour unchanged; localization is layered on top via two new module-level helpers (`_translate_variants_for_active_langs`, `_pick_variant`) and a routing change in the per-user send loop. No restructuring of the existing tone-generation path is needed. Translation count per run is `tones × non-en active langs` (today: 3 calls/run) — negligible. diff --git a/docs/superpowers/specs/2026-05-27-localization-italian-design.md b/docs/superpowers/specs/2026-05-27-localization-italian-design.md index 1c03633..c704494 100644 --- a/docs/superpowers/specs/2026-05-27-localization-italian-design.md +++ b/docs/superpowers/specs/2026-05-27-localization-italian-design.md @@ -6,10 +6,10 @@ ## Context All AI-generated content (strategic log, daily email digest, portfolio -analysis, follow-up chat) is English-only today. The operator wants to -add Italian translation as the first localization, with Spanish, -French, and German listed as "coming soon" in the settings UI but not -yet functional. Italian must work end-to-end from settings dropdown to +analysis) is English-only today. The operator wants to add Italian +translation as the first localization, with Spanish, French, and +German listed as "coming soon" in the settings UI but not yet +functional. Italian must work end-to-end from settings dropdown to rendered output; the other three exist as commitments and design placeholders so adding them later is a flag flip. @@ -49,30 +49,44 @@ retrofit. The system has two categories of AI-generated content, with different generation patterns: -### Per-user content (analyse, digest, chat) +### Per-user content (portfolio analysis only) -Each call already produces output for exactly one user. The fix is -trivial: the user's `lang` threads into the prompt assembly, and the -system prompt gains a `"Respond in Italian."` clause when `lang != 'en'`. -One LLM call, no extra cost, no extra latency. +Portfolio analysis is the only AI-generated surface whose *content* is +genuinely per-user — each call's input is the user's own pie. Here we +add the `"Respond in Italian."` clause to the system prompt when +`user.lang != 'en'`. One LLM call, no extra cost, no extra latency. -### Shared content (strategic log) +### Shared content (strategic log, email digest) -The hourly `ai_log_job` writes a single English log row used by every -user. To serve non-English users, we generate the English log as today, -then translate it to each active non-English language via a separate -LLM call and store the result in a new `strategic_log_translations` -table. Translations are fanned out in parallel with `asyncio.gather` so -total translation time is max(single call), not sum. The `/log` -endpoint serves the translation matching the requester's `lang`, -falling back to English if none exists. +Strategic log and email digest are generated once per cycle (hourly, +daily) and consumed by many users. We do NOT generate them per-user +per-language. Instead: -Why translate-after rather than generate-N-times: the strategic log -includes live market data, headlines, and references that are -expensive to assemble. Re-running the full generation in each language -duplicates that work; translating the rendered output preserves a -single source of truth (the English original) and only spends LLM -tokens on the actual prose conversion. +- **Strategic log**: `ai_log_job` writes the English row as today, + then translates it to each active non-English language and persists + in `strategic_log_translations` (one row per `(log_id, lang)`). + `/log` serves the translation matching the user's `lang`, falling + back to English. + +- **Email digest**: the digest job already generates one English + variant per tone (NOVICE / INTERMEDIATE / PRO). We extend the same + cycle so that for each tone variant, the job ALSO produces a + translation for each active non-English language. The translations + live alongside the English variants in memory for the duration of + the job run; the per-user send step selects the matching + `(tone, lang)` cell. No new persistence — variants exist only for + the lifetime of the job. + +Why translate-after rather than generate-N-times: the shared content +involves expensive context assembly (live market data, headlines, log +history). Re-running the full generation in each language duplicates +that work; translating the rendered output preserves a single source +of truth and only spends LLM tokens on the actual prose conversion. + +Why no per-user LLM call for the digest: 100 Italian users would +otherwise mean 100 translation calls per day. With the shared cycle +we make 3 translations per day (one per tone) regardless of how many +Italian users receive that variant. ## Architecture @@ -82,18 +96,27 @@ tokens on the actual prose conversion. │ Values: 'en' (default) | 'it' (active) | 'es'/'fr'/'de' (WIP) │ └─────────────────────────────────────────────────────────────────┘ │ - ├─ Per-user surfaces (portfolio analyse, daily digest, chat) + ├─ Per-user surface (portfolio analysis only) │ └─ prompt assembly threads user.lang to │ respond_in_clause() → appended to system prompt │ when lang != 'en'. Single call_llm, no extra cost. │ - └─ Shared surfaces (strategic log) - ├─ ai_log_job writes the English row as today - ├─ Then SELECTs distinct users.lang where lang != 'en' - │ AND user has active paid access - ├─ asyncio.gather of one translate() call per language - └─ Each result → INSERT into strategic_log_translations - keyed by (log_id, lang) UNIQUE + ├─ Shared surface — strategic log + │ ├─ ai_log_job writes the English row as today + │ ├─ SELECTs distinct users.lang where lang != 'en' + │ │ (no tier gating) + │ ├─ asyncio.gather of one translate() call per language + │ └─ Each result → INSERT into strategic_log_translations + │ keyed by (log_id, lang) UNIQUE + │ + └─ Shared surface — email digest + ├─ Job builds one English variant per tone (existing + │ _generate_variants behaviour, unchanged) + ├─ For each (variant, active non-en lang), translate + │ via asyncio.gather; results live in memory + └─ Per-user send loop looks up (user.digest_tone, + user.lang) in the in-memory dictionary; falls back + to the English variant of the same tone on miss ``` ## Data model @@ -233,12 +256,18 @@ read time. ### `app/jobs/email_digest_job.py` (modified) -The digest is already per-user and assembles its own prompt. Thread -`user.lang` through: +The job already builds one English variant per tone in +`_generate_variants(...)`. After that returns, the job translates each +variant into every active non-English language (parallel via +`asyncio.gather`), and exposes a `(tone, lang) -> content` lookup that +`_send_one(...)` consults using the recipient's `user.lang`. -- `_generate_variants(...)` accepts a `target_lang` param -- The system prompt assembly appends `respond_in_clause(target_lang)` -- Subject-line generation runs in the same call, so it's localized too +- Variants live only in memory for the duration of the job run. +- A failed translation for `(tone, lang)` is logged and that cell + falls back to the English variant of the same tone. The send + proceeds — the user still gets a digest, just in English that day. +- The subject line is part of each variant's content, so it gets + translated as part of the same call. ### `app/services/portfolio_analysis.py` (modified) @@ -364,15 +393,15 @@ End-to-end manual check after deploy: - We do not translate UI labels. Italian users see English buttons, headings, and tooltips. Future scope. -- We do not translate user-generated content (chat questions the user - types). Only the AI's output is localized; user-supplied input flows - through unchanged. -- We do not translate the email subject line independently. The same - per-user LLM call that generates the digest body also generates the - subject in the target language. -- We do not surface translation cost in any user-visible UI. Cost is - recorded in `strategic_log_translations.llm_cost_usd` and the existing - `ai_calls` ledger picks up per-user calls as today. +- We do not translate user-supplied input (e.g. portfolio names, any + free-text fields). Only AI-generated output is localized. +- The email subject line is part of each variant's content, so it + gets translated alongside the body in the same `translate()` call + per (tone, lang) cell — no separate subject-translation path. +- We do not surface translation cost in any user-visible UI. Strategic + log translation cost lands in `strategic_log_translations.llm_cost_usd`; + digest translation cost is captured in the existing `ai_calls` ledger + via the underlying `call_llm` calls. - We do **not** gate strategic-log translation on user tier. Any user with `lang='it'` triggers Italian translation for that hour's log, regardless of whether they are paid, on credit, or free. Rationale: From 5730aad73cf3a0a3ff13447f45fdc25bc219b25a Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:46:32 +0200 Subject: [PATCH 06/69] i18n: add LANGUAGES, ACTIVE_LANGUAGES, respond_in_clause helper Co-Authored-By: Claude Opus 4.7 --- app/services/i18n.py | 48 ++++++++++++++++++++++++++++++++++++++++++++ tests/test_i18n.py | 44 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 92 insertions(+) create mode 100644 app/services/i18n.py create mode 100644 tests/test_i18n.py diff --git a/app/services/i18n.py b/app/services/i18n.py new file mode 100644 index 0000000..742373d --- /dev/null +++ b/app/services/i18n.py @@ -0,0 +1,48 @@ +"""Language registry + prompt helpers for localized AI output. + +Two surfaces consume this module: +- Per-user LLM call sites (portfolio analysis only at this stage) call + ``respond_in_clause(user.lang)`` and append the result to their + system prompt. +- The settings dropdown + its PATCH endpoint consult ``ACTIVE_LANGUAGES`` + to decide which options are selectable. The strategic-log and digest + translation fan-outs also consult it to decide which languages to + spend tokens on. + +Adding Spanish/French/German support later is a one-line constant +change: extend ``ACTIVE_LANGUAGES`` to include the new code. No other +code change is required — the rest of the system already treats them +as first-class via ``LANGUAGES``. +""" +from __future__ import annotations + + +# Display labels for every language the system knows about. ES/FR/DE +# are kept here so labels still render in the dropdown (as disabled +# options) without requiring code changes to enable them later. +LANGUAGES: dict[str, str] = { + "en": "English", + "it": "Italian", + "es": "Spanish", + "fr": "French", + "de": "German", +} + + +# Languages users can actually select. Settings POST validates against +# this; the strategic-log + digest translation fan-outs only consider +# these. +ACTIVE_LANGUAGES: set[str] = {"en", "it"} + + +def respond_in_clause(lang: str | None) -> str: + """Suffix appended to per-user LLM system prompts. + + Returns an empty string for ``en`` (no nudge needed), an unknown + code, or ``None``/empty input — those callers want the default + English path. Otherwise returns ``"\\n\\nRespond in ."`` + keyed off ``LANGUAGES``. + """ + if not lang or lang == "en" or lang not in LANGUAGES: + return "" + return f"\n\nRespond in {LANGUAGES[lang]}." diff --git a/tests/test_i18n.py b/tests/test_i18n.py new file mode 100644 index 0000000..f0edce0 --- /dev/null +++ b/tests/test_i18n.py @@ -0,0 +1,44 @@ +"""Unit tests for app.services.i18n.""" +from __future__ import annotations + +import pytest + + +def test_languages_contains_all_four_plus_english(): + from app.services.i18n import LANGUAGES + assert set(LANGUAGES.keys()) == {"en", "it", "es", "fr", "de"} + assert LANGUAGES["en"] == "English" + assert LANGUAGES["it"] == "Italian" + assert LANGUAGES["es"] == "Spanish" + assert LANGUAGES["fr"] == "French" + assert LANGUAGES["de"] == "German" + + +def test_active_languages_is_en_and_it_only(): + from app.services.i18n import ACTIVE_LANGUAGES + assert ACTIVE_LANGUAGES == {"en", "it"} + + +def test_respond_in_clause_empty_for_english(): + from app.services.i18n import respond_in_clause + assert respond_in_clause("en") == "" + + +def test_respond_in_clause_empty_for_none_or_empty(): + from app.services.i18n import respond_in_clause + assert respond_in_clause("") == "" + assert respond_in_clause(None) == "" + + +def test_respond_in_clause_italian(): + from app.services.i18n import respond_in_clause + result = respond_in_clause("it") + assert "Italian" in result + assert result.startswith("\n\n") + + +def test_respond_in_clause_unknown_lang_falls_back_to_english(): + """Defensive: a raw POST or stale lang code should not crash the + prompt assembly. Unknown codes map to no-suffix (English default).""" + from app.services.i18n import respond_in_clause + assert respond_in_clause("xx") == "" From 7683f828202c89de7348f8e09211fab9833cc28e Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:48:32 +0200 Subject: [PATCH 07/69] i18n: add translate() helper backed by call_llm Co-Authored-By: Claude Opus 4.7 --- app/services/translation.py | 82 +++++++++++++++++++++++++++++++++++++ tests/test_i18n.py | 75 +++++++++++++++++++++++++++++++++ 2 files changed, 157 insertions(+) create mode 100644 app/services/translation.py diff --git a/app/services/translation.py b/app/services/translation.py new file mode 100644 index 0000000..46dbabb --- /dev/null +++ b/app/services/translation.py @@ -0,0 +1,82 @@ +"""Markdown translation via the existing LLM provider chain. + +DeepSeek-4-flash at ~$0.28/M output tokens is cheap enough that we +don't bother with a separate translation-only model. ``call_llm``'s +provider chain (DeepSeek primary, OpenRouter fallback) handles this +path identically to any other LLM call. + +The translator is content-aware in one important way: it instructs the +model to preserve markdown structure, ticker symbols, numbers, dates, +and percentages verbatim. This keeps generated artefacts (tables of +quotes, embedded percentages, dated references) intact across the +translation boundary. +""" +from __future__ import annotations + +import httpx + +from app.services.i18n import LANGUAGES +from app.services.openrouter import LogResult, call_llm + + +_SYSTEM_PROMPT_TMPL = """\ +You are an expert translator working on financial-markets commentary. +Translate the following markdown text to {language}. + +Strict rules: +- Preserve ALL markdown formatting (headings, lists, emphasis, links, + tables, code spans). +- Do NOT translate ticker symbols (AAPL, MSFT, VOD.L, ASML.AS, etc.), + company legal names, percentages, dates, ISO currency codes, or any + numbers. +- Do NOT add commentary, preambles, or apologies. Output ONLY the + translated markdown. +""" + + +async def translate( + client: httpx.AsyncClient, + text: str, + target_lang: str, +) -> tuple[str, LogResult]: + """Translate markdown ``text`` to ``target_lang``. + + Returns ``(translated_markdown, LogResult)``. Caller persists the + cost/model provenance from LogResult next to the cached row. + + Short-circuits without calling the LLM when ``target_lang`` is + ``'en'``, unknown, or empty — returns the source unchanged with a + zero-cost stub LogResult. This lets fan-out callers iterate over + all languages without per-call gating. + + Raises on provider failure (HTTP error, all chain providers down). + Callers in fan-out paths should catch and log per-language. + """ + if not target_lang or target_lang == "en" or target_lang not in LANGUAGES: + # No-op fast path. Returning a fake LogResult keeps the call + # signature stable for callers who unpack the tuple. + return text, LogResult( + content=text, model="noop", + prompt_tokens=0, completion_tokens=0, cost_usd=0.0, + ) + + system_prompt = _SYSTEM_PROMPT_TMPL.format(language=LANGUAGES[target_lang]) + messages = [ + {"role": "system", "content": system_prompt}, + {"role": "user", "content": text}, + ] + result = await call_llm(client, messages) + + content = (result.content or "").strip() + # Strip code fences if the model wrapped its output despite the system rule. + if content.startswith("```"): + # Drop the opening fence (with optional language tag). + first_nl = content.find("\n") + if first_nl != -1: + content = content[first_nl + 1:] + # Drop the closing fence. + if content.rstrip().endswith("```"): + content = content.rstrip()[:-3].rstrip() + content = content.strip() + + return content, result diff --git a/tests/test_i18n.py b/tests/test_i18n.py index f0edce0..6fc207e 100644 --- a/tests/test_i18n.py +++ b/tests/test_i18n.py @@ -42,3 +42,78 @@ def test_respond_in_clause_unknown_lang_falls_back_to_english(): prompt assembly. Unknown codes map to no-suffix (English default).""" from app.services.i18n import respond_in_clause assert respond_in_clause("xx") == "" + + +@pytest.mark.asyncio +async def test_translate_happy_path(monkeypatch): + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + from app.services.openrouter import LogResult + + monkeypatch.setattr(mod, "call_llm", AsyncMock(return_value=LogResult( + content="# Apertura\n\nIl mercato è in calo dello 0,4%.", + model="deepseek/deepseek-v4-flash", + prompt_tokens=300, completion_tokens=80, cost_usd=0.00002, + ))) + + client = MagicMock() + translated, llm_log = await mod.translate( + client, "# Open\n\nThe market is down 0.4%.", "it", + ) + assert "Apertura" in translated + assert llm_log.model == "deepseek/deepseek-v4-flash" + assert llm_log.cost_usd == pytest.approx(0.00002) + + +@pytest.mark.asyncio +async def test_translate_strips_code_fences(monkeypatch): + """If the LLM wraps the output in ```markdown ... ```, strip it.""" + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + from app.services.openrouter import LogResult + + fenced = "```markdown\n# Titolo\n\nCorpo.\n```" + monkeypatch.setattr(mod, "call_llm", AsyncMock(return_value=LogResult( + content=fenced, model="m", prompt_tokens=10, completion_tokens=20, cost_usd=0.0, + ))) + + client = MagicMock() + translated, _ = await mod.translate(client, "# Title\n\nBody.", "it") + assert "```" not in translated + assert translated.startswith("# Titolo") + + +@pytest.mark.asyncio +async def test_translate_provider_failure_propagates(monkeypatch): + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + + monkeypatch.setattr(mod, "call_llm", AsyncMock(side_effect=RuntimeError("upstream down"))) + + client = MagicMock() + with pytest.raises(RuntimeError, match="upstream down"): + await mod.translate(client, "# Title\n\nBody.", "it") + + +@pytest.mark.asyncio +async def test_translate_unknown_lang_returns_source_unchanged(monkeypatch): + """Defensive: an unknown lang code (or 'en') short-circuits without + calling the LLM. Callers shouldn't have to gate the call themselves.""" + from unittest.mock import AsyncMock, MagicMock + + from app.services import translation as mod + from app.services.openrouter import LogResult + + call_mock = AsyncMock(return_value=LogResult( + content="should not be returned", + model="m", prompt_tokens=0, completion_tokens=0, cost_usd=0.0, + )) + monkeypatch.setattr(mod, "call_llm", call_mock) + + client = MagicMock() + out, _ = await mod.translate(client, "Hello world.", "en") + assert out == "Hello world." + call_mock.assert_not_awaited() From 9423fa81b7efa73384a4815564f2ff79edf07d92 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:52:04 +0200 Subject: [PATCH 08/69] models: add User.lang + StrategicLogTranslation Co-Authored-By: Claude Opus 4.7 --- app/models.py | 38 ++++++++++++++++++ tests/test_localization_integration.py | 55 ++++++++++++++++++++++++++ 2 files changed, 93 insertions(+) create mode 100644 tests/test_localization_integration.py diff --git a/app/models.py b/app/models.py index 665a8cd..13ba1ab 100644 --- a/app/models.py +++ b/app/models.py @@ -120,6 +120,37 @@ class StrategicLog(Base): cost_usd: Mapped[float | None] = mapped_column(Float) +class StrategicLogTranslation(Base): + """Cached translation of a single StrategicLog row. + + Populated by ai_log_job after the English row is committed: one + row per (log_id, lang) combination. The /log endpoint serves the + matching row when available and falls back to the English source + when no row exists yet (e.g. translation failed or the language + was added after the log was generated). + + No user attribution — the cache is shared. Setting `lang` on a + user just selects which (already-translated) variant they see. + """ + __tablename__ = "strategic_log_translations" + + id: Mapped[int] = mapped_column(_PK, primary_key=True, autoincrement=True) + log_id: Mapped[int] = mapped_column( + _PK, ForeignKey("strategic_logs.id", ondelete="CASCADE"), nullable=False, + ) + lang: Mapped[str] = mapped_column(String(8), nullable=False) + content_md: Mapped[str] = mapped_column(Text, nullable=False) + generated_at: Mapped[datetime] = mapped_column( + DateTime(timezone=True), nullable=False, default=utcnow, + ) + llm_model: Mapped[str | None] = mapped_column(String(64)) + llm_cost_usd: Mapped[float | None] = mapped_column(Float) + + __table_args__ = ( + UniqueConstraint("log_id", "lang", name="uq_slt_log_lang"), + ) + + class IndicatorSummary(Base): """Short AI-generated read for one indicator group, regenerated hourly. The latest row per group_name is what the dashboard renders.""" @@ -189,6 +220,13 @@ class User(Base): # NULL = use INTERMEDIATE at render time. Server-side mirror of the # dashboard tone, decoupled because the dashboard pref is localStorage. digest_tone: Mapped[str | None] = mapped_column(String(16)) + # Preferred language for AI-generated content (strategic log, + # digest emails, portfolio commentary). Default 'en'. The settings + # PATCH endpoint validates against ACTIVE_LANGUAGES in + # app/services/i18n.py before writing. + lang: Mapped[str] = mapped_column( + String(8), nullable=False, default="en", server_default="en", + ) # Polar (MoR) linkage — populated by the polar_webhook handler the # first time we see a subscription/order event for the user. The # customer id is the stable join key; the subscription id is what diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py new file mode 100644 index 0000000..1079d66 --- /dev/null +++ b/tests/test_localization_integration.py @@ -0,0 +1,55 @@ +"""Integration tests: model surface, ai_log_job translation fan-out, +route-level localized fetch, settings PATCH validation.""" +from __future__ import annotations + +import pytest + + +def _build_session_factory(tmp_path): + """Per-test sqlite engine + factory. Mirrors test_referral_conversion.py.""" + from sqlalchemy.ext.asyncio import async_sessionmaker, create_async_engine + + from app import db as db_mod + from app.db import Base + import app.models # noqa: F401 — registers models on Base.metadata + + engine = create_async_engine(f"sqlite+aiosqlite:///{tmp_path}/loc.db") + factory = async_sessionmaker(engine, expire_on_commit=False) + db_mod._engine = engine + db_mod._session_factory = factory + + async def _setup(): + async with engine.begin() as conn: + await conn.run_sync(Base.metadata.create_all) + + return engine, factory, _setup + + +def test_user_has_lang_column_with_default_en(): + from sqlalchemy import inspect + from app.models import User + + cols = {c.name: c for c in inspect(User).columns} + assert "lang" in cols + assert cols["lang"].nullable is False + # SQLAlchemy default may be a callable or a literal — check both. + default = cols["lang"].default + assert default is not None + if hasattr(default, "arg"): + assert default.arg == "en" + + +def test_strategic_log_translation_model_columns(): + from sqlalchemy import inspect + from app.models import StrategicLogTranslation + + cols = {c.name: c for c in inspect(StrategicLogTranslation).columns} + assert "log_id" in cols + assert "lang" in cols + assert "content_md" in cols + assert "generated_at" in cols + assert "llm_model" in cols + assert "llm_cost_usd" in cols + assert cols["log_id"].nullable is False + assert cols["lang"].nullable is False + assert cols["content_md"].nullable is False From e190d0e35b5150f7838a827c397c87e67ae0bc20 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:53:27 +0200 Subject: [PATCH 09/69] alembic: add 0022 localization (users.lang + strategic_log_translations) --- alembic/versions/0022_localization.py | 46 +++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100644 alembic/versions/0022_localization.py diff --git a/alembic/versions/0022_localization.py b/alembic/versions/0022_localization.py new file mode 100644 index 0000000..30f6814 --- /dev/null +++ b/alembic/versions/0022_localization.py @@ -0,0 +1,46 @@ +"""localization: users.lang + strategic_log_translations. + +Revision ID: 0022 +Revises: 0021 +Create Date: 2026-05-27 +""" +from typing import Sequence, Union + +import sqlalchemy as sa +from alembic import op + + +revision: str = "0022" +down_revision: Union[str, None] = "0021" +branch_labels: Union[str, Sequence[str], None] = None +depends_on: Union[str, Sequence[str], None] = None + + +def upgrade() -> None: + op.add_column( + "users", + sa.Column( + "lang", sa.String(length=8), nullable=False, + server_default="en", + ), + ) + op.create_table( + "strategic_log_translations", + sa.Column("id", sa.BigInteger(), primary_key=True, autoincrement=True), + sa.Column("log_id", sa.BigInteger(), nullable=False), + sa.Column("lang", sa.String(length=8), nullable=False), + sa.Column("content_md", sa.Text(), nullable=False), + sa.Column("generated_at", sa.DateTime(timezone=True), nullable=False), + sa.Column("llm_model", sa.String(length=64), nullable=True), + sa.Column("llm_cost_usd", sa.Float(), nullable=True), + sa.ForeignKeyConstraint( + ["log_id"], ["strategic_logs.id"], + ondelete="CASCADE", name="fk_slt_log", + ), + sa.UniqueConstraint("log_id", "lang", name="uq_slt_log_lang"), + ) + + +def downgrade() -> None: + op.drop_table("strategic_log_translations") + op.drop_column("users", "lang") From e4982cdc047db9ac78d1c23a5d9c2844509fcfe8 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 16:57:06 +0200 Subject: [PATCH 10/69] ai-log-job: translate strategic log for active non-en languages Co-Authored-By: Claude Opus 4.7 --- app/jobs/ai_log_job.py | 61 +++++++++++- tests/test_localization_integration.py | 123 +++++++++++++++++++++++++ 2 files changed, 181 insertions(+), 3 deletions(-) diff --git a/app/jobs/ai_log_job.py b/app/jobs/ai_log_job.py index bc8b488..c0635a7 100644 --- a/app/jobs/ai_log_job.py +++ b/app/jobs/ai_log_job.py @@ -17,8 +17,9 @@ from app.jobs._market_context import ( month_spend, recent_headlines_by_bucket, ) -from app.models import AICall, JobRun, StrategicLog +from app.models import AICall, JobRun, StrategicLog, StrategicLogTranslation, User from app.services.cadence import DEFAULT_POLICY +from app.services.i18n import ACTIVE_LANGUAGES from app.services.openrouter import ( PROMPT_VERSION, active_model, @@ -27,6 +28,58 @@ from app.services.openrouter import ( call_llm, llm_configured, ) +from app.services.translation import translate + + +async def translate_log_for_active_languages(session, log_id: int) -> None: + """Fan out per-language translations for the strategic log identified + by ``log_id``. + + Reads ``users.lang`` (deduplicated, restricted to ACTIVE_LANGUAGES + minus English), one translation call per language in parallel via + ``asyncio.gather``, persists each successful result as a + ``StrategicLogTranslation`` row. Per-language failures are logged + but never raise — the strategic log itself is already committed at + this point and translation is a best-effort enhancement. + + The job orchestrator calls this AFTER the English ``StrategicLog`` + row is committed; pass the row's ``id`` in. + """ + target_langs = sorted({l for l in ACTIVE_LANGUAGES if l != "en"}) + if not target_langs: + return + + active_langs = (await session.execute( + select(User.lang).distinct().where(User.lang.in_(target_langs)) + )).scalars().all() + if not active_langs: + return + + log_row = await session.get(StrategicLog, log_id) + if log_row is None: + log.warning("log.translate.missing_log", log_id=log_id) + return + + async with httpx.AsyncClient(follow_redirects=True, timeout=60) as client: + results = await asyncio.gather(*[ + translate(client, log_row.content, lang) + for lang in active_langs + ], return_exceptions=True) + + for lang, result in zip(active_langs, results): + if isinstance(result, Exception): + log.warning("log.translate.failed", lang=lang, log_id=log_id, + error=str(result)[:200]) + continue + translated_md, llm_result = result + session.add(StrategicLogTranslation( + log_id=log_id, lang=lang, + content_md=translated_md, + generated_at=utcnow(), + llm_model=llm_result.model, + llm_cost_usd=llm_result.cost_usd, + )) + await session.commit() async def run() -> None: @@ -126,7 +179,7 @@ async def run() -> None: tone=tone, analysis=analysis, error=str(e)[:200]) continue - session.add(StrategicLog( + slog = StrategicLog( generated_at=utcnow(), model=result.model, anchor_date=anchor, @@ -137,7 +190,8 @@ async def run() -> None: prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, cost_usd=result.cost_usd, - )) + ) + session.add(slog) session.add(AICall( model=result.model, prompt_tokens=result.prompt_tokens, @@ -146,6 +200,7 @@ async def run() -> None: status="ok", )) await session.commit() + await translate_log_for_active_languages(session, slog.id) written += 1 log.info("ai_log.variant_done", tone=tone, analysis=analysis, diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py index 1079d66..6aac9f9 100644 --- a/tests/test_localization_integration.py +++ b/tests/test_localization_integration.py @@ -53,3 +53,126 @@ def test_strategic_log_translation_model_columns(): assert cols["log_id"].nullable is False assert cols["lang"].nullable is False assert cols["content_md"].nullable is False + + +@pytest.mark.asyncio +async def test_log_translation_fanout_no_active_non_en_users(tmp_path, monkeypatch): + """When no users have an active non-en lang, the fan-out makes no + translation calls and no rows are inserted.""" + from unittest.mock import AsyncMock + from sqlalchemy import select + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + from app.jobs import ai_log_job + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + fake_translate = AsyncMock() + monkeypatch.setattr(ai_log_job, "translate", fake_translate) + + # Seed an English user (no non-en users). + async with factory() as session: + session.add(User(id=1, email="en@x", tier="paid", lang="en")) + slog = StrategicLog( + generated_at=utcnow(), content="# Open\n\nDown 0.4%.", + model="test-model", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + async with factory() as session: + await ai_log_job.translate_log_for_active_languages(session, log_id) + + fake_translate.assert_not_awaited() + async with factory() as session: + rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() + assert rows == [] + + +@pytest.mark.asyncio +async def test_log_translation_fanout_italian_user(tmp_path, monkeypatch): + """One user at lang=it triggers one translation; the row lands with + the right lang and log_id.""" + from sqlalchemy import select + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + from app.services.openrouter import LogResult + from app.jobs import ai_log_job + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async def _fake_translate(client, text, target_lang): + assert target_lang == "it" + return "# Apertura\n\nIn calo 0,4%.", LogResult( + content="# Apertura\n\nIn calo 0,4%.", + model="deepseek/deepseek-v4-flash", + prompt_tokens=300, completion_tokens=80, cost_usd=0.00002, + ) + monkeypatch.setattr(ai_log_job, "translate", _fake_translate) + + async with factory() as session: + session.add(User(id=2, email="it@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content="# Open\n\nDown 0.4%.", + model="test-model", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + async with factory() as session: + await ai_log_job.translate_log_for_active_languages(session, log_id) + + async with factory() as session: + rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() + assert len(rows) == 1 + row = rows[0] + assert row.log_id == log_id + assert row.lang == "it" + assert row.content_md.startswith("# Apertura") + assert row.llm_model == "deepseek/deepseek-v4-flash" + assert row.llm_cost_usd == pytest.approx(0.00002) + + +@pytest.mark.asyncio +async def test_log_translation_fanout_per_language_failure_isolated(tmp_path, monkeypatch): + """If one language's translation fails, the others (if any) still land + and the job does not raise.""" + from sqlalchemy import select + + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + from app.jobs import ai_log_job + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async def _fake_translate(client, text, target_lang): + raise RuntimeError("upstream down") + monkeypatch.setattr(ai_log_job, "translate", _fake_translate) + + async with factory() as session: + session.add(User(id=3, email="it@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content="# Open", + model="test-model", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + # Must NOT raise. + async with factory() as session: + await ai_log_job.translate_log_for_active_languages(session, log_id) + + async with factory() as session: + rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() + assert rows == [] From d318039ad56c0824f74bb8253fe1d202868cf680 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 17:01:00 +0200 Subject: [PATCH 11/69] analyse: thread user.lang into the system prompt Co-Authored-By: Claude Opus 4.7 --- app/routers/universe.py | 10 +++- app/services/portfolio_analysis.py | 9 +++- tests/test_localization_integration.py | 74 ++++++++++++++++++++++++++ 3 files changed, 89 insertions(+), 4 deletions(-) diff --git a/app/routers/universe.py b/app/routers/universe.py index a77585f..3c2609c 100644 --- a/app/routers/universe.py +++ b/app/routers/universe.py @@ -36,7 +36,7 @@ from fastapi.responses import JSONResponse from sqlalchemy import and_, func, select from sqlalchemy.ext.asyncio import AsyncSession -from app.auth import require_auth +from app.auth import CurrentUser, require_auth from app.config import get_settings from app.db import get_session, utcnow from app.logging import get_logger @@ -341,10 +341,11 @@ async def parse_portfolio( # --------------------------------------------------------------------------- -@router.post("/analyze", dependencies=[Depends(require_paid)]) +@router.post("/analyze") async def analyze_portfolio( request: Request, session: AsyncSession = Depends(get_session), + principal: CurrentUser = Depends(require_paid), ) -> dict: """Generate AI commentary for the supplied pie. The pie is held in memory only for the duration of the LLM call; nothing about holdings @@ -364,6 +365,11 @@ async def analyze_portfolio( except Exception: raise HTTPException(status_code=400, detail="malformed JSON body") + user_lang = ( + principal.user.lang if (principal.user and principal.user.lang) else "en" + ) + payload["lang"] = user_lang + try: req = portfolio_analysis.parse_request(payload) except ValueError as e: diff --git a/app/services/portfolio_analysis.py b/app/services/portfolio_analysis.py index eb8a349..0aef3cd 100644 --- a/app/services/portfolio_analysis.py +++ b/app/services/portfolio_analysis.py @@ -31,6 +31,7 @@ from app.config import get_settings from app.db import utcnow from app.logging import get_logger from app.models import AICall +from app.services.i18n import LANGUAGES, respond_in_clause from app.services.openrouter import ( LogResult, active_model, @@ -74,6 +75,7 @@ class AnalysisRequest: anchor: str | None = None tone: str = "INTERMEDIATE" # NOVICE | INTERMEDIATE | PRO analysis: str = "SPECULATIVE" # DRY | SPECULATIVE + lang: str = "en" @dataclass @@ -163,10 +165,13 @@ def parse_request(payload: dict) -> AnalysisRequest: anchor = _sanitise_text(payload.get("anchor") or "", 32) or None tone = _sanitise_text(payload.get("tone", "INTERMEDIATE"), 16) or "INTERMEDIATE" analysis = _sanitise_text(payload.get("analysis", "SPECULATIVE"), 16) or "SPECULATIVE" + lang = (payload.get("lang") or "en").strip().lower() + if lang not in LANGUAGES: + lang = "en" return AnalysisRequest( positions=positions, prices=prices, base_currency=base_currency, - anchor=anchor, tone=tone, analysis=analysis, + anchor=anchor, tone=tone, analysis=analysis, lang=lang, ) @@ -276,7 +281,7 @@ def build_prompt(req: AnalysisRequest) -> tuple[str, str]: head = enriched[:MAX_POSITIONS_INLINED] tail_count = max(0, len(enriched) - MAX_POSITIONS_INLINED) - system = build_system_prompt(req.tone, req.analysis) + "\n\n" + _SYSTEM_OVERRIDES + system = build_system_prompt(req.tone, req.analysis) + "\n\n" + _SYSTEM_OVERRIDES + respond_in_clause(req.lang) user_parts = [ f"# Portfolio commentary request — {utcnow().strftime('%Y-%m-%d')}", diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py index 6aac9f9..680fe25 100644 --- a/tests/test_localization_integration.py +++ b/tests/test_localization_integration.py @@ -176,3 +176,77 @@ async def test_log_translation_fanout_per_language_failure_isolated(tmp_path, mo async with factory() as session: rows = (await session.execute(select(StrategicLogTranslation))).scalars().all() assert rows == [] + + +@pytest.mark.asyncio +async def test_analyse_threads_lang_into_system_prompt(tmp_path, monkeypatch): + """When lang='it', the system prompt sent to call_llm contains + 'Respond in Italian.' — the LLM does the rest.""" + from app.services import portfolio_analysis as pa + from app.services.openrouter import LogResult + + captured = {} + + async def _fake_call_llm(client, messages, **kw): + captured["messages"] = messages + return LogResult( + content="Analisi del portafoglio in italiano.", + model="m", prompt_tokens=400, completion_tokens=100, cost_usd=0.0001, + ) + monkeypatch.setattr(pa, "call_llm", _fake_call_llm) + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + payload = { + "positions": [{"yahoo_ticker": "AAPL", "qty": 10, "avg_cost": 150.0, + "currency": "USD", "name": "Apple Inc"}], + "prices": {"AAPL": {"p": 172.4, "c": "USD"}}, + "fx": {"USD": 1.0}, + "base_currency": "USD", + "tone": "INTERMEDIATE", + "analysis": "NORMAL", + "lang": "it", + } + req = pa.parse_request(payload) + assert req.lang == "it" + + async with factory() as session: + await pa.analyse(session, req) + system = next(m["content"] for m in captured["messages"] if m["role"] == "system") + assert "Respond in Italian" in system + + +@pytest.mark.asyncio +async def test_analyse_no_clause_when_lang_is_en(tmp_path, monkeypatch): + from app.services import portfolio_analysis as pa + from app.services.openrouter import LogResult + + captured = {} + + async def _fake_call_llm(client, messages, **kw): + captured["messages"] = messages + return LogResult( + content="Portfolio analysis in English.", + model="m", prompt_tokens=400, completion_tokens=100, cost_usd=0.0001, + ) + monkeypatch.setattr(pa, "call_llm", _fake_call_llm) + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + payload = { + "positions": [{"yahoo_ticker": "AAPL", "qty": 10, "avg_cost": 150.0, + "currency": "USD", "name": "Apple Inc"}], + "prices": {"AAPL": {"p": 172.4, "c": "USD"}}, + "fx": {"USD": 1.0}, + "base_currency": "USD", + "tone": "INTERMEDIATE", + "analysis": "NORMAL", + "lang": "en", + } + req = pa.parse_request(payload) + async with factory() as session: + await pa.analyse(session, req) + system = next(m["content"] for m in captured["messages"] if m["role"] == "system") + assert "Respond in" not in system From 924f37548b7c676b6c169029ebf81cfaa53b7d27 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 17:07:18 +0200 Subject: [PATCH 12/69] digest: translate variants once per active non-en language Co-Authored-By: Claude Opus 4.7 --- app/jobs/email_digest_job.py | 78 +++++++++++++++++++++--- tests/test_localization_integration.py | 82 ++++++++++++++++++++++++++ 2 files changed, 152 insertions(+), 8 deletions(-) diff --git a/app/jobs/email_digest_job.py b/app/jobs/email_digest_job.py index 1f38777..5ff25c6 100644 --- a/app/jobs/email_digest_job.py +++ b/app/jobs/email_digest_job.py @@ -30,6 +30,7 @@ from app.models import EmailSend, User from app.routers.email import sign_unsubscribe_token from app.services.access import paid_status from app.services.email_service import render_digest_email, send_email +from app.services.i18n import ACTIVE_LANGUAGES from app.services.openrouter import ( PROMPT_VERSION, build_daily_digest_prompt, @@ -37,6 +38,7 @@ from app.services.openrouter import ( call_llm, llm_configured, ) +from app.services.translation import translate def _now() -> datetime: @@ -116,6 +118,62 @@ def _kind_for_today(today: datetime) -> str: return "weekly" if today.weekday() == 6 else "daily" +async def _translate_variants_for_active_langs( + client, + english_variants: dict[str, str], + target_langs: list[str], +) -> dict[tuple[str, str], str]: + """Build a {(tone, lang): content_md} table. + + Starts with the English variants as the canonical cells. For each + (tone, target_lang) pair where target_lang != 'en', calls translate() + in parallel; on failure the cell falls back to the English variant + of the same tone so the digest still goes out, just untranslated. + """ + table: dict[tuple[str, str], str] = { + (tone, "en"): content for tone, content in english_variants.items() + } + pairs = [ + (tone, lang) + for tone in english_variants + for lang in target_langs + if lang != "en" + ] + if not pairs: + return table + + results = await asyncio.gather(*[ + translate(client, english_variants[tone], lang) for tone, lang in pairs + ], return_exceptions=True) + for (tone, lang), result in zip(pairs, results): + if isinstance(result, Exception): + log.warning("digest.translate.failed", + tone=tone, lang=lang, error=str(result)[:200]) + table[(tone, lang)] = english_variants[tone] + continue + translated_md, _llm_log = result + table[(tone, lang)] = translated_md + return table + + +def _pick_variant( + table: dict[tuple[str, str], str], tone: str, lang: str, +) -> str: + """Return the digest content for a recipient. + + Lookup order: exact (tone, lang) → (tone, 'en') → ('INTERMEDIATE', + 'en') → first table value. The last falls are defensive; the table + always contains at least one English entry when the job is sending. + """ + if (tone, lang) in table: + return table[(tone, lang)] + if (tone, "en") in table: + return table[(tone, "en")] + if ("INTERMEDIATE", "en") in table: + return table[("INTERMEDIATE", "en")] + return next(iter(table.values())) + + async def _send_one(user: User, kind: str, content_html: str, date_str: str, session) -> None: settings_url = f"{branding.SITE_URL}/settings" @@ -200,17 +258,21 @@ async def run() -> None: jr.error = "all variants failed" return + # Build the per-language translation table once per job run. + active_non_en = sorted({l for l in ACTIVE_LANGUAGES if l != "en"}) + async with httpx.AsyncClient(follow_redirects=True) as client: + variant_table = await _translate_variants_for_active_langs( + client, variants, active_non_en, + ) + written = 0 for u in fresh: tone = (u.digest_tone or "INTERMEDIATE").upper() - # Fall back to INTERMEDIATE first (the more common tone) and then - # to whatever variant succeeded, so an asymmetric LLM failure - # doesn't silently skip the user. - content = (variants.get(tone) - or variants.get("INTERMEDIATE") - or next(iter(variants.values()), None)) - if content is None: - continue + content = _pick_variant( + variant_table, + tone=tone, + lang=(u.lang or "en"), + ) await _send_one(u, kind, content, date_str, session) await asyncio.sleep(0.1) written += 1 diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py index 680fe25..ebe8a21 100644 --- a/tests/test_localization_integration.py +++ b/tests/test_localization_integration.py @@ -250,3 +250,85 @@ async def test_analyse_no_clause_when_lang_is_en(tmp_path, monkeypatch): await pa.analyse(session, req) system = next(m["content"] for m in captured["messages"] if m["role"] == "system") assert "Respond in" not in system + + +@pytest.mark.asyncio +async def test_digest_translates_variants_per_active_lang(monkeypatch): + """After English variants are built, the job translates each to every + active non-en lang. The result is an in-memory mapping the send loop + consults.""" + from unittest.mock import MagicMock + from app.jobs import email_digest_job as ed + from app.services.openrouter import LogResult + + english_variants = { + "NOVICE": "**Today.** Markets calmer.", + "INTERMEDIATE": "**Today.** Indices slightly down.", + "PRO": "**Today.** Risk-off rotation, breadth weak.", + } + + translate_calls: list[tuple[str, str]] = [] + + async def _fake_translate(client, text, target_lang): + translate_calls.append((text, target_lang)) + return f"[IT] {text}", LogResult( + content=f"[IT] {text}", model="m", + prompt_tokens=10, completion_tokens=10, cost_usd=0.0, + ) + + monkeypatch.setattr(ed, "translate", _fake_translate) + + client = MagicMock() + table = await ed._translate_variants_for_active_langs( + client, english_variants, ["it"], + ) + + # Three tones × one non-en lang = three translation calls. + assert len(translate_calls) == 3 + assert {lang for _, lang in translate_calls} == {"it"} + + # English entries are present unchanged. + assert table[("NOVICE", "en")] == english_variants["NOVICE"] + assert table[("PRO", "en")] == english_variants["PRO"] + # Italian entries are populated. + assert table[("INTERMEDIATE", "it")].startswith("[IT] ") + + +@pytest.mark.asyncio +async def test_digest_translation_failure_falls_back_to_english(monkeypatch): + """When translate() fails for a (tone, lang) cell, the table entry + for that cell is the English variant of the same tone — the user + still gets a digest, just in English that day.""" + from unittest.mock import MagicMock + from app.jobs import email_digest_job as ed + + english_variants = {"INTERMEDIATE": "**Today.** Indices down."} + + async def _fake_translate(client, text, target_lang): + raise RuntimeError("upstream down") + monkeypatch.setattr(ed, "translate", _fake_translate) + + client = MagicMock() + table = await ed._translate_variants_for_active_langs( + client, english_variants, ["it"], + ) + + assert table[("INTERMEDIATE", "it")] == english_variants["INTERMEDIATE"] + + +def test_digest_pick_variant_uses_user_lang(): + """The variant-picker helper consults user.digest_tone + user.lang.""" + from app.jobs import email_digest_job as ed + + table = { + ("NOVICE", "en"): "novice en", + ("NOVICE", "it"): "novice it", + ("INTERMEDIATE", "en"): "intermediate en", + ("INTERMEDIATE", "it"): "intermediate it", + } + assert ed._pick_variant(table, tone="NOVICE", lang="it") == "novice it" + assert ed._pick_variant(table, tone="INTERMEDIATE", lang="en") == "intermediate en" + # Missing lang → fallback to English variant of the same tone. + assert ed._pick_variant(table, tone="NOVICE", lang="de") == "novice en" + # Missing tone → fallback to INTERMEDIATE/en (the safe default). + assert ed._pick_variant(table, tone="UNKNOWN", lang="en") == "intermediate en" From 1ea71bc16055d53ebbbbcbf9df975ba4925e3fed Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 17:13:57 +0200 Subject: [PATCH 13/69] log: serve translated content when available; English fallback Adds module-level _resolve_log_content(session, log_id, lang) helper to app/routers/pages.py: looks up StrategicLogTranslation by (log_id, lang) when lang != 'en'; falls back silently to the English original when no translation row exists yet (the expected case for the first hour after a new language activates, or when translation fails for a specific log). log_page / log_page_day pull cu.user.lang and thread it through _log_page_context so the template renders the right variant. Two tests cover both branches. Co-Authored-By: Claude Opus 4.7 --- app/routers/pages.py | 33 ++++++++++++-- tests/test_localization_integration.py | 63 ++++++++++++++++++++++++++ 2 files changed, 92 insertions(+), 4 deletions(-) diff --git a/app/routers/pages.py b/app/routers/pages.py index f7ef42b..f80176a 100644 --- a/app/routers/pages.py +++ b/app/routers/pages.py @@ -11,7 +11,7 @@ from sqlalchemy.ext.asyncio import AsyncSession from app.auth import CurrentUser, maybe_current_user, require_auth, require_token from app.config import get_settings, load_groups from app.db import get_session -from app.models import EmailSend, Referral, StrategicLog, User +from app.models import EmailSend, Referral, StrategicLog, StrategicLogTranslation, User from app.services.access import is_paid_active, paid_status from app.services.referral_service import assign_code_if_missing from app.templates_env import templates @@ -75,7 +75,29 @@ async def _resolve_log_date(session: AsyncSession, day: str | None) -> date: return datetime.now(timezone.utc).date() -def _log_page_context(target: date, paid: bool) -> dict: +async def _resolve_log_content( + session: AsyncSession, log_id: int, lang: str | None, +) -> str: + """Return the strategic log content in the user's preferred language. + + If ``lang`` is 'en'/None or no translation exists for the requested + language, returns the English original from StrategicLog.content. + A missing translation is the expected case for hours where + translation hasn't yet run; the fallback is silent. + """ + if lang and lang != "en": + row = (await session.execute( + select(StrategicLogTranslation) + .where(StrategicLogTranslation.log_id == log_id) + .where(StrategicLogTranslation.lang == lang) + )).scalar_one_or_none() + if row is not None: + return row.content_md + log_row = await session.get(StrategicLog, log_id) + return log_row.content if log_row is not None else "" + + +def _log_page_context(target: date, paid: bool, user_lang: str = "en") -> dict: s = get_settings() return { "selected_iso": target.isoformat(), @@ -83,6 +105,7 @@ def _log_page_context(target: date, paid: bool) -> dict: "current_tone": s.CASSANDRA_TONE.upper(), "current_analysis": s.CASSANDRA_ANALYSIS.upper(), "paid": paid, + "user_lang": user_lang, } @@ -93,8 +116,9 @@ async def log_page( cu: CurrentUser = Depends(require_auth), ): target = await _resolve_log_date(session, None) + user_lang = cu.user.lang if cu.user else "en" return templates.TemplateResponse( - request, "log.html", _log_page_context(target, is_paid_active(cu)), + request, "log.html", _log_page_context(target, is_paid_active(cu), user_lang), ) @@ -106,8 +130,9 @@ async def log_page_day( cu: CurrentUser = Depends(require_auth), ): target = await _resolve_log_date(session, day) + user_lang = cu.user.lang if cu.user else "en" return templates.TemplateResponse( - request, "log.html", _log_page_context(target, is_paid_active(cu)), + request, "log.html", _log_page_context(target, is_paid_active(cu), user_lang), ) diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py index ebe8a21..ae74a53 100644 --- a/tests/test_localization_integration.py +++ b/tests/test_localization_integration.py @@ -332,3 +332,66 @@ def test_digest_pick_variant_uses_user_lang(): assert ed._pick_variant(table, tone="NOVICE", lang="de") == "novice en" # Missing tone → fallback to INTERMEDIATE/en (the safe default). assert ed._pick_variant(table, tone="UNKNOWN", lang="en") == "intermediate en" + + +@pytest.mark.asyncio +async def test_log_endpoint_serves_italian_when_user_is_italian(tmp_path): + """When a user with lang='it' opens /log, the served content is the + Italian translation, not the English original.""" + from app.db import utcnow + from app.models import StrategicLog, StrategicLogTranslation, User + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=10, email="it@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content="# Open\n\nDown 0.4%.", + model="test-model", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + session.add(StrategicLogTranslation( + log_id=slog.id, lang="it", + content_md="# Apertura\n\nIn calo 0,4%.", + generated_at=utcnow(), llm_model="m", llm_cost_usd=0.0, + )) + await session.commit() + log_id = slog.id + + # Test the resolver directly. + from app.routers.pages import _resolve_log_content + async with factory() as session: + user = await session.get(User, 10) + content = await _resolve_log_content(session, log_id, user.lang) + assert "Apertura" in content + assert "Open" not in content + + +@pytest.mark.asyncio +async def test_log_endpoint_falls_back_to_english_when_no_translation(tmp_path): + """User lang='it' but no IT translation exists → English fallback.""" + from app.db import utcnow + from app.models import StrategicLog, User + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=11, email="it2@x", tier="paid", lang="it")) + slog = StrategicLog( + generated_at=utcnow(), content="# Open\n\nDown 0.4%.", + model="test-model", + tone="INTERMEDIATE", analysis="NORMAL", + ) + session.add(slog) + await session.commit() + log_id = slog.id + + from app.routers.pages import _resolve_log_content + async with factory() as session: + user = await session.get(User, 11) + content = await _resolve_log_content(session, log_id, user.lang) + assert "Open" in content From f4025e3cbb3814819e6bfbfb0909f9f122108ff8 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 17:16:17 +0200 Subject: [PATCH 14/69] settings: PATCH /api/settings/language with ACTIVE_LANGUAGES gate Co-Authored-By: Claude Opus 4.7 --- app/routers/api.py | 36 +++++++++++++++ tests/test_localization_integration.py | 61 ++++++++++++++++++++++++++ 2 files changed, 97 insertions(+) diff --git a/app/routers/api.py b/app/routers/api.py index 5e06090..0b23d57 100644 --- a/app/routers/api.py +++ b/app/routers/api.py @@ -21,6 +21,7 @@ import httpx from pydantic import BaseModel, Field from app.auth import require_token, maybe_current_user, CurrentUser +from app.services.i18n import ACTIVE_LANGUAGES from app.config import get_settings from app.db import get_session, utcnow from app.services.openrouter import ( @@ -895,3 +896,38 @@ async def patch_digest_prefs( user.digest_tone = payload.tone await session.commit() return DigestPrefsOut(opt_in=payload.opt_in, tone=payload.tone) + + +# --------------------------------------------------------------------------- +# Settings — language preference +# --------------------------------------------------------------------------- + + +class LanguagePrefsIn(BaseModel): + lang: str + + +class LanguagePrefsOut(BaseModel): + lang: str + + +@router.patch("/settings/language", response_model=LanguagePrefsOut) +async def patch_language_prefs( + payload: LanguagePrefsIn, + principal: CurrentUser = Depends(require_token), + session: AsyncSession = Depends(get_session), +) -> LanguagePrefsOut: + if principal.user is None: + raise HTTPException(status_code=400, detail="no_user_context") + lang = (payload.lang or "").strip().lower() + if lang not in ACTIVE_LANGUAGES: + raise HTTPException( + status_code=400, + detail=f"unsupported language: {payload.lang!r}", + ) + user = await session.get(User, principal.user.id) + if user is None: + raise HTTPException(status_code=404, detail="user_not_found") + user.lang = lang + await session.commit() + return LanguagePrefsOut(lang=lang) diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py index ae74a53..c5cf92c 100644 --- a/tests/test_localization_integration.py +++ b/tests/test_localization_integration.py @@ -395,3 +395,64 @@ async def test_log_endpoint_falls_back_to_english_when_no_translation(tmp_path): user = await session.get(User, 11) content = await _resolve_log_content(session, log_id, user.lang) assert "Open" in content + + +@pytest.mark.asyncio +async def test_patch_language_accepts_active(tmp_path): + """PATCH /api/settings/language accepts 'en' and 'it' and persists.""" + from app.models import User + from app.routers.api import patch_language_prefs, LanguagePrefsIn + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=20, email="u@x", tier="paid", lang="en")) + await session.commit() + + class _P: + is_admin = False + def __init__(self, u): self.user = u + + async with factory() as session: + user = await session.get(User, 20) + result = await patch_language_prefs( + payload=LanguagePrefsIn(lang="it"), + principal=_P(user), + session=session, + ) + assert result.lang == "it" + + async with factory() as session: + user = await session.get(User, 20) + assert user.lang == "it" + + +@pytest.mark.asyncio +async def test_patch_language_rejects_wip(tmp_path): + """PATCH rejects 'es'/'fr'/'de'/'xx' with 400 — ACTIVE_LANGUAGES gate.""" + from fastapi import HTTPException + from app.models import User + from app.routers.api import patch_language_prefs, LanguagePrefsIn + + _, factory, setup = _build_session_factory(tmp_path) + await setup() + + async with factory() as session: + session.add(User(id=21, email="u2@x", tier="paid", lang="en")) + await session.commit() + + class _P: + is_admin = False + def __init__(self, u): self.user = u + + for bad in ("es", "fr", "de", "xx"): + async with factory() as session: + user = await session.get(User, 21) + with pytest.raises(HTTPException) as exc: + await patch_language_prefs( + payload=LanguagePrefsIn(lang=bad), + principal=_P(user), + session=session, + ) + assert exc.value.status_code == 400 From 50ac6b9366e873e3956abf07cd1252c9aae2b34f Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 17:17:18 +0200 Subject: [PATCH 15/69] settings: add language dropdown (IT active, ES/FR/DE WIP) --- app/templates/settings.html | 41 +++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/app/templates/settings.html b/app/templates/settings.html index 20dfa57..a5c59f0 100644 --- a/app/templates/settings.html +++ b/app/templates/settings.html @@ -224,6 +224,47 @@ })(); + {# --- Language block ------------------------------------------------ #} +
    + Language +

    + Language the AI uses for the strategic log, your daily digest, and + portfolio commentary. The interface itself stays in English for now. +

    +
    + + +
    + +
    + {# --- Cloud sync block --------------------------------------------- #}
    Cloud sync (encrypted) From fb71854238c89c594fb908e9cbaca6e5d9f83211 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 18:14:23 +0200 Subject: [PATCH 16/69] i18n: style the settings select + add a topbar lang toggle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two issues addressed: 1. The /settings language used in the Settings page. Native + * browser chrome stripped; we render a small chevron via crossed + * linear-gradients so the control matches the rest of the panel. */ +.settings-select { + appearance: none; + -webkit-appearance: none; + -moz-appearance: none; + background: transparent; + border: 1px solid var(--border); + color: var(--text); + padding: 4px 28px 4px 8px; + font-family: var(--font-mono); + font-size: 12px; + border-radius: 2px; + cursor: pointer; + background-image: + linear-gradient(45deg, transparent 50%, var(--dim) 50%), + linear-gradient(-45deg, transparent 50%, var(--dim) 50%); + background-position: calc(100% - 13px) 50%, calc(100% - 9px) 50%; + background-size: 5px 5px, 5px 5px; + background-repeat: no-repeat; + transition: border-color 120ms ease-out, color 120ms ease-out; +} +.settings-select:hover, +.settings-select:focus { + outline: none; + border-color: var(--accent); + color: var(--text); +} +.settings-select option { color: var(--text); background: var(--surface); } +.settings-select option:disabled { color: var(--dim); } + +.settings-status { + font-family: var(--font-mono); + font-size: 11px; + color: var(--muted); + letter-spacing: 0.04em; +} +.settings-status:empty { display: none; } + /* Sections are
    elements — collapsed by default to keep the settings page scannable. Click the summary to expand. */ .settings-section { diff --git a/app/templates/base.html b/app/templates/base.html index 9fdb0d1..29c8290 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -134,6 +134,34 @@ if (el && window.htmx) window.htmx.trigger(el, 'tone-changed'); }); }; + + window.cassandraSetLang = async function (newLang) { + var pill = document.getElementById('lang-toggle'); + if (!pill) return; + var prev = pill.dataset.lang; + if (prev === newLang) return; + // Optimistic update — flip the pill immediately so the click feels + // responsive. Revert on PATCH failure. + pill.dataset.lang = newLang; + try { + var r = await fetch('/api/settings/language', { + method: 'PATCH', + headers: {'Content-Type': 'application/json'}, + credentials: 'same-origin', + body: JSON.stringify({lang: newLang}), + }); + if (!r.ok) throw new Error('HTTP ' + r.status); + // Reload localized panels so the user immediately sees content + // in the new language (strategic log, dashboard header, etc.). + if (window.location.pathname === '/log' || + window.location.pathname.startsWith('/log/')) { + window.location.reload(); + } + } catch (e) { + pill.dataset.lang = prev; + console.warn('language switch failed:', e); + } + }; - + + {% endif %} - + {% endif %} {% endblock %} From e4dc6d007184fffb39a7392915a0c8b819d24fc4 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 21:02:03 +0200 Subject: [PATCH 29/69] i18n: instant lang switch via HTMX trigger + refresh paid-plans terms --- app/templates/base.html | 14 ++++++++------ app/templates/dashboard.html | 6 +++--- app/templates/log.html | 2 +- app/templates/terms.html | 25 ++++++++++--------------- 4 files changed, 22 insertions(+), 25 deletions(-) diff --git a/app/templates/base.html b/app/templates/base.html index d972043..fbf52e0 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -158,12 +158,14 @@ body: JSON.stringify({lang: newLang}), }); if (!r.ok) throw new Error('HTTP ' + r.status); - // Reload localized panels so the user immediately sees content - // in the new language (strategic log, dashboard header, etc.). - if (window.location.pathname === '/log' || - window.location.pathname.startsWith('/log/')) { - window.location.reload(); - } + // Trigger HTMX-driven panels to re-fetch in the new language. + // Same shape as cassandraSetTone — every panel that listens to + // tone-changed also listens to lang-changed. + ['#dash-header-container', '#log-panel .panel-body', + '#indicators-body', '#log-content'].forEach(function (sel) { + var el = document.querySelector(sel); + if (el && window.htmx) window.htmx.trigger(el, 'lang-changed'); + }); } catch (e) { pill.dataset.lang = prev; console.warn('language switch failed:', e); diff --git a/app/templates/dashboard.html b/app/templates/dashboard.html index fda7358..ee7e7df 100644 --- a/app/templates/dashboard.html +++ b/app/templates/dashboard.html @@ -5,7 +5,7 @@
    loading aggregate read…
    @@ -29,7 +29,7 @@
    loading…
    @@ -115,7 +115,7 @@
    awaiting first log…
    diff --git a/app/templates/log.html b/app/templates/log.html index 8abee4c..1ae7ea3 100644 --- a/app/templates/log.html +++ b/app/templates/log.html @@ -25,7 +25,7 @@
    loading log…
    diff --git a/app/templates/terms.html b/app/templates/terms.html index cd3530b..8e33b23 100644 --- a/app/templates/terms.html +++ b/app/templates/terms.html @@ -77,21 +77,16 @@

    5. Paid plans

    - If and when paid plans become available, you will be told the - applicable fees at point of sale. Paid features remain active for as - long as the subscription is current or any time-bounded credit - granted to your account is still valid. You can cancel a paid - subscription at any time; cancellation takes effect at the end of - the current billing period unless otherwise stated. -

    -

    - Where the law gives you a 14-day right to cancel a subscription - (Consumer Contracts (Information, Cancellation and Additional - Charges) Regulations 2013, UK), that right applies. By starting to - use a paid feature immediately on purchase you agree we may begin - supplying the service within the cancellation period, and you - acknowledge that you lose the right to cancel in respect of any - digital content already delivered. + Paid plans are available at £7/month or £70/year (terms + and current prices on the pricing page). New + annual subscriptions begin with a 14-day free trial; monthly + subscriptions begin immediately on payment. Paid features remain + active for as long as the subscription is current or any + time-bounded credit granted to your account is still valid. You + can cancel a paid subscription at any time; cancellation takes + effect at the end of the current billing period unless otherwise + stated. Detailed refund and cancellation rights are set out in + section 6 below.

    From a6d686324cc741d63f8344d7747e806d6defaa3b Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 21:18:29 +0200 Subject: [PATCH 30/69] models: align translation column naming + add token counts Three recently-added tables (strategic_log_translations, indicator_summary_translations, csv_format_templates) drifted from the codebase's existing naming convention: - llm_model -> model - llm_cost_usd -> cost_usd - content_md -> content (on the two translation tables; csv_format doesn't have a content field) Also added prompt_tokens and completion_tokens to the three tables; they were silently dropped at write time despite LogResult exposing them. All writer call sites (ai_log_job, indicator_summary_job, llm_csv_parser) and reader call sites (api.py localized helpers) updated to match. Tests realigned. Migration 0025 uses batch_alter_table for SQLite compatibility. Co-Authored-By: Claude Opus 4.7 --- .../0025_align_translation_columns.py | 79 +++++++++++++++++++ app/jobs/ai_log_job.py | 8 +- app/jobs/indicator_summary_job.py | 8 +- app/models.py | 22 ++++-- app/routers/api.py | 6 +- app/services/llm_csv_parser.py | 6 +- tests/test_llm_csv_parser.py | 14 ++-- tests/test_localization_integration.py | 14 ++-- 8 files changed, 125 insertions(+), 32 deletions(-) create mode 100644 alembic/versions/0025_align_translation_columns.py diff --git a/alembic/versions/0025_align_translation_columns.py b/alembic/versions/0025_align_translation_columns.py new file mode 100644 index 0000000..dbee1d7 --- /dev/null +++ b/alembic/versions/0025_align_translation_columns.py @@ -0,0 +1,79 @@ +"""align translation column naming + add token counts. + +Revision ID: 0025 +Revises: 0024 +Create Date: 2026-05-27 +""" +from typing import Sequence, Union + +import sqlalchemy as sa +from alembic import op + + +revision: str = "0025" +down_revision: Union[str, None] = "0024" +branch_labels: Union[str, Sequence[str], None] = None +depends_on: Union[str, Sequence[str], None] = None + + +def upgrade() -> None: + # strategic_log_translations + with op.batch_alter_table("strategic_log_translations") as bop: + bop.alter_column("llm_model", new_column_name="model", + existing_type=sa.String(length=64), existing_nullable=True) + bop.alter_column("llm_cost_usd", new_column_name="cost_usd", + existing_type=sa.Float(), existing_nullable=True) + bop.alter_column("content_md", new_column_name="content", + existing_type=sa.Text(), existing_nullable=False) + bop.add_column(sa.Column("prompt_tokens", sa.Integer(), nullable=True)) + bop.add_column(sa.Column("completion_tokens", sa.Integer(), nullable=True)) + + # indicator_summary_translations + with op.batch_alter_table("indicator_summary_translations") as bop: + bop.alter_column("llm_model", new_column_name="model", + existing_type=sa.String(length=64), existing_nullable=True) + bop.alter_column("llm_cost_usd", new_column_name="cost_usd", + existing_type=sa.Float(), existing_nullable=True) + bop.alter_column("content_md", new_column_name="content", + existing_type=sa.Text(), existing_nullable=False) + bop.add_column(sa.Column("prompt_tokens", sa.Integer(), nullable=True)) + bop.add_column(sa.Column("completion_tokens", sa.Integer(), nullable=True)) + + # csv_format_templates + with op.batch_alter_table("csv_format_templates") as bop: + bop.alter_column("llm_model", new_column_name="model", + existing_type=sa.String(length=64), existing_nullable=True) + bop.alter_column("llm_cost_usd", new_column_name="cost_usd", + existing_type=sa.Float(), existing_nullable=True) + bop.add_column(sa.Column("prompt_tokens", sa.Integer(), nullable=True)) + bop.add_column(sa.Column("completion_tokens", sa.Integer(), nullable=True)) + + +def downgrade() -> None: + with op.batch_alter_table("csv_format_templates") as bop: + bop.drop_column("completion_tokens") + bop.drop_column("prompt_tokens") + bop.alter_column("cost_usd", new_column_name="llm_cost_usd", + existing_type=sa.Float(), existing_nullable=True) + bop.alter_column("model", new_column_name="llm_model", + existing_type=sa.String(length=64), existing_nullable=True) + + with op.batch_alter_table("indicator_summary_translations") as bop: + bop.drop_column("completion_tokens") + bop.drop_column("prompt_tokens") + bop.alter_column("content", new_column_name="content_md", + existing_type=sa.Text(), existing_nullable=False) + bop.alter_column("cost_usd", new_column_name="llm_cost_usd", + existing_type=sa.Float(), existing_nullable=True) + bop.alter_column("model", new_column_name="llm_model", + existing_type=sa.String(length=64), existing_nullable=True) + + with op.batch_alter_table("strategic_log_translations") as bop: + bop.drop_column("completion_tokens") + bop.drop_column("prompt_tokens") + bop.alter_column("content", new_column_name="content_md", + existing_type=sa.Text(), existing_nullable=False) + bop.alter_column("cost_usd", new_column_name="llm_cost_usd", + existing_type=sa.Float(), existing_nullable=True) + bop.alter_column("model", new_column_name="llm_model", + existing_type=sa.String(length=64), existing_nullable=True) diff --git a/app/jobs/ai_log_job.py b/app/jobs/ai_log_job.py index c0635a7..59da09d 100644 --- a/app/jobs/ai_log_job.py +++ b/app/jobs/ai_log_job.py @@ -74,10 +74,12 @@ async def translate_log_for_active_languages(session, log_id: int) -> None: translated_md, llm_result = result session.add(StrategicLogTranslation( log_id=log_id, lang=lang, - content_md=translated_md, + content=translated_md, generated_at=utcnow(), - llm_model=llm_result.model, - llm_cost_usd=llm_result.cost_usd, + model=llm_result.model, + prompt_tokens=llm_result.prompt_tokens, + completion_tokens=llm_result.completion_tokens, + cost_usd=llm_result.cost_usd, )) await session.commit() diff --git a/app/jobs/indicator_summary_job.py b/app/jobs/indicator_summary_job.py index 829077b..5f47221 100644 --- a/app/jobs/indicator_summary_job.py +++ b/app/jobs/indicator_summary_job.py @@ -77,10 +77,12 @@ async def translate_summary_for_active_languages(session, summary_id: int) -> No translated_md, llm_result = result session.add(IndicatorSummaryTranslation( summary_id=summary_id, lang=lang, - content_md=translated_md, + content=translated_md, generated_at=utcnow(), - llm_model=llm_result.model, - llm_cost_usd=llm_result.cost_usd, + model=llm_result.model, + prompt_tokens=llm_result.prompt_tokens, + completion_tokens=llm_result.completion_tokens, + cost_usd=llm_result.cost_usd, )) await session.commit() diff --git a/app/models.py b/app/models.py index 4416501..57c9f19 100644 --- a/app/models.py +++ b/app/models.py @@ -141,12 +141,14 @@ class StrategicLogTranslation(Base): nullable=False, ) lang: Mapped[str] = mapped_column(String(8), nullable=False) - content_md: Mapped[str] = mapped_column(Text, nullable=False) + content: Mapped[str] = mapped_column(Text, nullable=False) generated_at: Mapped[datetime] = mapped_column( DateTime(timezone=True), nullable=False, default=utcnow, ) - llm_model: Mapped[str | None] = mapped_column(String(64)) - llm_cost_usd: Mapped[float | None] = mapped_column(Float) + model: Mapped[str | None] = mapped_column(String(64)) + prompt_tokens: Mapped[int | None] = mapped_column(Integer) + completion_tokens: Mapped[int | None] = mapped_column(Integer) + cost_usd: Mapped[float | None] = mapped_column(Float) __table_args__ = ( UniqueConstraint("log_id", "lang", name="uq_slt_log_lang"), @@ -191,12 +193,14 @@ class IndicatorSummaryTranslation(Base): nullable=False, ) lang: Mapped[str] = mapped_column(String(8), nullable=False) - content_md: Mapped[str] = mapped_column(Text, nullable=False) + content: Mapped[str] = mapped_column(Text, nullable=False) generated_at: Mapped[datetime] = mapped_column( DateTime(timezone=True), nullable=False, default=utcnow, ) - llm_model: Mapped[str | None] = mapped_column(String(64)) - llm_cost_usd: Mapped[float | None] = mapped_column(Float) + model: Mapped[str | None] = mapped_column(String(64)) + prompt_tokens: Mapped[int | None] = mapped_column(Integer) + completion_tokens: Mapped[int | None] = mapped_column(Integer) + cost_usd: Mapped[float | None] = mapped_column(Float) __table_args__ = ( UniqueConstraint("summary_id", "lang", name="uq_ist_summary_lang"), @@ -535,5 +539,7 @@ class CsvFormatTemplate(Base): last_used_at: Mapped[datetime] = mapped_column( DateTime(timezone=True), nullable=False, default=utcnow, ) - llm_model: Mapped[str | None] = mapped_column(String(64)) - llm_cost_usd: Mapped[float | None] = mapped_column(Float) + model: Mapped[str | None] = mapped_column(String(64)) + prompt_tokens: Mapped[int | None] = mapped_column(Integer) + completion_tokens: Mapped[int | None] = mapped_column(Integer) + cost_usd: Mapped[float | None] = mapped_column(Float) diff --git a/app/routers/api.py b/app/routers/api.py index 30c1c62..10a9f5a 100644 --- a/app/routers/api.py +++ b/app/routers/api.py @@ -326,7 +326,7 @@ async def _localized_content( row: StrategicLog | None, principal: CurrentUser | None, ) -> str | None: - """Return the translated content_md for ``row`` when the principal has + """Return the translated content for ``row`` when the principal has a non-English lang preference and a matching translation row exists. Returns None to signal 'use row.content as-is' (the default English path).""" @@ -340,7 +340,7 @@ async def _localized_content( .where(StrategicLogTranslation.log_id == row.id) .where(StrategicLogTranslation.lang == lang) )).scalar_one_or_none() - return t.content_md if t is not None else None + return t.content if t is not None else None async def _apply_localized_summary( @@ -364,7 +364,7 @@ async def _apply_localized_summary( .where(IndicatorSummaryTranslation.lang == lang) )).scalar_one_or_none() if t is not None: - row.content = t.content_md + row.content = t.content def _resolve_tone_param(tone: str | None) -> str: diff --git a/app/services/llm_csv_parser.py b/app/services/llm_csv_parser.py index 7bb84af..7c7c7a5 100644 --- a/app/services/llm_csv_parser.py +++ b/app/services/llm_csv_parser.py @@ -424,8 +424,10 @@ async def parse_with_llm(raw: bytes, session: AsyncSession) -> ParsedPie: first_seen_at=now, last_used_at=now, use_count=1, - llm_model=llm_log.model, - llm_cost_usd=llm_log.cost_usd, + model=llm_log.model, + prompt_tokens=llm_log.prompt_tokens, + completion_tokens=llm_log.completion_tokens, + cost_usd=llm_log.cost_usd, )) await session.commit() return pie diff --git a/tests/test_llm_csv_parser.py b/tests/test_llm_csv_parser.py index 15765b3..8d5d42f 100644 --- a/tests/test_llm_csv_parser.py +++ b/tests/test_llm_csv_parser.py @@ -22,8 +22,10 @@ def test_csv_format_template_model_columns(): assert "first_seen_at" in cols assert "use_count" in cols assert "last_used_at" in cols - assert "llm_model" in cols - assert "llm_cost_usd" in cols + assert "model" in cols + assert "cost_usd" in cols + assert "prompt_tokens" in cols + assert "completion_tokens" in cols # Crucially, no user attribution. assert "user_id" not in cols assert "first_seen_user_id" not in cols @@ -330,7 +332,7 @@ async def test_parse_with_llm_cache_miss_inserts_template(db_factory): assert tmpl.mapping["ticker_col"] == "Symbol" assert tmpl.broker_label == "Generic broker" assert tmpl.use_count == 1 - assert tmpl.llm_cost_usd == pytest.approx(0.0002) + assert tmpl.cost_usd == pytest.approx(0.0002) # The crucial PII guarantee: assert not hasattr(tmpl, "user_id"), "sample row must not be linked to a user" @@ -365,8 +367,8 @@ async def test_parse_with_llm_cache_hit_skips_llm(db_factory): first_seen_at=utcnow(), last_used_at=utcnow(), use_count=1, - llm_model="seed", - llm_cost_usd=0.0, + model="seed", + cost_usd=0.0, )) await session.commit() @@ -410,7 +412,7 @@ async def test_parse_with_llm_stale_mapping_raises_but_does_not_evict(db_factory mapping={"ticker_col": "Symbol", "qty_col": "Symbol"}, preamble_rows=0, delimiter=",", broker_label=None, first_seen_at=utcnow(), last_used_at=utcnow(), use_count=1, - llm_model="seed", llm_cost_usd=0.0, + model="seed", cost_usd=0.0, )) await session.commit() diff --git a/tests/test_localization_integration.py b/tests/test_localization_integration.py index 6a1ea08..f527d5b 100644 --- a/tests/test_localization_integration.py +++ b/tests/test_localization_integration.py @@ -27,13 +27,13 @@ def test_strategic_log_translation_model_columns(): cols = {c.name: c for c in inspect(StrategicLogTranslation).columns} assert "log_id" in cols assert "lang" in cols - assert "content_md" in cols + assert "content" in cols assert "generated_at" in cols - assert "llm_model" in cols - assert "llm_cost_usd" in cols + assert "model" in cols + assert "cost_usd" in cols assert cols["log_id"].nullable is False assert cols["lang"].nullable is False - assert cols["content_md"].nullable is False + assert cols["content"].nullable is False async def test_log_translation_fanout_no_active_non_en_users(db_factory, monkeypatch): @@ -113,9 +113,9 @@ async def test_log_translation_fanout_italian_user(db_factory, monkeypatch): row = rows[0] assert row.log_id == log_id assert row.lang == "it" - assert row.content_md.startswith("# Apertura") - assert row.llm_model == "deepseek/deepseek-v4-flash" - assert row.llm_cost_usd == pytest.approx(0.00002) + assert row.content.startswith("# Apertura") + assert row.model == "deepseek/deepseek-v4-flash" + assert row.cost_usd == pytest.approx(0.00002) async def test_log_translation_fanout_per_language_failure_isolated(db_factory, monkeypatch): From 4adc8dfe8299deca7a149801dfa600135bba2018 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 21:27:23 +0200 Subject: [PATCH 31/69] openrouter: split into llm_prompts (prompt engineering) + transport MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit openrouter.py was 790 lines mixing two orthogonal concerns: - Prompt engineering (build_system_prompt, build_summary_*, build_chat_*, build_daily_digest_*, etc.) — ~400 lines, changes weekly as PROMPT_VERSION bumps - LLM transport (call_llm, _provider_chain, _call_provider, retry + fallback machinery) — ~250 lines, rarely changes Extracted the prompt-engineering surface to app/services/llm_prompts.py. Transport stays in openrouter.py (consistent with the filename — the OpenRouter URL is the transport's anchor). All import sites (jobs, routers, services, tests) split their multi-import lines into two: prompt-things from llm_prompts, transport from openrouter. PROMPT_VERSION constant, _TONE_ALIASES, _resolve_tone, and SYSTEM_PROMPT moved with the prompt functions. No behaviour change — pure relocation. Function signatures, body, and naming all preserved. Co-Authored-By: Claude Opus 4.7 --- app/jobs/ai_log_job.py | 6 +- app/jobs/email_digest_job.py | 4 +- app/jobs/indicator_summary_job.py | 6 +- app/routers/api.py | 4 +- app/services/llm_prompts.py | 597 +++++++++++++++++++++++++++++ app/services/openrouter.py | 589 +--------------------------- app/services/portfolio_analysis.py | 2 +- tests/test_digest_prompts.py | 2 +- tests/test_openrouter_prompt.py | 4 +- 9 files changed, 619 insertions(+), 595 deletions(-) create mode 100644 app/services/llm_prompts.py diff --git a/app/jobs/ai_log_job.py b/app/jobs/ai_log_job.py index 59da09d..2c0277e 100644 --- a/app/jobs/ai_log_job.py +++ b/app/jobs/ai_log_job.py @@ -20,11 +20,13 @@ from app.jobs._market_context import ( from app.models import AICall, JobRun, StrategicLog, StrategicLogTranslation, User from app.services.cadence import DEFAULT_POLICY from app.services.i18n import ACTIVE_LANGUAGES -from app.services.openrouter import ( +from app.services.llm_prompts import ( PROMPT_VERSION, - active_model, build_system_prompt, build_user_prompt, +) +from app.services.openrouter import ( + active_model, call_llm, llm_configured, ) diff --git a/app/jobs/email_digest_job.py b/app/jobs/email_digest_job.py index 5ff25c6..0bad288 100644 --- a/app/jobs/email_digest_job.py +++ b/app/jobs/email_digest_job.py @@ -31,10 +31,12 @@ from app.routers.email import sign_unsubscribe_token from app.services.access import paid_status from app.services.email_service import render_digest_email, send_email from app.services.i18n import ACTIVE_LANGUAGES -from app.services.openrouter import ( +from app.services.llm_prompts import ( PROMPT_VERSION, build_daily_digest_prompt, build_weekly_digest_prompt, +) +from app.services.openrouter import ( call_llm, llm_configured, ) diff --git a/app/jobs/indicator_summary_job.py b/app/jobs/indicator_summary_job.py index 5f47221..fb21f24 100644 --- a/app/jobs/indicator_summary_job.py +++ b/app/jobs/indicator_summary_job.py @@ -22,13 +22,15 @@ from app.models import ( ) from app.services.cadence import DEFAULT_POLICY from app.services.i18n import ACTIVE_LANGUAGES -from app.services.openrouter import ( +from app.services.llm_prompts import ( PROMPT_VERSION, - active_model, build_aggregate_summary_system_prompt, build_aggregate_summary_user_prompt, build_summary_system_prompt, build_summary_user_prompt, +) +from app.services.openrouter import ( + active_model, call_llm, llm_configured, month_start, diff --git a/app/routers/api.py b/app/routers/api.py index 10a9f5a..893d08f 100644 --- a/app/routers/api.py +++ b/app/routers/api.py @@ -25,9 +25,11 @@ from app.services.i18n import ACTIVE_LANGUAGES from app.config import get_settings from app.db import get_session, utcnow from app.jobs._market_context import REFERENCE_LINE -from app.services.openrouter import ( +from app.services.llm_prompts import ( PROMPT_VERSION, build_chat_system_prompt, +) +from app.services.openrouter import ( call_llm, month_start, ) diff --git a/app/services/llm_prompts.py b/app/services/llm_prompts.py new file mode 100644 index 0000000..9840ec2 --- /dev/null +++ b/app/services/llm_prompts.py @@ -0,0 +1,597 @@ +"""Prompt-engineering surface for AI surfaces. + +This module assembles the system + user prompts the LLM ingests. It +has no I/O — pure string-building from typed inputs. Pair with +``app.services.openrouter`` (the transport layer) which actually +calls the model. + +The two halves of LLM work — what to ask vs how to ask — change at +very different cadences. Prompt-version bumps (see PROMPT_VERSION +below) happen ~weekly; transport changes are rare. +""" +from __future__ import annotations + +import json +from datetime import datetime + + +# Bump when the composed prompt changes meaningfully. Stored on every +# StrategicLog row so historical logs can be linked to the prompt that produced +# them. +# +# v6 (2026-05-17): TONE shrinks to NOVICE | INTERMEDIATE (PRO dropped). New +# educational stance baked into _CORE — explicit anti-TA, anti-gambling-mindset +# framing aimed at young investors entering the trading world. NOVICE retuned +# to be pedagogical (defining terms, anti-pattern teach-backs); INTERMEDIATE +# kept terse but with light-touch educational nudges. See tasks/todo.md. +# v7 (2026-05-18): Forbid "(Updated HH:MM UTC)" clauses in the date header — +# the model was hallucinating future times. The user prompt now carries the +# actual current UTC time so the model has accurate temporal context. +# v9 (2026-05-25): Adds daily + weekly digest prompt builders for email. +PROMPT_VERSION = 9 + + +# --- Core: invariant across tone/analysis settings ---------------------------- + +_CORE = """You are Cassandra, writing a single daily strategic markets log \ +for one specific investor. Synthesis, not exposition. + +# Lens +- Geopolitics → markets is the primary causal chain. For each sector move, \ +ask: geopolitical, cyclical, or idiosyncratic. Label it. +- Divergences and contradictions are where the information is. Hunt for them. +- Absence of expected moves is signal. If the thesis predicted a reaction \ +that didn't happen, that's more interesting than the reactions that did. +- Compare live readings against any reference snapshots provided. + +# Multi-source news +- When state-aligned outlets (Xinhua, China Daily, RT) and Western outlets \ +cover the same event, read the gap in framing — that's the data. +- News matters only insofar as it changes a market read. Color without \ +implications is filler. + +# Structure +- One-line date header containing ONLY the date (e.g. `2026-05-18`) and \ +optional anchor framing on the same line (e.g. "Week 11 since Hormuz"). \ +**Never include a time-of-day clause like "(Updated 21:30 UTC)"** — \ +generation time is recorded as metadata elsewhere. Inventing a future or \ +arbitrary time in the header confuses readers. +- Immediately after the date header — with **nothing** in between — write a \ +TL;DR. Format it as: + + ## TL;DR + + One concise paragraph of 2-3 sentences, **≤60 words total**, naming the \ +single most important read or divergence of the day with concrete numbers. \ +This is what a reader who only has 10 seconds sees. Don't waste it on the \ +weather or generic context. + +- Then 4-6 paragraphs, each anchored on a sleeve, sector, or theme. Concrete \ +numbers in every paragraph. No section over ~150 words. +- One paragraph synthesising the news flow into a market read. +- End with a watch list: 3-5 specific items to track in the next week, \ +each one sentence. + +# Time-horizon discipline +- This is a STRATEGIC log, not a day-trader's read. Treat 1-day moves under \ +2% as background noise; mention them only when they break or confirm a \ +multi-week trend or are extreme outliers. +- Anchor every claim to multi-week (1m), multi-month (since-anchor), or \ +multi-year (1y) changes — not 1d. If the only thing happening is a 1d move, \ +omit the paragraph. +- The watch list is for "structural tripwires over the next 1-3 months", not \ +"things to watch tomorrow". Each watch item should name a level/threshold \ +whose breach would change the regime, not a calendar-date event. + +# Rational vs irrational framing (MANDATORY in every paragraph) +The reader's primary goal is to disconnect rational decisions from market \ +irrationality. This is the single most important lens of the log — it MUST \ +appear in every sector or theme paragraph, not just where it feels natural. \ +For each paragraph, before writing it, ask yourself the two questions and \ +then make both answers visible in the prose: +- The RATIONAL drivers — what the underlying factors justify: earnings, \ +real-economy data, monetary policy, structural geopolitical shifts, \ +valuation vs fundamentals. +- The IRRATIONAL drivers — what the crowd is doing regardless of fundamentals: \ +positioning, narrative momentum, sentiment extremes, concentration, \ +flow-driven moves, options gamma, credit complacency. +Then state the GAP: is price moving with the rational read, ahead of it, \ +or against it? If they agree, say so briefly and move on. If they diverge \ +— price moving on irrational drivers while fundamentals say otherwise, or \ +vice versa — name the divergence explicitly. Those gaps are where the next \ +regime change starts and are the whole point of this log. +A paragraph that names only price action or only fundamentals, without \ +both lenses, is incomplete and must be rewritten. + +# Discipline +- No emojis, no marketing language, no "concerning" or "unprecedented" \ +without a specific number behind it. +- Concrete > vague. "AMD +113% since the anchor" beats "AI stocks up sharply". +- Distinguish "the thesis predicted X and X happened" from "the thesis \ +predicted X and X did not happen". Both are useful; conflating them is not. +- Don't repeat the same point in different words across paragraphs. +- No buy/sell recommendations. Triggers are pre-set elsewhere; your job is \ +to report whether reality is confirming, modifying, or refuting the thesis. + +# Stance (educational, anti-TA, anti-gambling) +The target reader is most likely young, new to investing, and at risk of \ +treating markets like a horse race they need to "read" via chart patterns. \ +Cassandra is the corrective. +- **No technical analysis.** Head-and-shoulders, RSI thresholds, Fibonacci \ +levels, Elliott waves, "support/resistance" — these are descriptions of past \ +crowd behaviour, not predictions. Don't use them; don't legitimise them. If \ +you mention a price level, frame it as a positioning fact (e.g. "the level \ +where the latest tranche of buyers entered"), not a signal. +- **No gambling framing.** Markets are not a coin flip and not a horse race. \ +Never present a position as a single decisive moment, a "now or never", or a \ +bet to be won. Every read should follow the shape: *regime → implication → \ +what would change the regime*. +- **Macro causality, every time.** Price moves get explained through \ +fundamentals, geopolitics, monetary policy, and structural shifts — not \ +chart shapes. Even short paragraphs need the cause, not just the effect. + +# System temperature (closing line, mandatory) +Close the log with a single sentence on a line of its own, formatted exactly: + + System temperature: [cool|neutral|elevated|hot|extreme] — [one clause naming the 2-3 specific divergences or readings that justify the label] + +This is the line a reader who only sees the watch list scrolls down to. Make \ +it earn its place: cite real signals (HY OAS, breadth, VIX, valuation, real \ +yields), not vibes. + +# Update mode (when an earlier log from today is provided) +If the user message includes a section labelled "Earlier log from today \ +(generated HH:MM UTC)", treat that as YOUR OWN earlier draft. You are \ +UPDATING it for the current data, not starting from scratch. +- Don't restate context that hasn't changed. Anchor on what's moved SINCE \ +that timestamp: confirmations, refutations, new emergent patterns. +- The TL;DR should lead with the move since the earlier read when there \ +was a meaningful intra-day change ("Since this morning's read, …") — \ +otherwise stay regime-level. +- The watch list should evolve: drop items that triggered or settled, add \ +items that emerged. Keep items still load-bearing. +- Preserve any insights from the earlier draft that remain valid; sharpen \ +or revise the ones that don't. Avoid contradicting yourself silently — if \ +you change a stance, name it briefly ("Earlier I read X; with Y now, the \ +read shifts to Z").""" + + +# --- Tone: audience-shaping block -------------------------------------------- + +_TONE: dict[str, str] = { + "NOVICE": """# Audience: novice — likely a young investor new to markets +This reader probably arrived from social media, treats charts as predictions, \ +and is one bad week away from quitting. Your job is to **educate them out of \ +the gambling mindset** without ever being preachy. Calm, patient, slightly \ +teacherly. Never condescending. + +- **Define jargon the first time it appears.** A short clause in parentheses \ +is fine: "yield curve (the chart of borrowing costs across different \ +maturities)", "ERP (equity risk premium — the extra return investors demand \ +for owning stocks instead of safe bonds)", "basis point (one hundredth of a \ +percent — 25bp = 0.25%)". +- **Avoid ticker shorthand without context.** Use "Apple (AAPL)" on first \ +mention, then "Apple" or the ticker after. +- **Everyday phrasing over jargon** where the meaning survives: "the price \ +of US government debt fell, pushing yields up" rather than "the long end \ +backed up"; "investors are paying more for the same earnings" rather than \ +"multiple expansion". +- **One analogy per concept, used sparingly.** Use them to bridge to \ +something concrete the reader already understands — not to entertain. + +# Educational teach-backs (NOVICE-specific, when warranted) +When the day's data makes a common misconception concrete, drop in ONE \ +teach-back of one to two sentences. Don't force it. Don't moralise. Examples \ +of moments to do this: + +- Anyone treating chart patterns as predictions: \ +"Patterns like head-and-shoulders describe what crowds did, not what they \ +will do — they're stories told after the fact, not edges." +- Anyone fixated on day-to-day moves: \ +"A 1% one-day move in a stock is roughly what you'd expect by chance. The \ +multi-week trend is where the information lives." +- Anyone treating one ticker as a coin flip: \ +"A single name's monthly move is mostly noise. The regime — what bonds, the \ +dollar, and credit are doing together — tells you whether ANY stock is \ +likely to drift up or down." +- Anyone trying to "time the bottom" or "buy the dip": \ +"Catching the bottom is a different game from owning the next cycle. The \ +first needs you to be right within days; the second needs you to be roughly \ +right within years." + +Limit yourself to one teach-back per log. Skip them entirely if the day's \ +data doesn't naturally invite one. + +# Length +Target ~700 words. Slightly more than INTERMEDIATE because explanations \ +need breathing room.""", + + "INTERMEDIATE": """# Audience: intermediate — reads the news, learning to \ +connect macro to markets +Assume the reader knows market basics (yield curves, breakevens, HY OAS, \ +sector ETFs, the difference between cyclical and defensive, what a basis \ +point is). Use common terms without defining them, but stay clear of deep \ +institutional shorthand ("the belly", "duration trade", "carry pickup", \ +"the RV book", "off-the-run"). + +Light-touch educational nudges are welcome when the day's data warrants — \ +e.g. "with rates this volatile, technical levels in equities are mostly \ +distraction" — but keep them to a passing clause, not a paragraph. Don't \ +moralise. + +# Length +Target ~600 words. Lean and clear, no padding.""", +} + + +# Legacy values map to the closest current value. Logs a warning so we can +# notice if some caller's config didn't get updated. +_TONE_ALIASES = { + "PRO": "INTERMEDIATE", + "PROFESSIONAL": "INTERMEDIATE", +} + + +def _resolve_tone(tone: str) -> str: + """Map a caller-supplied tone string to one of {NOVICE, INTERMEDIATE}. + + Unknown tones fall back to INTERMEDIATE. The legacy PRO value is mapped + to INTERMEDIATE (audience pivot, see PROMPT_VERSION v6 notes).""" + upper = (tone or "").upper().strip() + if upper in _TONE: + return upper + if upper in _TONE_ALIASES: + return _TONE_ALIASES[upper] + return "INTERMEDIATE" + + +# --- Analysis: forward-vs-backward focus ------------------------------------- + +_ANALYSIS: dict[str, str] = { + "DRY": """# Analysis style: dry +Report what happened. Identify divergences and contradictions. Compare to \ +references. Do not speculate on what comes next. Forward-looking statements \ +are limited to "what would invalidate the read" — never "we expect X to \ +happen". The watch list contains items to monitor, not predictions.""", + + "SPECULATIVE": """# Analysis style: speculative +Report what happened, then explicitly explore forward scenarios. For each \ +significant sector or theme, sketch a 1-4 week scenario set: the base case \ +(what the data suggests), a contrarian case (what would invalidate it), and \ +what tape signal would tip you from one to the other. Be explicit about \ +uncertainty — say "the base case is" not "X will happen". The watch list is \ +the trip-wires that decide between scenarios.""", +} + + +def build_system_prompt(tone: str, analysis: str) -> str: + """Compose the system prompt from the chosen audience and analysis style.""" + tone_block = _TONE[_resolve_tone(tone)] + analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) + return "\n\n".join([_CORE, tone_block, analysis_block]) + + +# Backwards-compat: a default-composed SYSTEM_PROMPT for tests / callers that +# don't yet pass tone/analysis. New callers should call build_system_prompt(). +SYSTEM_PROMPT = build_system_prompt("INTERMEDIATE", "SPECULATIVE") + + +# --- Chat-mode overrides (sidebar on /log) ----------------------------------- + +_CHAT_OVERRIDES = """# Chat mode (overrides the log-structure rules above) +You are NOT writing a daily log right now. The user is asking a specific +question via the chat sidebar. +- Forget the date header, TL;DR, sectional structure, and watch list. Just answer. +- Typical response: 200-400 words. Longer only if the question genuinely + warrants it. +- Cite specific numbers and named headlines from the reference materials + below whenever relevant. If a number isn't in the context, don't invent it. +- If a question is outside the provided context (e.g. asking about a stock or + event not in the data), say so plainly rather than speculating from prior + knowledge. +- No buy/sell recommendations. If asked, redirect to thesis and scenarios. +- Keep the same audience and analysis discipline established above.""" + + +def build_summary_system_prompt(tone: str, analysis: str) -> str: + """A lean, focused system prompt for the per-indicator-group hourly + summary. INTERPRETATION not description — the reader has the table + next to this paragraph; they don't need numbers recited at them.""" + tone_block = _TONE[_resolve_tone(tone)] + analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) + return f"""You write a TINY interpretation (≤60 words, 2-3 sentences) \ +of ONE indicator group for a strategic markets dashboard. + +# What this is for +The reader is looking at the table of numbers right next to your text. \ +They can see the values. They CANNOT see the meaning. Your job is to \ +**explain what the data means**, not to recite it. Each sentence should be \ +a regime-level interpretation, a fundamental driver identification, or a \ +cross-indicator implication — not a description of moves. + +# Rational vs irrational lens (required at this length too) +Even at 2-3 sentences, contrast what the underlying factors justify \ +(rational: fundamentals, policy, valuation) with what the crowd is doing \ +(irrational: positioning, narrative, flows) whenever the two diverge. If \ +they don't diverge, say so in one clause. Never just describe the move \ +without placing it on this axis. + +# Hard constraints +- Plain prose, ONE paragraph. No markdown, no headers, no lists, no labels. +- Open IMMEDIATELY with substance. NEVER start with: "I need to", "I'll", \ +"We need to", "We are asked", "Here's", "Let me", "Let's", "Sure", "Looking \ +at", "Based on", "Summary:", "The data shows", "First", "To address". No \ +meta-commentary at all. +- Cite at most 2-3 specific numbers and ONLY when they anchor an \ +interpretation. Don't list moves; explain them. +- Multi-week / multi-month horizon. 1-day moves under 2% are noise — skip. +- No buy/sell language. No predictions. No watch list. No TL;DR. No date \ +header. No "system temperature" line — that belongs to the full daily log. +- Output the read directly. Do NOT include phrases like "Example", "Good \ +example", "Bad example", "Reference", or any meta-framing of your output. + +{tone_block} + +{analysis_block} +""" + + +def build_summary_user_prompt(group_name: str, quotes: list[dict]) -> str: + parts = [ + f"# Group: {group_name}", + "Indicators (latest reading + 1d/1m/1y/since-anchor change):", + "```json", + json.dumps(quotes, indent=2, default=str)[:12000], + "```", + "\nWrite the 2-3 sentence read for this group now.", + ] + return "\n".join(parts) + + +def build_aggregate_summary_system_prompt(tone: str, analysis: str) -> str: + """System prompt for the cross-group aggregate read shown on the dashboard. + Wider lens than a per-group summary — synthesise across all groups.""" + tone_block = _TONE[_resolve_tone(tone)] + analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) + return f"""You write a single SHORT cross-asset INTERPRETATION (≤80 \ +words, 2-4 sentences) for the dashboard header. The reader is glancing — \ +give them the meaning of the whole tape, not a recap. + +# What this is for +The reader can see every indicator on the dashboard below this paragraph. \ +Your job is NOT to summarise the moves. It is to explain what the moves, \ +**taken together as a system**, mean: which regime is being signalled, \ +which divergences are load-bearing, what fundamental story the cross-asset \ +behaviour tells. + +# Rational vs irrational lens (required at this length too) +The cross-asset tape's value is in the gap between what the underlying \ +factors justify (rational: fundamentals, policy, valuation) and what the \ +crowd is actually doing (irrational: positioning, narrative momentum, \ +flows). At least one of the 2-4 sentences must name this gap or, if the \ +two cohere, explicitly say so. + +# Hard constraints +- Plain prose, ONE paragraph. No markdown, headers, lists, or labels. +- Open IMMEDIATELY with substance. NEVER start with: "I need to", "I'll", \ +"We need to", "Here's", "Let me", "Looking at", "Based on", "Sure", "Summary:", \ +"The data shows", "Across the board". No meta-commentary. +- Identify the single most important **cross-asset implication**: e.g. \ +"rates and credit disagree", "equities outrun fundamentals", "geopolitical \ +risk premium is in commodities but not vol". Cite no more than 3 specific \ +numbers, and only as anchors for the interpretation. +- Multi-week / multi-month horizon. 1-day moves under 2% are noise. +- No buy/sell language. No predictions of specific levels. +- Output the read directly. Do NOT include phrases like "Example", "Good \ +example", "Bad example", "Reference", or any meta-framing of your output. + +{tone_block} + +{analysis_block} +""" + + +def build_aggregate_summary_user_prompt(quotes_by_group: dict[str, list[dict]]) -> str: + parts = [ + "# All indicator groups (latest readings + change windows)", + "```json", + json.dumps(quotes_by_group, indent=2, default=str)[:20000], + "```", + "\nWrite the cross-asset aggregate read now.", + ] + return "\n".join(parts) + + +def build_chat_system_prompt( + tone: str, + analysis: str, + *, + log_content: str | None, + log_generated_at: datetime | None, + quotes_by_group: dict[str, list[dict]], + headlines: list[dict], + reference_line: str | None = None, +) -> str: + """Composed system prompt for the /log chat sidebar. Carries the user's + chosen tone + analysis style and inlines the latest log + market data + + headlines as reference material the model can cite from.""" + parts = [build_system_prompt(tone, analysis), "", _CHAT_OVERRIDES, ""] + if reference_line: + parts.append(f"# Doc reference snapshot\n{reference_line}\n") + if log_content: + ts = log_generated_at.strftime("%Y-%m-%d %H:%M UTC") if log_generated_at else "n/a" + parts.append(f"# Latest strategic log (generated {ts})\n\n{log_content}\n") + parts.append("# Live market data") + parts.append( + "```json\n" + json.dumps(quotes_by_group, indent=2, default=str)[:25000] + "\n```" + ) + parts.append("# Recent headlines (last 24h, thesis-filtered top 50)") + for h in headlines[:50]: + parts.append(f"- [{h['source']}] {h['title']}") + return "\n".join(parts) + + +def build_user_prompt( + *, + today: datetime, + anchor: str | None, + quotes_by_group: dict[str, list[dict]], + headlines_by_bucket: dict[str, list[dict]], + reference_line: str | None = None, + previous_log: object | None = None, +) -> str: + """Assemble the user message from already-fetched-and-persisted data. + If `previous_log` is a StrategicLog from earlier today, it's included + as 'Update mode' context — the model will revise rather than restart.""" + parts = [ + f"# Strategic log request — {today.strftime('%Y-%m-%d')}", + # Explicit current time so the model doesn't hallucinate one. The + # date header it writes MUST stay date-only (per system prompt). + f"Current time: {today.strftime('%Y-%m-%d %H:%M UTC')}", + ] + if anchor: + parts.append(f"Anchor reference date: {anchor}") + if reference_line: + parts.append( + "\n## Reference snapshot (when the macro thesis was authored)" + f"\n{reference_line}\nCompare live readings against it." + ) + + if previous_log is not None: + gen = getattr(previous_log, "generated_at", None) + ts = gen.strftime("%H:%M UTC") if gen else "earlier today" + parts.append( + f"\n## Earlier log from today (generated {ts})\n" + "Treat this as YOUR OWN earlier draft for today. Update it for\n" + "the current data — don't restate unchanged context. See the\n" + "'Update mode' section of the system prompt for how to handle it.\n" + "```markdown\n" + f"{previous_log.content}\n" + "```" + ) + + parts.append("\n## Live market data (per group)") + parts.append("```json\n" + json.dumps(quotes_by_group, indent=2, default=str) + "\n```") + parts.append("\n## News flow (last 24h, filtered by bucket)") + for label, items in headlines_by_bucket.items(): + if not items: + continue + parts.append(f"\n### {label.upper()}") + for h in items[:30]: + parts.append(f"- [{h['when'][:16].replace('T',' ')}] [{h['source']}] {h['title']}") + + task_line = ( + "\n## Task\nWrite the daily strategic log in ~800 words, following " + "the discipline in the system prompt. No preamble; begin directly " + "with the date header." + ) + if previous_log is not None: + task_line = ( + "\n## Task\nUpdate the earlier log above for the current data. " + "Keep the same structure (date header, TL;DR, sections, watch " + "list, system temperature) but anchor on what has CHANGED since " + "the earlier draft's timestamp. ~800 words. No preamble." + ) + parts.append(task_line) + return "\n".join(parts) + + +def _digest_tone_clause(tone: str) -> str: + if tone.upper() == "NOVICE": + return "Use plain English. Define any jargon on first use." + return "Write for a reader who already speaks markets fluently." + + +def build_daily_digest_prompt( + *, + tone: str, + today, + quotes_by_group: dict, + headlines_by_bucket: dict, + reference_line: str, +) -> tuple[str, str]: + """System + user prompt for the once-a-day editorial digest. + + Different from the hourly log: the daily digest reflects on the past + 24h and looks forward to the upcoming session. Longer, less + 'live-blogging,' more contextual. Target ~600 words.""" + system = ( + "You write the daily editorial digest for Read the Markets. " + f"Audience tone: {tone.upper()}. {_digest_tone_clause(tone)} " + "Cover: (1) what mattered yesterday, (2) what to watch in today's " + "EU and US sessions, (3) one cross-asset thread connecting them. " + "No predictions of price level, no buy/sell language. Target ~600 " + "words. Output HTML using only

    ,

    ,
      ,
    • , , " + " — no , , or wrapper, no inline styles." + ) + user = _digest_user_prompt( + today=today, quotes_by_group=quotes_by_group, + headlines_by_bucket=headlines_by_bucket, reference_line=reference_line, + ) + return system, user + + +def build_weekly_digest_prompt( + *, + tone: str, + today, + quotes_by_group: dict, + headlines_by_bucket: dict, + reference_line: str, +) -> tuple[str, str]: + """System + user prompt for the Sunday weekly recap + look-ahead. + + Sent to ALL opt-in users (free and paid). Target ~900 words.""" + system = ( + "You write the Sunday weekly digest for Read the Markets. " + f"Audience tone: {tone.upper()}. {_digest_tone_clause(tone)} " + "Cover: (1) the week behind — what moved and why, " + "(2) the week ahead — releases, earnings, central-bank meetings, " + "(3) the cross-asset story to keep in mind. " + "No predictions of price level, no buy/sell language. Target ~900 " + "words. Output HTML using only

      ,

      ,
        ,
      • , , " + " — no , , or wrapper, no inline styles." + ) + user = _digest_user_prompt( + today=today, quotes_by_group=quotes_by_group, + headlines_by_bucket=headlines_by_bucket, reference_line=reference_line, + ) + return system, user + + +def _digest_user_prompt( + *, + today, + quotes_by_group: dict, + headlines_by_bucket: dict, + reference_line: str, +) -> str: + """Shared user-message body used by both digest prompts. Same data + shape as the hourly user prompt; reformatted for the digest context.""" + today_str = today.strftime("%A %d %B %Y") if hasattr(today, "strftime") else str(today) + lines = [f"TODAY (UTC): {today_str}", "", f"REFERENCE: {reference_line}", ""] + + if headlines_by_bucket: + lines.append("HEADLINES BY CATEGORY") + for cat, items in headlines_by_bucket.items(): + lines.append(f" [{cat}]") + for h in items[:30]: + when = h.get("when", "") + src = h.get("source", "") + title = h.get("title", "") + lines.append(f" {when} · {src} · {title}") + lines.append("") + + if quotes_by_group: + lines.append("LATEST QUOTES BY GROUP") + for grp, items in quotes_by_group.items(): + lines.append(f" [{grp}]") + for q in items[:30]: + sym = q.get("symbol", "") + price = q.get("price", "") + lbl = q.get("label", "") + ccy = q.get("currency", "") + lines.append(f" {sym} ({lbl}) — {price} {ccy}") + lines.append("") + + return "\n".join(lines) diff --git a/app/services/openrouter.py b/app/services/openrouter.py index ff3215e..c1ddb4f 100644 --- a/app/services/openrouter.py +++ b/app/services/openrouter.py @@ -1,8 +1,8 @@ -"""Strategic-log generator — DB-fed, OpenRouter-backed. +"""LLM transport layer — OpenRouter / DeepSeek API calls. -Ported from /home/gg/ownCloud/Family/Finances/Wealth/strategic_log.py. The -system prompt is preserved verbatim (the voice we converged on). The user -prompt is now built from DB rows, not from subprocess JSON dumps. +Handles provider selection, retry + fallback machinery, and the monthly +budget-cap helpers. Prompt engineering lives in ``app.services.llm_prompts``; +this module only cares about *how* to reach the model, not *what to ask*. """ from __future__ import annotations @@ -18,420 +18,6 @@ from app.config import get_settings OPENROUTER_URL = "https://openrouter.ai/api/v1/chat/completions" -# Bump when the composed prompt changes meaningfully. Stored on every -# StrategicLog row so historical logs can be linked to the prompt that produced -# them. -# -# v6 (2026-05-17): TONE shrinks to NOVICE | INTERMEDIATE (PRO dropped). New -# educational stance baked into _CORE — explicit anti-TA, anti-gambling-mindset -# framing aimed at young investors entering the trading world. NOVICE retuned -# to be pedagogical (defining terms, anti-pattern teach-backs); INTERMEDIATE -# kept terse but with light-touch educational nudges. See tasks/todo.md. -# v7 (2026-05-18): Forbid "(Updated HH:MM UTC)" clauses in the date header — -# the model was hallucinating future times. The user prompt now carries the -# actual current UTC time so the model has accurate temporal context. -# v9 (2026-05-25): Adds daily + weekly digest prompt builders for email. -PROMPT_VERSION = 9 - - -# --- Core: invariant across tone/analysis settings ---------------------------- - -_CORE = """You are Cassandra, writing a single daily strategic markets log \ -for one specific investor. Synthesis, not exposition. - -# Lens -- Geopolitics → markets is the primary causal chain. For each sector move, \ -ask: geopolitical, cyclical, or idiosyncratic. Label it. -- Divergences and contradictions are where the information is. Hunt for them. -- Absence of expected moves is signal. If the thesis predicted a reaction \ -that didn't happen, that's more interesting than the reactions that did. -- Compare live readings against any reference snapshots provided. - -# Multi-source news -- When state-aligned outlets (Xinhua, China Daily, RT) and Western outlets \ -cover the same event, read the gap in framing — that's the data. -- News matters only insofar as it changes a market read. Color without \ -implications is filler. - -# Structure -- One-line date header containing ONLY the date (e.g. `2026-05-18`) and \ -optional anchor framing on the same line (e.g. "Week 11 since Hormuz"). \ -**Never include a time-of-day clause like "(Updated 21:30 UTC)"** — \ -generation time is recorded as metadata elsewhere. Inventing a future or \ -arbitrary time in the header confuses readers. -- Immediately after the date header — with **nothing** in between — write a \ -TL;DR. Format it as: - - ## TL;DR - - One concise paragraph of 2-3 sentences, **≤60 words total**, naming the \ -single most important read or divergence of the day with concrete numbers. \ -This is what a reader who only has 10 seconds sees. Don't waste it on the \ -weather or generic context. - -- Then 4-6 paragraphs, each anchored on a sleeve, sector, or theme. Concrete \ -numbers in every paragraph. No section over ~150 words. -- One paragraph synthesising the news flow into a market read. -- End with a watch list: 3-5 specific items to track in the next week, \ -each one sentence. - -# Time-horizon discipline -- This is a STRATEGIC log, not a day-trader's read. Treat 1-day moves under \ -2% as background noise; mention them only when they break or confirm a \ -multi-week trend or are extreme outliers. -- Anchor every claim to multi-week (1m), multi-month (since-anchor), or \ -multi-year (1y) changes — not 1d. If the only thing happening is a 1d move, \ -omit the paragraph. -- The watch list is for "structural tripwires over the next 1-3 months", not \ -"things to watch tomorrow". Each watch item should name a level/threshold \ -whose breach would change the regime, not a calendar-date event. - -# Rational vs irrational framing (MANDATORY in every paragraph) -The reader's primary goal is to disconnect rational decisions from market \ -irrationality. This is the single most important lens of the log — it MUST \ -appear in every sector or theme paragraph, not just where it feels natural. \ -For each paragraph, before writing it, ask yourself the two questions and \ -then make both answers visible in the prose: -- The RATIONAL drivers — what the underlying factors justify: earnings, \ -real-economy data, monetary policy, structural geopolitical shifts, \ -valuation vs fundamentals. -- The IRRATIONAL drivers — what the crowd is doing regardless of fundamentals: \ -positioning, narrative momentum, sentiment extremes, concentration, \ -flow-driven moves, options gamma, credit complacency. -Then state the GAP: is price moving with the rational read, ahead of it, \ -or against it? If they agree, say so briefly and move on. If they diverge \ -— price moving on irrational drivers while fundamentals say otherwise, or \ -vice versa — name the divergence explicitly. Those gaps are where the next \ -regime change starts and are the whole point of this log. -A paragraph that names only price action or only fundamentals, without \ -both lenses, is incomplete and must be rewritten. - -# Discipline -- No emojis, no marketing language, no "concerning" or "unprecedented" \ -without a specific number behind it. -- Concrete > vague. "AMD +113% since the anchor" beats "AI stocks up sharply". -- Distinguish "the thesis predicted X and X happened" from "the thesis \ -predicted X and X did not happen". Both are useful; conflating them is not. -- Don't repeat the same point in different words across paragraphs. -- No buy/sell recommendations. Triggers are pre-set elsewhere; your job is \ -to report whether reality is confirming, modifying, or refuting the thesis. - -# Stance (educational, anti-TA, anti-gambling) -The target reader is most likely young, new to investing, and at risk of \ -treating markets like a horse race they need to "read" via chart patterns. \ -Cassandra is the corrective. -- **No technical analysis.** Head-and-shoulders, RSI thresholds, Fibonacci \ -levels, Elliott waves, "support/resistance" — these are descriptions of past \ -crowd behaviour, not predictions. Don't use them; don't legitimise them. If \ -you mention a price level, frame it as a positioning fact (e.g. "the level \ -where the latest tranche of buyers entered"), not a signal. -- **No gambling framing.** Markets are not a coin flip and not a horse race. \ -Never present a position as a single decisive moment, a "now or never", or a \ -bet to be won. Every read should follow the shape: *regime → implication → \ -what would change the regime*. -- **Macro causality, every time.** Price moves get explained through \ -fundamentals, geopolitics, monetary policy, and structural shifts — not \ -chart shapes. Even short paragraphs need the cause, not just the effect. - -# System temperature (closing line, mandatory) -Close the log with a single sentence on a line of its own, formatted exactly: - - System temperature: [cool|neutral|elevated|hot|extreme] — [one clause naming the 2-3 specific divergences or readings that justify the label] - -This is the line a reader who only sees the watch list scrolls down to. Make \ -it earn its place: cite real signals (HY OAS, breadth, VIX, valuation, real \ -yields), not vibes. - -# Update mode (when an earlier log from today is provided) -If the user message includes a section labelled "Earlier log from today \ -(generated HH:MM UTC)", treat that as YOUR OWN earlier draft. You are \ -UPDATING it for the current data, not starting from scratch. -- Don't restate context that hasn't changed. Anchor on what's moved SINCE \ -that timestamp: confirmations, refutations, new emergent patterns. -- The TL;DR should lead with the move since the earlier read when there \ -was a meaningful intra-day change ("Since this morning's read, …") — \ -otherwise stay regime-level. -- The watch list should evolve: drop items that triggered or settled, add \ -items that emerged. Keep items still load-bearing. -- Preserve any insights from the earlier draft that remain valid; sharpen \ -or revise the ones that don't. Avoid contradicting yourself silently — if \ -you change a stance, name it briefly ("Earlier I read X; with Y now, the \ -read shifts to Z").""" - - -# --- Tone: audience-shaping block -------------------------------------------- - -_TONE: dict[str, str] = { - "NOVICE": """# Audience: novice — likely a young investor new to markets -This reader probably arrived from social media, treats charts as predictions, \ -and is one bad week away from quitting. Your job is to **educate them out of \ -the gambling mindset** without ever being preachy. Calm, patient, slightly \ -teacherly. Never condescending. - -- **Define jargon the first time it appears.** A short clause in parentheses \ -is fine: "yield curve (the chart of borrowing costs across different \ -maturities)", "ERP (equity risk premium — the extra return investors demand \ -for owning stocks instead of safe bonds)", "basis point (one hundredth of a \ -percent — 25bp = 0.25%)". -- **Avoid ticker shorthand without context.** Use "Apple (AAPL)" on first \ -mention, then "Apple" or the ticker after. -- **Everyday phrasing over jargon** where the meaning survives: "the price \ -of US government debt fell, pushing yields up" rather than "the long end \ -backed up"; "investors are paying more for the same earnings" rather than \ -"multiple expansion". -- **One analogy per concept, used sparingly.** Use them to bridge to \ -something concrete the reader already understands — not to entertain. - -# Educational teach-backs (NOVICE-specific, when warranted) -When the day's data makes a common misconception concrete, drop in ONE \ -teach-back of one to two sentences. Don't force it. Don't moralise. Examples \ -of moments to do this: - -- Anyone treating chart patterns as predictions: \ -"Patterns like head-and-shoulders describe what crowds did, not what they \ -will do — they're stories told after the fact, not edges." -- Anyone fixated on day-to-day moves: \ -"A 1% one-day move in a stock is roughly what you'd expect by chance. The \ -multi-week trend is where the information lives." -- Anyone treating one ticker as a coin flip: \ -"A single name's monthly move is mostly noise. The regime — what bonds, the \ -dollar, and credit are doing together — tells you whether ANY stock is \ -likely to drift up or down." -- Anyone trying to "time the bottom" or "buy the dip": \ -"Catching the bottom is a different game from owning the next cycle. The \ -first needs you to be right within days; the second needs you to be roughly \ -right within years." - -Limit yourself to one teach-back per log. Skip them entirely if the day's \ -data doesn't naturally invite one. - -# Length -Target ~700 words. Slightly more than INTERMEDIATE because explanations \ -need breathing room.""", - - "INTERMEDIATE": """# Audience: intermediate — reads the news, learning to \ -connect macro to markets -Assume the reader knows market basics (yield curves, breakevens, HY OAS, \ -sector ETFs, the difference between cyclical and defensive, what a basis \ -point is). Use common terms without defining them, but stay clear of deep \ -institutional shorthand ("the belly", "duration trade", "carry pickup", \ -"the RV book", "off-the-run"). - -Light-touch educational nudges are welcome when the day's data warrants — \ -e.g. "with rates this volatile, technical levels in equities are mostly \ -distraction" — but keep them to a passing clause, not a paragraph. Don't \ -moralise. - -# Length -Target ~600 words. Lean and clear, no padding.""", -} - - -# Legacy values map to the closest current value. Logs a warning so we can -# notice if some caller's config didn't get updated. -_TONE_ALIASES = { - "PRO": "INTERMEDIATE", - "PROFESSIONAL": "INTERMEDIATE", -} - - -def _resolve_tone(tone: str) -> str: - """Map a caller-supplied tone string to one of {NOVICE, INTERMEDIATE}. - - Unknown tones fall back to INTERMEDIATE. The legacy PRO value is mapped - to INTERMEDIATE (audience pivot, see PROMPT_VERSION v6 notes).""" - upper = (tone or "").upper().strip() - if upper in _TONE: - return upper - if upper in _TONE_ALIASES: - return _TONE_ALIASES[upper] - return "INTERMEDIATE" - - -# --- Analysis: forward-vs-backward focus ------------------------------------- - -_ANALYSIS: dict[str, str] = { - "DRY": """# Analysis style: dry -Report what happened. Identify divergences and contradictions. Compare to \ -references. Do not speculate on what comes next. Forward-looking statements \ -are limited to "what would invalidate the read" — never "we expect X to \ -happen". The watch list contains items to monitor, not predictions.""", - - "SPECULATIVE": """# Analysis style: speculative -Report what happened, then explicitly explore forward scenarios. For each \ -significant sector or theme, sketch a 1-4 week scenario set: the base case \ -(what the data suggests), a contrarian case (what would invalidate it), and \ -what tape signal would tip you from one to the other. Be explicit about \ -uncertainty — say "the base case is" not "X will happen". The watch list is \ -the trip-wires that decide between scenarios.""", -} - - -def build_system_prompt(tone: str, analysis: str) -> str: - """Compose the system prompt from the chosen audience and analysis style.""" - tone_block = _TONE[_resolve_tone(tone)] - analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) - return "\n\n".join([_CORE, tone_block, analysis_block]) - - -# Backwards-compat: a default-composed SYSTEM_PROMPT for tests / callers that -# don't yet pass tone/analysis. New callers should call build_system_prompt(). -SYSTEM_PROMPT = build_system_prompt("INTERMEDIATE", "SPECULATIVE") - - -# --- Chat-mode overrides (sidebar on /log) ----------------------------------- - -_CHAT_OVERRIDES = """# Chat mode (overrides the log-structure rules above) -You are NOT writing a daily log right now. The user is asking a specific -question via the chat sidebar. -- Forget the date header, TL;DR, sectional structure, and watch list. Just answer. -- Typical response: 200-400 words. Longer only if the question genuinely - warrants it. -- Cite specific numbers and named headlines from the reference materials - below whenever relevant. If a number isn't in the context, don't invent it. -- If a question is outside the provided context (e.g. asking about a stock or - event not in the data), say so plainly rather than speculating from prior - knowledge. -- No buy/sell recommendations. If asked, redirect to thesis and scenarios. -- Keep the same audience and analysis discipline established above.""" - - -def build_summary_system_prompt(tone: str, analysis: str) -> str: - """A lean, focused system prompt for the per-indicator-group hourly - summary. INTERPRETATION not description — the reader has the table - next to this paragraph; they don't need numbers recited at them.""" - tone_block = _TONE[_resolve_tone(tone)] - analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) - return f"""You write a TINY interpretation (≤60 words, 2-3 sentences) \ -of ONE indicator group for a strategic markets dashboard. - -# What this is for -The reader is looking at the table of numbers right next to your text. \ -They can see the values. They CANNOT see the meaning. Your job is to \ -**explain what the data means**, not to recite it. Each sentence should be \ -a regime-level interpretation, a fundamental driver identification, or a \ -cross-indicator implication — not a description of moves. - -# Rational vs irrational lens (required at this length too) -Even at 2-3 sentences, contrast what the underlying factors justify \ -(rational: fundamentals, policy, valuation) with what the crowd is doing \ -(irrational: positioning, narrative, flows) whenever the two diverge. If \ -they don't diverge, say so in one clause. Never just describe the move \ -without placing it on this axis. - -# Hard constraints -- Plain prose, ONE paragraph. No markdown, no headers, no lists, no labels. -- Open IMMEDIATELY with substance. NEVER start with: "I need to", "I'll", \ -"We need to", "We are asked", "Here's", "Let me", "Let's", "Sure", "Looking \ -at", "Based on", "Summary:", "The data shows", "First", "To address". No \ -meta-commentary at all. -- Cite at most 2-3 specific numbers and ONLY when they anchor an \ -interpretation. Don't list moves; explain them. -- Multi-week / multi-month horizon. 1-day moves under 2% are noise — skip. -- No buy/sell language. No predictions. No watch list. No TL;DR. No date \ -header. No "system temperature" line — that belongs to the full daily log. -- Output the read directly. Do NOT include phrases like "Example", "Good \ -example", "Bad example", "Reference", or any meta-framing of your output. - -{tone_block} - -{analysis_block} -""" - - -def build_summary_user_prompt(group_name: str, quotes: list[dict]) -> str: - parts = [ - f"# Group: {group_name}", - "Indicators (latest reading + 1d/1m/1y/since-anchor change):", - "```json", - json.dumps(quotes, indent=2, default=str)[:12000], - "```", - "\nWrite the 2-3 sentence read for this group now.", - ] - return "\n".join(parts) - - -def build_aggregate_summary_system_prompt(tone: str, analysis: str) -> str: - """System prompt for the cross-group aggregate read shown on the dashboard. - Wider lens than a per-group summary — synthesise across all groups.""" - tone_block = _TONE[_resolve_tone(tone)] - analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) - return f"""You write a single SHORT cross-asset INTERPRETATION (≤80 \ -words, 2-4 sentences) for the dashboard header. The reader is glancing — \ -give them the meaning of the whole tape, not a recap. - -# What this is for -The reader can see every indicator on the dashboard below this paragraph. \ -Your job is NOT to summarise the moves. It is to explain what the moves, \ -**taken together as a system**, mean: which regime is being signalled, \ -which divergences are load-bearing, what fundamental story the cross-asset \ -behaviour tells. - -# Rational vs irrational lens (required at this length too) -The cross-asset tape's value is in the gap between what the underlying \ -factors justify (rational: fundamentals, policy, valuation) and what the \ -crowd is actually doing (irrational: positioning, narrative momentum, \ -flows). At least one of the 2-4 sentences must name this gap or, if the \ -two cohere, explicitly say so. - -# Hard constraints -- Plain prose, ONE paragraph. No markdown, headers, lists, or labels. -- Open IMMEDIATELY with substance. NEVER start with: "I need to", "I'll", \ -"We need to", "Here's", "Let me", "Looking at", "Based on", "Sure", "Summary:", \ -"The data shows", "Across the board". No meta-commentary. -- Identify the single most important **cross-asset implication**: e.g. \ -"rates and credit disagree", "equities outrun fundamentals", "geopolitical \ -risk premium is in commodities but not vol". Cite no more than 3 specific \ -numbers, and only as anchors for the interpretation. -- Multi-week / multi-month horizon. 1-day moves under 2% are noise. -- No buy/sell language. No predictions of specific levels. -- Output the read directly. Do NOT include phrases like "Example", "Good \ -example", "Bad example", "Reference", or any meta-framing of your output. - -{tone_block} - -{analysis_block} -""" - - -def build_aggregate_summary_user_prompt(quotes_by_group: dict[str, list[dict]]) -> str: - parts = [ - "# All indicator groups (latest readings + change windows)", - "```json", - json.dumps(quotes_by_group, indent=2, default=str)[:20000], - "```", - "\nWrite the cross-asset aggregate read now.", - ] - return "\n".join(parts) - - -def build_chat_system_prompt( - tone: str, - analysis: str, - *, - log_content: str | None, - log_generated_at: datetime | None, - quotes_by_group: dict[str, list[dict]], - headlines: list[dict], - reference_line: str | None = None, -) -> str: - """Composed system prompt for the /log chat sidebar. Carries the user's - chosen tone + analysis style and inlines the latest log + market data + - headlines as reference material the model can cite from.""" - parts = [build_system_prompt(tone, analysis), "", _CHAT_OVERRIDES, ""] - if reference_line: - parts.append(f"# Doc reference snapshot\n{reference_line}\n") - if log_content: - ts = log_generated_at.strftime("%Y-%m-%d %H:%M UTC") if log_generated_at else "n/a" - parts.append(f"# Latest strategic log (generated {ts})\n\n{log_content}\n") - parts.append("# Live market data") - parts.append( - "```json\n" + json.dumps(quotes_by_group, indent=2, default=str)[:25000] + "\n```" - ) - parts.append("# Recent headlines (last 24h, thesis-filtered top 50)") - for h in headlines[:50]: - parts.append(f"- [{h['source']}] {h['title']}") - return "\n".join(parts) @dataclass @@ -443,172 +29,6 @@ class LogResult: cost_usd: float | None -def build_user_prompt( - *, - today: datetime, - anchor: str | None, - quotes_by_group: dict[str, list[dict]], - headlines_by_bucket: dict[str, list[dict]], - reference_line: str | None = None, - previous_log: object | None = None, -) -> str: - """Assemble the user message from already-fetched-and-persisted data. - If `previous_log` is a StrategicLog from earlier today, it's included - as 'Update mode' context — the model will revise rather than restart.""" - parts = [ - f"# Strategic log request — {today.strftime('%Y-%m-%d')}", - # Explicit current time so the model doesn't hallucinate one. The - # date header it writes MUST stay date-only (per system prompt). - f"Current time: {today.strftime('%Y-%m-%d %H:%M UTC')}", - ] - if anchor: - parts.append(f"Anchor reference date: {anchor}") - if reference_line: - parts.append( - "\n## Reference snapshot (when the macro thesis was authored)" - f"\n{reference_line}\nCompare live readings against it." - ) - - if previous_log is not None: - gen = getattr(previous_log, "generated_at", None) - ts = gen.strftime("%H:%M UTC") if gen else "earlier today" - parts.append( - f"\n## Earlier log from today (generated {ts})\n" - "Treat this as YOUR OWN earlier draft for today. Update it for\n" - "the current data — don't restate unchanged context. See the\n" - "'Update mode' section of the system prompt for how to handle it.\n" - "```markdown\n" - f"{previous_log.content}\n" - "```" - ) - - parts.append("\n## Live market data (per group)") - parts.append("```json\n" + json.dumps(quotes_by_group, indent=2, default=str) + "\n```") - parts.append("\n## News flow (last 24h, filtered by bucket)") - for label, items in headlines_by_bucket.items(): - if not items: - continue - parts.append(f"\n### {label.upper()}") - for h in items[:30]: - parts.append(f"- [{h['when'][:16].replace('T',' ')}] [{h['source']}] {h['title']}") - - task_line = ( - "\n## Task\nWrite the daily strategic log in ~800 words, following " - "the discipline in the system prompt. No preamble; begin directly " - "with the date header." - ) - if previous_log is not None: - task_line = ( - "\n## Task\nUpdate the earlier log above for the current data. " - "Keep the same structure (date header, TL;DR, sections, watch " - "list, system temperature) but anchor on what has CHANGED since " - "the earlier draft's timestamp. ~800 words. No preamble." - ) - parts.append(task_line) - return "\n".join(parts) - - -def _digest_tone_clause(tone: str) -> str: - if tone.upper() == "NOVICE": - return "Use plain English. Define any jargon on first use." - return "Write for a reader who already speaks markets fluently." - - -def build_daily_digest_prompt( - *, - tone: str, - today, - quotes_by_group: dict, - headlines_by_bucket: dict, - reference_line: str, -) -> tuple[str, str]: - """System + user prompt for the once-a-day editorial digest. - - Different from the hourly log: the daily digest reflects on the past - 24h and looks forward to the upcoming session. Longer, less - 'live-blogging,' more contextual. Target ~600 words.""" - system = ( - "You write the daily editorial digest for Read the Markets. " - f"Audience tone: {tone.upper()}. {_digest_tone_clause(tone)} " - "Cover: (1) what mattered yesterday, (2) what to watch in today's " - "EU and US sessions, (3) one cross-asset thread connecting them. " - "No predictions of price level, no buy/sell language. Target ~600 " - "words. Output HTML using only

        ,

        ,
          ,
        • , , " - " — no , , or wrapper, no inline styles." - ) - user = _digest_user_prompt( - today=today, quotes_by_group=quotes_by_group, - headlines_by_bucket=headlines_by_bucket, reference_line=reference_line, - ) - return system, user - - -def build_weekly_digest_prompt( - *, - tone: str, - today, - quotes_by_group: dict, - headlines_by_bucket: dict, - reference_line: str, -) -> tuple[str, str]: - """System + user prompt for the Sunday weekly recap + look-ahead. - - Sent to ALL opt-in users (free and paid). Target ~900 words.""" - system = ( - "You write the Sunday weekly digest for Read the Markets. " - f"Audience tone: {tone.upper()}. {_digest_tone_clause(tone)} " - "Cover: (1) the week behind — what moved and why, " - "(2) the week ahead — releases, earnings, central-bank meetings, " - "(3) the cross-asset story to keep in mind. " - "No predictions of price level, no buy/sell language. Target ~900 " - "words. Output HTML using only

          ,

          ,
            ,
          • , , " - " — no , , or wrapper, no inline styles." - ) - user = _digest_user_prompt( - today=today, quotes_by_group=quotes_by_group, - headlines_by_bucket=headlines_by_bucket, reference_line=reference_line, - ) - return system, user - - -def _digest_user_prompt( - *, - today, - quotes_by_group: dict, - headlines_by_bucket: dict, - reference_line: str, -) -> str: - """Shared user-message body used by both digest prompts. Same data - shape as the hourly user prompt; reformatted for the digest context.""" - today_str = today.strftime("%A %d %B %Y") if hasattr(today, "strftime") else str(today) - lines = [f"TODAY (UTC): {today_str}", "", f"REFERENCE: {reference_line}", ""] - - if headlines_by_bucket: - lines.append("HEADLINES BY CATEGORY") - for cat, items in headlines_by_bucket.items(): - lines.append(f" [{cat}]") - for h in items[:30]: - when = h.get("when", "") - src = h.get("source", "") - title = h.get("title", "") - lines.append(f" {when} · {src} · {title}") - lines.append("") - - if quotes_by_group: - lines.append("LATEST QUOTES BY GROUP") - for grp, items in quotes_by_group.items(): - lines.append(f" [{grp}]") - for q in items[:30]: - sym = q.get("symbol", "") - price = q.get("price", "") - lbl = q.get("label", "") - ccy = q.get("currency", "") - lines.append(f" {sym} ({lbl}) — {price} {ccy}") - lines.append("") - - return "\n".join(lines) - - def _provider_chain() -> list[str]: """Ordered list of providers to try: primary, then fallback (unless the fallback is unset, the same as primary, or has no API key).""" @@ -775,7 +195,6 @@ async def call_llm( raise last_exc - def month_window() -> tuple[datetime, datetime]: """[start, now] in UTC for the current calendar month.""" now = datetime.now(timezone.utc) diff --git a/app/services/portfolio_analysis.py b/app/services/portfolio_analysis.py index 0aef3cd..450f948 100644 --- a/app/services/portfolio_analysis.py +++ b/app/services/portfolio_analysis.py @@ -32,10 +32,10 @@ from app.db import utcnow from app.logging import get_logger from app.models import AICall from app.services.i18n import LANGUAGES, respond_in_clause +from app.services.llm_prompts import build_system_prompt from app.services.openrouter import ( LogResult, active_model, - build_system_prompt, call_llm, ) diff --git a/tests/test_digest_prompts.py b/tests/test_digest_prompts.py index 97a9755..8b01629 100644 --- a/tests/test_digest_prompts.py +++ b/tests/test_digest_prompts.py @@ -3,7 +3,7 @@ from __future__ import annotations from datetime import datetime, timezone -from app.services.openrouter import ( +from app.services.llm_prompts import ( build_daily_digest_prompt, build_weekly_digest_prompt, ) diff --git a/tests/test_openrouter_prompt.py b/tests/test_openrouter_prompt.py index 51f52a1..f21edc2 100644 --- a/tests/test_openrouter_prompt.py +++ b/tests/test_openrouter_prompt.py @@ -9,7 +9,7 @@ pytest.importorskip("pydantic_settings") from datetime import datetime, timezone -from app.services.openrouter import SYSTEM_PROMPT, build_user_prompt +from app.services.llm_prompts import SYSTEM_PROMPT, build_user_prompt def test_system_prompt_has_voice_anchors(): @@ -35,7 +35,7 @@ def test_pro_tone_falls_back_to_intermediate(): """PRO was removed in PROMPT_VERSION 6 (audience pivot to young investors). Legacy callers that still pass PRO should get the INTERMEDIATE prompt rather than a KeyError.""" - from app.services.openrouter import build_system_prompt + from app.services.llm_prompts import build_system_prompt pro = build_system_prompt("PRO", "SPECULATIVE") inter = build_system_prompt("INTERMEDIATE", "SPECULATIVE") assert pro == inter From b055eea1c2f394f86f583ccfeb6f02aa0dbee084 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 21:33:06 +0200 Subject: [PATCH 32/69] email: split digest renderer to digest_email.py MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit email_service.py was 428 lines covering three different concerns: SMTP transport, OTP/welcome rendering (tightly coupled — same brand template + theme), and digest rendering (a totally different shape of email, different layout, different copy cadence). The two halves changed at different cadences and made the file noisy to navigate. Extracted render_digest_email + _DIGEST_HTML_TEMPLATE + _strip_html_to_text to app/services/digest_email.py. SMTP transport and the OTP/welcome surface stay in email_service.py. Import sites updated: email_digest_job and test_email_render now import render_digest_email from digest_email. The OTP/welcome import sites (auth router, branding tests, test_email_service) are untouched. No behaviour change — pure relocation. Templates byte-identical. Co-Authored-By: Claude Opus 4.7 --- app/jobs/email_digest_job.py | 3 +- app/services/digest_email.py | 116 ++++++++++++++++++++++++++++++++++ app/services/email_service.py | 105 ------------------------------ tests/test_email_render.py | 2 +- 4 files changed, 119 insertions(+), 107 deletions(-) create mode 100644 app/services/digest_email.py diff --git a/app/jobs/email_digest_job.py b/app/jobs/email_digest_job.py index 0bad288..dc89e5b 100644 --- a/app/jobs/email_digest_job.py +++ b/app/jobs/email_digest_job.py @@ -29,7 +29,8 @@ from app.jobs._market_context import ( from app.models import EmailSend, User from app.routers.email import sign_unsubscribe_token from app.services.access import paid_status -from app.services.email_service import render_digest_email, send_email +from app.services.digest_email import render_digest_email +from app.services.email_service import send_email from app.services.i18n import ACTIVE_LANGUAGES from app.services.llm_prompts import ( PROMPT_VERSION, diff --git a/app/services/digest_email.py b/app/services/digest_email.py new file mode 100644 index 0000000..3d416f6 --- /dev/null +++ b/app/services/digest_email.py @@ -0,0 +1,116 @@ +"""Daily/weekly digest email rendering. + +Pure prose → HTML/text rendering. SMTP transport stays in +``email_service.send_email``; this module only assembles the message +body, subject, and a text-only fallback for clients without HTML +rendering. + +Split from email_service.py during the Tier 2 cleanup pass — the +SMTP/OTP/welcome surface and the digest renderer changed at very +different cadences and made the file noisy to navigate. +""" +from __future__ import annotations + +import html as _html_lib +import re as _re + +from app import branding + + +_DIGEST_HTML_TEMPLATE = """\ + + + + + + + {brand} — {label} + + + + + +
            +
            + ▰ {brand_upper} · {label_upper} +
            +
             
            +
            + {content_html} +
            +
             
            +
            +
             
            + +
            + + +""" + + +def _strip_html_to_text(html_body: str) -> str: + """Best-effort HTML → plain text for the multipart fallback. We don't + need perfection — just readable prose for clients that won't render + HTML.""" + text = _re.sub(r"(?i)<(/(p|h[1-6]|li|ul|ol)|br\s*/?)>", "\n", html_body) + text = _re.sub(r"<[^>]+>", "", text) + text = _html_lib.unescape(text) + text = _re.sub(r"\n{3,}", "\n\n", text) + return text.strip() + + +def render_digest_email( + *, + kind: str, + date_str: str, + content_html: str, + unsubscribe_url: str, + settings_url: str, +) -> tuple[str, str, str]: + """Returns (subject, text_body, html_body) for a digest email. + + `kind` is "daily" or "weekly". Anything else raises ValueError.""" + if kind == "daily": + label = "Daily" + subject = f"{branding.BRAND_NAME} · Daily — {date_str}" + elif kind == "weekly": + label = "Weekly recap" + subject = f"{branding.BRAND_NAME} · Weekly recap — {date_str}" + else: + raise ValueError(f"unknown digest kind: {kind!r}") + + html_body = _DIGEST_HTML_TEMPLATE.format( + brand=branding.BRAND_NAME, + brand_upper=branding.BRAND_NAME.upper(), + label=label, + label_upper=label.upper(), + FONT_MONO=branding.FONT_MONO, + content_html=content_html, + unsubscribe_url=unsubscribe_url, + settings_url=settings_url, + **{f"L_{k.replace('-', '_')}": v for k, v in branding.LIGHT.items()}, + **{f"D_{k.replace('-', '_')}": v for k, v in branding.DARK.items()}, + ) + + text_lines = [ + f"{branding.BRAND_NAME} — {label}", + date_str, + "", + _strip_html_to_text(content_html), + "", + f"Unsubscribe: {unsubscribe_url}", + f"Manage preferences: {settings_url}", + ] + text_body = "\n".join(text_lines) + return subject, text_body, html_body diff --git a/app/services/email_service.py b/app/services/email_service.py index d3ed9f7..8180ca6 100644 --- a/app/services/email_service.py +++ b/app/services/email_service.py @@ -18,8 +18,6 @@ convenient for local dev that doesn't want a mail server configured. """ from __future__ import annotations -import html as _html_lib -import re as _re from email.message import EmailMessage import aiosmtplib @@ -323,106 +321,3 @@ async def send_welcome_email(to: str) -> None: subject, text, html = render_welcome_email() await send_email(to, subject, text, html_body=html) - -# --------------------------------------------------------------------------- -# Digest email rendering -# --------------------------------------------------------------------------- - - -_DIGEST_HTML_TEMPLATE = """\ - - - - - - - {brand} — {label} - - - - - -
            -
            - ▰ {brand_upper} · {label_upper} -
            -
             
            -
            - {content_html} -
            -
             
            -
            -
             
            - -
            - - -""" - - -def _strip_html_to_text(html_body: str) -> str: - """Best-effort HTML → plain text for the multipart fallback. We don't - need perfection — just readable prose for clients that won't render - HTML.""" - text = _re.sub(r"(?i)<(/(p|h[1-6]|li|ul|ol)|br\s*/?)>", "\n", html_body) - text = _re.sub(r"<[^>]+>", "", text) - text = _html_lib.unescape(text) - text = _re.sub(r"\n{3,}", "\n\n", text) - return text.strip() - - -def render_digest_email( - *, - kind: str, - date_str: str, - content_html: str, - unsubscribe_url: str, - settings_url: str, -) -> tuple[str, str, str]: - """Returns (subject, text_body, html_body) for a digest email. - - `kind` is "daily" or "weekly". Anything else raises ValueError.""" - if kind == "daily": - label = "Daily" - subject = f"{branding.BRAND_NAME} · Daily — {date_str}" - elif kind == "weekly": - label = "Weekly recap" - subject = f"{branding.BRAND_NAME} · Weekly recap — {date_str}" - else: - raise ValueError(f"unknown digest kind: {kind!r}") - - html_body = _DIGEST_HTML_TEMPLATE.format( - brand=branding.BRAND_NAME, - brand_upper=branding.BRAND_NAME.upper(), - label=label, - label_upper=label.upper(), - FONT_MONO=branding.FONT_MONO, - content_html=content_html, - unsubscribe_url=unsubscribe_url, - settings_url=settings_url, - **{f"L_{k.replace('-', '_')}": v for k, v in branding.LIGHT.items()}, - **{f"D_{k.replace('-', '_')}": v for k, v in branding.DARK.items()}, - ) - - text_lines = [ - f"{branding.BRAND_NAME} — {label}", - date_str, - "", - _strip_html_to_text(content_html), - "", - f"Unsubscribe: {unsubscribe_url}", - f"Manage preferences: {settings_url}", - ] - text_body = "\n".join(text_lines) - return subject, text_body, html_body diff --git a/tests/test_email_render.py b/tests/test_email_render.py index f955066..35a130d 100644 --- a/tests/test_email_render.py +++ b/tests/test_email_render.py @@ -1,7 +1,7 @@ """Unit tests for render_digest_email.""" from __future__ import annotations -from app.services.email_service import render_digest_email +from app.services.digest_email import render_digest_email def test_daily_subject_and_bodies(): From 833d1775abc90b0834366ecd9689bb43e9d259bc Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 21:43:17 +0200 Subject: [PATCH 33/69] routers: extract chat + ops from api.py MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit api.py was 933 lines mixing four distinct concerns: indicators + news + strategic log (the JSON/HTMX API proper), the chat endpoint + its three private helpers (~200 lines), and the two HTML-only ops endpoints /markets-bar + /health (~150 lines). Extracted: - app/routers/chat.py — POST /api/chat + _latest_quotes_by_group_chat, _thesis_headlines_for_chat, _month_spend - app/routers/ops.py — GET /api/markets-bar + GET /api/health + _fmt_price helper Both new routers use the same dependencies=[Depends(require_token)] as api.py and are mounted at the /api prefix in app/main.py. URL surface is byte-identical with no externally-visible change. api.py shrinks to ~620 lines focused on indicators+news+log+settings. Helpers shared with the original api.py (_md_to_html, _resolve_tone_param) are imported from app.routers.api where needed in chat.py to avoid duplication. Also updated tests/test_chat_and_log_gates.py to mount chat_router in its local test app, since /api/chat now lives there. Co-Authored-By: Claude Opus 4.7 --- app/main.py | 4 + app/routers/api.py | 325 +------------------------------ app/routers/chat.py | 193 ++++++++++++++++++ app/routers/ops.py | 162 +++++++++++++++ tests/test_chat_and_log_gates.py | 2 + 5 files changed, 364 insertions(+), 322 deletions(-) create mode 100644 app/routers/chat.py create mode 100644 app/routers/ops.py diff --git a/app/main.py b/app/main.py index fe987f5..7f1729f 100644 --- a/app/main.py +++ b/app/main.py @@ -19,7 +19,9 @@ from app.db import get_session_factory from app.logging import configure_logging, get_logger from app.routers import api as api_router from app.routers import auth as auth_router +from app.routers import chat as chat_router from app.routers import email as email_router +from app.routers import ops as ops_router from app.routers import pages as pages_router from app.routers import polar_webhook as polar_webhook_router from app.routers import public as public_router @@ -89,6 +91,8 @@ app.mount( app.include_router(auth_router.router, tags=["auth"]) app.include_router(email_router.router, tags=["email"]) app.include_router(api_router.router, prefix="/api", tags=["api"]) +app.include_router(chat_router.router, prefix="/api", tags=["chat"]) +app.include_router(ops_router.router, prefix="/api", tags=["ops"]) app.include_router(universe_router.router, prefix="/api", tags=["universe"]) app.include_router(ticker_validate_router.router, prefix="/api", tags=["ticker-validate"]) app.include_router(sync_router.router, tags=["portfolio-sync"]) diff --git a/app/routers/api.py b/app/routers/api.py index 893d08f..5075654 100644 --- a/app/routers/api.py +++ b/app/routers/api.py @@ -10,45 +10,29 @@ import re from datetime import date, datetime, timedelta, timezone from typing import Literal -from fastapi import APIRouter, Depends, File, Form, HTTPException, Query, Request, UploadFile -from fastapi.responses import HTMLResponse, JSONResponse +from fastapi import APIRouter, Depends, HTTPException, Query, Request +from fastapi.responses import JSONResponse from sqlalchemy import desc, func, select from sqlalchemy.ext.asyncio import AsyncSession -from collections import defaultdict - -import httpx -from pydantic import BaseModel, Field +from pydantic import BaseModel from app.auth import require_token, maybe_current_user, CurrentUser from app.services.i18n import ACTIVE_LANGUAGES from app.config import get_settings from app.db import get_session, utcnow -from app.jobs._market_context import REFERENCE_LINE -from app.services.llm_prompts import ( - PROMPT_VERSION, - build_chat_system_prompt, -) -from app.services.openrouter import ( - call_llm, - month_start, -) from app.templates_env import templates from app.models import ( - AICall, Headline, IndicatorSummary, IndicatorSummaryTranslation, - JobRun, Quote, StrategicLog, StrategicLogTranslation, User, ) from app.schemas import ( - HealthOut, HeadlineOut, - JobStatus, QuoteOut, StrategicLogOut, ) @@ -56,11 +40,6 @@ from app.schemas import ( router = APIRouter(dependencies=[Depends(require_token)]) -JOB_NAMES = ("market_job", "news_job", "ai_log_job", "rollup_job", - "indicator_summary_job", "universe_flush_job", - "email_digest_job") -JOB_STALE_HOURS = 2.0 # job is "warn" if its last success was >2h ago - # Per-group expected freshness — bonds and intraday tape want daily data, # macro/economy/valuation are monthly/quarterly by nature. Older than this # many days from today → row gets a "stale" badge. @@ -565,10 +544,6 @@ async def log_days( return templates.TemplateResponse(request, "partials/calendar.html", payload) - -# --- Health / ops footer ----------------------------------------------------- - - # --- Aggregate summary + market status (dashboard header) ------------------- @@ -621,300 +596,6 @@ async def aggregate_summary( } -# Market → headline index mapping for the sticky bottom bar. Symbols must -# be present in config/default.toml so market_job populates `quotes`. -_MARKET_INDEX = { - "NYSE": ("^GSPC", "S&P 500"), - "LSE": ("^FTSE", "FTSE 100"), - # XETRA → Euro Stoxx 50 rather than ^GDAXI: Yahoo's DAX ticker is - # patchy via the chart endpoint, and ^STOXX50E is already tracked in - # config/default.toml's equity group. - "XETRA": ("^STOXX50E", "STOXX 50"), - "JPX": ("^N225", "Nikkei 225"), - "HKEX": ("^HSI", "Hang Seng"), - "SSE": ("000300.SS", "CSI 300"), -} - - -def _fmt_price(p: float | None) -> str: - if p is None: - return "—" - if abs(p) >= 1000: - return f"{p:,.0f}" - if abs(p) >= 100: - return f"{p:,.1f}" - return f"{p:,.2f}" - - -@router.get("/markets-bar", response_class=HTMLResponse, include_in_schema=False) -async def markets_bar( - request: Request, - session: AsyncSession = Depends(get_session), - as_: str | None = Query(default=None, alias="as"), -): - """The sticky bottom-bar payload: per-market open/close status with the - market's headline index price + 1d change. Refreshed by HTMX every 60s. - """ - from app.services.markets import all_statuses - - statuses = all_statuses() - # Latest quote per headline-index symbol in one query. - wanted_syms = [sym for sym, _ in _MARKET_INDEX.values()] - sub = ( - select(Quote.symbol, func.max(Quote.fetched_at).label("mx")) - .where(Quote.symbol.in_(wanted_syms)) - .group_by(Quote.symbol) - .subquery() - ) - rows = (await session.execute( - select(Quote).join( - sub, - (Quote.symbol == sub.c.symbol) & (Quote.fetched_at == sub.c.mx), - ) - )).scalars().all() - by_sym = {q.symbol: q for q in rows} - - markets: list[dict] = [] - for st in statuses: - sym, label = _MARKET_INDEX.get(st["code"], (None, None)) - q = by_sym.get(sym) if sym else None - idx = None - if q is not None and q.price is not None: - idx = { - "symbol": q.symbol, - "label": label, - "price_fmt": _fmt_price(q.price), - "change_1d_pct": (q.changes or {}).get("1d"), - } - markets.append({ - "code": st["code"], - "label": st["label"], - "open": st["open"], - "until_iso": st["until"].isoformat(), - "until_hhmm": st["until"].strftime("%H:%M"), - "index": idx, - }) - - return templates.TemplateResponse( - request, "partials/markets_bar.html", - {"markets": markets}, - ) - - -@router.get("/health", response_class=HTMLResponse, include_in_schema=False) -async def health_html( - request: Request, - session: AsyncSession = Depends(get_session), - as_: str | None = Query(default=None, alias="as"), -): - """Returns an HTML fragment by default (the ops footer); ?as=json returns the - structured object. The default is HTML because that's how the dashboard - consumes it; CLI/curl users will pass ?as=json.""" - try: - await session.execute(select(func.now())) - db_ok = True - except Exception: - db_ok = False - - now = utcnow() - jobs: list[dict] = [] - structured: list[JobStatus] = [] - for name in JOB_NAMES: - row = (await session.execute( - select(JobRun).where(JobRun.name == name) - .order_by(desc(JobRun.started_at)).limit(1) - )).scalar_one_or_none() - if row is None: - jobs.append({"name": name, "led": "idle", "age": "—", - "last_finished": None}) - structured.append(JobStatus(name=name)) - continue - if row.status == "success": - secs = _age_seconds(now, row.finished_at or row.started_at) or 0 - led = "ok" if secs < JOB_STALE_HOURS * 3600 else "warn" - elif row.status == "skipped": - led = "warn" - elif row.status == "running": - led = "warn" - else: - led = "err" - jobs.append({ - "name": name, "led": led, - "age": _fmt_age(now, row.finished_at or row.started_at), - "last_finished": row.finished_at, - }) - structured.append(JobStatus( - name=name, last_started=row.started_at, - last_finished=row.finished_at, status=row.status, - error=row.error, items_written=row.items_written, - )) - - if as_ == "json": - return JSONResponse( - HealthOut(db="ok" if db_ok else "down", jobs=structured).model_dump(mode="json") - ) - return templates.TemplateResponse( - request, "partials/ops_footer.html", - {"db_ok": db_ok, "jobs": jobs}, - ) - - -# --- Chat ------------------------------------------------------------------- - - -class ChatMessage(BaseModel): - role: str = Field(pattern="^(user|assistant)$") - content: str - - -class ChatRequest(BaseModel): - messages: list[ChatMessage] - - - -THESIS_KEYWORDS_FALLBACK = [ - "hormuz", "iran", "opec", "brent", "wti", "crude", "oil", - "china", "taiwan", "yuan", "fed", "inflation", "cpi", "yield", - "gold", "dollar", "yen", "saudi", "russia", "ukraine", "israel", - "nato", "defence", "defense", -] - - -async def _latest_quotes_by_group_chat(session: AsyncSession) -> dict[str, list[dict]]: - sub = ( - select(Quote.group_name, Quote.symbol, - func.max(Quote.fetched_at).label("mx")) - .group_by(Quote.group_name, Quote.symbol) - .subquery() - ) - rows = (await session.execute( - select(Quote).join( - sub, - (Quote.group_name == sub.c.group_name) - & (Quote.symbol == sub.c.symbol) - & (Quote.fetched_at == sub.c.mx), - ).order_by(Quote.group_name, Quote.symbol) - )).scalars().all() - by_group: dict[str, list[dict]] = defaultdict(list) - for q in rows: - by_group[q.group_name].append({ - "symbol": q.symbol, "label": q.label, - "price": q.price, "currency": q.currency, - "as_of": q.as_of, "changes": q.changes, - }) - return by_group - - -async def _thesis_headlines_for_chat(session: AsyncSession, limit: int = 50) -> list[dict]: - cutoff = utcnow() - timedelta(hours=24) - rows = (await session.execute( - select(Headline) - .where(Headline.published_at >= cutoff) - .order_by(desc(Headline.published_at)) - .limit(300) - )).scalars().all() - out = [] - for h in rows: - if any(kw in h.title.lower() for kw in THESIS_KEYWORDS_FALLBACK): - out.append({"source": h.source, "title": h.title}) - if len(out) >= limit: - break - return out - - -async def _month_spend(session: AsyncSession) -> float: - total = (await session.execute( - select(func.coalesce(func.sum(AICall.cost_usd), 0.0)) - .where(AICall.called_at >= month_start()) - )).scalar() - return float(total or 0.0) - - -@router.post("/chat") -async def chat( - body: ChatRequest, - session: AsyncSession = Depends(get_session), - principal: CurrentUser | None = Depends(maybe_current_user), -): - """Answer one user turn given the conversation so far. Grounded on the - latest strategic log + market data + thesis-filtered headlines. - Ephemeral — the conversation lives entirely in the client; the endpoint - just records each call's cost in `ai_calls`.""" - # Paid-only feature. Free users get the static log but not the - # interactive chat (see /pricing). - from app.services.access import is_paid_active - if not is_paid_active(principal): - raise HTTPException( - status_code=402, - detail={"code": "paid_required", - "message": "Follow-up chat is a paid-tier feature."}, - ) - - s = get_settings() - if not s.OPENROUTER_API_KEY: - raise HTTPException(status_code=503, detail="OPENROUTER_API_KEY not set") - - # Monthly cost cap — same one the log job respects. - spent = await _month_spend(session) - if spent >= s.OPENROUTER_MONTHLY_CAP_USD: - raise HTTPException( - status_code=429, - detail=f"Monthly OpenRouter cap reached (${spent:.2f})", - ) - - # Trim runaway conversations: keep last 20 turns. - history = body.messages[-20:] - if not history or history[-1].role != "user": - raise HTTPException(status_code=400, detail="Last message must be user") - - # Gather grounding context. - log_row = (await session.execute( - select(StrategicLog).order_by(desc(StrategicLog.generated_at)).limit(1) - )).scalar_one_or_none() - quotes = await _latest_quotes_by_group_chat(session) - headlines = await _thesis_headlines_for_chat(session) - - system_prompt = build_chat_system_prompt( - s.CASSANDRA_TONE, s.CASSANDRA_ANALYSIS, - log_content=log_row.content if log_row else None, - log_generated_at=log_row.generated_at if log_row else None, - quotes_by_group=quotes, - headlines=headlines, - reference_line=REFERENCE_LINE, - ) - - msgs = [{"role": "system", "content": system_prompt}] - for m in history: - msgs.append({"role": m.role, "content": m.content}) - - try: - async with httpx.AsyncClient(follow_redirects=True) as client: - result = await call_llm(client, msgs) - except Exception as e: - session.add(AICall( - model=s.OPENROUTER_MODEL, status="error", error=str(e)[:500], - )) - await session.commit() - raise HTTPException(status_code=502, detail=f"OpenRouter error: {e}") - - session.add(AICall( - model=result.model, - prompt_tokens=result.prompt_tokens, - completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, - status="ok", - )) - await session.commit() - - return { - "role": "assistant", - "content": result.content, - "content_html": _md_to_html(result.content), - "prompt_tokens": result.prompt_tokens, - "completion_tokens": result.completion_tokens, - } - - # --------------------------------------------------------------------------- # Settings — digest preferences # --------------------------------------------------------------------------- diff --git a/app/routers/chat.py b/app/routers/chat.py new file mode 100644 index 0000000..f4198ba --- /dev/null +++ b/app/routers/chat.py @@ -0,0 +1,193 @@ +"""Chat endpoint — POST /api/chat. + +Grounded on the latest strategic log, current market quotes, and +thesis-filtered headlines. Ephemeral: the conversation lives in the +client; this endpoint just records each call's cost in `ai_calls`. +""" +from __future__ import annotations + +from collections import defaultdict +from datetime import timedelta + +import httpx +from fastapi import APIRouter, Depends, HTTPException +from pydantic import BaseModel, Field +from sqlalchemy import desc, func, select +from sqlalchemy.ext.asyncio import AsyncSession + +from app.auth import require_token, maybe_current_user, CurrentUser +from app.config import get_settings +from app.db import get_session, utcnow +from app.jobs._market_context import REFERENCE_LINE +from app.models import AICall, Headline, Quote, StrategicLog +from app.routers.api import _md_to_html +from app.services.llm_prompts import build_chat_system_prompt +from app.services.openrouter import call_llm, month_start + +router = APIRouter(dependencies=[Depends(require_token)]) + + +# --------------------------------------------------------------------------- +# Pydantic models +# --------------------------------------------------------------------------- + + +class ChatMessage(BaseModel): + role: str = Field(pattern="^(user|assistant)$") + content: str + + +class ChatRequest(BaseModel): + messages: list[ChatMessage] + + +# --------------------------------------------------------------------------- +# Private helpers +# --------------------------------------------------------------------------- + +THESIS_KEYWORDS_FALLBACK = [ + "hormuz", "iran", "opec", "brent", "wti", "crude", "oil", + "china", "taiwan", "yuan", "fed", "inflation", "cpi", "yield", + "gold", "dollar", "yen", "saudi", "russia", "ukraine", "israel", + "nato", "defence", "defense", +] + + +async def _latest_quotes_by_group_chat(session: AsyncSession) -> dict[str, list[dict]]: + sub = ( + select(Quote.group_name, Quote.symbol, + func.max(Quote.fetched_at).label("mx")) + .group_by(Quote.group_name, Quote.symbol) + .subquery() + ) + rows = (await session.execute( + select(Quote).join( + sub, + (Quote.group_name == sub.c.group_name) + & (Quote.symbol == sub.c.symbol) + & (Quote.fetched_at == sub.c.mx), + ).order_by(Quote.group_name, Quote.symbol) + )).scalars().all() + by_group: dict[str, list[dict]] = defaultdict(list) + for q in rows: + by_group[q.group_name].append({ + "symbol": q.symbol, "label": q.label, + "price": q.price, "currency": q.currency, + "as_of": q.as_of, "changes": q.changes, + }) + return by_group + + +async def _thesis_headlines_for_chat(session: AsyncSession, limit: int = 50) -> list[dict]: + cutoff = utcnow() - timedelta(hours=24) + rows = (await session.execute( + select(Headline) + .where(Headline.published_at >= cutoff) + .order_by(desc(Headline.published_at)) + .limit(300) + )).scalars().all() + out = [] + for h in rows: + if any(kw in h.title.lower() for kw in THESIS_KEYWORDS_FALLBACK): + out.append({"source": h.source, "title": h.title}) + if len(out) >= limit: + break + return out + + +async def _month_spend(session: AsyncSession) -> float: + total = (await session.execute( + select(func.coalesce(func.sum(AICall.cost_usd), 0.0)) + .where(AICall.called_at >= month_start()) + )).scalar() + return float(total or 0.0) + + +# --------------------------------------------------------------------------- +# Route +# --------------------------------------------------------------------------- + + +@router.post("/chat") +async def chat( + body: ChatRequest, + session: AsyncSession = Depends(get_session), + principal: CurrentUser | None = Depends(maybe_current_user), +): + """Answer one user turn given the conversation so far. Grounded on the + latest strategic log + market data + thesis-filtered headlines. + Ephemeral — the conversation lives entirely in the client; the endpoint + just records each call's cost in `ai_calls`.""" + # Paid-only feature. Free users get the static log but not the + # interactive chat (see /pricing). + from app.services.access import is_paid_active + if not is_paid_active(principal): + raise HTTPException( + status_code=402, + detail={"code": "paid_required", + "message": "Follow-up chat is a paid-tier feature."}, + ) + + s = get_settings() + if not s.OPENROUTER_API_KEY: + raise HTTPException(status_code=503, detail="OPENROUTER_API_KEY not set") + + # Monthly cost cap — same one the log job respects. + spent = await _month_spend(session) + if spent >= s.OPENROUTER_MONTHLY_CAP_USD: + raise HTTPException( + status_code=429, + detail=f"Monthly OpenRouter cap reached (${spent:.2f})", + ) + + # Trim runaway conversations: keep last 20 turns. + history = body.messages[-20:] + if not history or history[-1].role != "user": + raise HTTPException(status_code=400, detail="Last message must be user") + + # Gather grounding context. + log_row = (await session.execute( + select(StrategicLog).order_by(desc(StrategicLog.generated_at)).limit(1) + )).scalar_one_or_none() + quotes = await _latest_quotes_by_group_chat(session) + headlines = await _thesis_headlines_for_chat(session) + + system_prompt = build_chat_system_prompt( + s.CASSANDRA_TONE, s.CASSANDRA_ANALYSIS, + log_content=log_row.content if log_row else None, + log_generated_at=log_row.generated_at if log_row else None, + quotes_by_group=quotes, + headlines=headlines, + reference_line=REFERENCE_LINE, + ) + + msgs = [{"role": "system", "content": system_prompt}] + for m in history: + msgs.append({"role": m.role, "content": m.content}) + + try: + async with httpx.AsyncClient(follow_redirects=True) as client: + result = await call_llm(client, msgs) + except Exception as e: + session.add(AICall( + model=s.OPENROUTER_MODEL, status="error", error=str(e)[:500], + )) + await session.commit() + raise HTTPException(status_code=502, detail=f"OpenRouter error: {e}") + + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=result.cost_usd, + status="ok", + )) + await session.commit() + + return { + "role": "assistant", + "content": result.content, + "content_html": _md_to_html(result.content), + "prompt_tokens": result.prompt_tokens, + "completion_tokens": result.completion_tokens, + } diff --git a/app/routers/ops.py b/app/routers/ops.py new file mode 100644 index 0000000..289f803 --- /dev/null +++ b/app/routers/ops.py @@ -0,0 +1,162 @@ +"""HTML-only ops endpoints — /api/markets-bar and /api/health. + +These are HTMX partials consumed by the dashboard. They return HTML by +default (not JSON) and are not included in the OpenAPI schema. +""" +from __future__ import annotations + +from fastapi import APIRouter, Depends, Query, Request +from fastapi.responses import HTMLResponse, JSONResponse +from sqlalchemy import desc, func, select +from sqlalchemy.ext.asyncio import AsyncSession + +from app.auth import require_token +from app.db import get_session, utcnow +from app.models import JobRun, Quote +from app.routers.api import _age_seconds, _fmt_age +from app.schemas import HealthOut, JobStatus +from app.templates_env import templates + +router = APIRouter(dependencies=[Depends(require_token)]) + +JOB_NAMES = ("market_job", "news_job", "ai_log_job", "rollup_job", + "indicator_summary_job", "universe_flush_job", + "email_digest_job") +JOB_STALE_HOURS = 2.0 # job is "warn" if its last success was >2h ago + +# Market → headline index mapping for the sticky bottom bar. Symbols must +# be present in config/default.toml so market_job populates `quotes`. +_MARKET_INDEX = { + "NYSE": ("^GSPC", "S&P 500"), + "LSE": ("^FTSE", "FTSE 100"), + # XETRA → Euro Stoxx 50 rather than ^GDAXI: Yahoo's DAX ticker is + # patchy via the chart endpoint, and ^STOXX50E is already tracked in + # config/default.toml's equity group. + "XETRA": ("^STOXX50E", "STOXX 50"), + "JPX": ("^N225", "Nikkei 225"), + "HKEX": ("^HSI", "Hang Seng"), + "SSE": ("000300.SS", "CSI 300"), +} + + +def _fmt_price(p: float | None) -> str: + if p is None: + return "—" + if abs(p) >= 1000: + return f"{p:,.0f}" + if abs(p) >= 100: + return f"{p:,.1f}" + return f"{p:,.2f}" + + +@router.get("/markets-bar", response_class=HTMLResponse, include_in_schema=False) +async def markets_bar( + request: Request, + session: AsyncSession = Depends(get_session), + as_: str | None = Query(default=None, alias="as"), +): + """The sticky bottom-bar payload: per-market open/close status with the + market's headline index price + 1d change. Refreshed by HTMX every 60s. + """ + from app.services.markets import all_statuses + + statuses = all_statuses() + # Latest quote per headline-index symbol in one query. + wanted_syms = [sym for sym, _ in _MARKET_INDEX.values()] + sub = ( + select(Quote.symbol, func.max(Quote.fetched_at).label("mx")) + .where(Quote.symbol.in_(wanted_syms)) + .group_by(Quote.symbol) + .subquery() + ) + rows = (await session.execute( + select(Quote).join( + sub, + (Quote.symbol == sub.c.symbol) & (Quote.fetched_at == sub.c.mx), + ) + )).scalars().all() + by_sym = {q.symbol: q for q in rows} + + markets: list[dict] = [] + for st in statuses: + sym, label = _MARKET_INDEX.get(st["code"], (None, None)) + q = by_sym.get(sym) if sym else None + idx = None + if q is not None and q.price is not None: + idx = { + "symbol": q.symbol, + "label": label, + "price_fmt": _fmt_price(q.price), + "change_1d_pct": (q.changes or {}).get("1d"), + } + markets.append({ + "code": st["code"], + "label": st["label"], + "open": st["open"], + "until_iso": st["until"].isoformat(), + "until_hhmm": st["until"].strftime("%H:%M"), + "index": idx, + }) + + return templates.TemplateResponse( + request, "partials/markets_bar.html", + {"markets": markets}, + ) + + +@router.get("/health", response_class=HTMLResponse, include_in_schema=False) +async def health_html( + request: Request, + session: AsyncSession = Depends(get_session), + as_: str | None = Query(default=None, alias="as"), +): + """Returns an HTML fragment by default (the ops footer); ?as=json returns the + structured object. The default is HTML because that's how the dashboard + consumes it; CLI/curl users will pass ?as=json.""" + try: + await session.execute(select(func.now())) + db_ok = True + except Exception: + db_ok = False + + now = utcnow() + jobs: list[dict] = [] + structured: list[JobStatus] = [] + for name in JOB_NAMES: + row = (await session.execute( + select(JobRun).where(JobRun.name == name) + .order_by(desc(JobRun.started_at)).limit(1) + )).scalar_one_or_none() + if row is None: + jobs.append({"name": name, "led": "idle", "age": "—", + "last_finished": None}) + structured.append(JobStatus(name=name)) + continue + if row.status == "success": + secs = _age_seconds(now, row.finished_at or row.started_at) or 0 + led = "ok" if secs < JOB_STALE_HOURS * 3600 else "warn" + elif row.status == "skipped": + led = "warn" + elif row.status == "running": + led = "warn" + else: + led = "err" + jobs.append({ + "name": name, "led": led, + "age": _fmt_age(now, row.finished_at or row.started_at), + "last_finished": row.finished_at, + }) + structured.append(JobStatus( + name=name, last_started=row.started_at, + last_finished=row.finished_at, status=row.status, + error=row.error, items_written=row.items_written, + )) + + if as_ == "json": + return JSONResponse( + HealthOut(db="ok" if db_ok else "down", jobs=structured).model_dump(mode="json") + ) + return templates.TemplateResponse( + request, "partials/ops_footer.html", + {"db_ok": db_ok, "jobs": jobs}, + ) diff --git a/tests/test_chat_and_log_gates.py b/tests/test_chat_and_log_gates.py index bff5997..e050cab 100644 --- a/tests/test_chat_and_log_gates.py +++ b/tests/test_chat_and_log_gates.py @@ -23,6 +23,7 @@ def _build_app(tmp_path): from app.db import Base from app.models import StrategicLog, User from app.routers import api as api_router + from app.routers import chat as chat_router engine = create_async_engine(f"sqlite+aiosqlite:///{tmp_path}/gates.db") factory = async_sessionmaker(engine, expire_on_commit=False) @@ -56,6 +57,7 @@ def _build_app(tmp_path): app = FastAPI() app.include_router(api_router.router, prefix="/api") + app.include_router(chat_router.router, prefix="/api") client = TestClient(app) return client, sign_session(1), sign_session(2) From 74b61a59edb2946e3a1e85a9c2d858d8b9d92722 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 23:37:11 +0200 Subject: [PATCH 34/69] i18n: add diagnostic logging to localizer + lang-toggle click path --- app/routers/api.py | 22 ++++++++++++++++++++++ app/templates/base.html | 14 ++++++-------- 2 files changed, 28 insertions(+), 8 deletions(-) diff --git a/app/routers/api.py b/app/routers/api.py index 5075654..87a2aee 100644 --- a/app/routers/api.py +++ b/app/routers/api.py @@ -21,6 +21,10 @@ from app.auth import require_token, maybe_current_user, CurrentUser from app.services.i18n import ACTIVE_LANGUAGES from app.config import get_settings from app.db import get_session, utcnow +from app.logging import get_logger + + +log = get_logger("api_router") from app.templates_env import templates from app.models import ( Headline, @@ -312,8 +316,14 @@ async def _localized_content( Returns None to signal 'use row.content as-is' (the default English path).""" if row is None or principal is None or principal.user is None: + log.info("i18n.log.skip", reason="no_row_or_principal", + row_id=getattr(row, "id", None), + has_principal=principal is not None, + has_user=(principal.user is not None) if principal else False) return None lang = (principal.user.lang or "en") + log.info("i18n.log.lookup", row_id=row.id, lang=lang, + user_id=principal.user.id) if lang == "en": return None t = (await session.execute( @@ -321,6 +331,9 @@ async def _localized_content( .where(StrategicLogTranslation.log_id == row.id) .where(StrategicLogTranslation.lang == lang) )).scalar_one_or_none() + log.info("i18n.log.result", row_id=row.id, lang=lang, + found=(t is not None), + content_preview=(t.content[:60] if t is not None else None)) return t.content if t is not None else None @@ -335,8 +348,14 @@ async def _apply_localized_summary( for the lifetime of this GET request. """ if row is None or principal is None or principal.user is None: + log.info("i18n.summary.skip", reason="no_row_or_principal", + row_id=getattr(row, "id", None), + has_principal=principal is not None, + has_user=(principal.user is not None) if principal else False) return lang = (principal.user.lang or "en") + log.info("i18n.summary.lookup", row_id=row.id, lang=lang, + user_id=principal.user.id) if lang == "en": return t = (await session.execute( @@ -344,6 +363,9 @@ async def _apply_localized_summary( .where(IndicatorSummaryTranslation.summary_id == row.id) .where(IndicatorSummaryTranslation.lang == lang) )).scalar_one_or_none() + log.info("i18n.summary.result", row_id=row.id, lang=lang, + found=(t is not None), + content_preview=(t.content[:60] if t is not None else None)) if t is not None: row.content = t.content diff --git a/app/templates/base.html b/app/templates/base.html index fbf52e0..5a40399 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -143,12 +143,11 @@ }; window.cassandraSetLang = async function (newLang) { + console.log('[lang] click', newLang); var pill = document.getElementById('lang-toggle'); - if (!pill) return; + if (!pill) { console.warn('[lang] no pill element'); return; } var prev = pill.dataset.lang; - if (prev === newLang) return; - // Optimistic update — flip the pill immediately so the click feels - // responsive. Revert on PATCH failure. + if (prev === newLang) { console.log('[lang] already', newLang); return; } pill.dataset.lang = newLang; try { var r = await fetch('/api/settings/language', { @@ -157,18 +156,17 @@ credentials: 'same-origin', body: JSON.stringify({lang: newLang}), }); + console.log('[lang] PATCH', r.status); if (!r.ok) throw new Error('HTTP ' + r.status); - // Trigger HTMX-driven panels to re-fetch in the new language. - // Same shape as cassandraSetTone — every panel that listens to - // tone-changed also listens to lang-changed. ['#dash-header-container', '#log-panel .panel-body', '#indicators-body', '#log-content'].forEach(function (sel) { var el = document.querySelector(sel); + console.log('[lang] trigger', sel, 'found:', !!el, 'htmx:', !!window.htmx); if (el && window.htmx) window.htmx.trigger(el, 'lang-changed'); }); } catch (e) { pill.dataset.lang = prev; - console.warn('language switch failed:', e); + console.warn('[lang] switch failed:', e); } }; From f9d448d57b6e59642db7edec4a835740889ab149 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Wed, 27 May 2026 23:55:59 +0200 Subject: [PATCH 35/69] Revert "i18n: add diagnostic logging to localizer + lang-toggle click path" This reverts commit 74b61a59edb2946e3a1e85a9c2d858d8b9d92722. --- app/routers/api.py | 22 ---------------------- app/templates/base.html | 14 ++++++++------ 2 files changed, 8 insertions(+), 28 deletions(-) diff --git a/app/routers/api.py b/app/routers/api.py index 87a2aee..5075654 100644 --- a/app/routers/api.py +++ b/app/routers/api.py @@ -21,10 +21,6 @@ from app.auth import require_token, maybe_current_user, CurrentUser from app.services.i18n import ACTIVE_LANGUAGES from app.config import get_settings from app.db import get_session, utcnow -from app.logging import get_logger - - -log = get_logger("api_router") from app.templates_env import templates from app.models import ( Headline, @@ -316,14 +312,8 @@ async def _localized_content( Returns None to signal 'use row.content as-is' (the default English path).""" if row is None or principal is None or principal.user is None: - log.info("i18n.log.skip", reason="no_row_or_principal", - row_id=getattr(row, "id", None), - has_principal=principal is not None, - has_user=(principal.user is not None) if principal else False) return None lang = (principal.user.lang or "en") - log.info("i18n.log.lookup", row_id=row.id, lang=lang, - user_id=principal.user.id) if lang == "en": return None t = (await session.execute( @@ -331,9 +321,6 @@ async def _localized_content( .where(StrategicLogTranslation.log_id == row.id) .where(StrategicLogTranslation.lang == lang) )).scalar_one_or_none() - log.info("i18n.log.result", row_id=row.id, lang=lang, - found=(t is not None), - content_preview=(t.content[:60] if t is not None else None)) return t.content if t is not None else None @@ -348,14 +335,8 @@ async def _apply_localized_summary( for the lifetime of this GET request. """ if row is None or principal is None or principal.user is None: - log.info("i18n.summary.skip", reason="no_row_or_principal", - row_id=getattr(row, "id", None), - has_principal=principal is not None, - has_user=(principal.user is not None) if principal else False) return lang = (principal.user.lang or "en") - log.info("i18n.summary.lookup", row_id=row.id, lang=lang, - user_id=principal.user.id) if lang == "en": return t = (await session.execute( @@ -363,9 +344,6 @@ async def _apply_localized_summary( .where(IndicatorSummaryTranslation.summary_id == row.id) .where(IndicatorSummaryTranslation.lang == lang) )).scalar_one_or_none() - log.info("i18n.summary.result", row_id=row.id, lang=lang, - found=(t is not None), - content_preview=(t.content[:60] if t is not None else None)) if t is not None: row.content = t.content diff --git a/app/templates/base.html b/app/templates/base.html index 5a40399..fbf52e0 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -143,11 +143,12 @@ }; window.cassandraSetLang = async function (newLang) { - console.log('[lang] click', newLang); var pill = document.getElementById('lang-toggle'); - if (!pill) { console.warn('[lang] no pill element'); return; } + if (!pill) return; var prev = pill.dataset.lang; - if (prev === newLang) { console.log('[lang] already', newLang); return; } + if (prev === newLang) return; + // Optimistic update — flip the pill immediately so the click feels + // responsive. Revert on PATCH failure. pill.dataset.lang = newLang; try { var r = await fetch('/api/settings/language', { @@ -156,17 +157,18 @@ credentials: 'same-origin', body: JSON.stringify({lang: newLang}), }); - console.log('[lang] PATCH', r.status); if (!r.ok) throw new Error('HTTP ' + r.status); + // Trigger HTMX-driven panels to re-fetch in the new language. + // Same shape as cassandraSetTone — every panel that listens to + // tone-changed also listens to lang-changed. ['#dash-header-container', '#log-panel .panel-body', '#indicators-body', '#log-content'].forEach(function (sel) { var el = document.querySelector(sel); - console.log('[lang] trigger', sel, 'found:', !!el, 'htmx:', !!window.htmx); if (el && window.htmx) window.htmx.trigger(el, 'lang-changed'); }); } catch (e) { pill.dataset.lang = prev; - console.warn('[lang] switch failed:', e); + console.warn('language switch failed:', e); } }; From 2b9cd875b49f3a2f5f1c8f16c3731bca5fa0021f Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 00:07:38 +0200 Subject: [PATCH 36/69] deps: add requirements.lock for reproducible builds MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit pyproject.toml uses range pins (>=) for all dependencies; without a lockfile, a fresh `pip install .` on a different day could pull materially different versions of fastapi, sqlalchemy, httpx, etc. For a production-shaped service that's a reproducibility risk — especially since we don't run a CI pipeline that would catch "works on yesterday's container, fails on today's." requirements.lock pins every transitive dep (60 packages) to the exact versions running in the test container today. Dockerfile is updated so both stages install from the lockfile first, then install the project itself with --no-deps: pip install -r requirements.lock pip install --no-deps . That way pyproject.toml's range pins document our compatible upper-and-lower bounds, but the lockfile is what actually gets installed on every build. To bump deps later: bump pyproject.toml ranges, rebuild a fresh venv, `pip freeze` it, save back to requirements.lock. Co-Authored-By: Claude Opus 4.7 --- Dockerfile | 18 ++++++++++---- requirements.lock | 60 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+), 4 deletions(-) create mode 100644 requirements.lock diff --git a/Dockerfile b/Dockerfile index 1123177..09c6443 100644 --- a/Dockerfile +++ b/Dockerfile @@ -6,11 +6,17 @@ ENV PIP_DISABLE_PIP_VERSION_CHECK=1 \ PYTHONDONTWRITEBYTECODE=1 WORKDIR /build -COPY pyproject.toml ./ +COPY pyproject.toml requirements.lock ./ COPY app ./app +# requirements.lock pins every transitive dependency to the known-good +# versions captured by `pip freeze` against a clean install. Install +# from it first, then add the project itself with --no-deps so the +# lockfile is the single source of truth and pyproject's range pins +# (>=) can't drift on rebuild. RUN python -m venv /opt/venv \ && /opt/venv/bin/pip install --upgrade pip \ - && /opt/venv/bin/pip install . + && /opt/venv/bin/pip install -r requirements.lock \ + && /opt/venv/bin/pip install --no-deps . FROM python:3.13-slim AS runtime @@ -49,7 +55,7 @@ ENV PYTHONUNBUFFERED=1 \ COPY --from=builder /opt/venv /opt/venv WORKDIR /app -COPY pyproject.toml ./ +COPY pyproject.toml requirements.lock ./ COPY app ./app COPY alembic ./alembic COPY alembic.ini ./ @@ -57,6 +63,10 @@ COPY alembic.ini ./ # a shipped image). docker-compose.test.yml bind-mounts ./tests:/app/tests # at run time, so the suite is always available without baking it in. -RUN /opt/venv/bin/pip install ".[dev]" +# The lockfile already contains the dev extras (pytest, ruff, aiosqlite, +# ...) because it was generated against a test-stage install. Same +# install pattern as the builder stage: lockfile first, project --no-deps. +RUN /opt/venv/bin/pip install -r requirements.lock \ + && /opt/venv/bin/pip install --no-deps . CMD ["pytest", "tests/", "-v"] diff --git a/requirements.lock b/requirements.lock new file mode 100644 index 0000000..035b270 --- /dev/null +++ b/requirements.lock @@ -0,0 +1,60 @@ +aiomysql==0.3.2 +aiosmtplib==5.1.0 +aiosqlite==0.22.1 +alembic==1.18.4 +annotated-doc==0.0.4 +annotated-types==0.7.0 +anyio==4.13.0 +APScheduler==3.11.2 +argon2-cffi==25.1.0 +argon2-cffi-bindings==25.1.0 +certifi==2026.5.20 +cffi==2.0.0 +charset-normalizer==3.4.7 +click==8.4.1 +cryptography==48.0.0 +dnspython==2.8.0 +email-validator==2.3.0 +fastapi==0.136.3 +greenlet==3.5.1 +h11==0.16.0 +hiredis==3.3.1 +httpcore==1.0.9 +httptools==0.8.0 +httpx==0.28.1 +idna==3.16 +iniconfig==2.3.0 +itsdangerous==2.2.0 +Jinja2==3.1.6 +Mako==1.3.12 +MarkupSafe==3.0.3 +packaging==26.2 +pluggy==1.6.0 +pycparser==3.0 +pydantic==2.13.4 +pydantic-settings==2.14.1 +pydantic_core==2.46.4 +Pygments==2.20.0 +PyMySQL==1.2.0 +pytest==9.0.3 +pytest-asyncio==1.4.0 +pytest-httpx==0.36.2 +python-dotenv==1.2.2 +python-multipart==0.0.29 +PyYAML==6.0.3 +redis==7.4.0 +requests==2.34.2 +ruff==0.15.14 +SQLAlchemy==2.0.50 +starlette==1.1.0 +stripe==15.1.0 +structlog==25.5.0 +tenacity==9.1.4 +typing-inspection==0.4.2 +typing_extensions==4.15.0 +tzlocal==5.3.1 +urllib3==2.7.0 +uvicorn==0.48.0 +uvloop==0.22.1 +watchfiles==1.2.0 +websockets==16.0 From 78ce8c8b0de3946be10b6ee0fbb88d80feade1e8 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 00:16:09 +0200 Subject: [PATCH 37/69] alembic: make migration chain SQLite-compatible (fresh upgrade) Five existing migrations used op.alter_column / op.create_unique_constraint / op.drop_constraint / op.create_foreign_key directly on the users + quotes + quotes_daily tables. SQLite has no native support for those operations and requires Alembic's batch_alter_table copy-and-rename workaround. This wasn't noticed until now because the test suite uses Base.metadata.create_all to materialise schema, not the migration chain itself; and prod is MariaDB. But running `alembic upgrade head` against a fresh SQLite database (developer onboarding, CI smoke tests, the test container's own bootstrap) would fail at 0005. Fixes: - alembic/env.py: set render_as_batch=True when the dialect is SQLite. This auto-wraps any future autogenerated migration but doesn't retroactively rewrite existing op.* calls. - 0005 (widen quotes.symbol), 0013 (referrals), 0018 (polar webhook), 0019 (stripe), 0023 (users.lang index + qd_symbol widen) explicitly wrap their problematic ops in `with op.batch_alter_table(...) as bop`. Now `alembic upgrade head` + `alembic downgrade base` round-trip cleanly on a fresh SQLite database. MariaDB prod behaviour unchanged. Co-Authored-By: Claude Opus 4.7 --- alembic/env.py | 7 ++++ alembic/versions/0005_widen_quote_symbol.py | 31 ++++++++------ alembic/versions/0013_referrals.py | 41 +++++++++---------- alembic/versions/0018_polar_webhook.py | 22 ++++------ alembic/versions/0019_stripe.py | 24 +++++------ .../0023_lang_index_and_qd_symbol_widen.py | 28 ++++++------- 6 files changed, 79 insertions(+), 74 deletions(-) diff --git a/alembic/env.py b/alembic/env.py index a652b05..0d40d2c 100644 --- a/alembic/env.py +++ b/alembic/env.py @@ -44,10 +44,17 @@ def run_migrations_offline() -> None: def do_run_migrations(connection: Connection) -> None: + # render_as_batch is required for SQLite, which doesn't support + # most ALTER COLUMN / ADD CONSTRAINT operations natively. With + # batch mode enabled, Alembic emits a copy-and-rename dance under + # SQLite while still producing plain ALTER on MariaDB / Postgres, + # so prod migrations are unchanged. Detect via the dialect name. + render_as_batch = connection.dialect.name == "sqlite" context.configure( connection=connection, target_metadata=target_metadata, compare_type=True, + render_as_batch=render_as_batch, ) with context.begin_transaction(): context.run_migrations() diff --git a/alembic/versions/0005_widen_quote_symbol.py b/alembic/versions/0005_widen_quote_symbol.py index 22b2bce..b7f8d33 100644 --- a/alembic/versions/0005_widen_quote_symbol.py +++ b/alembic/versions/0005_widen_quote_symbol.py @@ -17,18 +17,25 @@ depends_on: Union[str, Sequence[str], None] = None def upgrade() -> None: - op.alter_column( - "quotes", "symbol", - existing_type=sa.String(64), - type_=sa.String(128), - existing_nullable=False, - ) + # batch_alter_table wraps the ALTER in a copy-and-rename dance for + # SQLite (which doesn't support ALTER COLUMN TYPE) while remaining a + # plain ALTER on MariaDB / Postgres. Required for `alembic upgrade + # head` to work against a fresh SQLite database during local tooling + # or test bootstrap. + with op.batch_alter_table("quotes") as bop: + bop.alter_column( + "symbol", + existing_type=sa.String(64), + type_=sa.String(128), + existing_nullable=False, + ) def downgrade() -> None: - op.alter_column( - "quotes", "symbol", - existing_type=sa.String(128), - type_=sa.String(64), - existing_nullable=False, - ) + with op.batch_alter_table("quotes") as bop: + bop.alter_column( + "symbol", + existing_type=sa.String(128), + type_=sa.String(64), + existing_nullable=False, + ) diff --git a/alembic/versions/0013_referrals.py b/alembic/versions/0013_referrals.py index 6eeae26..89b32f2 100644 --- a/alembic/versions/0013_referrals.py +++ b/alembic/versions/0013_referrals.py @@ -30,23 +30,21 @@ depends_on: Union[str, Sequence[str], None] = None def upgrade() -> None: - op.add_column( - "users", - sa.Column("referral_code", sa.String(16), nullable=True), - ) - op.create_unique_constraint( - "uq_users_referral_code", "users", ["referral_code"], - ) - op.add_column( - "users", - sa.Column("referred_by_user_id", sa.Integer, nullable=True), - ) - op.create_foreign_key( - "fk_users_referred_by", - "users", "users", - ["referred_by_user_id"], ["id"], - ondelete="SET NULL", - ) + # batch_alter_table wraps ADD CONSTRAINT in a copy-and-rename for + # SQLite (no native ALTER constraints support); on MariaDB/Postgres + # it falls through to plain ALTER statements. + with op.batch_alter_table("users") as bop: + bop.add_column(sa.Column("referral_code", sa.String(16), nullable=True)) + bop.create_unique_constraint( + "uq_users_referral_code", ["referral_code"], + ) + bop.add_column(sa.Column("referred_by_user_id", sa.Integer, nullable=True)) + bop.create_foreign_key( + "fk_users_referred_by", + "users", + ["referred_by_user_id"], ["id"], + ondelete="SET NULL", + ) op.create_table( "referrals", @@ -71,7 +69,8 @@ def upgrade() -> None: def downgrade() -> None: op.drop_index("ix_referrals_referrer", table_name="referrals") op.drop_table("referrals") - op.drop_constraint("fk_users_referred_by", "users", type_="foreignkey") - op.drop_column("users", "referred_by_user_id") - op.drop_constraint("uq_users_referral_code", "users", type_="unique") - op.drop_column("users", "referral_code") + with op.batch_alter_table("users") as bop: + bop.drop_constraint("fk_users_referred_by", type_="foreignkey") + bop.drop_column("referred_by_user_id") + bop.drop_constraint("uq_users_referral_code", type_="unique") + bop.drop_column("referral_code") diff --git a/alembic/versions/0018_polar_webhook.py b/alembic/versions/0018_polar_webhook.py index bc085a7..5d3f31c 100644 --- a/alembic/versions/0018_polar_webhook.py +++ b/alembic/versions/0018_polar_webhook.py @@ -17,17 +17,12 @@ depends_on: Union[str, Sequence[str], None] = None def upgrade() -> None: - op.add_column( - "users", - sa.Column("polar_customer_id", sa.String(length=64), nullable=True), - ) - op.add_column( - "users", - sa.Column("polar_subscription_id", sa.String(length=64), nullable=True), - ) - op.create_unique_constraint( - "uq_users_polar_customer", "users", ["polar_customer_id"], - ) + with op.batch_alter_table("users") as bop: + bop.add_column(sa.Column("polar_customer_id", sa.String(length=64), nullable=True)) + bop.add_column(sa.Column("polar_subscription_id", sa.String(length=64), nullable=True)) + bop.create_unique_constraint( + "uq_users_polar_customer", ["polar_customer_id"], + ) op.create_table( "polar_events", @@ -50,6 +45,7 @@ def upgrade() -> None: def downgrade() -> None: op.drop_index("ix_polar_events_type_received", table_name="polar_events") op.drop_table("polar_events") - op.drop_constraint("uq_users_polar_customer", "users", type_="unique") - op.drop_column("users", "polar_subscription_id") + with op.batch_alter_table("users") as bop: + bop.drop_constraint("uq_users_polar_customer", type_="unique") + bop.drop_column("polar_subscription_id") op.drop_column("users", "polar_customer_id") diff --git a/alembic/versions/0019_stripe.py b/alembic/versions/0019_stripe.py index 3ea4018..acd516d 100644 --- a/alembic/versions/0019_stripe.py +++ b/alembic/versions/0019_stripe.py @@ -18,17 +18,12 @@ depends_on: Union[str, Sequence[str], None] = None def upgrade() -> None: - op.add_column( - "users", - sa.Column("stripe_customer_id", sa.String(length=64), nullable=True), - ) - op.add_column( - "users", - sa.Column("stripe_subscription_id", sa.String(length=64), nullable=True), - ) - op.create_unique_constraint( - "uq_users_stripe_customer", "users", ["stripe_customer_id"], - ) + with op.batch_alter_table("users") as bop: + bop.add_column(sa.Column("stripe_customer_id", sa.String(length=64), nullable=True)) + bop.add_column(sa.Column("stripe_subscription_id", sa.String(length=64), nullable=True)) + bop.create_unique_constraint( + "uq_users_stripe_customer", ["stripe_customer_id"], + ) op.create_table( "stripe_events", @@ -51,6 +46,7 @@ def upgrade() -> None: def downgrade() -> None: op.drop_index("ix_stripe_events_type_received", table_name="stripe_events") op.drop_table("stripe_events") - op.drop_constraint("uq_users_stripe_customer", "users", type_="unique") - op.drop_column("users", "stripe_subscription_id") - op.drop_column("users", "stripe_customer_id") + with op.batch_alter_table("users") as bop: + bop.drop_constraint("uq_users_stripe_customer", type_="unique") + bop.drop_column("stripe_subscription_id") + bop.drop_column("stripe_customer_id") diff --git a/alembic/versions/0023_lang_index_and_qd_symbol_widen.py b/alembic/versions/0023_lang_index_and_qd_symbol_widen.py index 42bcb63..31a6eeb 100644 --- a/alembic/versions/0023_lang_index_and_qd_symbol_widen.py +++ b/alembic/versions/0023_lang_index_and_qd_symbol_widen.py @@ -18,21 +18,21 @@ depends_on: Union[str, Sequence[str], None] = None def upgrade() -> None: op.create_index("ix_users_lang", "users", ["lang"]) - op.alter_column( - "quotes_daily", - "symbol", - existing_type=sa.String(length=64), - type_=sa.String(length=128), - existing_nullable=False, - ) + with op.batch_alter_table("quotes_daily") as bop: + bop.alter_column( + "symbol", + existing_type=sa.String(length=64), + type_=sa.String(length=128), + existing_nullable=False, + ) def downgrade() -> None: - op.alter_column( - "quotes_daily", - "symbol", - existing_type=sa.String(length=128), - type_=sa.String(length=64), - existing_nullable=False, - ) + with op.batch_alter_table("quotes_daily") as bop: + bop.alter_column( + "symbol", + existing_type=sa.String(length=128), + type_=sa.String(length=64), + existing_nullable=False, + ) op.drop_index("ix_users_lang", table_name="users") From 355593c4f758bb732521f217e709563308d2cc8c Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 12:31:29 +0200 Subject: [PATCH 38/69] css: split cassandra.css into per-section files Splits the 2571-line cassandra.css into ten focused stylesheets: tokens (palette + fonts), layout (chrome), panels, dashboard, portfolio, log-chat, auth, settings, news, public. base.html and public_base.html load only what they need; auth pages (login, verify, unsubscribe confirm) load tokens + layout + auth. Brand drift-detection test repointed at tokens.css (where the palette now lives). 291 tests still pass. --- app/branding.py | 8 +- app/routers/email.py | 4 +- app/services/glossary.py | 4 +- app/static/css/auth.css | 132 ++ app/static/css/cassandra.css | 2571 ---------------------------- app/static/css/dashboard.css | 228 +++ app/static/css/layout.css | 185 ++ app/static/css/log-chat.css | 282 +++ app/static/css/news.css | 86 + app/static/css/panels.css | 92 + app/static/css/portfolio.css | 376 ++++ app/static/css/public.css | 717 ++++++++ app/static/css/settings.css | 381 +++++ app/static/css/tokens.css | 44 + app/templates/base.html | 10 +- app/templates/login.html | 4 +- app/templates/public_base.html | 7 +- app/templates/verify.html | 4 +- tests/test_branding_consistency.py | 6 +- 19 files changed, 2556 insertions(+), 2585 deletions(-) create mode 100644 app/static/css/auth.css delete mode 100644 app/static/css/cassandra.css create mode 100644 app/static/css/dashboard.css create mode 100644 app/static/css/layout.css create mode 100644 app/static/css/log-chat.css create mode 100644 app/static/css/news.css create mode 100644 app/static/css/panels.css create mode 100644 app/static/css/portfolio.css create mode 100644 app/static/css/public.css create mode 100644 app/static/css/settings.css create mode 100644 app/static/css/tokens.css diff --git a/app/branding.py b/app/branding.py index 1bd8f48..dd7370c 100644 --- a/app/branding.py +++ b/app/branding.py @@ -7,13 +7,13 @@ into user-visible chrome (page titles, email headers, OpenRouter referer) must read `BRAND_NAME` from here; do not hard-code the string. Internal identifiers (`cassandra_session` cookie, pyproject package name, -SQLAlchemy GET_LOCK keys, file `cassandra.css`, env var `CASSANDRA_TOKEN`) -keep the legacy name on purpose — renaming them would invalidate live -sessions / advisory locks / configs for zero brand benefit. +SQLAlchemy GET_LOCK keys, env var `CASSANDRA_TOKEN`) keep the legacy +name on purpose — renaming them would invalidate live sessions / +advisory locks / configs for zero brand benefit. The colour palette below is hand-authored in CSS as well; a drift- detection test (`tests/test_branding_consistency.py`) parses -`cassandra.css` and asserts every variable matches. Update both or +`tokens.css` and asserts every variable matches. Update both or neither. The light theme is the *default* everywhere — dashboard `:root` block, diff --git a/app/routers/email.py b/app/routers/email.py index 429101b..b7df411 100644 --- a/app/routers/email.py +++ b/app/routers/email.py @@ -63,7 +63,9 @@ _CONFIRM_PAGE = """\ Unsubscribed — {brand} - + + +
            diff --git a/app/services/glossary.py b/app/services/glossary.py index c994995..40aa938 100644 --- a/app/services/glossary.py +++ b/app/services/glossary.py @@ -10,8 +10,8 @@ The wrap markup is: VIX `title` gives a native fallback on touch devices that don't fire :hover. -The CSS tooltip (see `.glossary:hover::after` in cassandra.css) uses -`data-def` for richer formatting. Wrapping happens at most once per term +The CSS tooltip (see `.glossary` / `#glossary-tooltip` in dashboard.css) +uses `data-def` for richer formatting. Wrapping happens at most once per term per HTML fragment — repeated occurrences stay plain. """ from __future__ import annotations diff --git a/app/static/css/auth.css b/app/static/css/auth.css new file mode 100644 index 0000000..70da6cd --- /dev/null +++ b/app/static/css/auth.css @@ -0,0 +1,132 @@ +/* Cassandra — auth pages: login, sign-up, OTP verify (standalone, no app chrome). */ + +/* --- Auth pages (login / signup, standalone — no app chrome) -------- */ + +.auth-shell { + min-height: 100vh; + display: flex; + align-items: center; + justify-content: center; + background: var(--bg); + padding: 20px; +} +.auth-card { + width: 360px; + max-width: 100%; + background: var(--surface); + border: 1px solid var(--border); + padding: 28px 26px; +} +.auth-card__brand { + font-family: var(--font-mono); + color: var(--accent); + font-size: 18px; + letter-spacing: 0.12em; + text-transform: uppercase; + font-weight: 700; +} +.auth-card__brand::before { content: "▰ "; opacity: 0.6; } +.auth-card__hint { + font-family: var(--font-mono); + color: var(--muted); + font-size: 10px; + text-transform: uppercase; + letter-spacing: 0.08em; + margin: 2px 0 18px; +} +.auth-card form { display: flex; flex-direction: column; gap: 12px; } +.auth-card label { + display: flex; + flex-direction: column; + font-family: var(--font-mono); + color: var(--muted); + font-size: 10px; + text-transform: uppercase; + letter-spacing: 0.06em; + gap: 4px; +} +.auth-card input[type="email"], +.auth-card input[type="password"], +.auth-card input[type="text"] { + background: var(--bg); + border: 1px solid var(--border); + color: var(--text); + font-family: var(--font-mono); + font-size: 16px; + padding: 12px 14px; + outline: none; + border-radius: 3px; +} +/* The 6-digit OTP input wants to be visually loud — it's the only + thing the user is doing on that page. Bigger, more spacing, taller. */ +.auth-card input[name="code"] { + font-size: 24px; + padding: 16px 14px; + letter-spacing: 0.5em; + text-align: center; +} +.auth-card input:focus { border-color: var(--accent); } +.auth-card button { + margin-top: 8px; + background: transparent; + border: 1px solid var(--accent); + color: var(--accent); + font-family: var(--font-mono); + font-size: 11px; + padding: 9px 12px; + text-transform: uppercase; + letter-spacing: 0.1em; + cursor: pointer; +} +.auth-card button:hover { background: var(--accent); color: var(--bg); } +.auth-card__alt { + margin-top: 18px; + font-size: 12px; + color: var(--muted); + text-align: center; +} +.auth-error { + border-left: 3px solid var(--negative); + background: color-mix(in srgb, var(--negative) 6%, transparent); + color: var(--negative); + padding: 8px 10px; + font-size: 12px; + margin-bottom: 14px; + font-family: var(--font-mono); +} +.auth-info { + border-left: 3px solid var(--accent); + background: color-mix(in srgb, var(--accent) 6%, transparent); + color: var(--accent); + padding: 8px 10px; + font-size: 12px; + margin-bottom: 14px; + font-family: var(--font-mono); +} +.auth-info--invited { + /* Slightly warmer / friendlier shading for the referral banner. */ + border-left-color: var(--positive); + background: color-mix(in srgb, var(--positive) 7%, transparent); + color: var(--text); + font-family: var(--font-sans); + font-size: 13px; + line-height: 1.5; +} +.auth-info--invited strong { color: var(--positive); font-weight: 600; } +.auth-card__lede { + font-size: 12.5px; + color: var(--muted); + margin: 0 0 16px; + line-height: 1.5; +} +.auth-card__lede strong { color: var(--text); font-weight: normal; } +.auth-card__resend { + background: transparent !important; + color: var(--muted) !important; + border: 1px dashed var(--border) !important; + font-size: 11px !important; +} +.auth-card__resend:hover { + color: var(--accent) !important; + border-color: var(--accent) !important; +} diff --git a/app/static/css/cassandra.css b/app/static/css/cassandra.css deleted file mode 100644 index fdc8729..0000000 --- a/app/static/css/cassandra.css +++ /dev/null @@ -1,2571 +0,0 @@ -/* Cassandra — geopolitical-terminal aesthetic with two themes. - * Mono for data, headers, terminal feel; sans for prose surfaces (log + chat). */ - -:root { - /* Light theme (default) */ - --bg: #f5f3ec; /* warm off-white, easier on the eyes than pure white */ - --surface: #ffffff; - --surface-2: #efece3; - --border: #d6d3cb; - --text: #1c1f25; - --muted: #545b69; - --dim: #8a8f9a; - --accent: #0e7490; /* deep teal — still terminal-feel on light */ - --positive: #166534; - --negative: #b91c1c; - --alert: #c2410c; - --warning: #a16207; - --user-bubble-bg: rgba(14, 116, 144, 0.07); -} - -[data-theme="dark"] { - --bg: #0a0e14; - --surface: #11151c; - --surface-2: #161b25; - --border: #2a3142; - --text: #d4dae8; /* lifted from #c0caf5 for readability */ - --muted: #8189a1; /* lifted from #565f89 — was unreadably dim */ - --dim: #565f89; - --accent: #00d9ff; - --positive: #50fa7b; - --negative: #ff5b5b; - --alert: #ff8a4a; - --warning: #f1fa8c; - --user-bubble-bg: rgba(0, 217, 255, 0.08); -} - -/* Font stacks. Mono for terminal feel; sans for reading. */ -:root { - --font-mono: 'JetBrains Mono', 'IBM Plex Mono', 'Fira Code', ui-monospace, monospace; - --font-sans: -apple-system, BlinkMacSystemFont, 'Inter', 'Segoe UI', Roboto, - 'Helvetica Neue', system-ui, sans-serif; -} - -* { box-sizing: border-box; } - -html, body { - margin: 0; - padding: 0; - background: var(--bg); - color: var(--text); - font-family: var(--font-mono); - font-size: 13px; - line-height: 1.5; - font-variant-numeric: tabular-nums; -} - -a { color: var(--accent); text-decoration: none; } -a:hover { text-decoration: underline; } - -/* --- Layout ---------------------------------------------------------- */ - -.app { - display: grid; - grid-template-columns: 1fr; - grid-template-rows: auto 1fr auto; - min-height: 100vh; -} - -.app-header { - display: flex; - align-items: center; - justify-content: space-between; - border-bottom: 1px solid var(--border); - padding: 10px 18px; - background: var(--surface); - letter-spacing: 0.08em; - text-transform: uppercase; - position: sticky; - top: 0; - z-index: 50; -} -.app-header .brand { - color: var(--accent); - font-weight: 700; - text-decoration: none; -} -.app-header .brand:hover { color: var(--text); } -.app-header .brand::before { content: "▰ "; opacity: 0.6; } -.app-header nav a { - margin-left: 18px; - color: var(--muted); -} -.app-header nav a.active { color: var(--text); } -.app-header .meta { color: var(--muted); font-size: 11px; } - -.app-header .header-right { display: flex; align-items: center; gap: 14px; } -.theme-toggle { - background: transparent; - border: 1px solid var(--border); - color: var(--muted); - padding: 3px 8px; - font-family: var(--font-mono); - font-size: 10px; - letter-spacing: 0.08em; - cursor: pointer; - text-transform: lowercase; -} -.theme-toggle:hover { color: var(--accent); border-color: var(--accent); } -.theme-toggle__label::before { content: "◐ light"; } -[data-theme="dark"] .theme-toggle__label::before { content: "◐ dark"; } - -/* Tone toggle (segmented control: Novice | Intermediate) */ -.tone-toggle { - display: inline-flex; - border: 1px solid var(--border); - font-family: var(--font-mono); - font-size: 10.5px; - letter-spacing: 0.06em; - text-transform: uppercase; -} -.tone-toggle button { - background: transparent; - color: var(--muted); - border: 0; - padding: 4px 10px; - cursor: pointer; - font: inherit; - letter-spacing: inherit; - text-transform: inherit; -} -.tone-toggle button + button { border-left: 1px solid var(--border); } -.tone-toggle button:hover { color: var(--accent); } -.tone-toggle[data-tone="NOVICE"] button[data-value="NOVICE"], -.tone-toggle[data-tone="INTERMEDIATE"] button[data-value="INTERMEDIATE"] { - background: var(--accent); - color: var(--bg); -} - -/* Language toggle in the topbar — same visual rhythm as the tone - * toggle so the two controls read as a pair. Only EN and IT are - * visible here; the WIP languages (ES/FR/DE) live in /settings. */ -.lang-toggle { - display: inline-flex; - border: 1px solid var(--border); - font-family: var(--font-mono); - font-size: 10.5px; - letter-spacing: 0.06em; - text-transform: uppercase; -} -.lang-toggle button { - background: transparent; - color: var(--muted); - border: 0; - padding: 4px 8px; - cursor: pointer; - font: inherit; - letter-spacing: inherit; - text-transform: inherit; -} -.lang-toggle button + button { border-left: 1px solid var(--border); } -.lang-toggle button:hover { color: var(--accent); } -.lang-toggle[data-lang="en"] button[data-value="en"], -.lang-toggle[data-lang="it"] button[data-value="it"] { - background: var(--accent); - color: var(--bg); -} - -.app-main { - padding: 14px; - display: grid; - grid-template-columns: minmax(0, 2fr) minmax(0, 1fr); - grid-template-rows: auto auto auto auto; - grid-template-areas: - "header header" - "indicators log" - "portfolio log" - "news news"; - gap: 14px; -} -@media (max-width: 1100px) { - .app-main { - grid-template-columns: 1fr; - grid-template-areas: "header" "indicators" "portfolio" "log" "news"; - } -} - -#dash-header-container { grid-area: header; } -#indicators-panel { grid-area: indicators; } -#portfolio-panel { grid-area: portfolio; } -#log-panel { - grid-area: log; - /* Don't stretch to fill both grid rows; if the log is shorter than - the portfolio next to it, the surplus below would render as a big - empty white box. Aligning to the start makes the panel shrink to - its content and the dashboard background fills any gap. */ - align-self: start; -} -#news-panel { grid-area: news; } - - -/* Sticky bottom markets bar — uses the same .mkt chip styling as the - old dashboard header, extended with each market's headline index. */ -.markets-bar { - position: sticky; - bottom: 0; - z-index: 50; - background: var(--surface); - border-top: 1px solid var(--border); -} -.markets-bar__inner { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); - gap: 1px; - background: var(--border); - border: 0; -} -.markets-bar .mkt { - border: 0; - border-radius: 0; -} - -/* --- Panels ----------------------------------------------------------- */ - -.panel { - background: var(--surface); - border: 1px solid var(--border); - position: relative; -} -.panel-header { - border-bottom: 1px solid var(--border); - padding: 8px 12px; - display: flex; - align-items: center; - justify-content: space-between; - text-transform: uppercase; - letter-spacing: 0.1em; - color: var(--muted); - font-size: 11px; - background: linear-gradient(180deg, var(--surface-2), var(--surface)); -} -.panel-header .title { color: var(--text); font-weight: 700; } -.panel-header .title::before { content: "■ "; color: var(--accent); } -.panel-header .meta { color: var(--dim); } -.panel-body { padding: 6px 0; } -.panel-body--scroll { max-height: 70vh; overflow-y: auto; } - -/* --- Tables ----------------------------------------------------------- */ - -table.dense { - width: 100%; - border-collapse: collapse; -} -table.dense th, table.dense td { - padding: 4px 12px; - font-size: 12px; - border-bottom: 1px solid var(--surface-2); - white-space: nowrap; -} -table.dense th { - text-align: left; - color: var(--muted); - font-weight: 400; - text-transform: uppercase; - letter-spacing: 0.06em; - font-size: 10px; - background: var(--surface-2); -} -table.dense th.num, -table.dense td.num { text-align: right; } -table.dense td.label { color: var(--text); } -table.dense td.label.has-tip, -table.dense td[title] { - cursor: help; - border-bottom: 1px dotted color-mix(in srgb, var(--accent) 40%, transparent); - border-bottom-width: 1px; -} -.pf-name.has-tip { - cursor: help; - border-bottom: 1px dotted color-mix(in srgb, var(--accent) 50%, transparent); -} -table.dense tr:hover td { background: color-mix(in srgb, var(--accent) 5%, transparent); } - -.pos { color: var(--positive); } -.neg { color: var(--negative); } -.neu { color: var(--muted); } -.note { color: var(--dim); font-size: 11px; } - -/* Stale indicator rows — last observation > 90 days old */ -table.dense tr.row-stale td { color: var(--dim); } -.stale-tag { - display: inline-block; - font-size: 8.5px; - letter-spacing: 0.08em; - color: var(--alert); - border: 1px solid var(--alert); - padding: 0 4px; - margin-left: 4px; - vertical-align: middle; - text-transform: uppercase; - cursor: help; -} - -/* --- Status LEDs ------------------------------------------------------ */ - -.led { display: inline-block; width: 8px; height: 8px; border-radius: 50%; margin-right: 4px; vertical-align: middle; } -.led.ok { background: var(--positive); box-shadow: 0 0 6px var(--positive); } -.led.warn { background: var(--warning); box-shadow: 0 0 6px var(--warning); } -.led.err { background: var(--negative); box-shadow: 0 0 6px var(--negative); } -.led.idle { background: var(--dim); } - -/* --- Dashboard top header (markets + aggregate read) ----------------- */ - -.dash-header { - display: grid; - grid-template-columns: 1fr; - gap: 12px; - margin-bottom: 0; -} -.dash-header__markets { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(150px, 1fr)); - gap: 1px; - background: var(--border); - border: 1px solid var(--border); -} -.mkt { - background: var(--surface); - padding: 6px 10px; - font-family: var(--font-mono); - font-size: 11px; - display: grid; - grid-template-columns: auto 1fr auto; - grid-template-rows: auto auto; - align-items: center; - gap: 2px 6px; -} -.mkt__dot { - width: 8px; height: 8px; border-radius: 50%; - grid-row: 1 / span 2; grid-column: 1; - align-self: center; -} -.mkt--open .mkt__dot { background: var(--positive); box-shadow: 0 0 6px var(--positive); } -.mkt--closed .mkt__dot { background: var(--dim); } -.mkt__name { - grid-row: 1; grid-column: 2; - color: var(--text); font-weight: 700; - text-transform: uppercase; letter-spacing: 0.08em; -} -.mkt__state { - grid-row: 1; grid-column: 3; - font-size: 9.5px; letter-spacing: 0.08em; - text-transform: lowercase; -} -.mkt--open .mkt__state { color: var(--positive); } -.mkt--closed .mkt__state { color: var(--dim); } -.mkt__index { - grid-row: 2; grid-column: 2; - font-size: 10.5px; - font-variant-numeric: tabular-nums; - display: inline-flex; - align-items: baseline; - gap: 5px; - white-space: nowrap; -} -.mkt__index-label { color: var(--dim); } -.mkt__index-price { color: var(--text); } -.mkt__index-change.pos { color: var(--positive); } -.mkt__index-change.neg { color: var(--negative); } -.mkt__index-change.neu { color: var(--muted); } -.mkt__index--empty { color: var(--dim); font-size: 10px; } -.mkt__when { - grid-row: 2; grid-column: 3; - color: var(--muted); font-size: 10px; - font-variant-numeric: tabular-nums; - text-align: right; -} -.mkt__when-label { color: var(--dim); } - -.dash-header__read { - border: 1px solid var(--border); - border-left: 3px solid var(--accent); - background: color-mix(in srgb, var(--accent) 4%, transparent); - padding: 10px 14px; -} -.dash-header__read-meta { - display: flex; - justify-content: space-between; - align-items: baseline; - margin-bottom: 4px; -} -.dash-header__read-body { - margin: 0; - font-family: var(--font-sans); - font-size: 14px; - line-height: 1.55; - color: var(--text); -} -.dash-header__read--pending { color: var(--dim); font-style: italic; } -.dash-header__read--pending .dash-header__read-body { color: var(--dim); font-size: 12px; } - -/* --- Indicator group summary (above the table) ----------------------- */ - -.ind-summary { - font-family: var(--font-sans); - padding: 10px 16px; - border-bottom: 1px solid var(--surface-2); - border-left: 3px solid var(--accent); - background: color-mix(in srgb, var(--accent) 4%, transparent); -} -.ind-summary__head { - display: flex; - align-items: baseline; - justify-content: space-between; - margin-bottom: 4px; -} -.ind-summary__label { - font-family: var(--font-mono); - font-size: 10px; - color: var(--accent); - text-transform: uppercase; - letter-spacing: 0.1em; - font-weight: 700; -} -.ind-summary__label::before { content: "▸ "; } -.ind-summary__when { - font-family: var(--font-mono); - font-size: 10px; - color: var(--dim); - font-variant-numeric: tabular-nums; -} -.ind-summary__body { - margin: 0; - font-size: 13.5px; - line-height: 1.55; - color: var(--text); -} -.ind-summary--pending { color: var(--dim); font-style: italic; } -.ind-summary--pending .ind-summary__body { color: var(--dim); font-size: 12px; } - -/* --- Glossary tooltips (Novice mode) --------------------------------- */ -/* The term gets a dotted underline. The actual tooltip is a single shared - element (#glossary-tooltip) positioned by JS so it can flip on viewport - edges and never clip behind sticky bars (which sit at z-index 50). */ - -.glossary { - border-bottom: 1px dotted var(--accent); - cursor: help; - /* Same colour as surrounding text — only the underline signals "tooltip - available", keeping the paragraph visually quiet. */ -} -.glossary:focus { outline: 1px dotted var(--accent); outline-offset: 2px; } - -#glossary-tooltip { - position: fixed; - z-index: 200; /* Above sticky bars (z-index 50). */ - max-width: 300px; - padding: 9px 12px; - background: var(--surface); - color: var(--text); - border: 1px solid var(--accent); - font-family: var(--font-sans); - font-size: 12.5px; - line-height: 1.5; - letter-spacing: 0; - text-transform: none; - font-weight: normal; - box-shadow: 0 6px 18px rgba(0,0,0,0.35); - pointer-events: none; - opacity: 0; - transition: opacity 90ms ease; -} -#glossary-tooltip[data-visible="1"] { opacity: 1; } -#glossary-tooltip[hidden] { display: none; } - -/* --- Group tabs ------------------------------------------------------- */ - -.group-tabs { - display: flex; - border-bottom: 1px solid var(--border); - overflow-x: auto; -} -.group-tabs button { - background: transparent; - border: 0; - border-right: 1px solid var(--border); - color: var(--muted); - font-family: inherit; - font-size: 11px; - padding: 6px 12px; - text-transform: uppercase; - letter-spacing: 0.06em; - cursor: pointer; -} -.group-tabs button:hover { color: var(--text); } -.group-tabs button.active { - color: var(--accent); - background: var(--bg); - box-shadow: inset 0 -2px 0 var(--accent); -} - -/* --- Portfolio overall ----------------------------------------------- */ - -.pf-overall { - border-bottom: 1px solid var(--border); - padding: 10px 14px 12px; - background: linear-gradient(180deg, var(--surface-2), var(--surface)); -} -.pf-overall__head { - display: flex; - justify-content: space-between; - align-items: baseline; - margin-bottom: 8px; -} -.pf-name { - color: var(--accent); - text-transform: uppercase; - letter-spacing: 0.1em; - font-weight: 700; - font-size: 11px; -} -.pf-name::before { content: "◆ "; opacity: 0.6; } -.pf-as-of { color: var(--dim); font-size: 11px; } -.pf-overall__grid { - display: grid; - grid-template-columns: repeat(3, 1fr); - gap: 6px 24px; -} -@media (max-width: 640px) { - .pf-overall__grid { grid-template-columns: repeat(2, 1fr); } -} -.pf-stat-label { - font-size: 10px; - color: var(--muted); - text-transform: uppercase; - letter-spacing: 0.08em; -} -.pf-stat-value { - font-size: 16px; - color: var(--text); - font-variant-numeric: tabular-nums; - margin-top: 2px; -} -.pf-stat-value.pos { color: var(--positive); } -.pf-stat-value.neg { color: var(--negative); } -.pf-stat-value.neu { color: var(--muted); } -.pf-ccy { color: var(--dim); font-size: 11px; margin-left: 2px; } -.pf-pct { color: var(--dim); font-size: 11px; margin-left: 4px; } -.pf-pills { display: flex; flex-wrap: wrap; gap: 4px; margin-top: 4px; } -.pf-pill { - font-size: 10.5px; - font-family: var(--font-mono); - color: var(--muted); - background: var(--surface-2); - border: 1px solid var(--border); - padding: 2px 6px; - letter-spacing: 0.04em; -} -.pf-warn { - border-left: 3px solid var(--alert); - background: color-mix(in srgb, var(--alert) 6%, transparent); - color: var(--alert); - padding: 8px 10px; - font-size: 12px; - margin: 10px 0; -} -.pf-actions { - display: flex; - gap: 8px; - margin-top: 12px; -} -.pf-actions button { - font-family: var(--font-mono); - font-size: 11px; - letter-spacing: 0.06em; - text-transform: uppercase; - background: var(--surface-2); - color: var(--accent); - border: 1px solid var(--border); - padding: 7px 14px; - cursor: pointer; -} -.pf-actions button:hover { border-color: var(--accent); } -.pf-actions button:disabled { opacity: 0.5; cursor: not-allowed; } -.pf-secondary { color: var(--muted); } -.pf-secondary:hover { color: var(--negative); border-color: var(--negative); } - -/* Settings-page action button — same visual language as .pf-actions - button so buttons across /settings (Manage subscription, future - actions) read as one family. Standalone class (not nested under a - parent) so it can be dropped onto any button anywhere on the page. */ -.settings-btn { - font-family: var(--font-mono); - font-size: 11px; - letter-spacing: 0.06em; - text-transform: uppercase; - background: var(--surface-2); - color: var(--accent); - border: 1px solid var(--border); - padding: 7px 14px; - cursor: pointer; - border-radius: 2px; - text-decoration: none; - display: inline-block; -} -.settings-btn:hover { border-color: var(--accent); } -.settings-btn:disabled { opacity: 0.5; cursor: not-allowed; } - -/* Icon-button variant for inline row actions (e.g. Manage subscription - gear in the Tier row). Square hit area, accent on hover, tooltip via - title attribute. */ -.settings-icon-btn { - background: transparent; - border: 1px solid transparent; - color: var(--muted); - width: 32px; - height: 32px; - padding: 0; - display: inline-flex; - align-items: center; - justify-content: center; - cursor: pointer; - border-radius: 3px; - flex-shrink: 0; - transition: color 80ms linear, border-color 80ms linear, background 80ms linear; -} -.settings-icon-btn:hover { - color: var(--accent); - border-color: var(--border); - background: var(--surface-2); -} -.settings-icon-btn:disabled { opacity: 0.5; cursor: not-allowed; } -.settings-icon-btn svg { display: block; } -.pf-analysis { - margin-top: 14px; - background: var(--surface-2); - border: 1px solid var(--border); -} -.pf-analysis__details { padding: 0; } -.pf-analysis__head { - display: flex; - justify-content: space-between; - align-items: center; - font-size: 11px; - color: var(--muted); - letter-spacing: 0.06em; - text-transform: uppercase; - padding: 10px 16px; - cursor: pointer; - user-select: none; - list-style: none; /* hide native marker in Firefox */ -} -.pf-analysis__head::-webkit-details-marker { display: none; } -.pf-analysis__head-left::before { - content: "▸ "; - display: inline-block; - width: 1em; - color: var(--accent); - transition: transform 120ms ease; -} -details[open] .pf-analysis__head-left::before { content: "▾ "; } -.pf-analysis__head:hover { color: var(--accent); } -.pf-analysis__head:hover .pf-analysis__head-left::before { color: var(--accent); } -.pf-analysis__details[open] .pf-analysis__head { - border-bottom: 1px solid var(--border); -} -.pf-analysis__body { - font-family: var(--font-sans); - font-size: 14px; - line-height: 1.65; - color: var(--text); - white-space: pre-wrap; - margin: 0; - padding: 14px 16px 16px; -} - -/* --- Log panel -------------------------------------------------------- */ - -.log-content { - font-family: var(--font-sans); - padding: 28px clamp(20px, 4vw, 56px) 32px; - font-size: 15.5px; - line-height: 1.72; - color: var(--text); - max-width: 76ch; - margin: 0 auto; - max-height: calc(100vh - 240px); - overflow-y: auto; -} -.log-content p { margin: 0 0 1.1em; } -.log-content h1, .log-content h2, .log-content h3, .log-content h4 { - font-family: var(--font-mono); - color: var(--accent); - text-transform: uppercase; - letter-spacing: 0.08em; - font-size: 12px; - margin-top: 1.8em; - margin-bottom: 0.5em; - font-weight: 700; -} -.log-content h1:first-child, -.log-content h2:first-child, -.log-content h3:first-child { margin-top: 0; } - -/* TL;DR callout — model is instructed to put it first, so style the first - * heading + paragraph block as a callout. */ -.log-content h3:first-of-type { - font-size: 11px; - color: var(--accent); - border-left: 3px solid var(--accent); - padding-left: 10px; - margin-bottom: 0; -} -.log-content h3:first-of-type + p { - font-size: 16.5px; - line-height: 1.6; - color: var(--text); - border-left: 3px solid var(--accent); - padding: 4px 14px 12px; - margin: 0 0 1.8em; - background: color-mix(in srgb, var(--accent) 5%, transparent); - font-weight: 500; -} -.log-content strong { color: var(--text); font-weight: 700; } -.log-content em { color: var(--muted); font-style: italic; } -.log-content ul, .log-content ol { padding-left: 1.4em; margin: 0 0 1.1em; } -.log-content li { margin-bottom: 0.4em; } -.log-content hr { - border: 0; - border-top: 1px solid var(--border); - margin: 1.6em 0; -} - -/* --- Log page (calendar + log + chat sidebar) ------------------------- */ - -.log-page__body { - display: grid; - grid-template-columns: 220px 1fr 320px; - gap: 1px; - background: var(--border); -} -@media (max-width: 1100px) { - .log-page__body { grid-template-columns: 1fr; } -} -.log-page__cal, .log-page__content, .log-page__chat { background: var(--surface); } -.log-page__cal { padding: 10px; } -.log-page__content { min-height: 60vh; } -.log-page__chat { padding: 8px; min-height: 60vh; display: flex; flex-direction: column; } -.log-page__chat--locked { opacity: 0.92; } -.chat-locked { - flex: 1; - display: flex; - flex-direction: column; - justify-content: center; - align-items: center; - text-align: center; - gap: 16px; - padding: 24px 18px; - color: var(--muted); - font-size: 13px; - line-height: 1.55; - border: 1px dashed var(--border); - border-radius: 4px; - margin: 8px 4px; -} -.chat-locked p { margin: 0; max-width: 280px; } -.chat-locked strong { color: var(--text); display: block; margin-bottom: 6px; } - -/* --- Calendar widget --------------------------------------------------- */ - -.cal__nav { - display: flex; - justify-content: space-between; - align-items: center; - margin-bottom: 8px; - font-size: 11px; - letter-spacing: 0.08em; - text-transform: uppercase; -} -.cal__title { color: var(--accent); font-weight: 700; } -.cal__btn { - background: transparent; - color: var(--muted); - border: 1px solid var(--border); - padding: 2px 8px; - cursor: pointer; - font-family: inherit; - font-size: 13px; -} -.cal__btn:hover { color: var(--accent); border-color: var(--accent); } -.cal__grid { - display: grid; - grid-template-columns: repeat(7, 1fr); - gap: 1px; - background: var(--border); - border: 1px solid var(--border); -} -.cal__h { - text-align: center; - font-size: 9px; - color: var(--dim); - background: var(--surface-2); - padding: 3px 0; - text-transform: uppercase; -} -.cal__d { - background: var(--surface); - border: 0; - color: var(--muted); - font-family: inherit; - font-size: 11px; - padding: 6px 0; - text-align: center; - cursor: not-allowed; -} -.cal__d--empty { background: var(--bg); cursor: default; } -.cal__d--has-log { - color: var(--text); - cursor: pointer; - position: relative; -} -.cal__d--has-log::after { - content: ""; - position: absolute; - bottom: 3px; - left: 50%; - transform: translateX(-50%); - width: 3px; height: 3px; - border-radius: 50%; - background: var(--accent); -} -.cal__d--has-log:hover { background: color-mix(in srgb, var(--accent) 10%, transparent); } -.cal__d--today { color: var(--warning); } -.cal__d--selected { - background: var(--accent); - color: var(--bg); - font-weight: 700; -} -.cal__d--selected::after { background: var(--bg); } - -/* --- Badges (tone / analysis indicators) ------------------------------ */ - -.badge { - display: inline-block; - font-family: var(--font-mono); - font-size: 9.5px; - letter-spacing: 0.06em; - text-transform: uppercase; - padding: 1px 6px; - border: 1px solid currentColor; - margin-right: 4px; - background: transparent; - vertical-align: middle; -} -/* Tone axis — green→accent→amber as audience density rises */ -.badge--tone-novice { color: var(--positive); } -.badge--tone-intermediate { color: var(--accent); } -.badge--tone-pro { color: var(--alert); } - -/* Analysis axis — dry is muted, speculative is accent */ -.badge--analysis-dry { color: var(--muted); } -.badge--analysis-speculative { color: var(--accent); } - -.badge--ver { color: var(--dim); } -.badge--ok { color: var(--positive); border-color: var(--positive); } - -.meta__hint { color: var(--dim); font-size: 10px; margin-right: 4px; } - -/* --- Log metadata footer ---------------------------------------------- */ - -.log-meta { - padding: 4px clamp(20px, 4vw, 56px) 6px; - max-width: 76ch; - margin: 0 auto; - border-top: 1px dashed var(--border); - color: var(--dim); - font-size: 10.5px; - font-family: var(--font-mono); - letter-spacing: 0.04em; -} - -/* --- Auth pages (login / signup, standalone — no app chrome) -------- */ - -.auth-shell { - min-height: 100vh; - display: flex; - align-items: center; - justify-content: center; - background: var(--bg); - padding: 20px; -} -.auth-card { - width: 360px; - max-width: 100%; - background: var(--surface); - border: 1px solid var(--border); - padding: 28px 26px; -} -.auth-card__brand { - font-family: var(--font-mono); - color: var(--accent); - font-size: 18px; - letter-spacing: 0.12em; - text-transform: uppercase; - font-weight: 700; -} -.auth-card__brand::before { content: "▰ "; opacity: 0.6; } -.auth-card__hint { - font-family: var(--font-mono); - color: var(--muted); - font-size: 10px; - text-transform: uppercase; - letter-spacing: 0.08em; - margin: 2px 0 18px; -} -.auth-card form { display: flex; flex-direction: column; gap: 12px; } -.auth-card label { - display: flex; - flex-direction: column; - font-family: var(--font-mono); - color: var(--muted); - font-size: 10px; - text-transform: uppercase; - letter-spacing: 0.06em; - gap: 4px; -} -.auth-card input[type="email"], -.auth-card input[type="password"], -.auth-card input[type="text"] { - background: var(--bg); - border: 1px solid var(--border); - color: var(--text); - font-family: var(--font-mono); - font-size: 16px; - padding: 12px 14px; - outline: none; - border-radius: 3px; -} -/* The 6-digit OTP input wants to be visually loud — it's the only - thing the user is doing on that page. Bigger, more spacing, taller. */ -.auth-card input[name="code"] { - font-size: 24px; - padding: 16px 14px; - letter-spacing: 0.5em; - text-align: center; -} -.auth-card input:focus { border-color: var(--accent); } - -/* --- Modal text inputs (cloud-sync PIN modal, etc.) ---------------- */ -/* Same visual treatment as auth-card so prompts read as a coherent - family. Replaces the inline `style="padding:8px"` that left these - inputs feeling cramped. */ -.modal-input { - width: 100%; - background: var(--bg); - border: 1px solid var(--border); - color: var(--text); - font-family: var(--font-mono); - font-size: 16px; - padding: 12px 14px; - margin-bottom: 12px; - outline: none; - border-radius: 3px; - box-sizing: border-box; -} -.modal-input:focus { border-color: var(--accent); } -.auth-card button { - margin-top: 8px; - background: transparent; - border: 1px solid var(--accent); - color: var(--accent); - font-family: var(--font-mono); - font-size: 11px; - padding: 9px 12px; - text-transform: uppercase; - letter-spacing: 0.1em; - cursor: pointer; -} -.auth-card button:hover { background: var(--accent); color: var(--bg); } -.auth-card__alt { - margin-top: 18px; - font-size: 12px; - color: var(--muted); - text-align: center; -} -.auth-error { - border-left: 3px solid var(--negative); - background: color-mix(in srgb, var(--negative) 6%, transparent); - color: var(--negative); - padding: 8px 10px; - font-size: 12px; - margin-bottom: 14px; - font-family: var(--font-mono); -} -.auth-info { - border-left: 3px solid var(--accent); - background: color-mix(in srgb, var(--accent) 6%, transparent); - color: var(--accent); - padding: 8px 10px; - font-size: 12px; - margin-bottom: 14px; - font-family: var(--font-mono); -} -.auth-info--invited { - /* Slightly warmer / friendlier shading for the referral banner. */ - border-left-color: var(--positive); - background: color-mix(in srgb, var(--positive) 7%, transparent); - color: var(--text); - font-family: var(--font-sans); - font-size: 13px; - line-height: 1.5; -} -.auth-info--invited strong { color: var(--positive); font-weight: 600; } - -/* --- Settings page --------------------------------------------------- */ - -.settings-row { - display: flex; - align-items: baseline; - gap: 14px; - padding: 8px 0; - border-bottom: 1px solid var(--surface-2); - font-size: 13px; -} -.settings-row__label { - width: 110px; - flex-shrink: 0; - color: var(--muted); - text-transform: uppercase; - letter-spacing: 0.06em; - font-size: 10.5px; - font-family: var(--font-mono); -} -.settings-row__value { color: var(--text); } -.settings-row__hint { - color: var(--dim); - font-size: 11px; - margin-left: 8px; -} - -/* Terminal-aesthetic used in the Settings page. Native + * browser chrome stripped; we render a small chevron via crossed + * linear-gradients so the control matches the rest of the panel. */ +.settings-select { + appearance: none; + -webkit-appearance: none; + -moz-appearance: none; + background: transparent; + border: 1px solid var(--border); + color: var(--text); + padding: 4px 28px 4px 8px; + font-family: var(--font-mono); + font-size: 12px; + border-radius: 2px; + cursor: pointer; + background-image: + linear-gradient(45deg, transparent 50%, var(--dim) 50%), + linear-gradient(-45deg, transparent 50%, var(--dim) 50%); + background-position: calc(100% - 13px) 50%, calc(100% - 9px) 50%; + background-size: 5px 5px, 5px 5px; + background-repeat: no-repeat; + transition: border-color 120ms ease-out, color 120ms ease-out; +} +.settings-select:hover, +.settings-select:focus { + outline: none; + border-color: var(--accent); + color: var(--text); +} +.settings-select option { color: var(--text); background: var(--surface); } +.settings-select option:disabled { color: var(--dim); } + +.settings-status { + font-family: var(--font-mono); + font-size: 11px; + color: var(--muted); + letter-spacing: 0.04em; +} +.settings-status:empty { display: none; } + +/* Sections are
            elements — collapsed by default to keep the + settings page scannable. Click the summary to expand. */ +.settings-section { + margin-top: 14px; + border-top: 1px solid var(--surface-2); + padding-top: 14px; +} +.settings-section__head { + font-family: var(--font-mono); + font-size: 11px; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--accent); + margin-bottom: 6px; + cursor: pointer; + list-style: none; + user-select: none; + display: flex; + align-items: center; + gap: 8px; + padding: 4px 0; +} +/* Suppress the native disclosure marker (Webkit + Firefox). */ +.settings-section__head::-webkit-details-marker { display: none; } +.settings-section__head::marker { content: ""; } +.settings-section__head::before { + content: "▸"; + color: var(--accent); + display: inline-block; + transition: transform 120ms ease-out; + font-size: 10px; +} +.settings-section[open] > .settings-section__head::before { + transform: rotate(90deg); +} +.settings-section[open] > .settings-section__head { margin-bottom: 10px; } +.settings-section__head:hover { color: var(--text); } +.settings-section__head:hover::before { color: var(--text); } +.settings-section__lede { + color: var(--muted); + font-size: 12.5px; + line-height: 1.55; + margin: 0 0 14px; +} +.settings-section__lede strong { color: var(--positive); font-weight: 600; } + +.invite-block { + background: var(--surface-2); + border: 1px solid var(--border); + padding: 14px 16px; +} +.invite-block__label { + display: block; + font-family: var(--font-mono); + font-size: 10px; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--muted); + margin-bottom: 4px; +} +.invite-block__label:not(:first-child) { margin-top: 12px; } +.invite-block__code { + font-family: var(--font-mono); + font-size: 22px; + letter-spacing: 0.32em; + color: var(--accent); + background: var(--surface); + padding: 10px 14px; + border: 1px solid var(--accent); + text-align: center; + user-select: all; +} +.invite-block__link { + display: flex; + gap: 6px; +} +.invite-block__link input { + flex: 1; + background: var(--surface); + color: var(--text); + border: 1px solid var(--border); + padding: 7px 10px; + font-family: var(--font-mono); + font-size: 12px; +} +.invite-block__link button { + background: var(--accent); + color: var(--bg); + border: 0; + padding: 0 14px; + font-family: var(--font-mono); + font-size: 11px; + letter-spacing: 0.06em; + text-transform: uppercase; + cursor: pointer; +} +.invite-block__link button:hover { opacity: 0.85; } + +.invite-stats { + display: grid; + grid-template-columns: repeat(3, 1fr); + gap: 1px; + background: var(--border); + border: 1px solid var(--border); + margin-top: 16px; +} +.invite-stats > div { + background: var(--surface); + padding: 10px 14px; +} +.invite-stats__label { + font-family: var(--font-mono); + font-size: 10px; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--muted); +} +.invite-stats__value { + font-family: var(--font-mono); + font-size: 18px; + color: var(--text); + font-variant-numeric: tabular-nums; + margin-top: 4px; +} + +/* Import preview action row — two stacked buttons with an explainer. */ +.import-actions { + display: flex; + flex-wrap: wrap; + gap: 12px; + margin-top: 14px; +} +.import-choice { flex: 1 1 240px; min-width: 220px; } +.import-choice button { width: 100%; } +.import-choice .settings-row__hint { + display: block; + margin-top: 6px; + line-height: 1.5; +} + +/* User chip in header — now a button that toggles a dropdown menu. */ +.user-menu { position: relative; margin-left: 8px; } +.user-chip { + font-family: var(--font-mono); + font-size: 10.5px; + color: var(--muted); + letter-spacing: 0.04em; + background: none; + border: 0; + padding: 0; + cursor: pointer; +} +.user-chip:hover { color: var(--accent); } +.user-menu__caret { margin-left: 4px; opacity: 0.6; } +.user-menu__panel { + position: absolute; + top: calc(100% + 6px); + right: 0; + min-width: 160px; + background: var(--surface); + border: 1px solid var(--border); + border-radius: 6px; + box-shadow: 0 6px 18px rgba(0, 0, 0, 0.18); + z-index: 200; + padding: 4px 0; +} +.user-menu__item { + display: block; + padding: 8px 14px; + color: var(--text); + text-decoration: none; + font-size: 12px; +} +.user-menu__item:hover { background: var(--surface-2); color: var(--accent); } + +/* --- Upload / import drag-drop zone (settings page) ------------------ */ + +.dz { + border: 2px dashed var(--border); + background: var(--surface-2); + padding: 36px 20px; + text-align: center; + cursor: pointer; + transition: border-color 0.15s, background 0.15s; +} +.dz:hover, .dz--over { + border-color: var(--accent); + background: color-mix(in srgb, var(--accent) 6%, var(--surface-2)); +} +.dz__icon { + font-family: var(--font-mono); + font-size: 28px; + color: var(--accent); + letter-spacing: -2px; + margin-bottom: 6px; +} +.dz__label { + font-family: var(--font-mono); + font-size: 13px; + color: var(--text); + text-transform: uppercase; + letter-spacing: 0.08em; +} +.dz__hint { color: var(--muted); font-size: 11.5px; margin-top: 4px; } +.dz__hint a { color: var(--accent); } +.dz__filename { margin-top: 10px; color: var(--accent); font-size: 12px; font-family: var(--font-mono); min-height: 1em; } + + +.result { + margin-top: 20px; + padding: 14px; + border: 1px solid var(--border); + border-left: 3px solid var(--accent); + background: color-mix(in srgb, var(--accent) 4%, transparent); + font-family: var(--font-sans); + font-size: 13px; +} +.result--err { border-left-color: var(--negative); background: color-mix(in srgb, var(--negative) 5%, transparent); } +.result__head { + font-family: var(--font-mono); + font-size: 11px; + text-transform: uppercase; + letter-spacing: 0.08em; + color: var(--accent); + margin-bottom: 10px; +} +.result--err .result__head { color: var(--negative); } +.result__grid { + display: grid; + grid-template-columns: repeat(4, 1fr); + gap: 10px 18px; + margin-bottom: 10px; +} +.result__grid .k { + font-family: var(--font-mono); + font-size: 9.5px; + color: var(--muted); + text-transform: uppercase; + letter-spacing: 0.08em; +} +.result__grid .v { font-size: 17px; color: var(--text); font-variant-numeric: tabular-nums; margin-top: 2px; } +.result__grid .v.pos { color: var(--positive); } +.result__grid .v.neg { color: var(--negative); } +.result__row { color: var(--muted); font-size: 12px; margin-top: 6px; } +.result__warn { color: var(--alert); font-size: 12px; margin-top: 4px; } +.result__warn code { background: rgba(0,0,0,0.15); padding: 1px 4px; font-family: var(--font-mono); } + +/* --- Modal text inputs (cloud-sync PIN modal, etc.) ---------------- */ +/* Same visual treatment as auth-card so prompts read as a coherent + family. Replaces the inline `style="padding:8px"` that left these + inputs feeling cramped. */ +.modal-input { + width: 100%; + background: var(--bg); + border: 1px solid var(--border); + color: var(--text); + font-family: var(--font-mono); + font-size: 16px; + padding: 12px 14px; + margin-bottom: 12px; + outline: none; + border-radius: 3px; + box-sizing: border-box; +} +.modal-input:focus { border-color: var(--accent); } diff --git a/app/static/css/tokens.css b/app/static/css/tokens.css new file mode 100644 index 0000000..a3551b0 --- /dev/null +++ b/app/static/css/tokens.css @@ -0,0 +1,44 @@ +/* Cassandra — design tokens: palette, dark-theme overrides, font stacks. + * Must load first so all other files can var(--foo). */ + +:root { + /* Light theme (default) */ + --bg: #f5f3ec; /* warm off-white, easier on the eyes than pure white */ + --surface: #ffffff; + --surface-2: #efece3; + --border: #d6d3cb; + --text: #1c1f25; + --muted: #545b69; + --dim: #8a8f9a; + --accent: #0e7490; /* deep teal — still terminal-feel on light */ + --positive: #166534; + --negative: #b91c1c; + --alert: #c2410c; + --warning: #a16207; + --user-bubble-bg: rgba(14, 116, 144, 0.07); +} + +[data-theme="dark"] { + --bg: #0a0e14; + --surface: #11151c; + --surface-2: #161b25; + --border: #2a3142; + --text: #d4dae8; /* lifted from #c0caf5 for readability */ + --muted: #8189a1; /* lifted from #565f89 — was unreadably dim */ + --dim: #565f89; + --accent: #00d9ff; + --positive: #50fa7b; + --negative: #ff5b5b; + --alert: #ff8a4a; + --warning: #f1fa8c; + --user-bubble-bg: rgba(0, 217, 255, 0.08); +} + +/* Font stacks. Mono for terminal feel; sans for reading. */ +:root { + --font-mono: 'JetBrains Mono', 'IBM Plex Mono', 'Fira Code', ui-monospace, monospace; + --font-sans: -apple-system, BlinkMacSystemFont, 'Inter', 'Segoe UI', Roboto, + 'Helvetica Neue', system-ui, sans-serif; +} + +* { box-sizing: border-box; } diff --git a/app/templates/base.html b/app/templates/base.html index fbf52e0..fd15361 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -36,7 +36,15 @@ } catch (e) { document.documentElement.dataset.theme = 'light'; } })(); - + + + + + + + + + - + + +
            diff --git a/app/templates/public_base.html b/app/templates/public_base.html index 77e4186..427fc36 100644 --- a/app/templates/public_base.html +++ b/app/templates/public_base.html @@ -14,7 +14,12 @@ } catch (e) { document.documentElement.dataset.theme = 'light'; } })(); - + + + + + +
            diff --git a/app/templates/verify.html b/app/templates/verify.html index 1399fe5..43637d4 100644 --- a/app/templates/verify.html +++ b/app/templates/verify.html @@ -10,7 +10,9 @@ catch (e) { document.documentElement.dataset.theme = 'light'; } })(); - + + +
            diff --git a/tests/test_branding_consistency.py b/tests/test_branding_consistency.py index a10c3a4..e51fd2e 100644 --- a/tests/test_branding_consistency.py +++ b/tests/test_branding_consistency.py @@ -1,6 +1,6 @@ """Drift-detection: brand palette in `app/branding.py` must match the CSS. -Both the website (cassandra.css) and the email templates use the same +Both the website (tokens.css) and the email templates use the same palette. The CSS hand-authors the values in :root and [data-theme="light"] blocks; this test parses those blocks and asserts every variable matches its counterpart in branding.py. If a colour changes, both must change. @@ -15,7 +15,7 @@ import pytest from app import branding -CSS_PATH = Path(__file__).resolve().parent.parent / "app" / "static" / "css" / "cassandra.css" +CSS_PATH = Path(__file__).resolve().parent.parent / "app" / "static" / "css" / "tokens.css" def _extract_vars(css: str, selector: str) -> dict[str, str]: @@ -23,7 +23,7 @@ def _extract_vars(css: str, selector: str) -> dict[str, str]: selector block. Strips whitespace; lowercases hex values.""" # Match the selector followed by its block. Non-greedy on the body to # stop at the first closing brace at the same depth (these blocks - # don't nest in cassandra.css). + # don't nest in tokens.css). pattern = re.escape(selector) + r"\s*\{([^}]*)\}" m = re.search(pattern, css) if not m: From 7348055d7298c4090053679dcd97d43ae4f83644 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 12:36:55 +0200 Subject: [PATCH 39/69] llm: estimate cost from tokens when provider omits it MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DeepSeek's native API returns prompt_tokens/completion_tokens but not `usage.cost`. OpenRouter returns both. Result: with DeepSeek-direct as primary (current default), every LogResult.cost_usd was None — and every downstream cost ledger row (AICall, StrategicLog, IndicatorSummary, translation tables) stored None instead of the real spend. Added a per-model rate table and fallback computation in _call_provider: when the upstream omits cost, multiply tokens by the table rates. If the upstream DOES return cost, keep it (authoritative). Falls back to None if both the upstream and the table miss. deepseek-v4-flash rates: \$0.07/M input, \$0.28/M output (per DeepSeek). --- app/services/openrouter.py | 39 +++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/app/services/openrouter.py b/app/services/openrouter.py index c1ddb4f..ca31f2f 100644 --- a/app/services/openrouter.py +++ b/app/services/openrouter.py @@ -20,6 +20,31 @@ from app.config import get_settings OPENROUTER_URL = "https://openrouter.ai/api/v1/chat/completions" +# Per-model USD rates: (input_per_million, output_per_million). +# OpenRouter returns `usage.cost` directly; DeepSeek's native API does not. +# Used as a fallback when the upstream omits the cost field. +_MODEL_PRICING_USD_PER_MILLION: dict[str, tuple[float, float]] = { + "deepseek-v4-flash": (0.07, 0.28), + "deepseek/deepseek-v4-flash": (0.07, 0.28), + "deepseek-chat": (0.27, 1.10), + "deepseek-reasoner": (0.55, 2.19), +} + + +def _estimate_cost_usd(model: str, prompt_tokens, completion_tokens) -> float | None: + """Compute cost from token counts when the upstream didn't return one. + + Returns None if either token count is missing or the model isn't in + the pricing table — caller falls back to whatever value the upstream + did (or didn't) return. + """ + rates = _MODEL_PRICING_USD_PER_MILLION.get(model) + if rates is None or prompt_tokens is None or completion_tokens is None: + return None + in_rate, out_rate = rates + return (prompt_tokens * in_rate + completion_tokens * out_rate) / 1_000_000.0 + + @dataclass class LogResult: content: str @@ -141,13 +166,21 @@ async def _call_provider( f"provider={provider}, model={used_model}, max_tokens={max_tokens})" ) usage = data.get("usage") or {} + prompt_tokens = usage.get("prompt_tokens") + completion_tokens = usage.get("completion_tokens") + # OpenRouter populates `usage.cost`; DeepSeek's native API doesn't — + # estimate from tokens × per-model rates so the cost ledger stays + # populated regardless of which provider answered. + cost_usd = usage.get("cost") or usage.get("total_cost") + if cost_usd is None: + cost_usd = _estimate_cost_usd(used_model, prompt_tokens, completion_tokens) return LogResult( content=content, # Record provider+model so admin can see which path produced this row. model=f"{provider}/{used_model}", - prompt_tokens=usage.get("prompt_tokens"), - completion_tokens=usage.get("completion_tokens"), - cost_usd=usage.get("cost") or usage.get("total_cost"), + prompt_tokens=prompt_tokens, + completion_tokens=completion_tokens, + cost_usd=cost_usd, ) From c5fb4525f395830b23402d3e9238ae5bff07337c Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 12:37:06 +0200 Subject: [PATCH 40/69] jobs: per-row savepoint + aggregate logging in translation fan-out MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously translate_log_for_active_languages and translate_summary_for_active_languages added every successful translation to the session and called session.commit() once at the end. A single bad row (DB error, constraint violation, encoding mismatch) rolled back the whole batch — losing all the languages that had succeeded. Wrap each row in session.begin_nested() so a per-row failure only loses that one row. Track succeeded/failed counts and log them at the end — escalating to error if zero succeeded out of N attempted, so total failure surfaces in monitoring instead of just N warning lines. --- app/jobs/ai_log_job.py | 43 +++++++++++++++++++++---------- app/jobs/indicator_summary_job.py | 41 ++++++++++++++++++++--------- 2 files changed, 59 insertions(+), 25 deletions(-) diff --git a/app/jobs/ai_log_job.py b/app/jobs/ai_log_job.py index 2c0277e..9b5683e 100644 --- a/app/jobs/ai_log_job.py +++ b/app/jobs/ai_log_job.py @@ -40,9 +40,9 @@ async def translate_log_for_active_languages(session, log_id: int) -> None: Reads ``users.lang`` (deduplicated, restricted to ACTIVE_LANGUAGES minus English), one translation call per language in parallel via ``asyncio.gather``, persists each successful result as a - ``StrategicLogTranslation`` row. Per-language failures are logged - but never raise — the strategic log itself is already committed at - this point and translation is a best-effort enhancement. + ``StrategicLogTranslation`` row. Each row is committed in its own + savepoint so a per-language LLM error or DB error doesn't roll back + the languages that already succeeded. The job orchestrator calls this AFTER the English ``StrategicLog`` row is committed; pass the row's ``id`` in. @@ -68,22 +68,39 @@ async def translate_log_for_active_languages(session, log_id: int) -> None: for lang in active_langs ], return_exceptions=True) + succeeded = 0 + failed = 0 for lang, result in zip(active_langs, results): if isinstance(result, Exception): log.warning("log.translate.failed", lang=lang, log_id=log_id, error=str(result)[:200]) + failed += 1 continue translated_md, llm_result = result - session.add(StrategicLogTranslation( - log_id=log_id, lang=lang, - content=translated_md, - generated_at=utcnow(), - model=llm_result.model, - prompt_tokens=llm_result.prompt_tokens, - completion_tokens=llm_result.completion_tokens, - cost_usd=llm_result.cost_usd, - )) - await session.commit() + try: + async with session.begin_nested(): + session.add(StrategicLogTranslation( + log_id=log_id, lang=lang, + content=translated_md, + generated_at=utcnow(), + model=llm_result.model, + prompt_tokens=llm_result.prompt_tokens, + completion_tokens=llm_result.completion_tokens, + cost_usd=llm_result.cost_usd, + )) + await session.commit() + succeeded += 1 + except Exception as exc: + log.warning("log.translate.persist_failed", + lang=lang, log_id=log_id, error=str(exc)[:200]) + failed += 1 + + if failed and succeeded == 0: + log.error("log.translate.all_failed", + log_id=log_id, attempted=len(active_langs)) + else: + log.info("log.translate.done", + log_id=log_id, succeeded=succeeded, failed=failed) async def run() -> None: diff --git a/app/jobs/indicator_summary_job.py b/app/jobs/indicator_summary_job.py index fb21f24..97c5f80 100644 --- a/app/jobs/indicator_summary_job.py +++ b/app/jobs/indicator_summary_job.py @@ -47,8 +47,8 @@ async def translate_summary_for_active_languages(session, summary_id: int) -> No Mirrors ``ai_log_job.translate_log_for_active_languages``: reads the distinct non-en ``users.lang`` set, translates the English content once per active language in parallel via ``asyncio.gather``, and - persists each result as an ``IndicatorSummaryTranslation`` row. - Per-language failures are logged but never raise. + persists each result as an ``IndicatorSummaryTranslation`` row in + its own savepoint so one bad row doesn't lose the rest. """ target_langs = sorted({l for l in ACTIVE_LANGUAGES if l != "en"}) if not target_langs: @@ -70,23 +70,40 @@ async def translate_summary_for_active_languages(session, summary_id: int) -> No for lang in active_langs ], return_exceptions=True) + succeeded = 0 + failed = 0 for lang, result in zip(active_langs, results): if isinstance(result, Exception): log.warning("ind_summary.translate.failed", lang=lang, summary_id=summary_id, error=str(result)[:200]) + failed += 1 continue translated_md, llm_result = result - session.add(IndicatorSummaryTranslation( - summary_id=summary_id, lang=lang, - content=translated_md, - generated_at=utcnow(), - model=llm_result.model, - prompt_tokens=llm_result.prompt_tokens, - completion_tokens=llm_result.completion_tokens, - cost_usd=llm_result.cost_usd, - )) - await session.commit() + try: + async with session.begin_nested(): + session.add(IndicatorSummaryTranslation( + summary_id=summary_id, lang=lang, + content=translated_md, + generated_at=utcnow(), + model=llm_result.model, + prompt_tokens=llm_result.prompt_tokens, + completion_tokens=llm_result.completion_tokens, + cost_usd=llm_result.cost_usd, + )) + await session.commit() + succeeded += 1 + except Exception as exc: + log.warning("ind_summary.translate.persist_failed", + lang=lang, summary_id=summary_id, error=str(exc)[:200]) + failed += 1 + + if failed and succeeded == 0: + log.error("ind_summary.translate.all_failed", + summary_id=summary_id, attempted=len(active_langs)) + else: + log.info("ind_summary.translate.done", + summary_id=summary_id, succeeded=succeeded, failed=failed) # Strip known meta-commentary openers the model sometimes leaks despite the From 83995e96c8dfa6e7c2b543ad537076f07f1255c9 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 12:42:40 +0200 Subject: [PATCH 41/69] stripe: detect buyer currency at checkout (GBP/USD/EUR) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pass `currency` to Stripe checkout for first-time buyers so Stripe picks the matching `currency_options` rate configured on the Price in the Dashboard (multi-currency Prices: one Price, per-currency unit_amount). Operator configures the rates on existing Prices prod_UaZ0xCpCboUGCN/price_*; this commit is the application-side signal. Currency precedence: explicit request body > Cloudflare cf-ipcountry header > Accept-Language locale > GBP fallback. Only honoured when the user has no stripe_customer_id yet — Stripe locks currency to the customer record at first checkout, so existing customers keep their original currency (they can switch via the portal). Adds 4 tests: sniffed currency on new customer, body override beats sniff, currency omitted for existing customer, and unit-tests for the sniffing fallback chain. --- app/routers/stripe_billing.py | 61 ++++++++++++++++++++++- tests/test_stripe_billing.py | 94 +++++++++++++++++++++++++++++++++++ 2 files changed, 154 insertions(+), 1 deletion(-) diff --git a/app/routers/stripe_billing.py b/app/routers/stripe_billing.py index 60bc7f7..bfdeed0 100644 --- a/app/routers/stripe_billing.py +++ b/app/routers/stripe_billing.py @@ -19,7 +19,7 @@ from __future__ import annotations import asyncio import json -from typing import Any, Literal +from typing import Any, Literal, Optional import stripe from fastapi import APIRouter, Body, Depends, HTTPException, Request @@ -69,6 +69,53 @@ def _price_for(cadence: str) -> str: raise HTTPException(status_code=400, detail="cadence must be 'monthly' or 'annual'") +# Rough country → currency mapping. Covers the markets we have a stated +# rate for; everything else falls back to GBP (the home currency) and +# Stripe handles the FX at checkout. Configure the per-currency +# unit_amount on each Price's `currency_options` in the Stripe Dashboard +# — we just signal which option to use here. +_COUNTRY_CURRENCY: dict[str, str] = { + "US": "usd", "CA": "usd", + "GB": "gbp", "IM": "gbp", "JE": "gbp", "GG": "gbp", + **dict.fromkeys(( + "DE", "FR", "IT", "ES", "PT", "NL", "BE", "IE", "AT", "FI", + "GR", "LU", "MT", "CY", "EE", "LV", "LT", "SI", "SK", "HR", + ), "eur"), +} + +# Accept-Language locale → currency, used when CF-IPCountry is absent. +# Ambiguous locales (e.g. plain "fr" without region) get EUR because +# that's the majority outcome. +_LOCALE_CURRENCY: dict[str, str] = { + "en-gb": "gbp", "en": "gbp", + "en-us": "usd", "en-ca": "usd", + "fr": "eur", "de": "eur", "it": "eur", "es": "eur", + "pt": "eur", "nl": "eur", +} + + +def _sniff_currency(request: Request) -> str: + """Best-effort currency detection for new-customer checkouts. + + Order: explicit Cloudflare country header, then Accept-Language + (exact match then language-only). GBP as the final fallback. Only + consulted when the user has no Stripe customer record yet — Stripe + locks currency at customer creation, so an existing customer's + currency wins regardless of the request locale. + """ + cc = (request.headers.get("cf-ipcountry") or "").upper() + if cc in _COUNTRY_CURRENCY: + return _COUNTRY_CURRENCY[cc] + al = (request.headers.get("accept-language") or "").lower() + first = al.split(",", 1)[0].split(";", 1)[0].strip() + if first in _LOCALE_CURRENCY: + return _LOCALE_CURRENCY[first] + short = first.split("-", 1)[0] + if short in _LOCALE_CURRENCY: + return _LOCALE_CURRENCY[short] + return "gbp" + + def _stripe_client() -> stripe.StripeClient: """Per-call client so we read the secret at request time (lets us rotate the key by editing .env + reloading without rebuilding any @@ -83,6 +130,10 @@ def _stripe_client() -> stripe.StripeClient: class CheckoutRequest(BaseModel): cadence: Literal["monthly", "annual"] + # Optional override; when omitted we sniff from request headers. + # Honoured only for first-time checkouts (Stripe locks currency + # to the customer at creation). + currency: Optional[Literal["gbp", "usd", "eur"]] = None class CheckoutResponse(BaseModel): @@ -92,6 +143,7 @@ class CheckoutResponse(BaseModel): @router.post("/api/stripe/checkout", response_model=CheckoutResponse) async def create_checkout( body: CheckoutRequest, + request: Request, session: AsyncSession = Depends(get_session), cu: CurrentUser = Depends(require_auth), ) -> CheckoutResponse: @@ -120,6 +172,13 @@ async def create_checkout( # referral redemption flow ships. "allow_promotion_codes": True, } + # Multi-currency: for first-time buyers (no stripe_customer_id yet) + # we pass the detected/requested currency. Stripe picks the matching + # `currency_options` rate configured on the Price in the Dashboard, + # then locks that currency to the new customer record. Existing + # customers keep their original currency regardless. + if not user.stripe_customer_id: + create_kwargs["currency"] = body.currency or _sniff_currency(request) # Per-cadence cooling-off treatment: # # - Annual gets a 14-day free trial. No money moves during the diff --git a/tests/test_stripe_billing.py b/tests/test_stripe_billing.py index f00e72d..d231cd2 100644 --- a/tests/test_stripe_billing.py +++ b/tests/test_stripe_billing.py @@ -463,3 +463,97 @@ def test_checkout_endpoint_requires_login(tmp_path): r = client.post("/api/stripe/checkout", json={"cadence": "monthly"}) # No session cookie → require_auth bounces with 401. assert r.status_code == 401, r.text + + +def test_checkout_passes_sniffed_currency_for_new_customer(tmp_path): + """First-time buyer (no stripe_customer_id yet) gets the currency + sniffed from the request. CF-IPCountry=US → 'usd', and Stripe will + look up the USD currency_option on the Price.""" + client, _, session_cookie = _build_app(tmp_path) + + def asserter(params): + assert params["currency"] == "usd" + + with patch("app.routers.stripe_billing._stripe_client", + return_value=_fake_checkout_client(asserter)): + r = client.post( + "/api/stripe/checkout", + json={"cadence": "monthly"}, + cookies={"cassandra_session": session_cookie}, + headers={"cf-ipcountry": "US"}, + ) + assert r.status_code == 200, r.text + + +def test_checkout_body_currency_overrides_sniff(tmp_path): + """Explicit `currency` in the request body beats header sniffing — + lets a UK-based buyer choose EUR if they want to.""" + client, _, session_cookie = _build_app(tmp_path) + + def asserter(params): + assert params["currency"] == "eur" + + with patch("app.routers.stripe_billing._stripe_client", + return_value=_fake_checkout_client(asserter)): + r = client.post( + "/api/stripe/checkout", + json={"cadence": "monthly", "currency": "eur"}, + cookies={"cassandra_session": session_cookie}, + headers={"cf-ipcountry": "GB"}, + ) + assert r.status_code == 200, r.text + + +def test_checkout_omits_currency_for_existing_customer(tmp_path): + """Existing customer: Stripe locked their currency at first + checkout, so passing `currency` again would error. Verify we omit + it (and also use the existing `customer` ref instead of + customer_email).""" + import asyncio + + from app.models import User + + client, factory, session_cookie = _build_app(tmp_path) + + async def _link(): + async with factory() as s: + u = await s.get(User, 1) + u.stripe_customer_id = "cus_existing_xxxxxxxxxxxxxx" + await s.commit() + + asyncio.run(_link()) + + def asserter(params): + assert "currency" not in params, ( + "currency must not be passed once a customer exists — " + "Stripe rejects mismatches against the locked customer currency" + ) + assert params["customer"] == "cus_existing_xxxxxxxxxxxxxx" + + with patch("app.routers.stripe_billing._stripe_client", + return_value=_fake_checkout_client(asserter)): + r = client.post( + "/api/stripe/checkout", + json={"cadence": "monthly", "currency": "usd"}, + cookies={"cassandra_session": session_cookie}, + headers={"cf-ipcountry": "US"}, + ) + assert r.status_code == 200, r.text + + +def test_sniff_currency_fallback_chain(): + """Unit-test the header-sniffing helper: CF country wins, then + Accept-Language exact, then language-only, then GBP default.""" + from types import SimpleNamespace + + from app.routers.stripe_billing import _sniff_currency + + def _req(headers): + return SimpleNamespace(headers=headers) + + assert _sniff_currency(_req({"cf-ipcountry": "DE"})) == "eur" + assert _sniff_currency(_req({"cf-ipcountry": "us"})) == "usd" # case-insensitive + assert _sniff_currency(_req({"accept-language": "fr-FR,fr;q=0.9"})) == "eur" + assert _sniff_currency(_req({"accept-language": "en-US,en;q=0.5"})) == "usd" + assert _sniff_currency(_req({"accept-language": "ja,ja-JP;q=0.5"})) == "gbp" + assert _sniff_currency(_req({})) == "gbp" From f9f4f25ef7e008766f0b81a333a6e7a7e2d92638 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 13:58:28 +0200 Subject: [PATCH 42/69] tests: backfill coverage for openrouter transport, auth sessions, cadence Three new test files covering modules the audit flagged as having zero direct coverage: - test_openrouter_transport.py (18 tests): provider chain selection, endpoint resolution, _call_provider parse path (including the reasoning-field fallback and token-based cost estimation), and call_llm's cross-provider failover. Uses httpx.MockTransport so no network. Patches _call_provider for failover tests to bypass tenacity's retry delays. - test_auth_session.py (7 tests): sign/verify round-trip, tampered cookie rejection, expired cookie rejection (via TTL monkeypatch), garbage input handling, salt isolation between session and pending serializers, and rejection of cookies signed with a different secret. - test_cadence_policy.py (16 tests): is_active_window weekday/weekend + half-open interval boundaries, min_gap_hours across bands, should_run gating for first-run / active / off-hours / weekend / naive-datetime cases, and the NEWS_POLICY 20-minute / 3-hour variations. Suite goes from 291 to 336 passing. --- tests/test_auth_session.py | 81 +++++++++ tests/test_cadence_policy.py | 163 ++++++++++++++++++ tests/test_openrouter_transport.py | 256 +++++++++++++++++++++++++++++ 3 files changed, 500 insertions(+) create mode 100644 tests/test_auth_session.py create mode 100644 tests/test_cadence_policy.py create mode 100644 tests/test_openrouter_transport.py diff --git a/tests/test_auth_session.py b/tests/test_auth_session.py new file mode 100644 index 0000000..b8add6e --- /dev/null +++ b/tests/test_auth_session.py @@ -0,0 +1,81 @@ +"""Session cookie sign/verify — security-critical edges that the +existing test suite uses as a fixture (``sign_session(1)`` for cookies) +but doesn't actually probe. + +Covers: +- Round-trip: sign(user_id) → verify → user_id +- Tampered cookie → None (not raised) +- Expired cookie → None (via itsdangerous max_age) +- Garbage / non-serializer-format input → None +- Wrong-salt isolation: a pending cookie can't be unlocked by the + session verifier (and vice versa) +""" +from __future__ import annotations + +from itsdangerous import URLSafeTimedSerializer + +from app import auth + + +def test_session_signed_token_round_trips(): + cookie = auth.sign_session(42) + assert auth.verify_session(cookie) == 42 + + +def test_session_token_is_opaque_url_safe(): + """Sanity: the serializer produces a URL-safe string with at least + two dot-separated segments (payload.timestamp.signature). Not a + semantic test, but catches a future swap to an un-encoded format.""" + cookie = auth.sign_session(7) + assert "." in cookie + assert " " not in cookie + + +def test_tampered_session_cookie_returns_none(): + """Flip a single character in the signature segment and verify + the cookie no longer authenticates — without exceptions leaking.""" + cookie = auth.sign_session(99) + # Flip the last character (signature segment). + tampered = cookie[:-1] + ("a" if cookie[-1] != "a" else "b") + assert auth.verify_session(tampered) is None + + +def test_garbage_session_cookie_returns_none(): + assert auth.verify_session("not-a-real-cookie") is None + assert auth.verify_session("") is None + assert auth.verify_session("a.b.c") is None + + +def test_expired_session_cookie_returns_none(monkeypatch): + """Forge a cookie with an ancient timestamp and confirm the TTL + check rejects it. We bypass sign_session() so the timestamp is + in our control rather than `now`.""" + s = auth._serializer() + # itsdangerous stores the issued-at timestamp in a base62 segment. + # Easier than hand-building: monkeypatch the SESSION_TTL_SECONDS + # to a negative value so any freshly-signed cookie is "expired" + # the moment we verify it. + cookie = auth.sign_session(123) + monkeypatch.setattr(auth, "SESSION_TTL_SECONDS", -1) + assert auth.verify_session(cookie) is None + + +def test_session_serializer_isolated_from_pending_serializer(): + """A pending-verify cookie must not authenticate as a session + (different salts), and vice versa — otherwise the half-finished + OTP flow becomes a free login.""" + pending = auth.sign_pending("u@x", 5) + session = auth.sign_session(5) + assert auth.verify_session(pending) is None + assert auth.verify_pending(session) is None + + +def test_session_cookie_signed_with_different_secret_rejected(monkeypatch): + """Defence-in-depth: signing with a different secret produces a + cookie that the live verifier (using the configured secret) + rejects. Confirms we're actually checking the HMAC, not just the + payload format.""" + rogue = URLSafeTimedSerializer("totally-different-secret", + salt="cassandra-session-v1") + rogue_cookie = rogue.dumps({"uid": 1}) + assert auth.verify_session(rogue_cookie) is None diff --git a/tests/test_cadence_policy.py b/tests/test_cadence_policy.py new file mode 100644 index 0000000..19b9990 --- /dev/null +++ b/tests/test_cadence_policy.py @@ -0,0 +1,163 @@ +"""Cadence policy — the gate that ai_log_job and indicator_summary_job +use to throttle OpenRouter spend outside active market hours. + +Pure-function module, so tests just construct timestamps and assert on +the (should_run, reason) tuple. Uses the default policy (active window +07:00-21:00 UTC weekdays, no off-hours runs without 4+ hours since +last success, weekends 12+ hours). +""" +from __future__ import annotations + +from datetime import datetime, timedelta, timezone + +import pytest + +from app.services.cadence import DEFAULT_POLICY, NEWS_POLICY, CadencePolicy + + +def _utc(year, month, day, hour, minute=0): + return datetime(year, month, day, hour, minute, tzinfo=timezone.utc) + + +# Pick reference timestamps used across tests. Wednesday 12:00 UTC is +# squarely inside the active window; Wednesday 03:00 is off-hours; +# Saturday 12:00 is weekend. +_WED_NOON = _utc(2026, 5, 27, 12) # Wednesday 12:00 +_WED_PRE_DAWN = _utc(2026, 5, 27, 3) # Wednesday 03:00 +_SAT_NOON = _utc(2026, 5, 30, 12) # Saturday 12:00 + + +# --------------------------------------------------------------------------- +# is_active_window +# --------------------------------------------------------------------------- + + +def test_active_window_weekday_noon_is_active(): + assert DEFAULT_POLICY.is_active_window(_WED_NOON) is True + + +def test_active_window_weekday_predawn_is_off_hours(): + assert DEFAULT_POLICY.is_active_window(_WED_PRE_DAWN) is False + + +def test_active_window_weekend_always_off_hours(): + """Weekends bypass the hour check — even Saturday noon is throttled.""" + assert DEFAULT_POLICY.is_active_window(_SAT_NOON) is False + + +def test_active_window_boundary_inclusive_start_exclusive_end(): + """07:00 UTC is the first active hour; 21:00 is the first off-hour. + Locks the half-open interval semantics in place.""" + assert DEFAULT_POLICY.is_active_window(_utc(2026, 5, 27, 7)) is True + assert DEFAULT_POLICY.is_active_window(_utc(2026, 5, 27, 21)) is False + + +# --------------------------------------------------------------------------- +# min_gap_hours +# --------------------------------------------------------------------------- + + +def test_min_gap_uses_zero_during_active_window(): + assert DEFAULT_POLICY.min_gap_hours(_WED_NOON) == 0.0 + + +def test_min_gap_uses_off_hours_value_at_night(): + assert DEFAULT_POLICY.min_gap_hours(_WED_PRE_DAWN) == 4.0 + + +def test_min_gap_uses_weekend_value_on_saturday(): + assert DEFAULT_POLICY.min_gap_hours(_SAT_NOON) == 12.0 + + +# --------------------------------------------------------------------------- +# should_run — the function jobs call +# --------------------------------------------------------------------------- + + +def test_should_run_first_ever_call_always_proceeds(): + ok, reason = DEFAULT_POLICY.should_run(None, now=_WED_NOON) + assert ok is True + assert "no prior" in reason.lower() + + +def test_should_run_during_active_window_always_proceeds(): + """Default policy has active_gap_h=0, so even a run from 1 minute ago + is allowed when we're in the active window.""" + last = _WED_NOON - timedelta(minutes=1) + ok, reason = DEFAULT_POLICY.should_run(last, now=_WED_NOON) + assert ok is True + assert "active" in reason + + +def test_should_run_off_hours_too_soon_is_throttled(): + """Off-hours requires 4+ hours since last success. 1 hour ago → no.""" + last = _WED_PRE_DAWN - timedelta(hours=1) + ok, reason = DEFAULT_POLICY.should_run(last, now=_WED_PRE_DAWN) + assert ok is False + assert "throttled" in reason + assert "off-hours" in reason + + +def test_should_run_off_hours_after_gap_proceeds(): + last = _WED_PRE_DAWN - timedelta(hours=5) + ok, reason = DEFAULT_POLICY.should_run(last, now=_WED_PRE_DAWN) + assert ok is True + assert "off-hours" in reason + + +def test_should_run_weekend_requires_12h_gap(): + """Weekend gap is 12h. 6h is too soon; 13h is enough.""" + ok6, _ = DEFAULT_POLICY.should_run( + _SAT_NOON - timedelta(hours=6), now=_SAT_NOON, + ) + ok13, _ = DEFAULT_POLICY.should_run( + _SAT_NOON - timedelta(hours=13), now=_SAT_NOON, + ) + assert ok6 is False + assert ok13 is True + + +def test_should_run_naive_datetime_treated_as_utc(): + """The DB column comes back as a naive datetime in some test paths; + the policy must coerce it to UTC rather than crash on tz subtraction.""" + naive_last = _WED_PRE_DAWN.replace(tzinfo=None) - timedelta(hours=5) + ok, _ = DEFAULT_POLICY.should_run(naive_last, now=_WED_PRE_DAWN) + assert ok is True + + +# --------------------------------------------------------------------------- +# NEWS_POLICY — tighter gaps so 3 runs/hour during the active window. +# --------------------------------------------------------------------------- + + +def test_news_policy_active_gap_is_twenty_minutes(): + # 20 minutes = 1/3 hour. Verify a 15-min-ago run is throttled but + # a 21-min-ago one is allowed. + last_15 = _WED_NOON - timedelta(minutes=15) + last_21 = _WED_NOON - timedelta(minutes=21) + assert NEWS_POLICY.should_run(last_15, now=_WED_NOON)[0] is False + assert NEWS_POLICY.should_run(last_21, now=_WED_NOON)[0] is True + + +def test_news_policy_off_hours_gap_is_three_hours(): + last_2h = _WED_PRE_DAWN - timedelta(hours=2) + last_4h = _WED_PRE_DAWN - timedelta(hours=4) + assert NEWS_POLICY.should_run(last_2h, now=_WED_PRE_DAWN)[0] is False + assert NEWS_POLICY.should_run(last_4h, now=_WED_PRE_DAWN)[0] is True + + +# --------------------------------------------------------------------------- +# Bespoke policy — confirms the dataclass is reconfigurable for callers +# (the audit flagged this as risky to over-fit to defaults). +# --------------------------------------------------------------------------- + + +def test_custom_policy_with_active_gap_throttles_within_window(): + """active_gap_h=0.5 means even during the active window a run from + 20 minutes ago is throttled — verifies the gate isn't hardcoded to + 'always run during active'.""" + p = CadencePolicy(active_gap_h=0.5) + last = _WED_NOON - timedelta(minutes=20) + ok, reason = p.should_run(last, now=_WED_NOON) + assert ok is False + assert "throttled" in reason diff --git a/tests/test_openrouter_transport.py b/tests/test_openrouter_transport.py new file mode 100644 index 0000000..dfc14b0 --- /dev/null +++ b/tests/test_openrouter_transport.py @@ -0,0 +1,256 @@ +"""Transport-layer tests for app.services.openrouter. + +The companion file `test_openrouter_prompt.py` covers prompt building; +this one covers the HTTP plumbing: provider chain selection, endpoint +resolution, the per-call retry/parse path in `_call_provider`, and +fallback behaviour in `call_llm`. Network requests are intercepted with +``httpx.MockTransport`` so nothing hits the wire. +""" +from __future__ import annotations + +import json +from unittest.mock import patch + +import httpx +import pytest + +from app.config import get_settings +from app.services import openrouter as ot + + +# --------------------------------------------------------------------------- +# _estimate_cost_usd +# --------------------------------------------------------------------------- + + +def test_estimate_cost_known_model_uses_table_rates(): + # deepseek-v4-flash table: 0.07/M input, 0.28/M output. + # 1000 in + 2000 out = 0.000_07 + 0.000_56 = 0.000_63. + cost = ot._estimate_cost_usd("deepseek-v4-flash", 1000, 2000) + assert cost == pytest.approx(0.00063, rel=1e-9) + + +def test_estimate_cost_handles_provider_prefixed_model_name(): + # OpenRouter-style model strings use the slash-prefixed form. + cost = ot._estimate_cost_usd("deepseek/deepseek-v4-flash", 1000, 2000) + assert cost == pytest.approx(0.00063, rel=1e-9) + + +def test_estimate_cost_unknown_model_returns_none(): + assert ot._estimate_cost_usd("never-heard-of-this-model", 100, 200) is None + + +def test_estimate_cost_missing_tokens_returns_none(): + assert ot._estimate_cost_usd("deepseek-v4-flash", None, 200) is None + assert ot._estimate_cost_usd("deepseek-v4-flash", 100, None) is None + assert ot._estimate_cost_usd("deepseek-v4-flash", None, None) is None + + +# --------------------------------------------------------------------------- +# _provider_chain / llm_configured / active_model +# --------------------------------------------------------------------------- + + +def _configure(monkeypatch, **overrides): + """Apply a small bundle of LLM settings for one test.""" + s = get_settings() + defaults = { + "LLM_PROVIDER": "deepseek", + "LLM_FALLBACK": "openrouter", + "DEEPSEEK_API_KEY": "", + "OPENROUTER_API_KEY": "", + "DEEPSEEK_MODEL": "deepseek-v4-flash", + "OPENROUTER_MODEL": "deepseek/deepseek-v4-flash", + "DEEPSEEK_URL": "https://api.deepseek.com/chat/completions", + } + defaults.update(overrides) + for k, v in defaults.items(): + monkeypatch.setattr(s, k, v, raising=False) + + +def test_provider_chain_drops_providers_without_keys(monkeypatch): + _configure(monkeypatch, DEEPSEEK_API_KEY="sk-deepseek") # openrouter key missing + assert ot._provider_chain() == ["deepseek"] + assert ot.llm_configured() is True + + +def test_provider_chain_lists_primary_then_fallback(monkeypatch): + _configure(monkeypatch, + DEEPSEEK_API_KEY="sk-deepseek", OPENROUTER_API_KEY="sk-openrouter") + assert ot._provider_chain() == ["deepseek", "openrouter"] + + +def test_provider_chain_skips_duplicate_when_primary_equals_fallback(monkeypatch): + _configure(monkeypatch, LLM_FALLBACK="deepseek", DEEPSEEK_API_KEY="sk") + assert ot._provider_chain() == ["deepseek"] + + +def test_llm_configured_false_when_no_keys(monkeypatch): + _configure(monkeypatch) # both keys empty + assert ot.llm_configured() is False + assert ot._provider_chain() == [] + assert ot.active_model() == "unknown" + + +def test_active_model_reflects_primary(monkeypatch): + _configure(monkeypatch, + LLM_PROVIDER="openrouter", OPENROUTER_API_KEY="sk-or", + DEEPSEEK_API_KEY="") + assert ot.active_model() == "deepseek/deepseek-v4-flash" # OPENROUTER_MODEL + + +# --------------------------------------------------------------------------- +# _endpoint_for +# --------------------------------------------------------------------------- + + +def test_endpoint_for_unknown_provider_raises(monkeypatch): + _configure(monkeypatch, DEEPSEEK_API_KEY="sk") + with pytest.raises(RuntimeError, match="Unknown LLM provider"): + ot._endpoint_for("anthropic") + + +def test_endpoint_for_provider_without_key_raises(monkeypatch): + _configure(monkeypatch) # both keys empty + with pytest.raises(RuntimeError, match="DEEPSEEK_API_KEY not set"): + ot._endpoint_for("deepseek") + with pytest.raises(RuntimeError, match="OPENROUTER_API_KEY not set"): + ot._endpoint_for("openrouter") + + +def test_endpoint_for_openrouter_includes_attribution_and_no_train_headers(monkeypatch): + _configure(monkeypatch, OPENROUTER_API_KEY="sk-or") + url, key, model, headers = ot._endpoint_for("openrouter") + assert url.endswith("/chat/completions") + assert key == "sk-or" + assert headers["X-OR-Allow-Training"] == "false" + assert "HTTP-Referer" in headers and "X-Title" in headers + + +# --------------------------------------------------------------------------- +# _call_provider (through call_llm so retry doesn't fire — happy paths only) +# --------------------------------------------------------------------------- + + +def _mock_post(callback): + """Wrap a callback into an httpx.MockTransport. Callback receives the + request and returns either an httpx.Response or raises.""" + return httpx.MockTransport(callback) + + +@pytest.mark.asyncio +async def test_call_llm_returns_parsed_log_result(monkeypatch): + _configure(monkeypatch, DEEPSEEK_API_KEY="sk-deepseek", LLM_FALLBACK="") + + def handler(request: httpx.Request) -> httpx.Response: + body = json.loads(request.content.decode()) + assert body["model"] == "deepseek-v4-flash" + return httpx.Response(200, json={ + "choices": [{"message": {"content": "hello"}, "finish_reason": "stop"}], + "usage": {"prompt_tokens": 100, "completion_tokens": 200}, + }) + + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + result = await ot.call_llm(client, [{"role": "user", "content": "hi"}]) + + assert result.content == "hello" + # Model is prefixed with the answering provider for ledger traceability. + assert result.model == "deepseek/deepseek-v4-flash" + assert result.prompt_tokens == 100 + assert result.completion_tokens == 200 + # DeepSeek doesn't return cost — estimated from tokens. + # 100 * 0.07 + 200 * 0.28 = 7 + 56 = 63 → 0.000063. + assert result.cost_usd == pytest.approx(0.000063, rel=1e-9) + + +@pytest.mark.asyncio +async def test_call_llm_uses_upstream_cost_when_provided(monkeypatch): + """When the upstream supplies usage.cost (OpenRouter), we trust it + and skip the per-model table estimate.""" + _configure(monkeypatch, LLM_PROVIDER="openrouter", + OPENROUTER_API_KEY="sk-or", LLM_FALLBACK="") + + def handler(request: httpx.Request) -> httpx.Response: + return httpx.Response(200, json={ + "choices": [{"message": {"content": "ok"}, "finish_reason": "stop"}], + "usage": {"prompt_tokens": 50, "completion_tokens": 50, "cost": 0.0042}, + }) + + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + result = await ot.call_llm(client, [{"role": "user", "content": "hi"}]) + + assert result.cost_usd == 0.0042 + + +@pytest.mark.asyncio +async def test_call_llm_falls_back_to_reasoning_field_when_content_null(monkeypatch): + """Thinking models sometimes return null `content` plus a populated + `reasoning` block — we surface the reasoning so the caller still gets + something usable rather than treating the row as empty.""" + _configure(monkeypatch, DEEPSEEK_API_KEY="sk-d", LLM_FALLBACK="") + + def handler(request: httpx.Request) -> httpx.Response: + return httpx.Response(200, json={ + "choices": [{ + "message": {"content": None, "reasoning": "deep thought"}, + "finish_reason": "stop", + }], + "usage": {"prompt_tokens": 10, "completion_tokens": 20}, + }) + + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + result = await ot.call_llm(client, [{"role": "user", "content": "hi"}]) + assert result.content == "deep thought" + + +@pytest.mark.asyncio +async def test_call_llm_raises_when_no_provider_configured(monkeypatch): + _configure(monkeypatch) # both keys empty + async with httpx.AsyncClient() as client: + with pytest.raises(RuntimeError, match="No LLM provider configured"): + await ot.call_llm(client, [{"role": "user", "content": "hi"}]) + + +# --------------------------------------------------------------------------- +# call_llm fallback chain — patch _call_provider to bypass the retry/sleep +# decorator and exercise the cross-provider failover logic directly. +# --------------------------------------------------------------------------- + + +@pytest.mark.asyncio +async def test_call_llm_falls_back_to_secondary_when_primary_raises(monkeypatch): + _configure(monkeypatch, + DEEPSEEK_API_KEY="sk-d", OPENROUTER_API_KEY="sk-or") + + calls = [] + success = ot.LogResult( + content="from-fallback", model="openrouter/deepseek/deepseek-v4-flash", + prompt_tokens=1, completion_tokens=2, cost_usd=0.0, + ) + + async def fake(_client, provider, _messages, _model, _max_tokens): + calls.append(provider) + if provider == "deepseek": + raise RuntimeError("primary down") + return success + + with patch.object(ot, "_call_provider", fake): + async with httpx.AsyncClient() as client: + result = await ot.call_llm(client, [{"role": "user", "content": "hi"}]) + + assert calls == ["deepseek", "openrouter"] + assert result.content == "from-fallback" + + +@pytest.mark.asyncio +async def test_call_llm_raises_last_exception_when_chain_exhausted(monkeypatch): + _configure(monkeypatch, + DEEPSEEK_API_KEY="sk-d", OPENROUTER_API_KEY="sk-or") + + async def fake(_client, provider, _messages, _model, _max_tokens): + raise RuntimeError(f"{provider} broken") + + with patch.object(ot, "_call_provider", fake): + async with httpx.AsyncClient() as client: + with pytest.raises(RuntimeError, match="openrouter broken"): + await ot.call_llm(client, [{"role": "user", "content": "hi"}]) From 4c1793e4e9d49bb9f4cfac57aca6fa39badae747 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 18:30:42 +0200 Subject: [PATCH 43/69] docs: mobile responsiveness design spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Captures the decisions from the brainstorm: phones-only (≤480px), all views in scope, right-side hamburger drawer, per-file @media blocks, hide secondary indicator columns. User opted to iterate on the coded product rather than running through writing-plans; spec exists so the rationale survives the session. --- .gitignore | 1 + ...2026-05-28-mobile-responsiveness-design.md | 88 +++++++++++++++++++ 2 files changed, 89 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-28-mobile-responsiveness-design.md diff --git a/.gitignore b/.gitignore index 0c0c5ae..168165b 100644 --- a/.gitignore +++ b/.gitignore @@ -16,3 +16,4 @@ build/ dist/ .coverage .mypy_cache/ +.superpowers/ diff --git a/docs/superpowers/specs/2026-05-28-mobile-responsiveness-design.md b/docs/superpowers/specs/2026-05-28-mobile-responsiveness-design.md new file mode 100644 index 0000000..769287b --- /dev/null +++ b/docs/superpowers/specs/2026-05-28-mobile-responsiveness-design.md @@ -0,0 +1,88 @@ +# Mobile responsiveness — design + +**Status:** approved 2026-05-28 (user opted to skip the implementation-plan +ceremony and iterate on the coded product instead). +**Scope:** all views, single ≤480px breakpoint, incremental media-query approach. + +## Decisions captured + +1. **Target device:** phones only. Single `@media (max-width: 480px)` breakpoint. + Tablets and small laptops keep the existing desktop layout. +2. **Scope:** every template (auth, public, app, dashboard, log, news, settings). +3. **Mobile topbar pattern:** hamburger drawer, **side-slide from the right**. +4. **Indicator table:** hide secondary columns on phones (`ccy`, `1y`, `anchor`, + `as_of`); keep symbol, price, 1d, 1m. +5. **CSS organisation:** per-file `@media` block at the bottom of each CSS file — + extends the pattern already in `layout.css`, `log-chat.css`, `news.css`, + `portfolio.css`, `public.css`. No central `mobile.css`. + +## Architecture + +``` +tokens.css — no mobile rules +layout.css — drawer geometry + topbar mobile layout +panels.css — header padding tightens +dashboard.css — group tabs scroll, indicator table column-hiding +portfolio.css — overall grid 2-col, composer textarea full-width, action wrap +log-chat.css — body padding, bubble width +auth.css — card padding +settings.css — form rows stack +news.css — pill wrap, source under headline +public.css — tighten existing 520/560 rules, hero typography clamp +``` + +Two small additions to `base.html`: +- A hamburger button in `.app-header` (hidden on desktop via `display: none`, + shown at ≤480px). +- ~20 lines of vanilla JS to toggle `body.drawer-open` plus a backdrop element. + Tap-backdrop, ESC, and swipe-right-on-drawer all close. + +## Hamburger drawer (right-side) + +- `position: fixed; top: 0; right: 0; height: 100vh; width: min(82vw, 320px)` +- Transform animation: `translateX(100%) → translateX(0)`, `180ms ease-out` +- Backdrop: `rgba(0,0,0,0.4)`, fades in over `120ms` +- Existing nav + `.header-right` widgets get wrapped in `.mobile-drawer` which + is `display: contents` on desktop (zero layout effect) and the fixed slide-out + panel on mobile. +- The existing `.user-menu` dropdown chip hides on mobile; its links surface + flat inside the drawer. + +## Per-view rules + +**Dashboard.** Group-tabs `overflow-x: auto`, no-wrap. Indicator table hides +`Ccy / 1y / anchor / as_of` columns. Aggregate-read summary header tightens. + +**Portfolio.** `.pf-overall__grid` collapses to 2 columns. Composer textarea +becomes full-width. `.pf-actions` buttons wrap to two rows instead of squishing. + +**Log + chat.** Body padding `16px → 10px`. Chat bubbles `max-width: 100%`, +user bubble loses right margin so it reaches the screen edge. + +**News.** Tag pills flex-wrap. Source + timestamp move under headline. +Shift-click hint hides (touch users get long-press equivalent). + +**Settings.** Form rows stack — label above input. Two-column import picker +becomes single column. Digest preferences keep their layout. + +**Auth.** Card padding `28px 26px → 20px 18px`. Width already fluid. + +**Public pages.** Audit existing 520/560 breakpoints; tighten hero typography +with `clamp()` so it scales down for small phones. + +## Testing + +- No Python tests affected — this is pure CSS + a single template tweak. + Existing 336-pass suite stays green. +- Manual verification on the user's phone post-deploy. (User cannot reach + localhost on the dev host, so visual companion was abandoned mid-brainstorm + in favour of ASCII previews; same constraint means no local browser smoke + test from the assistant side either — user iterates on the deployed site.) + +## Out of scope + +- Tablet / small-laptop breakpoints. Single ≤480 only. +- Touch gestures beyond drawer swipe-right-to-close. +- Mobile-specific reordering of dashboard panels (existing collapse order + is preserved). +- Visual companion server work (host unreachable from user's browser). From 2b3ea33884651929e1ab3d66e7e1576560c7ea5b Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 18:36:37 +0200 Subject: [PATCH 44/69] mobile: hamburger drawer (right-side slide-out) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ≤480px gets a hamburger button in the topbar and a fixed slide-out panel from the right edge (width min(82vw, 320px)). The topbar keeps only brand + tone toggle + hamburger visible; nav and the header-right widgets (theme, lang, user menu, version meta) move into the drawer. Markup change: nav and .header-right are now wrapped in .mobile-drawer, which is display:contents on desktop (no layout effect) and a fixed translateX panel on mobile. The user-menu dropdown chip hides on mobile and its links surface flat inside the drawer. JS: ~50 lines of vanilla. Tap hamburger / backdrop / ESC / swipe- right-on-drawer all close. Clicking a nav link inside the drawer closes it after the navigation kicks off so the panel doesn't linger on the next page. CSS: per-file @media block at the bottom of layout.css per the agreed-upon organisation. --- app/static/css/layout.css | 176 ++++++++++++++++++++++++++++++++++++++ app/templates/base.html | 161 ++++++++++++++++++++++++---------- 2 files changed, 292 insertions(+), 45 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index c4293b6..2a66ccc 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -51,7 +51,46 @@ a:hover { text-decoration: underline; } .app-header nav a.active { color: var(--text); } .app-header .meta { color: var(--muted); font-size: 11px; } +/* On desktop the mobile-drawer wrapper has no layout effect — its + * children (nav, header-right) flow as if it weren't there. On mobile + * the @media block at the bottom converts it to a fixed slide-out. */ +.mobile-drawer { display: contents; } + .app-header .header-right { display: flex; align-items: center; gap: 14px; } + +/* Hamburger button — only visible at ≤480px (rule in the mobile block). + * Three thin bars; uses the same border/muted treatment as the other + * header buttons so the visual rhythm matches. */ +.drawer-toggle { + display: none; + background: transparent; + border: 1px solid var(--border); + cursor: pointer; + padding: 6px 8px; + width: 36px; + height: 32px; + flex-direction: column; + justify-content: space-between; + align-items: stretch; +} +.drawer-toggle:hover { border-color: var(--accent); } +.drawer-toggle__bar { + display: block; + height: 2px; + background: var(--muted); + width: 100%; +} +.drawer-toggle:hover .drawer-toggle__bar { background: var(--accent); } + +.drawer-backdrop { + position: fixed; + inset: 0; + background: rgba(0, 0, 0, 0.4); + z-index: 90; + opacity: 0; + transition: opacity 120ms ease-out; +} +body.drawer-open .drawer-backdrop { opacity: 1; } .theme-toggle { background: transparent; border: 1px solid var(--border); @@ -183,3 +222,140 @@ a:hover { text-decoration: underline; } ::-webkit-scrollbar-track { background: var(--bg); } ::-webkit-scrollbar-thumb { background: var(--dim); border-radius: 0; } ::-webkit-scrollbar-thumb:hover { background: var(--muted); } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* Tighten the topbar so the brand, tone toggle and hamburger all fit + on a 360px phone. Drop the letter-spacing because at 11px tracking + eats horizontal space the brand cannot spare. */ + .app-header { + padding: 8px 12px; + gap: 8px; + letter-spacing: 0.04em; + } + .app-header .brand { + font-size: 12px; + /* Shrink the leading glyph but don't remove it — keeps brand identity. */ + } + .beta-chip { display: none; } + + /* Show the hamburger; the rest of the header widgets collapse into + the drawer (the .mobile-drawer block below). */ + .drawer-toggle { display: flex; margin-left: auto; } + + /* Keep the tone toggle visible but trim it: just N / I letters so it + fits next to brand + hamburger. */ + .tone-toggle--header { + font-size: 9.5px; + } + .tone-toggle--header button { + padding: 4px 7px; + } + + /* The drawer wrapper: full-height slide-out from the right. The + content inside (nav + header-right) becomes a vertical stack + with comfortable touch targets. */ + .mobile-drawer { + display: flex; + flex-direction: column; + gap: 0; + position: fixed; + top: 0; + right: 0; + bottom: 0; + width: min(82vw, 320px); + background: var(--surface); + border-left: 1px solid var(--border); + box-shadow: -2px 0 12px rgba(0, 0, 0, 0.18); + transform: translateX(100%); + transition: transform 180ms ease-out; + z-index: 100; + overflow-y: auto; + padding: 56px 18px 24px; + text-transform: none; + letter-spacing: 0.02em; + } + body.drawer-open .mobile-drawer { transform: translateX(0); } + + /* Vertical nav inside the drawer — links become big-tap rows, no + leading margin like the desktop horizontal nav. */ + .mobile-drawer nav { display: flex; flex-direction: column; } + .mobile-drawer nav a { + margin-left: 0; + padding: 12px 4px; + border-bottom: 1px solid var(--border); + font-size: 14px; + text-transform: uppercase; + letter-spacing: 0.06em; + } + .mobile-drawer nav a.active { + color: var(--accent); + border-left: 2px solid var(--accent); + padding-left: 10px; + } + + /* header-right widgets vertically stacked inside the drawer. */ + .mobile-drawer .header-right { + flex-direction: column; + align-items: stretch; + gap: 14px; + margin-top: 20px; + } + .mobile-drawer .theme-toggle, + .mobile-drawer .lang-toggle { + width: 100%; + justify-content: center; + } + .mobile-drawer .lang-toggle { display: inline-flex; } + .mobile-drawer .lang-toggle button, + .mobile-drawer .theme-toggle { padding: 10px; font-size: 11.5px; } + + /* The user-menu's dropdown becomes redundant inside the drawer — + surface its links flat as a list, and hide the chip button. */ + .mobile-drawer .user-menu { width: 100%; } + .mobile-drawer .user-chip { display: none; } + .mobile-drawer .user-menu__panel { + display: block !important; /* override the hidden attribute */ + position: static; + border: 0; + padding: 0; + margin-top: 4px; + } + .mobile-drawer .user-menu__panel[hidden] { display: block !important; } + .mobile-drawer .user-menu__item { + display: block; + padding: 10px 4px; + border-bottom: 1px solid var(--border); + font-size: 13px; + text-transform: uppercase; + letter-spacing: 0.06em; + } + .mobile-drawer .meta { + margin-top: auto; + padding-top: 18px; + text-align: center; + opacity: 0.7; + } + + /* The drawer container itself sits above the topbar in z-stacking; + we still want the close button accessible while it's open, so push + a close target into the top-right corner of the drawer via a + repurposed pseudo-element. (Simpler than adding new markup.) */ + .mobile-drawer::before { + content: "✕"; + position: absolute; + top: 14px; + right: 18px; + font-size: 18px; + color: var(--muted); + cursor: pointer; + pointer-events: none; /* tap handled by the backdrop / hamburger */ + } + + /* Body-level layout: tighten main padding too — saves another 16px + of horizontal real estate which the indicator table and chat + bubbles all benefit from. */ + .app-main { padding: 10px 8px; gap: 10px; } +} diff --git a/app/templates/base.html b/app/templates/base.html index fd15361..ccbcb08 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -207,57 +207,78 @@
            {{ BRAND_NAME }} {% if BETA_MODE %}BETA{% endif %} - -
            -
            - - -
            - - {% set cu = request.state.current_user if request.state and request.state.current_user is defined else None %} - {% if cu and cu.user %} -
            - - -
            - {% endif %} - {% if cu and (cu.user or cu.is_admin) %} -
            - + +
            + + {# Mobile hamburger — shown only at ≤480px via CSS. #} + + + {# Wrapper: display:contents on desktop (zero layout effect), fixed + slide-out panel on mobile. Holds nav + header-right widgets. #} +
            + +
            + - - {% endif %} - v0.1 · UTC
            + {# Drawer backdrop. Hidden by default; CSS shows it when + body.drawer-open is set. Click closes the drawer. #} + + + +
            {% block main %}{% endblock %}
            From b6da1983d33f23c64b0f5741c76a3f4ba3536d42 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 18:43:36 +0200 Subject: [PATCH 45/69] =?UTF-8?q?mobile:=20per-view=20=E2=89=A4480px=20rul?= =?UTF-8?q?es=20across=20the=20CSS=20bundle?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the @media (max-width: 480px) blocks specified in the design: - dashboard.css: indicator table hides the 'mobile-hide'-tagged columns (Label, Ccy, 1y, anchor, as-of), keeping Symbol / Price / 1d / 1m. Cell padding + font shrink. Group-tab buttons get a bigger touch target. - panels.css: header padding tightens, scroll-body max-height drops to 60vh so log/news stay above the fold in the stacked layout. - portfolio.css: overall grid keeps 2 cols (already at 640px) with tighter gap; action buttons wrap; composer input goes full-width. - log-chat.css: chat bubbles edge-to-edge, input row stacks, font- size:14px on form fields to avoid iOS Safari zoom-on-focus. - news.css: row collapses to age | (title / source) — source moves under the title. Tag filter strip wraps. - settings.css: form rows stack (label above input). Import picker becomes single-column. Buttons full-width. - auth.css: card padding tightens to free up vertical space when the iOS keyboard is up. font-size:14px on inputs. - public.css: hero headline clamp() lower bound drops to 22px; CTAs stack full-width; pricing tier-grid stacks. indicators.html: tagged the secondary cells with .mobile-hide rather than relying on positional nth-child — the anchor column is conditional and would have shifted positions. 336 tests still pass. --- app/static/css/auth.css | 17 ++++++++++ app/static/css/dashboard.css | 39 +++++++++++++++++++++++ app/static/css/log-chat.css | 43 ++++++++++++++++++++++++++ app/static/css/news.css | 37 ++++++++++++++++++++++ app/static/css/panels.css | 16 ++++++++++ app/static/css/portfolio.css | 30 ++++++++++++++++++ app/static/css/public.css | 26 ++++++++++++++++ app/static/css/settings.css | 35 +++++++++++++++++++++ app/templates/partials/indicators.html | 20 ++++++------ 9 files changed, 253 insertions(+), 10 deletions(-) diff --git a/app/static/css/auth.css b/app/static/css/auth.css index 70da6cd..31081bb 100644 --- a/app/static/css/auth.css +++ b/app/static/css/auth.css @@ -130,3 +130,20 @@ color: var(--accent) !important; border-color: var(--accent) !important; } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* The card is already width:360px;max-width:100% so it fills the + screen — just tighten internal padding to free up vertical space + for the keyboard on iOS Safari (which eats half the viewport). */ + .auth-card { padding: 20px 18px; } + .auth-card__brand { font-size: 14px; } + .auth-card__lede { font-size: 12px; } + .auth-card input, + .auth-card button[type="submit"] { + font-size: 14px; /* avoids iOS Safari zoom-on-focus */ + padding: 10px 12px; + } +} diff --git a/app/static/css/dashboard.css b/app/static/css/dashboard.css index 956157d..aae9b6a 100644 --- a/app/static/css/dashboard.css +++ b/app/static/css/dashboard.css @@ -226,3 +226,42 @@ vertical-align: middle; user-select: none; } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* Hide secondary indicator-table columns: Label, Ccy, 1y, anchor, + as-of. The cells are tagged with .mobile-hide in indicators.html; + this rule keeps display intent in CSS while letting the template + handle the conditional anchor column. Symbol / Price / 1d / 1m + remain — the four numbers a phone user actually wants. */ + .dense .mobile-hide { display: none; } + + /* Tighter cell padding so the four remaining columns fit + comfortably on a 360px viewport. */ + .dense th, .dense td { + padding: 4px 6px; + font-size: 11px; + } + /* Symbol column gets a touch more breathing room — it's the + identifying anchor. */ + .dense td.label { font-weight: 600; } + + /* Group-tabs strip already has overflow-x:auto; widen the tap + targets so swipe-scrolling on a touchscreen feels natural. */ + .group-tabs button { + padding: 8px 14px; + font-size: 11.5px; + white-space: nowrap; + } + + /* Aggregate-read summary header tightens — stack the label above + the timestamp to avoid wrapping at awkward points. */ + .ind-summary__head { + flex-direction: column; + align-items: flex-start; + gap: 2px; + } + .ind-summary__body { font-size: 12px; } +} diff --git a/app/static/css/log-chat.css b/app/static/css/log-chat.css index 6e847ec..895953b 100644 --- a/app/static/css/log-chat.css +++ b/app/static/css/log-chat.css @@ -280,3 +280,46 @@ } .chat-form button:hover:not(:disabled) { background: var(--accent); color: var(--bg); } .chat-form button:disabled { opacity: 0.4; cursor: not-allowed; } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* Trim horizontal padding so the markdown column uses the screen + width. The existing 1100px rule already capped the column at + 76ch; we just shave the surrounding gutter. */ + .log-content { padding: 0 4px; font-size: 13.5px; } + .log-content h2 { font-size: 16px; } + .log-content h3 { font-size: 14px; } + + /* Chat bubbles edge-to-edge so the conversation reads like a + mobile messenger thread. */ + .chat-msg { + max-width: 100%; + padding: 8px 10px; + font-size: 13px; + } + .chat-msg--user { margin-right: 0; } + .chat-msg--assistant { margin-left: 0; } + + /* Chat input row stacks: textarea full-width, button below. */ + .chat-form { + flex-direction: column; + gap: 6px; + padding: 8px; + } + .chat-form textarea { + width: 100%; + min-height: 56px; + font-size: 14px; /* avoids iOS Safari zoom-on-focus */ + } + .chat-form button { + width: 100%; + padding: 10px; + font-size: 12px; + } + + .chat-header { padding: 8px 10px; } + .chat-title { font-size: 12px; } + .chat-hint { font-size: 10px; } +} diff --git a/app/static/css/news.css b/app/static/css/news.css index 8827339..3fb4bc6 100644 --- a/app/static/css/news.css +++ b/app/static/css/news.css @@ -84,3 +84,40 @@ } .news-tag--clear { color: var(--dim); border-style: dashed; } .news-tag--clear:hover { color: var(--negative); border-color: var(--negative); } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* The 720px rule already collapsed to age | source | title and + hid the right-side tag chips. At ≤480 we drop the source column + too and let the title flow under the age, with source as a small + line below the title — saves another ~100px of horizontal room. */ + .news-row { + grid-template-columns: 50px minmax(0, 1fr); + gap: 8px; + padding: 6px 10px; + } + .news-row .source { + grid-column: 2; + grid-row: 2; + font-size: 10.5px; + } + .news-row .title { + grid-column: 2; + grid-row: 1; + font-size: 12.5px; + line-height: 1.35; + } + + /* Tag filter strip wraps onto multiple rows on a phone. */ + .news-tags { + flex-wrap: wrap; + gap: 6px; + padding: 6px 8px; + } + .news-tag { + padding: 4px 8px; + font-size: 11px; + } +} diff --git a/app/static/css/panels.css b/app/static/css/panels.css index 18293a5..4d2d9a3 100644 --- a/app/static/css/panels.css +++ b/app/static/css/panels.css @@ -90,3 +90,19 @@ table.dense tr:hover td { background: color-mix(in srgb, var(--accent) 5%, trans transition: opacity 0.2s; } .htmx-request .htmx-indicator { opacity: 1; } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + .panel-header { + padding: 8px 10px; + gap: 8px; + } + .panel-header .title { font-size: 12px; } + .panel-header .meta { font-size: 10px; } + .panel-body { padding: 4px 6px; } + /* Scroll panels lose some vertical room on small screens so the + stacked layout doesn't push log/news off the fold. */ + .panel-body--scroll { max-height: 60vh; } +} diff --git a/app/static/css/portfolio.css b/app/static/css/portfolio.css index 89ffb32..cdf0417 100644 --- a/app/static/css/portfolio.css +++ b/app/static/css/portfolio.css @@ -374,3 +374,33 @@ details[open] .pf-analysis__head-left::before { content: "▾ "; } color: var(--muted); background: color-mix(in srgb, var(--accent) 4%, transparent); } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* The existing 640px breakpoint already moves the overall grid to + 2 cols. At ≤480 we keep 2 cols but tighten gap so the stat + values don't crowd the labels next to them. */ + .pf-overall__grid { gap: 4px 12px; } + .pf-stat-value { font-size: 14px; } + + /* Action buttons wrap to multiple rows instead of squishing onto + one. flex-wrap was already set above; ensure each button has a + comfortable tap target. */ + .pf-actions { flex-wrap: wrap; gap: 6px; } + .pf-actions button { + flex: 1 1 auto; + padding: 8px 12px; + font-size: 11.5px; + } + + /* Pill row stays wrapped; just give pills a small min-width so + two-character tags (USD, EUR) don't hug each other awkwardly. */ + .pf-pill { padding: 3px 7px; } + + /* The inline composer's input gets the full width — the desktop's + intrinsic-width sizing leaves it tiny on a phone. */ + .pf-add__line { flex-wrap: wrap; gap: 6px; } + .pf-add__line input, .pf-add__line textarea { width: 100%; } +} diff --git a/app/static/css/public.css b/app/static/css/public.css index 0e0e361..9b1f753 100644 --- a/app/static/css/public.css +++ b/app/static/css/public.css @@ -715,3 +715,29 @@ a.btn-secondary:hover { color: var(--accent); border-color: var(--accent); } color: var(--accent); outline: none; } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* Hero headline already uses clamp(); shrink its lower bound so a + two-line headline doesn't push the CTAs below the fold on a + 360px screen. */ + .hero__headline { font-size: clamp(22px, 6vw, 32px); } + .hero__subhead { font-size: 14px; } + + /* CTAs stack full-width on phones — easier tap targets. */ + .hero__ctas { flex-direction: column; align-items: stretch; } + .btn-primary, .btn-secondary { + width: 100%; + text-align: center; + padding: 12px 18px; + } + + /* Tier cards (pricing page) stack on phones. */ + .tier-grid { grid-template-columns: 1fr; gap: 16px; } + .tier-card { padding: 18px; } + + /* Tighten public-page outer padding. */ + .public-shell { padding: 16px 12px; } +} diff --git a/app/static/css/settings.css b/app/static/css/settings.css index 1d6c0e5..819d69b 100644 --- a/app/static/css/settings.css +++ b/app/static/css/settings.css @@ -379,3 +379,38 @@ box-sizing: border-box; } .modal-input:focus { border-color: var(--accent); } + + +/* --- Mobile (≤480px) -------------------------------------------------- */ + +@media (max-width: 480px) { + /* Form rows stack: label above value instead of side-by-side. The + desktop layout uses a fixed 110px label column that pinches the + value column unbearably on a phone. */ + .settings-row { + flex-direction: column; + align-items: stretch; + gap: 4px; + padding: 10px 0; + } + .settings-row__label { + width: auto; + font-size: 11px; + text-transform: uppercase; + letter-spacing: 0.06em; + } + .settings-select { + width: 100%; + font-size: 14px; /* avoids iOS Safari zoom-on-focus */ + } + + /* The two-column import picker becomes single column. */ + .import-choice { + flex: 1 1 100%; + min-width: 0; + } + + /* Buttons get a full-width tap target. */ + .settings-btn { width: 100%; padding: 10px; } + .settings-icon-btn { width: 100%; justify-content: center; } +} diff --git a/app/templates/partials/indicators.html b/app/templates/partials/indicators.html index 0ae1e1f..70aaab0 100644 --- a/app/templates/partials/indicators.html +++ b/app/templates/partials/indicators.html @@ -20,11 +20,11 @@ - - - - {% if has_anchor %}{% endif %} - + + + + {% if has_anchor %}{% endif %} + @@ -46,22 +46,22 @@ - + - + {% for k in ["1d","1m","1y"] %} {% set v = q.changes.get(k) if q.changes else None %} - {% endfor %} {% if has_anchor %} {% set va = q.changes.get('anchor') if q.changes else None %} - {% endif %} - + {% endif %} {% endfor %} From 5ceee961352b020e6754dbfb4c212e56d5335a68 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 18:55:04 +0200 Subject: [PATCH 46/69] mobile: fix drawer stacking + horizontal page overflow MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two related bugs reported on phone: 1. Drawer was unclickable — backdrop covered it. Root cause: the .app-header (position:sticky, z-index:50) creates a stacking context, so the drawer inside it had its z-index:100 clamped to "above other things inside the header" but NOT above siblings of the header. The backdrop at root-level z:90 then sat over the drawer subtree. Fix: when body.drawer-open, raise .app-header z-index to 110 so its entire descendant tree (drawer included) draws above the z:90 backdrop. The page body under the header stays dimmed. 2. Horizontal scrolling on the dashboard. Root cause: the bottom markets bar used `grid-template-columns: repeat(auto-fit, minmax(220px, 1fr))`, which at 4+ markets blows out to 880px+ and forces the page wider than the viewport. Fix: on ≤480px the markets bar becomes a horizontally scrolling flex strip with min-width:160px per chip — page stays narrow, user swipes the bar to see more markets. Also added overflow-x:hidden to html/body as a defensive net against the fixed off-screen drawer creating overflow on Safari iOS. --- app/static/css/layout.css | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 2a66ccc..2d6fe80 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -10,6 +10,10 @@ html, body { font-size: 13px; line-height: 1.5; font-variant-numeric: tabular-nums; + /* Prevents the off-screen fixed mobile drawer (translateX(100%)) + from forcing horizontal scroll on Safari iOS, and provides a + safety net for any cell/grid that would otherwise overflow. */ + overflow-x: hidden; } a { color: var(--accent); text-decoration: none; } @@ -235,6 +239,13 @@ body.drawer-open .drawer-backdrop { opacity: 1; } gap: 8px; letter-spacing: 0.04em; } + /* When the drawer is open the header (which contains the drawer) + needs to draw above the backdrop. The header is a sticky element + with its own stacking context at z-index 50, so the drawer's + local z-index 100 is clamped to z-50 in the root context — the + backdrop at z-90 then sits OVER it. Raise the whole header above + the backdrop while the drawer is open. */ + body.drawer-open .app-header { z-index: 110; } .app-header .brand { font-size: 12px; /* Shrink the leading glyph but don't remove it — keeps brand identity. */ @@ -358,4 +369,20 @@ body.drawer-open .drawer-backdrop { opacity: 1; } of horizontal real estate which the indicator table and chat bubbles all benefit from. */ .app-main { padding: 10px 8px; gap: 10px; } + + /* Markets bar: the desktop grid uses minmax(220px, 1fr) per market, + which at 4+ markets blows out to 880px+ and forces page-wide + horizontal scroll. On phones, let the bar itself scroll + horizontally so the page can stay narrow. Each chip gets a + reasonable min-width so the values inside don't wrap. */ + .markets-bar__inner { + grid-template-columns: none; + display: flex; + overflow-x: auto; + -webkit-overflow-scrolling: touch; + } + .markets-bar .mkt { + flex: 0 0 auto; + min-width: 160px; + } } From 8ec4ea1c723687ad72e51d09eee12883a55fb9e1 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 19:02:30 +0200 Subject: [PATCH 47/69] mobile: clamp grid items + table cells to viewport width MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User reported the page rendering at ~3x viewport width on Android Chrome with overflow-x:hidden clipping off most of the content. Root cause: CSS grid items default to min-width:min-content, and the indicator table inside the indicators panel has white-space:nowrap cells. A long Symbol/Label value forces the table wider than its panel; the panel propagates that minimum width up the grid; the grid expands the .app-main; .app-main pushes the page wider than the viewport. overflow-x:hidden then just chops the right portion off. Fix has three parts: 1. .app and .app-main get min-width:0 and max-width:100vw so the shell can't be wider than the viewport regardless of descendants. 2. Every direct child of .app-main (each panel) gets min-width:0 on mobile so individual panels can shrink past their min-content. 3. table.dense drops white-space:nowrap on text cells at ≤480px — long symbols wrap to two lines instead of forcing the table wide. Numeric cells keep nowrap (negative percentages reading as "−12\n.34%" would be unreadable). Also adds an overflow-x:auto fallback on .panel-body pre/code so any code block in AI output scrolls within the panel instead of blowing the page out. --- app/static/css/layout.css | 20 ++++++++++++++++++-- app/static/css/panels.css | 23 +++++++++++++++++++++++ 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 2d6fe80..1d2db25 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -26,6 +26,13 @@ a:hover { text-decoration: underline; } grid-template-columns: 1fr; grid-template-rows: auto 1fr auto; min-height: 100vh; + /* Grid items default to min-content min-width which can blow past + the viewport when a descendant table or flex row is wide. min-width:0 + lets the cell shrink below intrinsic min-content, and max-width:100vw + caps the whole shell against the viewport so we never need to rely on + overflow:hidden clipping. */ + min-width: 0; + max-width: 100vw; } .app-header { @@ -367,8 +374,17 @@ body.drawer-open .drawer-backdrop { opacity: 1; } /* Body-level layout: tighten main padding too — saves another 16px of horizontal real estate which the indicator table and chat - bubbles all benefit from. */ - .app-main { padding: 10px 8px; gap: 10px; } + bubbles all benefit from. Also force min-width:0 on the grid + container and every grid item, otherwise a wide table inside + a panel forces the whole grid (and the page) wider than the + viewport. This is the single most important mobile fix. */ + .app-main { + padding: 10px 8px; + gap: 10px; + min-width: 0; + max-width: 100vw; + } + .app-main > * { min-width: 0; } /* Markets bar: the desktop grid uses minmax(220px, 1fr) per market, which at 4+ markets blows out to 880px+ and forces page-wide diff --git a/app/static/css/panels.css b/app/static/css/panels.css index 4d2d9a3..7a704ae 100644 --- a/app/static/css/panels.css +++ b/app/static/css/panels.css @@ -95,6 +95,14 @@ table.dense tr:hover td { background: color-mix(in srgb, var(--accent) 5%, trans /* --- Mobile (≤480px) -------------------------------------------------- */ @media (max-width: 480px) { + /* Force panels and their bodies to honour the parent grid cell + width, even when descendants (tables, code blocks, long URLs) + have intrinsic widths that exceed the viewport. min-width:0 is + the magic that lets flex/grid items shrink past min-content; + max-width:100% caps the box itself. */ + .panel { min-width: 0; max-width: 100%; } + .panel-body { min-width: 0; max-width: 100%; } + .panel-header { padding: 8px 10px; gap: 8px; @@ -105,4 +113,19 @@ table.dense tr:hover td { background: color-mix(in srgb, var(--accent) 5%, trans /* Scroll panels lose some vertical room on small screens so the stacked layout doesn't push log/news off the fold. */ .panel-body--scroll { max-height: 60vh; } + + /* Tables: dropping white-space:nowrap lets long Symbol / Label cells + wrap to a second line instead of forcing the table wider than the + panel. Numeric cells stay nowrap since "−12.34%" wrapping would be + unreadable. */ + table.dense { table-layout: auto; } + table.dense th, table.dense td { white-space: normal; } + table.dense .num { white-space: nowrap; } + + /* Final safety net: if a descendant still insists on being wider + than the panel (e.g. a wide pre/code block in the AI output), + scroll it horizontally inside the panel rather than blowing the + whole page out. */ + .panel-body pre, + .panel-body code { max-width: 100%; overflow-x: auto; } } From 6459e8c43d5827ac0bee9397b77242e65cc184fe Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 19:10:58 +0200 Subject: [PATCH 48/69] mobile: wrap tabs, trim portfolio + markets bar columns MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three pieces of phone-side feedback: 1. Indicator group tabs wrap onto multiple rows instead of horizontal-scrolling — every group is visible at a glance. Each button keeps its own bottom border so wrapped rows stay visually delimited; the container's bottom border is removed. 2. Portfolio holdings table hides Qty and Avg columns on mobile via the mobile-hide class (same mechanism as the indicator table). Remaining columns are the actionable ones: Ticker, Name, Last, P/L, %. 3. Markets bar at the bottom compacts to one row per chip — dot + code + change% only. The state word ("open" / "closed") is implied by the dot colour; the index label, price, and until-time are dropped on mobile. Grid columns drop their 220px floor so the full set fits the viewport without horizontal scroll (previously the bar scrolled within itself). --- app/static/css/dashboard.css | 16 ++++++--- app/static/css/layout.css | 48 +++++++++++++++++++++------ app/templates/partials/portfolio.html | 8 ++--- 3 files changed, 53 insertions(+), 19 deletions(-) diff --git a/app/static/css/dashboard.css b/app/static/css/dashboard.css index aae9b6a..9cf1e98 100644 --- a/app/static/css/dashboard.css +++ b/app/static/css/dashboard.css @@ -248,11 +248,19 @@ identifying anchor. */ .dense td.label { font-weight: 600; } - /* Group-tabs strip already has overflow-x:auto; widen the tap - targets so swipe-scrolling on a touchscreen feels natural. */ + /* Group tabs: wrap onto multiple rows instead of horizontal + scrolling so the user can see every group at a glance. The + border-bottom moves to each row so wrapped rows are still + visually delimited. */ + .group-tabs { + flex-wrap: wrap; + overflow-x: visible; + border-bottom: 0; + } .group-tabs button { - padding: 8px 14px; - font-size: 11.5px; + padding: 6px 10px; + font-size: 11px; + border-bottom: 1px solid var(--border); white-space: nowrap; } diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 1d2db25..748b8db 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -386,19 +386,45 @@ body.drawer-open .drawer-backdrop { opacity: 1; } } .app-main > * { min-width: 0; } - /* Markets bar: the desktop grid uses minmax(220px, 1fr) per market, - which at 4+ markets blows out to 880px+ and forces page-wide - horizontal scroll. On phones, let the bar itself scroll - horizontally so the page can stay narrow. Each chip gets a - reasonable min-width so the values inside don't wrap. */ + /* Markets bar: compact each chip so the full set fits the viewport + without horizontal scrolling. We drop: + - state word ("open" / "closed") — the dot already conveys that + - index label (e.g. "SPX") — implied by the market code + - index price — keep the change% which is the actionable number + - until-time — too detailed for a glance + Remaining: dot + market code + change%. The grid keeps auto-fit + but the minimum drops from 220px to 0 so it always fits. */ .markets-bar__inner { - grid-template-columns: none; - display: flex; - overflow-x: auto; - -webkit-overflow-scrolling: touch; + grid-template-columns: repeat(auto-fit, minmax(0, 1fr)); + gap: 0; } .markets-bar .mkt { - flex: 0 0 auto; - min-width: 160px; + grid-template-columns: auto 1fr auto; + grid-template-rows: auto; + padding: 5px 6px; + gap: 4px; + font-size: 10px; } + /* Re-flow the chip's grid so it's a single row of three: dot, + code, change. The 2-row layout (which had state/when on row 2) + is dropped along with the elements that lived there. */ + .markets-bar .mkt .mkt__dot { + grid-row: 1; grid-column: 1; + width: 6px; height: 6px; + } + .markets-bar .mkt .mkt__name { + grid-row: 1; grid-column: 2; + font-size: 10px; + letter-spacing: 0.04em; + } + .markets-bar .mkt .mkt__index { + grid-row: 1; grid-column: 3; + font-size: 10px; + } + /* Strip the now-redundant content. The elements still render but + occupy no space so the chip stays narrow. */ + .markets-bar .mkt__state, + .markets-bar .mkt__when, + .markets-bar .mkt__index-label, + .markets-bar .mkt__index-price { display: none; } } diff --git a/app/templates/partials/portfolio.html b/app/templates/partials/portfolio.html index 18061f9..99d4dc0 100644 --- a/app/templates/partials/portfolio.html +++ b/app/templates/partials/portfolio.html @@ -51,8 +51,8 @@ - - + + @@ -63,8 +63,8 @@ - - + + ' + '' + '' + - '' + - '' + + '' + + '' + '' + '' + '' + @@ -365,7 +365,7 @@ '
            SymbolLabelPriceCcy1d1m1yanchoras-ofSymbolLabelPriceCcy1d1m1yanchoras-of
            {{ short_sym }} {{ q.label or "" }}{{ q.label or "" }} {{ q.price | price }}{{ q.currency or "" }}{{ q.currency or "" }} + {% if v is none %}—{% else %}{{ "%+.2f"|format(v) }}%{% endif %} + {% if va is none %}—{% else %}{{ "%+.2f"|format(va) }}%{% endif %} {{ q.as_of or "" }}{{ q.as_of or "" }}
            Ticker NameQtyAvgQtyAvg Last P/L %
            {{ pos.ticker }} {{ pos.name or "" }}{{ pos.quantity | price }}{{ pos.average_price | price }}{{ pos.quantity | price }}{{ pos.average_price | price }} {{ pos.current_price | price }} {{ pos.ppl | signed }} From 1a20f0a15baa37ac0e611fcecb3ed7bfb818423c Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 19:13:52 +0200 Subject: [PATCH 49/69] mobile: tag Qty/Avg cells in JS-rendered portfolio table MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The portfolio table is rendered client-side in portfolio.js (not by the partials/portfolio.html Jinja template, which is unused for this view). The previous commit's mobile-hide class made it into the template but never reached the actual DOM. Adding the class to the JS-emitted and strings so .dense .mobile-hide { display: none } actually picks them up at ≤480px. --- app/static/js/portfolio.js | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/app/static/js/portfolio.js b/app/static/js/portfolio.js index ac2c175..b7f820f 100644 --- a/app/static/js/portfolio.js +++ b/app/static/js/portfolio.js @@ -303,8 +303,8 @@ return '
            ' + esc(p.yahoo_ticker) + '' + esc(p.name || '') + '' + fmt(p.qty, { maximumFractionDigits: 6 }) + '' + fmt(p.avg_cost) + '' + fmt(p.qty, { maximumFractionDigits: 6 }) + '' + fmt(p.avg_cost) + '' + lastDisplay + fxBadge + '' + signed(p._ppl) + '' + pct(p._ppl_pct) + '
            ' + '' + '' + - '' + + '' + '' + '' + '' + From daa3f79a525f135160a4f991c96026bd9b92c8ae Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Thu, 28 May 2026 19:20:49 +0200 Subject: [PATCH 50/69] mobile: cache-bust static assets so browser picks up CSS/JS edits User reported phone still showing old behaviour (Qty/Avg portfolio columns visible) even though the server-side JS had been updated. Root cause: every / - - - - - - - - - - + + + + + + + + + + - - + + +
            diff --git a/app/templates/landing.html b/app/templates/landing.html index 8d3d170..0726226 100644 --- a/app/templates/landing.html +++ b/app/templates/landing.html @@ -27,10 +27,10 @@
            @@ -48,10 +48,10 @@ off-hours stay quiet.

            @@ -66,10 +66,10 @@ in earnings, policy, valuation — not chart patterns.

            @@ -87,10 +87,10 @@ not a forecast and not advice on any investment decision.

            @@ -100,10 +100,10 @@

            More views

            -{% if paid %}{% endif %} +{% if paid %}{% endif %} {% endblock %} diff --git a/app/templates/login.html b/app/templates/login.html index 448c883..2cfc899 100644 --- a/app/templates/login.html +++ b/app/templates/login.html @@ -10,9 +10,9 @@ catch (e) { document.documentElement.dataset.theme = 'light'; } })(); - - - + + +
            diff --git a/app/templates/public_base.html b/app/templates/public_base.html index 427fc36..b1cef24 100644 --- a/app/templates/public_base.html +++ b/app/templates/public_base.html @@ -14,12 +14,12 @@ } catch (e) { document.documentElement.dataset.theme = 'light'; } })(); - - - - - - + + + + + +
            diff --git a/app/templates/settings.html b/app/templates/settings.html index 95edef8..7ce054c 100644 --- a/app/templates/settings.html +++ b/app/templates/settings.html @@ -357,7 +357,7 @@ {% if user %} {# Import widget wiring — auto-parse on drop, preview, then commit. #} - - + + {% endif %} {% endblock %} diff --git a/app/templates/verify.html b/app/templates/verify.html index 43637d4..d28c989 100644 --- a/app/templates/verify.html +++ b/app/templates/verify.html @@ -10,9 +10,9 @@ catch (e) { document.documentElement.dataset.theme = 'light'; } })(); - - - + + +
            diff --git a/app/templates_env.py b/app/templates_env.py index 22cfdf6..7240d39 100644 --- a/app/templates_env.py +++ b/app/templates_env.py @@ -3,6 +3,7 @@ Imported by both routers/pages.py and routers/api.py so the filters are registered exactly once.""" from __future__ import annotations +import time from pathlib import Path from fastapi.templating import Jinja2Templates @@ -13,6 +14,13 @@ from app.config import get_settings from app.services.glossary import wrap_glossary +# Cache-busting token for static assets. Computed once at import time +# (i.e. process startup), so every container restart yields a fresh +# value and browsers refetch CSS/JS instead of serving stale cache. +# Templates append `?v={{ ASSET_VERSION }}` to every static URL. +ASSET_VERSION = str(int(time.time())) + + TEMPLATE_DIR = Path(__file__).resolve().parent / "templates" @@ -77,3 +85,4 @@ templates.env.globals["LEGAL_OPERATOR"] = branding.LEGAL_OPERATOR templates.env.globals["OPERATOR_EMAIL"] = branding.OPERATOR_EMAIL templates.env.globals["OPERATOR_JURISDICTION"] = branding.OPERATOR_JURISDICTION templates.env.globals["BETA_MODE"] = get_settings().BETA_MODE +templates.env.globals["ASSET_VERSION"] = ASSET_VERSION From 31a8efc27d2b70fe7ecdee188a1b641cf1d40705 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 11:00:11 +0200 Subject: [PATCH 51/69] ui: regroup topbar + unify the three header toggles MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Header layout was visibly broken on desktop after the mobile-drawer change: flex space-between distributed brand, BETA, tone-toggle, nav and header-right across the bar, so BETA drifted away from the brand wordmark and the tone-toggle landed in the middle of the row. Markup: brand + BETA are now wrapped in .header-left so they ride together. The tone-toggle moves back inside .header-right next to theme + lang where it logically belongs. CSS: the header switches to grid (1fr auto 1fr) on desktop, which truly centres the nav regardless of side-group widths. The mobile @media block reverts to flex so the hamburger + slide-out drawer still work. Toggle redesign (tone, theme, language): - The single-button theme widget becomes a Light | Dark segmented control matching the other two so all three read as one cluster. cassandraToggleTheme is replaced by cassandraSetTheme(theme), the toggle's data-theme attribute is synced on page load. - All three share one CSS rule set: same padding, font, border, and a min-width so the active-only width matches the expanded width (no layout jump on hover). - On hover-capable devices each toggle collapses to just the active option; hovering (or keyboard focus-within) reveals both. Touch devices keep both visible — the @media (hover: hover) gate handles that and the mobile-drawer block overrides it explicitly so the drawer-stacked controls remain full-width with both options shown. Co-Authored-By: Claude Opus 4.7 --- app/static/css/layout.css | 154 ++++++++++++++++++++++---------------- app/templates/base.html | 50 +++++++------ 2 files changed, 119 insertions(+), 85 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 748b8db..5fae678 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -36,9 +36,15 @@ a:hover { text-decoration: underline; } } .app-header { - display: flex; + /* Three-column grid: brand+BETA pinned left, nav truly centered in + the middle column regardless of side widths, header-right pinned + right. The mobile-drawer wrapper is display:contents on desktop so + its children (nav, .header-right) become direct grid items and + land in columns 2 and 3 by source order. */ + display: grid; + grid-template-columns: 1fr auto 1fr; align-items: center; - justify-content: space-between; + gap: 14px; border-bottom: 1px solid var(--border); padding: 10px 18px; background: var(--surface); @@ -48,6 +54,13 @@ a:hover { text-decoration: underline; } top: 0; z-index: 50; } +.app-header .header-left { + display: inline-flex; + align-items: center; + gap: 10px; + justify-self: start; +} +.app-header nav { justify-self: center; } .app-header .brand { color: var(--accent); font-weight: 700; @@ -59,6 +72,7 @@ a:hover { text-decoration: underline; } margin-left: 18px; color: var(--muted); } +.app-header nav a:first-child { margin-left: 0; } .app-header nav a.active { color: var(--text); } .app-header .meta { color: var(--muted); font-size: 11px; } @@ -67,7 +81,12 @@ a:hover { text-decoration: underline; } * the @media block at the bottom converts it to a fixed slide-out. */ .mobile-drawer { display: contents; } -.app-header .header-right { display: flex; align-items: center; gap: 14px; } +.app-header .header-right { + display: flex; + align-items: center; + gap: 14px; + justify-self: end; +} /* Hamburger button — only visible at ≤480px (rule in the mobile block). * Three thin bars; uses the same border/muted treatment as the other @@ -102,51 +121,14 @@ a:hover { text-decoration: underline; } transition: opacity 120ms ease-out; } body.drawer-open .drawer-backdrop { opacity: 1; } -.theme-toggle { - background: transparent; - border: 1px solid var(--border); - color: var(--muted); - padding: 3px 8px; - font-family: var(--font-mono); - font-size: 10px; - letter-spacing: 0.08em; - cursor: pointer; - text-transform: lowercase; -} -.theme-toggle:hover { color: var(--accent); border-color: var(--accent); } -.theme-toggle__label::before { content: "◐ light"; } -[data-theme="dark"] .theme-toggle__label::before { content: "◐ dark"; } - -/* Tone toggle (segmented control: Novice | Intermediate) */ -.tone-toggle { - display: inline-flex; - border: 1px solid var(--border); - font-family: var(--font-mono); - font-size: 10.5px; - letter-spacing: 0.06em; - text-transform: uppercase; -} -.tone-toggle button { - background: transparent; - color: var(--muted); - border: 0; - padding: 4px 10px; - cursor: pointer; - font: inherit; - letter-spacing: inherit; - text-transform: inherit; -} -.tone-toggle button + button { border-left: 1px solid var(--border); } -.tone-toggle button:hover { color: var(--accent); } -.tone-toggle[data-tone="NOVICE"] button[data-value="NOVICE"], -.tone-toggle[data-tone="INTERMEDIATE"] button[data-value="INTERMEDIATE"] { - background: var(--accent); - color: var(--bg); -} - -/* Language toggle in the topbar — same visual rhythm as the tone - * toggle so the two controls read as a pair. Only EN and IT are - * visible here; the WIP languages (ES/FR/DE) live in /settings. */ +/* Segmented toggles — tone (Novice | Intermediate), theme (Light | Dark) + * and language (EN | IT) share one visual rhythm so the three controls + * read as a single cluster in the header. By default only the currently + * active option is rendered; hover or keyboard focus reveals both so the + * user can pick the other. Touch devices (which can't hover) show both + * options at all times; the @media (hover: hover) gate handles that. */ +.tone-toggle, +.theme-toggle, .lang-toggle { display: inline-flex; border: 1px solid var(--border); @@ -155,24 +137,65 @@ body.drawer-open .drawer-backdrop { opacity: 1; } letter-spacing: 0.06em; text-transform: uppercase; } +.tone-toggle button, +.theme-toggle button, .lang-toggle button { background: transparent; color: var(--muted); border: 0; - padding: 4px 8px; + padding: 4px 10px; cursor: pointer; font: inherit; letter-spacing: inherit; text-transform: inherit; + /* Fixed min-width so the active-only width matches the expanded width + of a single button — prevents the layout jumping as the user + mouses over and the second option appears. */ + min-width: 5.5em; + text-align: center; } +.tone-toggle button + button, +.theme-toggle button + button, .lang-toggle button + button { border-left: 1px solid var(--border); } +.tone-toggle button:hover, +.theme-toggle button:hover, .lang-toggle button:hover { color: var(--accent); } + +/* Active-option highlighting (data-* attribute on the container is + * authored by JS on load and on every change). */ +.tone-toggle[data-tone="NOVICE"] button[data-value="NOVICE"], +.tone-toggle[data-tone="INTERMEDIATE"] button[data-value="INTERMEDIATE"], +.theme-toggle[data-theme="light"] button[data-value="light"], +.theme-toggle[data-theme="dark"] button[data-value="dark"], .lang-toggle[data-lang="en"] button[data-value="en"], .lang-toggle[data-lang="it"] button[data-value="it"] { background: var(--accent); color: var(--bg); } +/* Collapse-when-idle behaviour: on hover-capable devices, hide the + * non-active option until the user hovers (or keyboard-focuses) the + * toggle. The mobile drawer overrides this further down. */ +@media (hover: hover) { + .tone-toggle button, + .theme-toggle button, + .lang-toggle button { display: none; } + .tone-toggle[data-tone="NOVICE"] button[data-value="NOVICE"], + .tone-toggle[data-tone="INTERMEDIATE"] button[data-value="INTERMEDIATE"], + .theme-toggle[data-theme="light"] button[data-value="light"], + .theme-toggle[data-theme="dark"] button[data-value="dark"], + .lang-toggle[data-lang="en"] button[data-value="en"], + .lang-toggle[data-lang="it"] button[data-value="it"] { + display: inline-block; + } + .tone-toggle:hover button, + .tone-toggle:focus-within button, + .theme-toggle:hover button, + .theme-toggle:focus-within button, + .lang-toggle:hover button, + .lang-toggle:focus-within button { display: inline-block; } +} + .app-main { padding: 14px; display: grid; @@ -238,10 +261,11 @@ body.drawer-open .drawer-backdrop { opacity: 1; } /* --- Mobile (≤480px) -------------------------------------------------- */ @media (max-width: 480px) { - /* Tighten the topbar so the brand, tone toggle and hamburger all fit - on a 360px phone. Drop the letter-spacing because at 11px tracking - eats horizontal space the brand cannot spare. */ + /* Revert to flex on mobile so the drawer-toggle can pin to the right + via margin-left:auto and the off-screen drawer doesn't try to claim + a grid column. */ .app-header { + display: flex; padding: 8px 12px; gap: 8px; letter-spacing: 0.04em; @@ -263,15 +287,6 @@ body.drawer-open .drawer-backdrop { opacity: 1; } the drawer (the .mobile-drawer block below). */ .drawer-toggle { display: flex; margin-left: auto; } - /* Keep the tone toggle visible but trim it: just N / I letters so it - fits next to brand + hamburger. */ - .tone-toggle--header { - font-size: 9.5px; - } - .tone-toggle--header button { - padding: 4px 7px; - } - /* The drawer wrapper: full-height slide-out from the right. The content inside (nav + header-right) becomes a vertical stack with comfortable touch targets. */ @@ -321,14 +336,25 @@ body.drawer-open .drawer-backdrop { opacity: 1; } gap: 14px; margin-top: 20px; } + .mobile-drawer .tone-toggle, .mobile-drawer .theme-toggle, .mobile-drawer .lang-toggle { + display: inline-flex; width: 100%; justify-content: center; } - .mobile-drawer .lang-toggle { display: inline-flex; } - .mobile-drawer .lang-toggle button, - .mobile-drawer .theme-toggle { padding: 10px; font-size: 11.5px; } + /* Inside the drawer all options stay visible — undoes the + hover-collapse from the @media (hover: hover) block above. Also + splits the row evenly and bumps the button padding for thumb taps. */ + .mobile-drawer .tone-toggle button, + .mobile-drawer .theme-toggle button, + .mobile-drawer .lang-toggle button { + display: inline-block; + flex: 1; + padding: 10px; + font-size: 11.5px; + min-width: 0; + } /* The user-menu's dropdown becomes redundant inside the drawer — surface its links flat as a list, and hide the chip button. */ diff --git a/app/templates/base.html b/app/templates/base.html index 2368970..f4218a5 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -126,6 +126,10 @@ // Reflect the saved value in the toggle on load. var pill = document.getElementById('tone-toggle'); if (pill) pill.dataset.tone = currentTone(); + // Same for the theme toggle — pull the current theme that the + // top-of-page inline script already wrote to . + var themePill = document.getElementById('theme-toggle'); + if (themePill) themePill.dataset.theme = document.documentElement.dataset.theme || 'light'; }); window.cassandraSetTone = function (newTone) { @@ -143,11 +147,11 @@ }); }; - window.cassandraToggleTheme = function () { - var d = document.documentElement; - var t = d.dataset.theme === 'light' ? 'dark' : 'light'; - d.dataset.theme = t; - try { localStorage.setItem('cassandra.theme', t); } catch (e) {} + window.cassandraSetTheme = function (newTheme) { + document.documentElement.dataset.theme = newTheme; + var pill = document.getElementById('theme-toggle'); + if (pill) pill.dataset.theme = newTheme; + try { localStorage.setItem('cassandra.theme', newTheme); } catch (e) {} }; window.cassandraSetLang = async function (newLang) { @@ -205,18 +209,12 @@
            - {{ BRAND_NAME }} - {% if BETA_MODE %}BETA{% endif %} - - {# Tone toggle is the one widget we keep visible in the mobile - header even when the drawer is closed — it directly affects the - readability of the content right next to it. #} -
            - - + {# Left group keeps brand + BETA chip pinned together as a single + layout cell so the chip can't drift away from the wordmark when + the header grows or shrinks. #} +
            + {{ BRAND_NAME }} + {% if BETA_MODE %}BETA{% endif %}
            {# Mobile hamburger — shown only at ≤480px via CSS. #} @@ -236,10 +234,20 @@ Log
            - +
            + + +
            +
            + + +
            {% set cu = request.state.current_user if request.state and request.state.current_user is defined else None %} {% if cu and cu.user %}
            Date: Fri, 29 May 2026 11:11:46 +0200 Subject: [PATCH 52/69] ui: header toggles expand downward, not sideways MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hovering a toggle (tone, theme, language) previously revealed the non-active option inline next to the active one, which widened the toggle and pushed its neighbours sideways. Now the non-active option appears as a popup ABSOLUTELY POSITIONED below the active one — the toggle's in-flow footprint stays exactly one button wide and tall, so the other two toggles next to it never move when the user mouses over one of them. Mechanism: inside @media (hover: hover) the container becomes position:relative and every button defaults to display:none. The :hover/:focus-within rule renders all options as position:absolute under the container. Specificity (.X[data=Y] btn[data=Y]) on the active-button rule then pins the active option back into the static flow at the top, so only the non-active end(s) up absolute — popup grows downward only. margin-top:-1px makes the popup's top border overlap the container's bottom border for a single shared edge. z-index:60 sits above the markets bar (z-50). Touch devices keep both options side-by-side (the @media gate); the mobile drawer keeps both visible too. Co-Authored-By: Claude Opus 4.7 --- app/static/css/layout.css | 53 +++++++++++++++++++++++++++++++-------- 1 file changed, 43 insertions(+), 10 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 5fae678..83f03f5 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -173,27 +173,60 @@ body.drawer-open .drawer-backdrop { opacity: 1; } color: var(--bg); } -/* Collapse-when-idle behaviour: on hover-capable devices, hide the - * non-active option until the user hovers (or keyboard-focuses) the - * toggle. The mobile drawer overrides this further down. */ +/* Collapse-when-idle behaviour: on hover-capable devices each toggle + * shows only its active option. Hover or keyboard focus reveals the + * other option STACKED ABSOLUTELY BELOW so the toggle's in-flow size + * never changes — neighbouring controls don't shift when the user + * mouses over one of them. */ @media (hover: hover) { + .tone-toggle, + .theme-toggle, + .lang-toggle { + position: relative; + } + + /* Hide every option by default. The active option's higher-specificity + rule below puts it back into the static flow. */ .tone-toggle button, .theme-toggle button, .lang-toggle button { display: none; } + + /* Hover / focus: render every option as an absolutely-positioned + button immediately under the container. The active-button rule + immediately below wins on specificity and pins it back into the + static flow at the top — only the non-active option(s) actually + end up absolutely-positioned, so the popup grows downward only. */ + .tone-toggle:hover button, + .tone-toggle:focus-within button, + .theme-toggle:hover button, + .theme-toggle:focus-within button, + .lang-toggle:hover button, + .lang-toggle:focus-within button { + display: block; + position: absolute; + top: 100%; + left: 0; + right: 0; + margin-top: -1px; /* share the container's bottom border */ + background: var(--surface); + border: 1px solid var(--border); + z-index: 60; /* above the markets bar (z-50) */ + } + + /* Active option stays in static flow at the top of the container + even while hovered. Two-attribute specificity (.X[data=Y] btn[data=Y]) + beats the .X:hover button rule above. */ .tone-toggle[data-tone="NOVICE"] button[data-value="NOVICE"], .tone-toggle[data-tone="INTERMEDIATE"] button[data-value="INTERMEDIATE"], .theme-toggle[data-theme="light"] button[data-value="light"], .theme-toggle[data-theme="dark"] button[data-value="dark"], .lang-toggle[data-lang="en"] button[data-value="en"], .lang-toggle[data-lang="it"] button[data-value="it"] { - display: inline-block; + display: block; + position: static; + margin-top: 0; + border: 0; } - .tone-toggle:hover button, - .tone-toggle:focus-within button, - .theme-toggle:hover button, - .theme-toggle:focus-within button, - .lang-toggle:hover button, - .lang-toggle:focus-within button { display: inline-block; } } .app-main { From 71155a67be1dceb0b3a0bd5605f0c5e7617f687c Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 11:17:43 +0200 Subject: [PATCH 53/69] =?UTF-8?q?ui:=20rename=20tone=20"Novice"=20?= =?UTF-8?q?=E2=86=92=20"Pro";=20fit=20tone-toggle=20to=20longest=20option?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User-visible relabel only. Backend tone value stays NOVICE — no API contract change, no migration on stored user.digest_tone, the glossary/plain-prose depth of analysis is unchanged. The marketing intent is that "Pro" reads better than "Novice" on the dashboard header; landing/pricing/privacy copy still uses the word "Novice" in flowing prose, so leaving those alone keeps the existing explanations coherent until they get a copy pass. Toggle width: the popup expansion (positioned left:0/right:0) is sized by the container, which previously sized to the active button. When "Pro" was active the popup was too narrow to fit "Intermediate". Bumped .tone-toggle button min-width to 10em so both buttons reserve enough room for the longest label regardless of which one is active. Co-Authored-By: Claude Opus 4.7 --- app/static/css/layout.css | 5 +++++ app/templates/base.html | 6 +++++- app/templates/settings.html | 2 +- 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 83f03f5..fc94c65 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -157,6 +157,11 @@ body.drawer-open .drawer-backdrop { opacity: 1; } .tone-toggle button + button, .theme-toggle button + button, .lang-toggle button + button { border-left: 1px solid var(--border); } +/* The tone-toggle's longer option ("Intermediate", 12 chars) needs more + room than the shared 5.5em min-width. We size both buttons to fit the + longest one so the popup width (set by container width via left/right:0) + doesn't get clipped when only the short "Pro" label is active. */ +.tone-toggle button { min-width: 10em; } .tone-toggle button:hover, .theme-toggle button:hover, .lang-toggle button:hover { color: var(--accent); } diff --git a/app/templates/base.html b/app/templates/base.html index f4218a5..7344b76 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -234,10 +234,14 @@ Log
            + {# The "Pro" label maps to the NOVICE tone server-side — kept that + way to avoid touching every stored user preference and API + contract. The mode itself (glossary tooltips + plainer + framing) is unchanged; only the display label changes. #}
            + onclick="cassandraSetTone('NOVICE')">Pro
            diff --git a/app/templates/settings.html b/app/templates/settings.html index 7ce054c..37e88f7 100644 --- a/app/templates/settings.html +++ b/app/templates/settings.html @@ -185,7 +185,7 @@
            + {% if (user.digest_tone or 'INTERMEDIATE') == 'NOVICE' %}checked{% endif %}> Pro
            From 3e1a14f3348ba04ccac8e2a0bf12865e44972b7e Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 11:23:52 +0200 Subject: [PATCH 54/69] =?UTF-8?q?ui:=20flip=20tone=20relabel=20=E2=80=94?= =?UTF-8?q?=20"Pro"=20now=20maps=20to=20INTERMEDIATE,=20not=20NOVICE?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reverses the polarity of 71155a6 to match the actual semantics: - "Novice" stays labelled "Novice" → glossary tooltips, plainer prose. - "Intermediate" is relabelled "Pro" → terse, assumes fluency, no hand-holding. This is the mode an expert reader wants, so the "Pro" badge actually fits. Backend tone values (NOVICE, INTERMEDIATE) are unchanged — no API, prompt, or stored-preference impact. Only the display strings flip. Also drops the .tone-toggle button min-width: 10em override added in 71155a6. With "Intermediate" gone from the visible label, the longest remaining label is "Novice" (6 chars), which fits the shared 5.5em just like the theme and language toggles. Co-Authored-By: Claude Opus 4.7 --- app/static/css/layout.css | 5 ----- app/templates/base.html | 13 +++++++------ app/templates/settings.html | 4 ++-- 3 files changed, 9 insertions(+), 13 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index fc94c65..83f03f5 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -157,11 +157,6 @@ body.drawer-open .drawer-backdrop { opacity: 1; } .tone-toggle button + button, .theme-toggle button + button, .lang-toggle button + button { border-left: 1px solid var(--border); } -/* The tone-toggle's longer option ("Intermediate", 12 chars) needs more - room than the shared 5.5em min-width. We size both buttons to fit the - longest one so the popup width (set by container width via left/right:0) - doesn't get clipped when only the short "Pro" label is active. */ -.tone-toggle button { min-width: 10em; } .tone-toggle button:hover, .theme-toggle button:hover, .lang-toggle button:hover { color: var(--accent); } diff --git a/app/templates/base.html b/app/templates/base.html index 7344b76..97028ab 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -234,16 +234,17 @@ Log
            - {# The "Pro" label maps to the NOVICE tone server-side — kept that - way to avoid touching every stored user preference and API - contract. The mode itself (glossary tooltips + plainer - framing) is unchanged; only the display label changes. #} + {# The "Pro" label maps to the INTERMEDIATE tone server-side — + kept that way to avoid touching every stored user preference + and API contract. The mode itself (terse, no glossary + tooltips, assumes fluency) is unchanged; only the display + label changes. #}
            + onclick="cassandraSetTone('NOVICE')">Novice + onclick="cassandraSetTone('INTERMEDIATE')">Pro
            diff --git a/app/templates/settings.html b/app/templates/settings.html index 37e88f7..ac0107d 100644 --- a/app/templates/settings.html +++ b/app/templates/settings.html @@ -185,9 +185,9 @@
            + {% if (user.digest_tone or 'INTERMEDIATE') == 'NOVICE' %}checked{% endif %}> Novice + {% if (user.digest_tone or 'INTERMEDIATE') == 'INTERMEDIATE' %}checked{% endif %}> Pro
            From 48f022b71b456c6d77189838e86f6039313e1edd Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 11:44:41 +0200 Subject: [PATCH 55/69] i18n: stop truncating IT translations + localise the chat sidebar MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three connected fixes after the user spotted the 2026-05-28 IT log cutting off mid-sentence: 1. translation: bump max_tokens 4000 → 8000. call_llm()'s default cap was 4000, which is what the English log generator itself uses as its ceiling. Italian expands roughly 15-25 % over English in tokens, so any near-cap English source produced an IT translation that hit finish_reason=length and returned a truncated body — silently, because _call_provider() only raises when content is fully empty. The strategic_log_translations table has dozens of rows where completion_tokens landed at exactly 4000 with content well under half the source length. 8000 gives ample headroom for any of the five LANGUAGES we ship (en/it/es/fr/de). 2. log.html: localise the chat sidebar strings. user_lang was already passed into the template by pages.py, so an inline {% if user_lang == 'it' %} keeps it simple. Covers the "Ask Cassandra" title, the "grounded on…" hint, the helper lede, the textarea placeholder, and the Send button label. 3. chat endpoint: append respond_in_clause(user.lang) to the system prompt. The chat conversation can now happen in IT — the model's first reply lands in the right language even when the user's first turn is short. scripts/backfill_truncated_translations.py: one-off cleanup utility. Scans strategic_log_translations for rows whose translated content is < 70 % of the English source (the truncation signal — IT *expands* beyond English, so a shorter translation is always suspect), deletes them, and re-translates via the now-uncapped service. Supports --date, --since, --all and --dry-run. The 2026-05-28 fan-out has already been re-translated (13/13 rows). Other historical dates still hold older truncations; the user can decide whether to backfill those (the script is idempotent). Co-Authored-By: Claude Opus 4.7 --- app/routers/chat.py | 8 ++ app/services/translation.py | 8 +- app/templates/log.html | 15 +- scripts/backfill_truncated_translations.py | 154 +++++++++++++++++++++ 4 files changed, 180 insertions(+), 5 deletions(-) create mode 100644 scripts/backfill_truncated_translations.py diff --git a/app/routers/chat.py b/app/routers/chat.py index f4198ba..f213637 100644 --- a/app/routers/chat.py +++ b/app/routers/chat.py @@ -21,6 +21,7 @@ from app.db import get_session, utcnow from app.jobs._market_context import REFERENCE_LINE from app.models import AICall, Headline, Quote, StrategicLog from app.routers.api import _md_to_html +from app.services.i18n import respond_in_clause from app.services.llm_prompts import build_chat_system_prompt from app.services.openrouter import call_llm, month_start @@ -160,6 +161,13 @@ async def chat( headlines=headlines, reference_line=REFERENCE_LINE, ) + # Respect the user's interface language preference: append a single + # localized "respond in" nudge so the assistant answers in IT when + # the user has lang=it. The prompt + history (which includes the + # user's own question, often in their language) are usually enough, + # but the nudge guarantees the first reply lands correctly. + user_lang = principal.user.lang if principal and principal.user else "en" + system_prompt = system_prompt + respond_in_clause(user_lang) msgs = [{"role": "system", "content": system_prompt}] for m in history: diff --git a/app/services/translation.py b/app/services/translation.py index 46dbabb..96f99ed 100644 --- a/app/services/translation.py +++ b/app/services/translation.py @@ -65,7 +65,13 @@ async def translate( {"role": "system", "content": system_prompt}, {"role": "user", "content": text}, ] - result = await call_llm(client, messages) + # Italian / Spanish / French / German typically expand the token count + # 15-25 % over English (longer words, more sub-word splits). Our + # strategic-log generator runs up to its own 4000-token cap, so a 4000 + # cap here would silently truncate any near-cap source. 8000 gives + # ample headroom for every language we currently support and costs + # nothing extra unless the model actually emits more tokens. + result = await call_llm(client, messages, max_tokens=8000) content = (result.content or "").strip() # Strip code fences if the model wrapped its output despite the system rule. diff --git a/app/templates/log.html b/app/templates/log.html index c91cb7a..8370050 100644 --- a/app/templates/log.html +++ b/app/templates/log.html @@ -33,21 +33,28 @@ {% if paid %} {% else %} diff --git a/scripts/backfill_truncated_translations.py b/scripts/backfill_truncated_translations.py new file mode 100644 index 0000000..8fd7a72 --- /dev/null +++ b/scripts/backfill_truncated_translations.py @@ -0,0 +1,154 @@ +"""One-off backfill: re-translate StrategicLog rows whose Italian (or +other-language) translation was truncated by the old 4000-token cap in +services/translation.py. + +Selection criteria for a "truncated" row: +- completion_tokens >= 3990 (right at or above the old cap), OR +- the translated content is shorter than half the English source + +Usage inside the app container: + docker compose exec app python -m scripts.backfill_truncated_translations \ + --date 2026-05-28 # restrict to one day, repeatable + docker compose exec app python -m scripts.backfill_truncated_translations \ + --since 2026-04-01 # everything from a date onward + docker compose exec app python -m scripts.backfill_truncated_translations \ + --all # entire history (slow / costs $$) + docker compose exec app python -m scripts.backfill_truncated_translations \ + --date 2026-05-28 --dry-run # just print what would be touched + +Idempotent: each affected row is deleted then re-inserted in its own +transaction, so a re-run only re-translates rows that are STILL flagged +truncated after the previous pass. +""" +from __future__ import annotations + +import argparse +import asyncio +import sys +from datetime import date, datetime + +import httpx +from sqlalchemy import and_, delete, func, or_, select + +from app.db import get_session_factory +from app.logging import get_logger +from app.models import StrategicLog, StrategicLogTranslation +from app.services.translation import translate + +log = get_logger("backfill.translations") + +# Italian (and the other expansive Romance / Germanic targets we support) +# typically produce 15-25 % MORE characters than the English source, so +# a translation shorter than the source — let alone much shorter — is a +# truncation signal even if completion_tokens didn't land exactly at the +# old 4000-token cap. We tolerate down to 70 % of source length to avoid +# touching the occasional legitimately-compressed translation. +SHORTNESS_RATIO = 0.7 + + +def _is_truncated(en_chars: int, tr_chars: int, tr_completion: int | None) -> bool: + if en_chars <= 0: + return False + return tr_chars < en_chars * SHORTNESS_RATIO + + +async def _find_targets(session, day: date | None, since: date | None, all_: bool): + q = ( + select( + StrategicLog.id.label("log_id"), + StrategicLog.generated_at, + func.char_length(StrategicLog.content).label("en_chars"), + StrategicLogTranslation.id.label("tr_id"), + StrategicLogTranslation.lang, + StrategicLogTranslation.completion_tokens.label("tr_tok"), + func.char_length(StrategicLogTranslation.content).label("tr_chars"), + ) + .join(StrategicLogTranslation, + StrategicLogTranslation.log_id == StrategicLog.id) + ) + if day is not None: + q = q.where(func.date(StrategicLog.generated_at) == day) + elif since is not None: + q = q.where(StrategicLog.generated_at >= since) + # all_ → no date filter + q = q.order_by(StrategicLog.generated_at, StrategicLogTranslation.lang) + rows = (await session.execute(q)).all() + return [r for r in rows if _is_truncated(r.en_chars, r.tr_chars, r.tr_tok)] + + +async def _retranslate_one(session, client: httpx.AsyncClient, log_id: int, lang: str): + """Delete the existing (log_id, lang) translation row and write a fresh + one via the (now uncapped) translation service. Each row commits + independently so a per-row failure doesn't roll back the rest.""" + src_row = (await session.execute( + select(StrategicLog).where(StrategicLog.id == log_id) + )).scalar_one_or_none() + if src_row is None: + log.warning("backfill.missing_source", log_id=log_id) + return False + + await session.execute( + delete(StrategicLogTranslation) + .where(StrategicLogTranslation.log_id == log_id) + .where(StrategicLogTranslation.lang == lang) + ) + await session.commit() + + try: + translated_md, llm_result = await translate(client, src_row.content, lang) + except Exception as exc: + log.warning("backfill.translate_failed", + log_id=log_id, lang=lang, error=str(exc)[:200]) + return False + + session.add(StrategicLogTranslation( + log_id=log_id, + lang=lang, + content=translated_md, + model=llm_result.model, + prompt_tokens=llm_result.prompt_tokens, + completion_tokens=llm_result.completion_tokens, + cost_usd=llm_result.cost_usd, + )) + await session.commit() + return True + + +async def main(args): + day = datetime.strptime(args.date, "%Y-%m-%d").date() if args.date else None + since = datetime.strptime(args.since, "%Y-%m-%d").date() if args.since else None + if not (day or since or args.all): + print("Specify --date, --since, or --all", file=sys.stderr) + sys.exit(2) + + session_factory = get_session_factory() + async with session_factory() as session: + targets = await _find_targets(session, day, since, args.all) + print(f"Found {len(targets)} truncated translation row(s):") + for r in targets: + print(f" log_id={r.log_id} lang={r.lang} " + f"en={r.en_chars}c tr={r.tr_chars}c " + f"tok={r.tr_tok} at {r.generated_at}") + if args.dry_run or not targets: + return + + ok = 0 + async with httpx.AsyncClient(follow_redirects=True) as client: + for r in targets: + print(f" re-translating log_id={r.log_id} lang={r.lang}…", end=" ") + done = await _retranslate_one(session, client, r.log_id, r.lang) + print("OK" if done else "FAILED") + if done: + ok += 1 + print(f"\nRe-translated {ok}/{len(targets)} row(s).") + + +if __name__ == "__main__": + p = argparse.ArgumentParser() + grp = p.add_mutually_exclusive_group() + grp.add_argument("--date", help="single day YYYY-MM-DD") + grp.add_argument("--since", help="from YYYY-MM-DD onward") + grp.add_argument("--all", action="store_true", help="entire history") + p.add_argument("--dry-run", action="store_true", + help="list affected rows without rewriting") + asyncio.run(main(p.parse_args())) From fca05aef7ab3b52110d5966697e3e73360eb04a7 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 12:01:28 +0200 Subject: [PATCH 56/69] i18n: live-swap chat sidebar labels on language toggle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The strategic log content already refreshes via HTMX on lang-changed (server-side translation lookup), but the chat sidebar's static labels — title, hint, helper lede, textarea placeholder, Send button — were baked into the HTML by Jinja at page render and only updated after a full reload. Add a tiny client-side i18n dictionary (CASSANDRA_I18N) plus applyI18n(lang) in base.html. cassandraSetLang() now calls applyI18n(newLang) right after the language PATCH succeeds and before firing the HTMX triggers, so labels swap in step with the AI content. Convention: sets textContent; sets .placeholder. Initial render still goes through the existing {% if user_lang == 'it' %} Jinja blocks so there's no flash of English on page load for IT users — applyI18n is a no-op until the toggle is clicked. Only the chat sidebar has bindings today. Adding more labels later is a matter of dropping a key into the dict and tagging the element. Co-Authored-By: Claude Opus 4.7 --- app/templates/base.html | 36 ++++++++++++++++++++++++++++++++++++ app/templates/log.html | 20 +++++--------------- 2 files changed, 41 insertions(+), 15 deletions(-) diff --git a/app/templates/base.html b/app/templates/base.html index 97028ab..9bbb46f 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -154,6 +154,40 @@ try { localStorage.setItem('cassandra.theme', newTheme); } catch (e) {} }; + // Static-label i18n dictionary. AI-generated content is re-fetched via + // HTMX (server-side translation), but plain UI labels are baked into + // the HTML at render time. This dict + applyI18n() below let the + // language toggle swap labels live without a page refresh. + // Convention: … (sets textContent), + // (sets .placeholder). + // First-render correctness is handled by the template's + // {% if user_lang == 'it' %} block — applyI18n only kicks in on + // subsequent toggle events. + window.CASSANDRA_I18N = { + 'chat.title': { en: 'Ask Cassandra', + it: 'Chiedi a Cassandra' }, + 'chat.hint': { en: 'grounded on the latest log + live data', + it: "basato sull'ultimo log + dati in tempo reale" }, + 'chat.lede': { en: "Ask about today's analysis. The model sees the latest strategic log, live market readings across all groups, and the last 24h of thesis-filtered headlines. Refresh wipes this conversation.", + it: "Fai domande sull'analisi di oggi. Il modello vede l'ultimo log strategico, le quotazioni di mercato in tempo reale per tutti i gruppi e le ultime 24h di titoli filtrati per tesi. Un refresh della pagina cancella questa conversazione." }, + 'chat.placeholder': { en: 'e.g. why is the defence sleeve flat through Hormuz?', + it: 'es. perché il comparto difesa è piatto nonostante Hormuz?' }, + 'chat.send': { en: 'Send', + it: 'Invia' }, + }; + window.cassandraApplyI18n = function (lang) { + document.querySelectorAll('[data-i18n]').forEach(function (el) { + var key = el.getAttribute('data-i18n'); + var entry = window.CASSANDRA_I18N[key]; + if (entry && entry[lang] != null) el.textContent = entry[lang]; + }); + document.querySelectorAll('[data-i18n-placeholder]').forEach(function (el) { + var key = el.getAttribute('data-i18n-placeholder'); + var entry = window.CASSANDRA_I18N[key]; + if (entry && entry[lang] != null) el.placeholder = entry[lang]; + }); + }; + window.cassandraSetLang = async function (newLang) { var pill = document.getElementById('lang-toggle'); if (!pill) return; @@ -170,6 +204,8 @@ body: JSON.stringify({lang: newLang}), }); if (!r.ok) throw new Error('HTTP ' + r.status); + // Swap any static UI labels that have i18n bindings. + window.cassandraApplyI18n(newLang); // Trigger HTMX-driven panels to re-fetch in the new language. // Same shape as cassandraSetTone — every panel that listens to // tone-changed also listens to lang-changed. diff --git a/app/templates/log.html b/app/templates/log.html index 8370050..cadc3f1 100644 --- a/app/templates/log.html +++ b/app/templates/log.html @@ -33,28 +33,18 @@ {% if paid %} {% else %} From 259146ecdc0c4755ef277aa9d058d274de143cad Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 12:03:44 +0200 Subject: [PATCH 57/69] fix: don't put literal Jinja syntax inside JS comments in base.html MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous commit's i18n explanatory comment included the snippet {% if user_lang == 'it' %} as illustration — but Jinja parses the whole template, including content inside JS // comments, so that literal got picked up as a real (unclosed) tag and every page rendered with a TemplateSyntaxError. Rewrite the comment without the literal Jinja syntax. Co-Authored-By: Claude Opus 4.7 --- app/templates/base.html | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/app/templates/base.html b/app/templates/base.html index 9bbb46f..97d1750 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -158,11 +158,10 @@ // HTMX (server-side translation), but plain UI labels are baked into // the HTML at render time. This dict + applyI18n() below let the // language toggle swap labels live without a page refresh. - // Convention: … (sets textContent), - // (sets .placeholder). - // First-render correctness is handled by the template's - // {% if user_lang == 'it' %} block — applyI18n only kicks in on - // subsequent toggle events. + // Convention: data-i18n="key" sets textContent; + // data-i18n-placeholder="key" sets .placeholder. + // First-render correctness is handled by the template's user_lang + // conditional, so applyI18n only kicks in on subsequent toggles. window.CASSANDRA_I18N = { 'chat.title': { en: 'Ask Cassandra', it: 'Chiedi a Cassandra' }, From 6c4c7118308e7cc076a2af40779b470dc16e35ec Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 12:17:49 +0200 Subject: [PATCH 58/69] ui: log page tone badge follows the toggle (novice / pro) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Strategic Log Archive panel header used to show two engineery badges sourced from server config: new logs use: tone intermediate analysis speculative Both were misleading: - The tone badge described the SERVER's generator setting, not the user's reading preference — confusingly disconnected from the Novice | Pro toggle in the topbar that actually controls what AI panels render. - The analysis flag is always SPECULATIVE in production, so the badge carried no information. Drop the "new logs use:" prefix and the analysis badge. The tone badge now mirrors the user's toggle: NOVICE → "novice", INTERMEDIATE → "pro" (same data values; just the display label flips, matching the header relabel from 3e1a14f). Wiring lives in base.html: a new cassandraSyncToneBadge(tone) helper updates the #tone-badge element when present. Called from DOMContentLoaded (so the initial badge picks up the localStorage tone) and from cassandraSetTone (so toggling the header updates the badge live, without a page refresh). current_tone / current_analysis are removed from _log_page_context — log.html was the only consumer and neither key is referenced now. Co-Authored-By: Claude Opus 4.7 --- app/routers/pages.py | 3 --- app/templates/base.html | 18 ++++++++++++++++++ app/templates/log.html | 8 +++++--- 3 files changed, 23 insertions(+), 6 deletions(-) diff --git a/app/routers/pages.py b/app/routers/pages.py index d1327a3..1801f93 100644 --- a/app/routers/pages.py +++ b/app/routers/pages.py @@ -77,12 +77,9 @@ async def _resolve_log_date(session: AsyncSession, day: str | None) -> date: def _log_page_context(target: date, paid: bool, user_lang: str = "en") -> dict: - s = get_settings() return { "selected_iso": target.isoformat(), "selected_month": target.strftime("%Y-%m"), - "current_tone": s.CASSANDRA_TONE.upper(), - "current_analysis": s.CASSANDRA_ANALYSIS.upper(), "paid": paid, "user_lang": user_lang, } diff --git a/app/templates/base.html b/app/templates/base.html index 97d1750..850fbca 100644 --- a/app/templates/base.html +++ b/app/templates/base.html @@ -130,12 +130,30 @@ // top-of-page inline script already wrote to . var themePill = document.getElementById('theme-toggle'); if (themePill) themePill.dataset.theme = document.documentElement.dataset.theme || 'light'; + // Sync the /log page's tone badge to the saved tone — server-side + // first render defaults to "pro", but a returning NOVICE user + // should see "novice" before any toggle interaction. + window.cassandraSyncToneBadge(currentTone()); }); + // Sync the optional #tone-badge (currently used on the /log page) to + // the supplied tone. NOVICE renders as "novice"; INTERMEDIATE renders + // as "pro" — matches the header toggle's display labels. Safe to call + // on pages that don't render the badge. + window.cassandraSyncToneBadge = function (tone) { + var badge = document.getElementById('tone-badge'); + if (!badge) return; + var label = (tone === 'NOVICE') ? 'novice' : 'pro'; + badge.className = 'badge badge--tone-' + label; + var span = badge.querySelector('[data-tone-label]'); + if (span) span.textContent = label; + }; + window.cassandraSetTone = function (newTone) { try { localStorage.setItem('cassandra.tone', newTone); } catch (e) {} var pill = document.getElementById('tone-toggle'); if (pill) pill.dataset.tone = newTone; + window.cassandraSyncToneBadge(newTone); // Trigger a re-fetch of every AI-driven HTMX target on the page. // Easiest: dispatch a custom event that the relevant elements // listen to. Simpler still: fire htmx.trigger on the well-known diff --git a/app/templates/log.html b/app/templates/log.html index cadc3f1..51d090e 100644 --- a/app/templates/log.html +++ b/app/templates/log.html @@ -8,9 +8,11 @@ selected {{ selected_iso }}  ·  - new logs use: - tone {{ current_tone | lower }} - analysis {{ current_analysis | lower }} + {# Tone badge mirrors the header toggle. base.html's DOMContentLoaded + hook and cassandraSetTone() both update this element so the label + stays in step with the user's choice — no need to re-render the + page when the toggle flips. #} + tone pro
            From a55168d20a306b92c221b11c63fac68cdcd35712 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 12:35:10 +0200 Subject: [PATCH 59/69] ui: log panel stretches to portfolio bottom; AI analysis stays expanded MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two small fixes to the dashboard right column based on user feedback: 1. layout.css — drop align-self:start from #log-panel. The panel previously shrank to its content, leaving the right-hand column visually shorter than the indicators+portfolio stack on the left. Removing the override lets the grid stretch the panel to the full row span so the two columns now bottom-align. The log content still sits at the top of the panel; any extra height is empty padding inside the box. 2. portfolio.js — re-hydrate AI analysis expanded. The 60s auto-refresh rebuilds the portfolio mount and re-attaches the previously-generated analysis from localStorage, but the
            element was re-attached with open:false — collapsing it under the user's cursor every minute. Users reasonably perceived that as "the analysis disappeared". Hydrate as open:true so the body stays visible; the user can still click the summary to collapse manually within a refresh window. Co-Authored-By: Claude Opus 4.7 --- app/static/css/layout.css | 9 ++++----- app/static/js/portfolio.js | 8 +++++--- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/app/static/css/layout.css b/app/static/css/layout.css index 83f03f5..6acaf6c 100644 --- a/app/static/css/layout.css +++ b/app/static/css/layout.css @@ -253,11 +253,10 @@ body.drawer-open .drawer-backdrop { opacity: 1; } #portfolio-panel { grid-area: portfolio; } #log-panel { grid-area: log; - /* Don't stretch to fill both grid rows; if the log is shorter than - the portfolio next to it, the surplus below would render as a big - empty white box. Aligning to the start makes the panel shrink to - its content and the dashboard background fills any gap. */ - align-self: start; + /* Stretch (default align-self) so the log panel's border reaches the + bottom of the portfolio next to it — the two right-hand panels + align cleanly. The log body itself sits at the top of the panel; + any height beyond its content is empty padding inside the box. */ } #news-panel { grid-area: news; } diff --git a/app/static/js/portfolio.js b/app/static/js/portfolio.js index b7f820f..0f3ecb4 100644 --- a/app/static/js/portfolio.js +++ b/app/static/js/portfolio.js @@ -387,10 +387,12 @@ }); // Re-hydrate any cached AI analysis so the 60s auto-refresh doesn't - // wipe it. Collapsed by default on hydration so the panel stays - // compact — click the header to expand. + // wipe it. Rendered expanded so the user keeps seeing the body they + // just generated — collapsing it under their cursor every minute + // reads as "the analysis disappeared". They can still click the + // header to collapse manually within a single refresh window. if (pie.analysis && pie.analysis.content) { - showAnalysis(pie.analysis, { open: false }); + showAnalysis(pie.analysis, { open: true }); } } From 8347c90235b5b8c28cec6930413561c7222cbf08 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 12:58:06 +0200 Subject: [PATCH 60/69] ui: drop log-content's fixed-viewport scroll cap The dashboard's log panel now stretches in the grid to bottom-align with the portfolio (a55168d), but .log-content still carried max-height: calc(100vh - 240px) + overflow-y: auto from an older layout. That produced an inner scrollbar inside the panel AND left visible dead space below the scrolling region. Removing the cap lets the panel grid handle the height and the page scroll handle very long logs; no more nested scroll region. Co-Authored-By: Claude Opus 4.7 --- app/static/css/log-chat.css | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/app/static/css/log-chat.css b/app/static/css/log-chat.css index 895953b..0ca7954 100644 --- a/app/static/css/log-chat.css +++ b/app/static/css/log-chat.css @@ -10,8 +10,13 @@ color: var(--text); max-width: 76ch; margin: 0 auto; - max-height: calc(100vh - 240px); - overflow-y: auto; + /* No max-height cap here — the dashboard's log panel now stretches in + the grid to match the left column's bottom (see #log-panel in + layout.css). A constrained max-height was producing an awkward + inner scrollbar AND leaving dead space below it inside the panel. + With the cap gone the content sits at the panel's top, the panel + grows or shrinks with the grid, and the regular page scroll + handles very long logs. */ } .log-content p { margin: 0 0 1.1em; } .log-content h1, .log-content h2, .log-content h3, .log-content h4 { From 19d4854f50fc2b2fd8a66242d2e79ff72182356e Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:02:36 +0200 Subject: [PATCH 61/69] llm: support JSON-mode + stop publishing the reasoning field MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two changes to the LLM call path that together close the chain-of-thought leakage surface: 1. _call_provider accepts an optional `response_format` (forwarded to the OpenAI-shaped API — DeepSeek and OpenRouter both honour {"type": "json_object"}). Threaded through call_llm so callers can force structured output without monkey-patching the body. The indicator-summary job will use this next: it'll require the model to emit {"read": "..."} and parse the field, making prose outside the JSON object physically impossible to publish. 2. Empty `content` no longer falls back to the `reasoning` field. `reasoning` is the model's internal scratchpad — "Let's see...", half-formed math, planning notes. We had a fallback that surfaced it when content was null, but the field is intended for debugging the model, not for publication. After the 2026-05-29 valuation read leaked into production, the fallback is gone: an empty content row now raises so the caller retries or skips, and the previous good row remains visible. Test updated to assert this safer behaviour. Co-Authored-By: Claude Opus 4.7 --- app/services/openrouter.py | 37 +++++++++++++++++++++--------- tests/test_openrouter_transport.py | 18 ++++++++------- 2 files changed, 36 insertions(+), 19 deletions(-) diff --git a/app/services/openrouter.py b/app/services/openrouter.py index ca31f2f..598150c 100644 --- a/app/services/openrouter.py +++ b/app/services/openrouter.py @@ -136,10 +136,15 @@ async def _call_provider( messages: list[dict], model: str | None, max_tokens: int, + response_format: dict | None = None, ) -> LogResult: """One provider call with tenacity retries on transport/HTTP errors. Lives inside the retry decorator so retries happen within a provider, - not across the fallback chain.""" + not across the fallback chain. + + `response_format` is forwarded to the provider verbatim — DeepSeek and + OpenRouter both accept the OpenAI-shaped {"type": "json_object"} for + JSON-mode generation. None means free-form text.""" url, api_key, default_model, extra_headers = _endpoint_for(provider) used_model = model or default_model headers = { @@ -147,18 +152,22 @@ async def _call_provider( "Content-Type": "application/json", **extra_headers, } - r = await client.post( - url, - headers=headers, - json={"model": used_model, "messages": messages, "max_tokens": max_tokens}, - timeout=180, - ) + body: dict = {"model": used_model, "messages": messages, "max_tokens": max_tokens} + if response_format is not None: + body["response_format"] = response_format + r = await client.post(url, headers=headers, json=body, timeout=180) r.raise_for_status() data = r.json() msg = data["choices"][0]["message"] - # Some providers return null content + populated `reasoning` for thinking - # models, or null content when finish_reason=length cut off the response. - content = msg.get("content") or msg.get("reasoning") + # The `content` field is the model's user-facing answer. The optional + # `reasoning` field is the model's internal chain-of-thought — never + # safe to publish; it contains raw scratchpad ("Let's see…", + # mid-sentence question marks, planning notes). If `content` is empty + # (provider issue, finish_reason=length cutoff, or the model spent + # its budget on thinking), treat that as a generation failure and + # raise so the caller can retry or skip the row. Do NOT fall back to + # reasoning — see the 2026-05-29 valuation-read leak. + content = msg.get("content") if not content: finish = data["choices"][0].get("finish_reason") raise RuntimeError( @@ -189,6 +198,7 @@ async def call_llm( messages: list[dict], model: str | None = None, max_tokens: int = 4000, + response_format: dict | None = None, ) -> LogResult: """Provider-aware chat completion with fallback. Tries primary (LLM_PROVIDER) first; if it raises after retries, falls through to @@ -197,7 +207,11 @@ async def call_llm( The returned LogResult.model is prefixed with the provider that actually answered (e.g. ``deepseek/deepseek-v4-flash`` or ``openrouter/deepseek/deepseek-v4-flash``) — useful admin metadata - even though we hide it from the user-facing UI.""" + even though we hide it from the user-facing UI. + + Pass response_format={"type": "json_object"} to force JSON-mode + output (the model still needs to be instructed in the system prompt + to emit valid JSON — this flag enforces, not asks).""" chain = _provider_chain() if not chain: raise RuntimeError("No LLM provider configured (no API key set)") @@ -207,6 +221,7 @@ async def call_llm( try: result = await _call_provider( client, provider, messages, model, max_tokens, + response_format=response_format, ) if i > 0: from app.logging import get_logger diff --git a/tests/test_openrouter_transport.py b/tests/test_openrouter_transport.py index dfc14b0..e836044 100644 --- a/tests/test_openrouter_transport.py +++ b/tests/test_openrouter_transport.py @@ -183,10 +183,12 @@ async def test_call_llm_uses_upstream_cost_when_provided(monkeypatch): @pytest.mark.asyncio -async def test_call_llm_falls_back_to_reasoning_field_when_content_null(monkeypatch): - """Thinking models sometimes return null `content` plus a populated - `reasoning` block — we surface the reasoning so the caller still gets - something usable rather than treating the row as empty.""" +async def test_call_llm_does_not_publish_reasoning_when_content_null(monkeypatch): + """The `reasoning` field is the model's internal chain-of-thought + (scratchpad: "Let's see…", planning notes, half-formed math). It is + never safe to surface as the user-facing answer — see the + 2026-05-29 valuation-read leak. If `content` is null we treat the + row as a generation failure and raise; the caller can retry or skip.""" _configure(monkeypatch, DEEPSEEK_API_KEY="sk-d", LLM_FALLBACK="") def handler(request: httpx.Request) -> httpx.Response: @@ -199,8 +201,8 @@ async def test_call_llm_falls_back_to_reasoning_field_when_content_null(monkeypa }) async with httpx.AsyncClient(transport=_mock_post(handler)) as client: - result = await ot.call_llm(client, [{"role": "user", "content": "hi"}]) - assert result.content == "deep thought" + with pytest.raises(RuntimeError, match="LLM returned empty content"): + await ot.call_llm(client, [{"role": "user", "content": "hi"}]) @pytest.mark.asyncio @@ -228,7 +230,7 @@ async def test_call_llm_falls_back_to_secondary_when_primary_raises(monkeypatch) prompt_tokens=1, completion_tokens=2, cost_usd=0.0, ) - async def fake(_client, provider, _messages, _model, _max_tokens): + async def fake(_client, provider, _messages, _model, _max_tokens, response_format=None): calls.append(provider) if provider == "deepseek": raise RuntimeError("primary down") @@ -247,7 +249,7 @@ async def test_call_llm_raises_last_exception_when_chain_exhausted(monkeypatch): _configure(monkeypatch, DEEPSEEK_API_KEY="sk-d", OPENROUTER_API_KEY="sk-or") - async def fake(_client, provider, _messages, _model, _max_tokens): + async def fake(_client, provider, _messages, _model, _max_tokens, response_format=None): raise RuntimeError(f"{provider} broken") with patch.object(ot, "_call_provider", fake): From 45fa31bb2bfa38670341eafe191b5c88cb0ff700 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:10:52 +0200 Subject: [PATCH 62/69] ai: structured-output + reviewer agent for indicator summaries MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the regex-based clean_summary / looks_like_leakage pipeline that produced the 2026-05-29 valuation-read leak. Two layers of defence in depth: 1. JSON-mode generation. The per-group and aggregate summary system prompts now require the model to emit a single object {"read": "..."}; response_format={"type":"json_object"} is passed through to the provider so the API enforces well-formed JSON. Prose outside the field is physically impossible. The "read" field is the only schema slot, so the model has nowhere to spill scratchpad into the envelope. 2. Reviewer agent. services/output_review.review_read() makes a second small LLM call that judges whether the candidate "read" string is publishable. It catches the residual failure mode — scratchpad INSIDE the field ("Let's see…", multi-question parentheticals, meta-commentary) — and returns a JSON verdict {"clean": bool, "reason": str}. Any failure (provider error, parse error, missing field) returns clean=false (fail-safe). Cost ~$0.0001/check; latency ~1-2 s in the hourly job, no user-facing latency. The old regex scaffolding (_LEAK_PATTERNS, clean_summary, looks_like_leakage, _TRAILING_QUOTE) is deleted entirely. It produced false positives (chopped legitimate "The indicators are…" leaders) and false negatives (never matched the chain-of-thought patterns the model actually emits). The reviewer agent is strictly better on both. On reviewer/parse rejection: don't persist a new IndicatorSummary; the API's existing fallback to the previous good row continues to serve the panel. Failures are logged as ind_summary.json_invalid / ind_summary.reviewer_rejected so we can measure the rejection rate. Reviewer cost is added to the row's recorded cost_usd so the monthly budget cap covers the full pipeline. Adds tests/test_output_review.py: 11 cases covering _extract_read (JSON envelope handling — invalid JSON, missing field, wrong types, empty values) and review_read (clean / unclean verdicts plus three fail-safe paths for malformed reviewer responses). Co-Authored-By: Claude Opus 4.7 --- app/jobs/indicator_summary_job.py | 245 ++++++++++++++---------------- app/services/llm_prompts.py | 39 ++++- app/services/output_review.py | 107 +++++++++++++ tests/test_output_review.py | 146 ++++++++++++++++++ 4 files changed, 396 insertions(+), 141 deletions(-) create mode 100644 app/services/output_review.py create mode 100644 tests/test_output_review.py diff --git a/app/jobs/indicator_summary_job.py b/app/jobs/indicator_summary_job.py index 97c5f80..422c49c 100644 --- a/app/jobs/indicator_summary_job.py +++ b/app/jobs/indicator_summary_job.py @@ -4,7 +4,7 @@ hourly stays comfortably under the monthly cap.""" from __future__ import annotations import asyncio -import re +import json import httpx from sqlalchemy import desc, func, select @@ -35,6 +35,7 @@ from app.services.openrouter import ( llm_configured, month_start, ) +from app.services.output_review import review_read from app.services.translation import translate @@ -106,109 +107,41 @@ async def translate_summary_for_active_languages(session, summary_id: int) -> No summary_id=summary_id, succeeded=succeeded, failed=failed) -# Strip known meta-commentary openers the model sometimes leaks despite the -# prompt's hard constraints. Each pattern matches one leading sentence. -_LEAK_PATTERNS = [ - re.compile(p, re.IGNORECASE | re.DOTALL) - for p in ( - # First-person meta — "I need to / I'll / I have to / I'm going to ..." - r"^i\s+(?:need|have|must|should|am going|'ll|will|shall|can|am)[^.]*\.\s*", - # "We need / we're / we are asked / we will ..." - r"^we\s+(?:need|are|'re|will|shall|can|should|must|have)[^.]*\.\s*", - r"^let\s+(?:me|us|'?s)[^.]*\.\s*", - r"^here['’]s[^.]*\.\s*", - r"^sure[,!]?\s[^.]*\.\s*", - r"^looking at[^.]*\.\s*", - r"^based on[^.]*\.\s*", - r"^to (?:address|answer|write|summarise|summarize)[^.]*\.\s*", - r"^first[,]?\s[^.]*\.\s*", - r"^the (?:user|data shows|reader|task|request|reader sees|instructions?)[^.]*\.\s*", - r"^summary[:.]\s*", - r"^key\s*[:\-—]\s*", - r"^must\s+(?:be|cite|explain|avoid|give|stay|provide)[^.]*\.\s*", - r"^should\s+(?:be|give|cite|explain|avoid|provide)[^.]*\.\s*", - r"^avoid[^.]*\.\s*", - r"^cite\s+at\s+most[^.]*\.\s*", - r"^be\s+(?:speculative|specific|concise|brief)[^.]*\.\s*", - r"^stay\s+on[^.]*\.\s*", - r"^okay[,]?\s+", - r"^alright[,]?\s+", - r"^thinking[^.]*\.\s*", - # Prompt-leak prefixes — the model echoes example framing or rule - # headers from the system prompt. - r"^(?:good|bad|positive|negative)\s+example\s*[:\-—]\s*", - r"^example\s+(?:good|bad)\s*[:\-—]\s*", - r"^example\s*[:\-—]\s*", - r"^reference\s+style\s*[:\-—]\s*", - # Prompt label echoes (markdown-style or plain-text) - r"^(?:hard\s+)?constraints?\s*[:\-—][^.\n]*[.\n]\s*", - r"^key\s+observations?\s*[:\-—]\s*", - r"^observations?\s*[:\-—]\s*", - r"^focus\s+on[^.]*\.\s*", - r"^output\s+the\s+read[^.]*\.\s*", - r"^plain\s+prose[^.]*\.\s*", - r"^the\s+indicators?[^.]*\.\s*", # "The indicators include..." / "The indicators are..." - r"^indicators?\s*[:\-—]\s*", - r"^data\s*[:\-—]\s*", - r"^analysis\s*[:\-—]\s*", - r"^interpretation\s*[:\-—]\s*", - r"^read\s*[:\-—]\s*", - r"^note\s*[:\-—]\s*", - # Sometimes the response gets wrapped in literal quotes - r"^[\"“'`]+", - ) -] +# Defence-in-depth: read generation goes through JSON mode + a reviewer. +# +# 1. The system prompt instructs the model to emit {"read": "..."} only; +# response_format={"type":"json_object"} forces well-formed JSON at +# the API layer, so prose outside the field is impossible. +# 2. We extract `read`, then ask a second LLM call (services/output_review) +# whether the candidate text is publishable. Scratchpad INSIDE the +# field — "Let's see…", "X? Actually Y?" — is caught here. +# 3. Any failure at either stage (parse, missing field, reviewer veto, +# reviewer error) drops the candidate. The previous good +# IndicatorSummary stays visible. +# +# The old _LEAK_PATTERNS / clean_summary / looks_like_leakage regex +# scaffolding lived here previously. It produced false positives (e.g. +# chopping off a legitimate leading sentence like "The indicators are +# pricing…") and false negatives (it never caught the chain-of-thought +# patterns the model actually emits). The reviewer agent replaces it. -_TRAILING_QUOTE = re.compile(r"[\"”'`]+\s*$") - -# Tell-tale phrases that mean the model regurgitated the prompt as its -# "answer" — we'd rather show nothing than show this. -_LEAKAGE_FLAGS = ( - "≤60 words", "60 words", "must be under", "must cite", "must explain", - "no meta-commentary", "no buy/sell", "horizon. ", "1-day moves", - "the instructions are", "instructions:", "constraints:", "hard constraints", - "good example", "bad example", "reference style", -) - - -def looks_like_leakage(text: str) -> bool: - """Heuristic: after cleaning, if these phrases still appear, the output - is contaminated prompt-regurgitation and shouldn't be shown.""" - low = text.lower() - return any(flag in low for flag in _LEAKAGE_FLAGS) - - -def clean_summary(text: str) -> str: - """Strip leading meta-commentary. If cleaning removes nearly everything - (suggesting the model emitted reasoning then ran out of tokens), fall - back to the last non-empty paragraph of the raw output — that's usually - where the actual answer ended up.""" - raw = text.strip() - out = raw - # Up to 6 passes: handles compound leakage like - # "Constraints: <...>. The indicators are: <...>. " - for _ in range(6): - before = out - for pat in _LEAK_PATTERNS: - out = pat.sub("", out, count=1).lstrip() - if out == before: - break - if len(out) < 60 and len(raw) > 120: - # Cleaning ate too much; take the last non-empty paragraph of raw. - paragraphs = [p.strip() for p in re.split(r"\n\s*\n", raw) if p.strip()] - if paragraphs: - out = paragraphs[-1] - # Re-strip leaders from the recovered paragraph too. - for _ in range(2): - before = out - for pat in _LEAK_PATTERNS: - out = pat.sub("", out, count=1).lstrip() - if out == before: - break - # Trim any orphan closing quote/backtick from the wrap-strip above. - out = _TRAILING_QUOTE.sub("", out).rstrip() - return out +def _extract_read(raw: str) -> str | None: + """Parse the model's JSON envelope and return the "read" field, or + None if the body isn't valid JSON / the field is missing / the field + isn't a string. Conservative: on any deviation from the schema we + drop the candidate rather than try to salvage it.""" + try: + parsed = json.loads(raw) + except json.JSONDecodeError: + return None + if not isinstance(parsed, dict): + return None + read = parsed.get("read") + if not isinstance(read, str): + return None + read = read.strip() + return read or None @@ -228,19 +161,20 @@ async def _generate_one( [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt}], max_tokens=800, # DeepSeek sometimes spends 300+ on internal reasoning + response_format={"type": "json_object"}, ) except Exception as e: session.add(AICall(model=active_model(), status="error", error=str(e)[:500])) log.warning("ind_summary.failed", group=group, error=str(e)[:120]) return None - cleaned = clean_summary(result.content) - if looks_like_leakage(cleaned) or len(cleaned) < 40: - # Model regurgitated the prompt or produced nothing usable. - # Don't persist — keep the last good summary visible. Log it so - # we can see the rate of failures over time. - log.warning("ind_summary.leakage_detected", - group=group, preview=cleaned[:120]) + candidate = _extract_read(result.content) + if candidate is None or len(candidate) < 40: + # JSON envelope malformed, "read" field missing/wrong type, or + # the candidate is too short to be a real read. Don't persist; + # the last good summary stays visible. + log.warning("ind_summary.json_invalid", + group=group, preview=result.content[:160]) session.add(AICall( model=result.model, prompt_tokens=result.prompt_tokens, @@ -250,6 +184,23 @@ async def _generate_one( )) return None + verdict = await review_read(client, candidate) + if not verdict.clean: + # Reviewer caught scratchpad / meta-commentary / partial text + # INSIDE the read field. Drop the candidate; the previous good + # summary continues to serve. + log.warning("ind_summary.reviewer_rejected", + group=group, reason=verdict.reason, + preview=candidate[:120]) + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=(result.cost_usd or 0.0) + (verdict.cost_usd or 0.0), + status="leaked", + )) + return None + summary = IndicatorSummary( group_name=group, generated_at=utcnow(), @@ -257,17 +208,19 @@ async def _generate_one( tone=tone, analysis=analysis, prompt_version=PROMPT_VERSION, - content=cleaned, + content=candidate, prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, + # Include the reviewer's cost in the row's recorded spend so the + # monthly budget tracking covers the full pipeline cost. + cost_usd=(result.cost_usd or 0.0) + (verdict.cost_usd or 0.0), ) session.add(summary) session.add(AICall( model=result.model, prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, + cost_usd=(result.cost_usd or 0.0) + (verdict.cost_usd or 0.0), status="ok", )) return summary @@ -338,6 +291,7 @@ async def run() -> None: await translate_summary_for_active_languages(session, summary.id) # One aggregate read across all groups, stored under __all__. + # Same JSON-mode + reviewer-agent path as per-group reads. agg_system = build_aggregate_summary_system_prompt(tone, analysis) agg_user = build_aggregate_summary_user_prompt(groups) agg_summary: IndicatorSummary | None = None @@ -346,28 +300,53 @@ async def run() -> None: client, [{"role": "system", "content": agg_system}, {"role": "user", "content": agg_user}], - max_tokens=1500, # room for reasoning + 80-word output + max_tokens=1500, + response_format={"type": "json_object"}, ) - agg_summary = IndicatorSummary( - group_name=AGGREGATE_GROUP_NAME, - generated_at=utcnow(), - model=result.model, - tone=tone, - analysis=analysis, - prompt_version=PROMPT_VERSION, - content=clean_summary(result.content), - prompt_tokens=result.prompt_tokens, - completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, - ) - session.add(agg_summary) - session.add(AICall( - model=result.model, - prompt_tokens=result.prompt_tokens, - completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, status="ok", - )) - written += 1 + candidate = _extract_read(result.content) + if candidate is None or len(candidate) < 40: + log.warning("ind_summary.agg_json_invalid", + tone=tone, preview=result.content[:160]) + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=result.cost_usd, status="leaked", + )) + else: + verdict = await review_read(client, candidate) + full_cost = (result.cost_usd or 0.0) + (verdict.cost_usd or 0.0) + if not verdict.clean: + log.warning("ind_summary.agg_reviewer_rejected", + tone=tone, reason=verdict.reason, + preview=candidate[:120]) + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=full_cost, status="leaked", + )) + else: + agg_summary = IndicatorSummary( + group_name=AGGREGATE_GROUP_NAME, + generated_at=utcnow(), + model=result.model, + tone=tone, + analysis=analysis, + prompt_version=PROMPT_VERSION, + content=candidate, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=full_cost, + ) + session.add(agg_summary) + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=full_cost, status="ok", + )) + written += 1 except Exception as e: session.add(AICall( model=active_model(), status="error", diff --git a/app/services/llm_prompts.py b/app/services/llm_prompts.py index 9840ec2..726b60a 100644 --- a/app/services/llm_prompts.py +++ b/app/services/llm_prompts.py @@ -296,12 +296,25 @@ question via the chat sidebar. def build_summary_system_prompt(tone: str, analysis: str) -> str: """A lean, focused system prompt for the per-indicator-group hourly summary. INTERPRETATION not description — the reader has the table - next to this paragraph; they don't need numbers recited at them.""" + next to this paragraph; they don't need numbers recited at them. + + Output is JSON-mode: the model must emit a single object + {"read": "..."}. The wrapper makes scratchpad outside the field + physically impossible — the API enforces well-formed JSON, and the + only schema slot is the publishable read. Scratchpad inside the + field is caught by the reviewer agent (services/output_review).""" tone_block = _TONE[_resolve_tone(tone)] analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) return f"""You write a TINY interpretation (≤60 words, 2-3 sentences) \ of ONE indicator group for a strategic markets dashboard. +# Output format (strict) +Return ONLY a single JSON object with exactly one field: +{{"read": ""}} +Nothing outside that JSON object. No preamble. No markdown fences. \ +No additional fields. The "read" string is what the user sees verbatim, \ +so it must already be the finished, publishable text — never your thinking. + # What this is for The reader is looking at the table of numbers right next to your text. \ They can see the values. They CANNOT see the meaning. Your job is to \ @@ -316,19 +329,20 @@ Even at 2-3 sentences, contrast what the underlying factors justify \ they don't diverge, say so in one clause. Never just describe the move \ without placing it on this axis. -# Hard constraints +# Hard constraints on the "read" string - Plain prose, ONE paragraph. No markdown, no headers, no lists, no labels. - Open IMMEDIATELY with substance. NEVER start with: "I need to", "I'll", \ "We need to", "We are asked", "Here's", "Let me", "Let's", "Sure", "Looking \ at", "Based on", "Summary:", "The data shows", "First", "To address". No \ meta-commentary at all. +- No rhetorical questions, no "X? Actually Y?" self-corrections, no \ +parenthetical asides that question your own numbers. The text is the \ +finished read, not the thinking. - Cite at most 2-3 specific numbers and ONLY when they anchor an \ interpretation. Don't list moves; explain them. - Multi-week / multi-month horizon. 1-day moves under 2% are noise — skip. - No buy/sell language. No predictions. No watch list. No TL;DR. No date \ header. No "system temperature" line — that belongs to the full daily log. -- Output the read directly. Do NOT include phrases like "Example", "Good \ -example", "Bad example", "Reference", or any meta-framing of your output. {tone_block} @@ -350,13 +364,22 @@ def build_summary_user_prompt(group_name: str, quotes: list[dict]) -> str: def build_aggregate_summary_system_prompt(tone: str, analysis: str) -> str: """System prompt for the cross-group aggregate read shown on the dashboard. - Wider lens than a per-group summary — synthesise across all groups.""" + Wider lens than a per-group summary — synthesise across all groups. + + Same JSON-mode contract as build_summary_system_prompt: output is + {"read": "..."} only; the field is the publishable text verbatim.""" tone_block = _TONE[_resolve_tone(tone)] analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"]) return f"""You write a single SHORT cross-asset INTERPRETATION (≤80 \ words, 2-4 sentences) for the dashboard header. The reader is glancing — \ give them the meaning of the whole tape, not a recap. +# Output format (strict) +Return ONLY a single JSON object with exactly one field: +{{"read": ""}} +Nothing outside that JSON object. No preamble. No markdown fences. \ +No additional fields. The "read" string is what the user sees verbatim. + # What this is for The reader can see every indicator on the dashboard below this paragraph. \ Your job is NOT to summarise the moves. It is to explain what the moves, \ @@ -371,19 +394,19 @@ crowd is actually doing (irrational: positioning, narrative momentum, \ flows). At least one of the 2-4 sentences must name this gap or, if the \ two cohere, explicitly say so. -# Hard constraints +# Hard constraints on the "read" string - Plain prose, ONE paragraph. No markdown, headers, lists, or labels. - Open IMMEDIATELY with substance. NEVER start with: "I need to", "I'll", \ "We need to", "Here's", "Let me", "Looking at", "Based on", "Sure", "Summary:", \ "The data shows", "Across the board". No meta-commentary. +- No rhetorical questions, no "X? Actually Y?" self-corrections, no \ +parenthetical asides that question your own numbers. - Identify the single most important **cross-asset implication**: e.g. \ "rates and credit disagree", "equities outrun fundamentals", "geopolitical \ risk premium is in commodities but not vol". Cite no more than 3 specific \ numbers, and only as anchors for the interpretation. - Multi-week / multi-month horizon. 1-day moves under 2% are noise. - No buy/sell language. No predictions of specific levels. -- Output the read directly. Do NOT include phrases like "Example", "Good \ -example", "Bad example", "Reference", or any meta-framing of your output. {tone_block} diff --git a/app/services/output_review.py b/app/services/output_review.py new file mode 100644 index 0000000..3af2a7a --- /dev/null +++ b/app/services/output_review.py @@ -0,0 +1,107 @@ +"""Second-pass reviewer agent for AI-generated reads. + +The per-group and aggregate indicator summaries are generated in JSON +mode and the publishable text comes out of a single "read" field, but a +misbehaving model can still slip chain-of-thought INSIDE the field +("Let's see…", "X? Actually Y?", multi-question parentheticals). This +module makes a small second LLM call that judges the candidate read as +clean / unclean. Cost is ~$0.0001 per check; latency ~1-2 s in the +hourly job. No user-facing latency. + +The reviewer is deliberately a tiny, JSON-shaped classifier — same +JSON-mode mechanism as the generator, so the verdict can't be lost in +prose. If parsing fails or the call errors, the row is rejected +(fail-safe: the previously cached good summary stays visible). +""" +from __future__ import annotations + +import json +from dataclasses import dataclass + +import httpx + +from app.logging import get_logger +from app.services.openrouter import call_llm + +log = get_logger("output_review") + + +_SYSTEM_PROMPT = """\ +You are a strict editor for a financial-markets dashboard. The author +was asked to produce a short interpretive read for human readers. +You receive their proposed read and decide if it is publishable as-is. + +Mark CLEAN only if the text reads like a finished interpretation a +reader could see on a public dashboard without confusion. + +Mark UNCLEAN if the text contains ANY of: +- Chain-of-thought / scratchpad markers used as thinking — phrases like + "Let me", "Let's see", "we need to", "actually" (correcting itself), + "wait", "hmm", "or rather", "I should". +- Self-questioning parentheticals: "Q1 2026? Actually Q4 2025?", + "is it X or Y?", any place where the author appears to be working + out the answer in front of the reader. +- Multiple rhetorical questions or any question that interrupts the + declarative voice. A clean interpretive read is assertive. +- Meta-commentary about the task, output format, word limits, or + instructions — e.g. "as required by the constraints", "the prompt + asks", "let me address each". +- Partial / truncated content. Starts mid-word, mid-number, mid-clause. +- Visible internal numbers without clear meaning ("change 1y +5.9%?"), + raw column names ("as_of 2026-01-01"), or any debug-like fragments. +- Anything other than the finished, publishable interpretation. + +Return ONLY a JSON object with this exact shape: +{"clean": true | false, "reason": "<≤20 words, plain text>"} +No preamble, no markdown fences, no other fields. +""" + + +@dataclass(frozen=True) +class Verdict: + clean: bool + reason: str + cost_usd: float | None # cost of the review call itself, for the ledger + + +async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict: + """Ask the LLM whether `candidate` is a publishable read. + + Returns Verdict(clean, reason, cost). Any error — provider failure, + JSON parse failure, missing field, wrong type — yields a CONSERVATIVE + verdict (clean=False) so the caller drops the candidate. The + previously cached good summary stays visible on the dashboard.""" + if not candidate or not candidate.strip(): + return Verdict(clean=False, reason="empty candidate", cost_usd=0.0) + + messages = [ + {"role": "system", "content": _SYSTEM_PROMPT}, + # Sent as a fenced user turn so the model can't confuse the + # candidate with instructions, even if the candidate happens to + # contain prompt-like prose. + {"role": "user", "content": f"Candidate read:\n```\n{candidate}\n```"}, + ] + try: + result = await call_llm( + client, messages, + max_tokens=120, + response_format={"type": "json_object"}, + ) + except Exception as e: + log.warning("review.call_failed", error=str(e)[:200]) + return Verdict(clean=False, reason=f"reviewer error: {str(e)[:80]}", + cost_usd=None) + + try: + parsed = json.loads(result.content) + except json.JSONDecodeError: + log.warning("review.parse_failed", preview=result.content[:200]) + return Verdict(clean=False, reason="reviewer returned non-JSON", + cost_usd=result.cost_usd) + + clean = parsed.get("clean") + reason = parsed.get("reason") or "" + if not isinstance(clean, bool): + return Verdict(clean=False, reason="reviewer omitted bool 'clean'", + cost_usd=result.cost_usd) + return Verdict(clean=clean, reason=str(reason)[:200], cost_usd=result.cost_usd) diff --git a/tests/test_output_review.py b/tests/test_output_review.py new file mode 100644 index 0000000..53f0b34 --- /dev/null +++ b/tests/test_output_review.py @@ -0,0 +1,146 @@ +"""Tests for the JSON-envelope extractor and the reviewer agent. + +The two together replaced the regex `clean_summary` + `looks_like_leakage` +scaffolding that used to live in indicator_summary_job. The extractor is +pure-function so it's covered exhaustively; the reviewer makes an LLM +call and is exercised via the httpx MockTransport that the other +openrouter tests use.""" +from __future__ import annotations + +import httpx +import pytest + +from app.jobs.indicator_summary_job import _extract_read +from app.services import openrouter as ot +from app.services.output_review import review_read + + +# --------------------------------------------------------------------------- +# _extract_read — JSON envelope handling +# --------------------------------------------------------------------------- + + +def test_extract_read_returns_trimmed_field(): + raw = '{"read": " The market is pricing growth. "}' + assert _extract_read(raw) == "The market is pricing growth." + + +def test_extract_read_returns_none_on_invalid_json(): + assert _extract_read("not json") is None + assert _extract_read("{bad}") is None + assert _extract_read("") is None + + +def test_extract_read_returns_none_when_field_missing(): + assert _extract_read('{"other": "x"}') is None + + +def test_extract_read_returns_none_when_field_not_string(): + assert _extract_read('{"read": 42}') is None + assert _extract_read('{"read": null}') is None + assert _extract_read('{"read": ["a","b"]}') is None + + +def test_extract_read_returns_none_when_field_empty(): + assert _extract_read('{"read": ""}') is None + assert _extract_read('{"read": " "}') is None + + +def test_extract_read_returns_none_when_envelope_not_object(): + # A bare string or array is valid JSON but not the expected shape. + assert _extract_read('"just a string"') is None + assert _extract_read('["a", "b"]') is None + + +# --------------------------------------------------------------------------- +# review_read — judges candidate read via a second LLM call +# --------------------------------------------------------------------------- + + +def _mock_post(handler): + return httpx.MockTransport(handler) + + +def _configure(monkeypatch): + """Minimal env so call_llm believes a provider is configured.""" + monkeypatch.setattr(ot, "get_settings", lambda: type("S", (), { + "LLM_PROVIDER": "deepseek", "LLM_FALLBACK": "", + "DEEPSEEK_API_KEY": "sk-d", "OPENROUTER_API_KEY": "", + "DEEPSEEK_URL": "https://x/deepseek", "DEEPSEEK_MODEL": "deepseek-v4-flash", + "OPENROUTER_URL": "https://x/or", "OPENROUTER_MODEL": "deepseek/deepseek-v4-flash", + })()) + + +@pytest.mark.asyncio +async def test_review_clean_verdict(monkeypatch): + _configure(monkeypatch) + def handler(_req): + return httpx.Response(200, json={ + "choices": [{"message": {"content": '{"clean": true, "reason": "ok"}'}, + "finish_reason": "stop"}], + "usage": {"prompt_tokens": 50, "completion_tokens": 12, "cost": 0.00007}, + }) + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + v = await review_read(client, "Markets are pricing tighter policy.") + assert v.clean is True + assert v.cost_usd == 0.00007 + + +@pytest.mark.asyncio +async def test_review_unclean_verdict(monkeypatch): + _configure(monkeypatch) + def handler(_req): + return httpx.Response(200, json={ + "choices": [{"message": {"content": + '{"clean": false, "reason": "chain of thought"}'}, + "finish_reason": "stop"}], + "usage": {"prompt_tokens": 50, "completion_tokens": 14, "cost": 0.00009}, + }) + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + v = await review_read(client, "Let's see, is it X? Actually Y?") + assert v.clean is False + assert "chain of thought" in v.reason + + +@pytest.mark.asyncio +async def test_review_failsafe_on_malformed_json(monkeypatch): + """Reviewer returned prose instead of JSON → conservative reject.""" + _configure(monkeypatch) + def handler(_req): + return httpx.Response(200, json={ + "choices": [{"message": {"content": "yes it looks clean"}, + "finish_reason": "stop"}], + "usage": {"prompt_tokens": 50, "completion_tokens": 6}, + }) + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + v = await review_read(client, "Some candidate.") + assert v.clean is False + assert "non-JSON" in v.reason + + +@pytest.mark.asyncio +async def test_review_failsafe_on_missing_clean_field(monkeypatch): + _configure(monkeypatch) + def handler(_req): + return httpx.Response(200, json={ + "choices": [{"message": {"content": '{"reason": "no field"}'}, + "finish_reason": "stop"}], + "usage": {"prompt_tokens": 50, "completion_tokens": 6}, + }) + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + v = await review_read(client, "Some candidate.") + assert v.clean is False + + +@pytest.mark.asyncio +async def test_review_failsafe_on_empty_candidate(monkeypatch): + """No LLM call should fire if the candidate is empty.""" + _configure(monkeypatch) + calls = [] + def handler(_req): + calls.append(1) + return httpx.Response(500, json={"error": "should not be called"}) + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + v = await review_read(client, " ") + assert v.clean is False + assert calls == [] From 0550063316d14a5811cf9a9d30837b90f318adaf Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:15:42 +0200 Subject: [PATCH 63/69] =?UTF-8?q?ai:=20bump=20reviewer=20max=5Ftokens=2012?= =?UTF-8?q?0=20=E2=86=92=20300?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A live sanity-check on 50 recent IndicatorSummary rows found 6 of 10 reviewer rejections were the reviewer hitting its own max_tokens cap mid-verdict ('{"clean": false, "reason": "Truncated sent…'). The parser then dropped the candidate as malformed JSON, producing a false-negative verdict that would have purged legitimately clean rows. 300 tokens is well above the ~30-token verdict the prompt asks for; the extra headroom removes the artefact at ~$0.00015 per call. Co-Authored-By: Claude Opus 4.7 --- app/services/output_review.py | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/app/services/output_review.py b/app/services/output_review.py index 3af2a7a..cdf545d 100644 --- a/app/services/output_review.py +++ b/app/services/output_review.py @@ -84,7 +84,15 @@ async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict: try: result = await call_llm( client, messages, - max_tokens=120, + # 300 tokens is comfortably above the 30-token JSON verdict + # the prompt asks for. An earlier 120-token cap was producing + # frequent finish_reason=length cutoffs that left the JSON + # half-written ('{"clean": false, "reason": "Text…'), which + # the parser then rejected as malformed — a false-negative + # in the verdict. The extra headroom costs ~$0.00015 per + # call (DeepSeek output rates) and removes that whole class + # of artefact. + max_tokens=300, response_format={"type": "json_object"}, ) except Exception as e: From 8b9d3c9c3e5e42a45584fe05cd842113c3bbc8c6 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:16:57 +0200 Subject: [PATCH 64/69] =?UTF-8?q?ai:=20bump=20reviewer=20max=5Ftokens=2030?= =?UTF-8?q?0=20=E2=86=92=20800?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Live re-check on 50 recent IndicatorSummary rows after the previous 120 → 300 bump still produced 4 'reviewer returned non-JSON' verdicts out of 12 rejections. DeepSeek-V4-flash sometimes prefixes its JSON output with a short stretch of thinking even though response_format is enforced, which truncates the JSON at the back end of the 300-token cap. 800 tokens is comfortably above any realistic verdict + preamble at ~$0.00022/call (DeepSeek output rates). Negligible cost given the hourly call volume. Co-Authored-By: Claude Opus 4.7 --- app/services/output_review.py | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/app/services/output_review.py b/app/services/output_review.py index cdf545d..f228a74 100644 --- a/app/services/output_review.py +++ b/app/services/output_review.py @@ -84,15 +84,14 @@ async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict: try: result = await call_llm( client, messages, - # 300 tokens is comfortably above the 30-token JSON verdict - # the prompt asks for. An earlier 120-token cap was producing - # frequent finish_reason=length cutoffs that left the JSON - # half-written ('{"clean": false, "reason": "Text…'), which - # the parser then rejected as malformed — a false-negative - # in the verdict. The extra headroom costs ~$0.00015 per - # call (DeepSeek output rates) and removes that whole class - # of artefact. - max_tokens=300, + # 800 tokens is well above the ~30-token JSON verdict the + # prompt asks for. The reviewer model (DeepSeek-V4-flash) + # occasionally pads with its own thinking before the JSON + # even though response_format is enforced; smaller caps + # (120, 300) produced finish_reason=length cutoffs that + # left the JSON half-written and broke the parser. 800 + # removes the artefact entirely at ~$0.00022 per call. + max_tokens=800, response_format={"type": "json_object"}, ) except Exception as e: From 788563a81fcca79b266118d3a1ca7e1c8e19b3fa Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:21:26 +0200 Subject: [PATCH 65/69] ai: route reviewer through OpenRouter + Claude Haiku 4.5 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The DeepSeek-V4-flash reviewer was unreliable in production: it pads its JSON verdicts with internal chain-of-thought even when the prompt forbids it, so the verdict gets truncated at any reasonable max_tokens cap and the parser drops it as malformed (a false-negative verdict that would purge clean rows). A live run on 50 rows reproduced the failure on 8 of 12 rejections, even at 800 tokens. Fix: pin the reviewer call to OpenRouter with anthropic/claude-haiku-4.5. Haiku answers structured-output classification tersely (no scratchpad preamble), which means a 300-token cap is comfortably above the ~30-token JSON verdict. Cost is roughly the same (~$0.0001-$0.0003 per review) and the latency tax is smaller. To enable the pinned-provider call without disrupting other callers, call_llm grows an optional `provider` parameter: when set, only that provider is used (no fallback chain). All existing call sites default to provider=None and keep the chain behaviour. REVIEWER_MODEL is read from settings via getattr-with-fallback so an env override can swap models without code changes — useful if we want to A/B test against e.g. gemini-2.5-flash later. Co-Authored-By: Claude Opus 4.7 --- app/services/openrouter.py | 13 +++++++++++-- app/services/output_review.py | 31 +++++++++++++++++++++++-------- tests/test_output_review.py | 15 +++++++++++---- 3 files changed, 45 insertions(+), 14 deletions(-) diff --git a/app/services/openrouter.py b/app/services/openrouter.py index 598150c..50e7f7e 100644 --- a/app/services/openrouter.py +++ b/app/services/openrouter.py @@ -199,6 +199,7 @@ async def call_llm( model: str | None = None, max_tokens: int = 4000, response_format: dict | None = None, + provider: str | None = None, ) -> LogResult: """Provider-aware chat completion with fallback. Tries primary (LLM_PROVIDER) first; if it raises after retries, falls through to @@ -211,8 +212,16 @@ async def call_llm( Pass response_format={"type": "json_object"} to force JSON-mode output (the model still needs to be instructed in the system prompt - to emit valid JSON — this flag enforces, not asks).""" - chain = _provider_chain() + to emit valid JSON — this flag enforces, not asks). + + Pass `provider` (e.g. "openrouter") to skip the configured chain + and pin the call to a specific provider. Used by the reviewer agent + to force routing through OpenRouter so it can address a non-DeepSeek + model that doesn't pre-think before emitting JSON.""" + if provider is not None: + chain = [provider] + else: + chain = _provider_chain() if not chain: raise RuntimeError("No LLM provider configured (no API key set)") diff --git a/app/services/output_review.py b/app/services/output_review.py index f228a74..fe22e6d 100644 --- a/app/services/output_review.py +++ b/app/services/output_review.py @@ -20,12 +20,23 @@ from dataclasses import dataclass import httpx +from app.config import get_settings from app.logging import get_logger from app.services.openrouter import call_llm log = get_logger("output_review") +# The reviewer runs through OpenRouter against a small, non-thinking +# model. DeepSeek-V4-flash (our generator default) emits internal +# chain-of-thought before its JSON output even when the prompt forbids +# it, which truncates the JSON at any reasonable max_tokens cap and +# breaks the parser. Anthropic's Haiku family answers structured-output +# tasks tersely and deterministically — no chain-of-thought tax. Cost +# is ~$0.0001-$0.0003 per review depending on candidate length. +DEFAULT_REVIEWER_MODEL = "anthropic/claude-haiku-4.5" + + _SYSTEM_PROMPT = """\ You are a strict editor for a financial-markets dashboard. The author was asked to produce a short interpretive read for human readers. @@ -81,17 +92,21 @@ async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict: # contain prompt-like prose. {"role": "user", "content": f"Candidate read:\n```\n{candidate}\n```"}, ] + settings = get_settings() + reviewer_model = getattr(settings, "REVIEWER_MODEL", None) or DEFAULT_REVIEWER_MODEL try: result = await call_llm( client, messages, - # 800 tokens is well above the ~30-token JSON verdict the - # prompt asks for. The reviewer model (DeepSeek-V4-flash) - # occasionally pads with its own thinking before the JSON - # even though response_format is enforced; smaller caps - # (120, 300) produced finish_reason=length cutoffs that - # left the JSON half-written and broke the parser. 800 - # removes the artefact entirely at ~$0.00022 per call. - max_tokens=800, + # Pin to OpenRouter so a non-DeepSeek model like Haiku is + # actually reachable; the default provider chain would try + # DeepSeek native first and 404 on the Anthropic model name. + provider="openrouter", + model=reviewer_model, + # 300 tokens is well above the ~30-token JSON verdict. + # Haiku doesn't pad with hidden reasoning the way DeepSeek + # does, so we don't need the 800-token headroom required to + # absorb the generator's chain-of-thought. + max_tokens=300, response_format={"type": "json_object"}, ) except Exception as e: diff --git a/tests/test_output_review.py b/tests/test_output_review.py index 53f0b34..4e6fa4b 100644 --- a/tests/test_output_review.py +++ b/tests/test_output_review.py @@ -62,13 +62,20 @@ def _mock_post(handler): def _configure(monkeypatch): - """Minimal env so call_llm believes a provider is configured.""" - monkeypatch.setattr(ot, "get_settings", lambda: type("S", (), { + """Minimal env so call_llm believes a provider is configured. + Both review_read (which pins to OpenRouter for a non-thinking model) + and the openrouter module itself read get_settings, so we patch + both module-level references.""" + import app.services.output_review as orr + settings = type("S", (), { "LLM_PROVIDER": "deepseek", "LLM_FALLBACK": "", - "DEEPSEEK_API_KEY": "sk-d", "OPENROUTER_API_KEY": "", + "DEEPSEEK_API_KEY": "sk-d", "OPENROUTER_API_KEY": "sk-or", "DEEPSEEK_URL": "https://x/deepseek", "DEEPSEEK_MODEL": "deepseek-v4-flash", "OPENROUTER_URL": "https://x/or", "OPENROUTER_MODEL": "deepseek/deepseek-v4-flash", - })()) + "REVIEWER_MODEL": "anthropic/claude-haiku-4.5", + })() + monkeypatch.setattr(ot, "get_settings", lambda: settings) + monkeypatch.setattr(orr, "get_settings", lambda: settings) @pytest.mark.asyncio From 385c5fdc600b2e7eb1f15c82afe1707a0cc321fd Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:27:37 +0200 Subject: [PATCH 66/69] review: strip markdown code-fences from JSON verdicts MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Haiku 4.5 occasionally wraps its JSON response in a markdown code fence even with response_format={"type":"json_object"} enforced: ```json {"clean": true, "reason": "polished read"} ``` Live testing the new reviewer caught this — every verdict was being dropped as "reviewer returned non-JSON". Strip a single leading trailing fence before json.loads. Defensive for any model that does the same (Claude variants commonly fence JSON even when told not to). Adds a unit test covering fenced output. --- app/services/output_review.py | 16 +++++++++++++++- tests/test_output_review.py | 19 +++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/app/services/output_review.py b/app/services/output_review.py index fe22e6d..401096d 100644 --- a/app/services/output_review.py +++ b/app/services/output_review.py @@ -114,8 +114,22 @@ async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict: return Verdict(clean=False, reason=f"reviewer error: {str(e)[:80]}", cost_usd=None) + # Haiku (and several other models) occasionally wrap their JSON + # output in a markdown code fence even with response_format set — + # ```json\n{...}\n``` — so strip a single leading/trailing fence + # before parsing. We do this defensively for any model; it's a + # no-op for callers that already emit bare JSON. + raw = result.content.strip() + if raw.startswith("```"): + first_nl = raw.find("\n") + if first_nl != -1: + raw = raw[first_nl + 1:] + if raw.rstrip().endswith("```"): + raw = raw.rstrip()[:-3].rstrip() + raw = raw.strip() + try: - parsed = json.loads(result.content) + parsed = json.loads(raw) except json.JSONDecodeError: log.warning("review.parse_failed", preview=result.content[:200]) return Verdict(clean=False, reason="reviewer returned non-JSON", diff --git a/tests/test_output_review.py b/tests/test_output_review.py index 4e6fa4b..c437678 100644 --- a/tests/test_output_review.py +++ b/tests/test_output_review.py @@ -109,6 +109,25 @@ async def test_review_unclean_verdict(monkeypatch): assert "chain of thought" in v.reason +@pytest.mark.asyncio +async def test_review_strips_markdown_fence_around_json(monkeypatch): + """Haiku (and friends) sometimes wrap JSON in ```json ... ``` even + when response_format is set. The parser needs to peel that off + before json.loads or it'll reject otherwise-valid verdicts.""" + _configure(monkeypatch) + fenced = '```json\n{"clean": true, "reason": "polished read"}\n```' + def handler(_req): + return httpx.Response(200, json={ + "choices": [{"message": {"content": fenced}, + "finish_reason": "stop"}], + "usage": {"prompt_tokens": 50, "completion_tokens": 18, "cost": 0.0006}, + }) + async with httpx.AsyncClient(transport=_mock_post(handler)) as client: + v = await review_read(client, "Markets are pricing tighter policy.") + assert v.clean is True + assert v.reason == "polished read" + + @pytest.mark.asyncio async def test_review_failsafe_on_malformed_json(monkeypatch): """Reviewer returned prose instead of JSON → conservative reject.""" From cd485fe6464a9c3d30aef0068177c143bf77a584 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 13:56:47 +0200 Subject: [PATCH 67/69] scripts: one-off purge of unclean IndicatorSummary rows MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Iterates every IndicatorSummary in the DB and asks the reviewer agent (services/output_review.review_read) whether each row's content is publishable. Rows the reviewer flags as unclean are deleted along with their translation rows. The API's existing fallback path — serve the latest IndicatorSummary by (group, tone) — picks up the previous clean row automatically. Concurrency defaults to 8 reviewer calls in flight; on the 3245-row prod archive that completes in ~10 minutes for ~$1 of Haiku cost. Idempotent: a second run only re-evaluates whatever's still in the table. --dry-run skips the deletion stage. After the live pipeline fix landed (JSON-mode + reviewer at write time) this script should not find anything on subsequent invocations. --- scripts/purge_unclean_summaries.py | 76 ++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 scripts/purge_unclean_summaries.py diff --git a/scripts/purge_unclean_summaries.py b/scripts/purge_unclean_summaries.py new file mode 100644 index 0000000..44c8ef8 --- /dev/null +++ b/scripts/purge_unclean_summaries.py @@ -0,0 +1,76 @@ +"""One-off purge: ask the reviewer agent to judge every IndicatorSummary +row already in the DB, delete the ones it flags as unclean. + +Same reviewer the live pipeline uses (services/output_review.review_read), +so post-purge rows are exactly what would survive a fresh generation. +Per-row cost ~$0.0001; total run on ~3000 rows ~$0.30. + +Usage inside the app container: + docker compose exec app python /tmp/purge.py --dry-run + docker compose exec app python /tmp/purge.py # actually delete + +The script processes rows concurrently up to a small fan-out (default 8) +to keep wall-clock down without hammering the provider. +""" +from __future__ import annotations + +import argparse +import asyncio + +import httpx +from sqlalchemy import delete, select + +from app.db import get_session_factory +from app.models import IndicatorSummary, IndicatorSummaryTranslation +from app.services.output_review import review_read + + +async def _judge(client, sem, row): + async with sem: + v = await review_read(client, row.content or "") + return row, v + + +async def main(args): + session_factory = get_session_factory() + async with session_factory() as session: + rows = (await session.execute( + select(IndicatorSummary).order_by(IndicatorSummary.id) + )).scalars().all() + print(f"Reviewing {len(rows)} IndicatorSummary rows…") + + sem = asyncio.Semaphore(args.concurrency) + async with httpx.AsyncClient(follow_redirects=True) as client: + results = await asyncio.gather(*(_judge(client, sem, r) for r in rows)) + + unclean = [(r, v) for r, v in results if not v.clean] + print(f"\nFlagged {len(unclean)} of {len(rows)} as unclean.") + for r, v in unclean: + head = (r.content or "")[:100].replace("\n", " ") + print(f" id={r.id} group={r.group_name} tone={r.tone} " + f"at {r.generated_at} reason={v.reason!r}") + print(f" preview: {head!r}") + + if args.dry_run or not unclean: + return + + ids = [r.id for r, _ in unclean] + await session.execute( + delete(IndicatorSummaryTranslation) + .where(IndicatorSummaryTranslation.summary_id.in_(ids)) + ) + await session.execute( + delete(IndicatorSummary).where(IndicatorSummary.id.in_(ids)) + ) + await session.commit() + print(f"\nDeleted {len(ids)} unclean row(s). The dashboard's /api/indicators/" + " endpoint will now fall back to the previous clean row " + "for each (group, tone).") + + +if __name__ == "__main__": + p = argparse.ArgumentParser() + p.add_argument("--dry-run", action="store_true") + p.add_argument("--concurrency", type=int, default=8, + help="Parallel reviewer calls (default 8)") + asyncio.run(main(p.parse_args())) From a6e476b8512717fd4c12992b9f859be4c0d75e90 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 14:26:37 +0200 Subject: [PATCH 68/69] review: reject financial advice in indicator-summary reads MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a new UNCLEAN criterion to the reviewer agent's system prompt: direct recommendation language (buy/sell/hold/accumulate/trim/rotate), allocation guidance (overweight/underweight, "X% in bonds"), price targets, and personalised framing ("you should", "investors should") all trigger a reject. The operator is not licensed to give investment advice; this is editorial commentary on public data. The generator's system prompt already forbids buy/sell language, but a prompt-only constraint is not an enforcement layer. The reviewer agent — already in the pipeline for chain-of-thought / truncation / meta-commentary — is the right place to enforce the regulatory boundary structurally: rows that drift into advice get dropped, and the API falls back to the previous compliant row. Descriptive / interpretive language about market state remains explicitly allowed ("valuations are stretched", "real yields are restrictive"). The criterion is state vs action: states publish, actions don't. Co-Authored-By: Claude Opus 4.7 --- app/services/output_review.py | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/app/services/output_review.py b/app/services/output_review.py index 401096d..833b927 100644 --- a/app/services/output_review.py +++ b/app/services/output_review.py @@ -60,7 +60,22 @@ Mark UNCLEAN if the text contains ANY of: - Partial / truncated content. Starts mid-word, mid-number, mid-clause. - Visible internal numbers without clear meaning ("change 1y +5.9%?"), raw column names ("as_of 2026-01-01"), or any debug-like fragments. -- Anything other than the finished, publishable interpretation. +- FINANCIAL ADVICE or any phrasing that recommends an action the + reader should take. This service is editorial commentary on public + data, not investment advice; the operator is not licensed to give + it. Reject any of: + * Buy/sell/hold/accumulate/trim/exit/enter/rotate language. + * Allocation guidance ("overweight", "underweight", + "X% in bonds", "increase exposure to"). + * Price targets or specific level predictions ("will reach $X", + "target Y", "expect Z by year-end"). + * Personalised framing ("you should", "investors should", + "consider buying", "we recommend"). + DESCRIPTIVE / INTERPRETIVE language about market state is fine — + "valuations are stretched", "real yields are restrictive", "rates + and credit disagree". The test: does the text describe a STATE, or + does it suggest an ACTION? States are fine; actions are not. +- Anything else other than the finished, publishable interpretation. Return ONLY a JSON object with this exact shape: {"clean": true | false, "reason": "<≤20 words, plain text>"} From f9534f7ad697d8330aa1593fba1faf5baec1b8c2 Mon Sep 17 00:00:00 2001 From: Giorgio Gilestro Date: Fri, 29 May 2026 14:40:04 +0200 Subject: [PATCH 69/69] review: gate strategic-log, portfolio, chat, and digest on reviewer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Extends the reviewer agent — previously only protecting indicator summaries — to every AI-generated surface that reaches a user. The reviewer's prompt already rejects scratchpad, truncation, meta-commentary, and (since a6e476b) financial advice; wiring it in turns those rules from prompt-level "asks" into structural gates. Four call sites updated: - ai_log_job.run() : after each tone/analysis variant is generated, pass through review_read. On reject, log the reason and skip the StrategicLog insert; the API's existing "latest StrategicLog" lookup falls back to the previous clean log. - services/portfolio_analysis.analyse() : on reject, raise a clean RuntimeError that the /api/analyze router already maps to HTTP 502 with a retry-able message. Portfolio analysis isn't cached server- side, so the user retries; the reviewer's verdict reason goes into the AICall ledger as the leaked-status row's error column. - routers/chat.chat() : on reject, instead of returning the raw assistant content we return a short refusal explaining the limit and inviting a rephrase. Adds ~1-2 s of latency per turn (one extra LLM call to Haiku) — the only user-facing latency tax. - jobs/email_digest_job._generate_variants() : on reject, the variant is dropped for the cycle. Recipients on the rejected tone get no digest email this run, which is better than delivering inbox copy that drifts into advice (emails are unrecallable once sent). In every case the AICall ledger row records the reviewer cost so month_spend stays accurate across all paths. The reviewer system prompt is slightly generalised to cover both the indicator-summary case and the longer-form log/digest/chat case: - removes "short interpretive read" framing - softens the "any question" rule so genuine rhetorical structure in a long-form log doesn't trigger a reject tests/conftest.py grows an autouse fixture that stubs review_read to clean=True in every consumer module. Tests that mock the generator shouldn't have to also mock the safety gate behind it; tests that specifically want the reject branch can override with their own monkeypatch. test_output_review.py is unaffected — it imports review_read directly. Co-Authored-By: Claude Opus 4.7 --- app/jobs/ai_log_job.py | 26 +++++++++++++++++-- app/jobs/email_digest_job.py | 22 +++++++++++++++- app/routers/chat.py | 40 +++++++++++++++++++++++++++++- app/services/output_review.py | 24 ++++++++++-------- app/services/portfolio_analysis.py | 36 +++++++++++++++++++++++---- tests/conftest.py | 32 ++++++++++++++++++++++++ 6 files changed, 161 insertions(+), 19 deletions(-) diff --git a/app/jobs/ai_log_job.py b/app/jobs/ai_log_job.py index 9b5683e..197faa5 100644 --- a/app/jobs/ai_log_job.py +++ b/app/jobs/ai_log_job.py @@ -25,6 +25,7 @@ from app.services.llm_prompts import ( build_system_prompt, build_user_prompt, ) +from app.services.output_review import review_read from app.services.openrouter import ( active_model, call_llm, @@ -200,6 +201,27 @@ async def run() -> None: tone=tone, analysis=analysis, error=str(e)[:200]) continue + # Reviewer gate: catches chain-of-thought, truncation, + # and (regulatory-critical) any financial-advice phrasing + # that drifted past the generator's system prompt. Drop + # rejected variants; the API falls back to the previous + # clean StrategicLog row. + verdict = await review_read(client, result.content) + full_cost = (result.cost_usd or 0.0) + (verdict.cost_usd or 0.0) + if not verdict.clean: + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=full_cost, status="leaked", + )) + await session.commit() + log.warning("ai_log.reviewer_rejected", + tone=tone, analysis=analysis, + reason=verdict.reason, + preview=result.content[:120]) + continue + slog = StrategicLog( generated_at=utcnow(), model=result.model, @@ -210,14 +232,14 @@ async def run() -> None: content=result.content, prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, + cost_usd=full_cost, ) session.add(slog) session.add(AICall( model=result.model, prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, + cost_usd=full_cost, status="ok", )) await session.commit() diff --git a/app/jobs/email_digest_job.py b/app/jobs/email_digest_job.py index dc89e5b..4cbd865 100644 --- a/app/jobs/email_digest_job.py +++ b/app/jobs/email_digest_job.py @@ -41,6 +41,7 @@ from app.services.openrouter import ( call_llm, llm_configured, ) +from app.services.output_review import review_read from app.services.translation import translate @@ -93,12 +94,31 @@ async def _generate_variants(session, client, kind: str, ctx: dict) -> dict[str, [{"role": "system", "content": sys_}, {"role": "user", "content": usr}], ) + # Reviewer gate. Digest emails land in inboxes — once + # delivered they're unrecallable, so a financial-advice slip + # has more reach than the dashboard. Drop rejected variants; + # users on that tone get no digest this cycle (better than + # delivering bad copy). + verdict = await review_read(client, result.content) + full_cost = (result.cost_usd or 0.0) + (verdict.cost_usd or 0.0) + if not verdict.clean: + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=full_cost, status="leaked", + error=f"reviewer: {verdict.reason}", + )) + await session.commit() + log.warning("digest.reviewer_rejected", kind=kind, tone=tone, + reason=verdict.reason, preview=result.content[:120]) + continue out[tone] = result.content session.add(AICall( model=result.model, prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, + cost_usd=full_cost, status="ok", )) await session.commit() diff --git a/app/routers/chat.py b/app/routers/chat.py index f213637..20f99e5 100644 --- a/app/routers/chat.py +++ b/app/routers/chat.py @@ -24,6 +24,10 @@ from app.routers.api import _md_to_html from app.services.i18n import respond_in_clause from app.services.llm_prompts import build_chat_system_prompt from app.services.openrouter import call_llm, month_start +from app.services.output_review import review_read + +from app.logging import get_logger +log = get_logger("chat") router = APIRouter(dependencies=[Depends(require_token)]) @@ -176,6 +180,11 @@ async def chat( try: async with httpx.AsyncClient(follow_redirects=True) as client: result = await call_llm(client, msgs) + # Reviewer gate. The chat turn could solicit advice with a + # leading question; the generator's system prompt forbids it, + # but the reviewer is the enforcement layer. ~1-2 s extra + # latency per turn on top of the generation call. + verdict = await review_read(client, result.content) except Exception as e: session.add(AICall( model=s.OPENROUTER_MODEL, status="error", error=str(e)[:500], @@ -183,11 +192,40 @@ async def chat( await session.commit() raise HTTPException(status_code=502, detail=f"OpenRouter error: {e}") + full_cost = (result.cost_usd or 0.0) + (verdict.cost_usd or 0.0) + if not verdict.clean: + # Rejected reply. Record the cost and surface a generic refusal + # the user can retry, rather than letting potentially non-compliant + # text reach them. + session.add(AICall( + model=result.model, + prompt_tokens=result.prompt_tokens, + completion_tokens=result.completion_tokens, + cost_usd=full_cost, status="leaked", + error=f"reviewer: {verdict.reason}", + )) + await session.commit() + log.warning("chat.reviewer_rejected", reason=verdict.reason, + preview=result.content[:120]) + refusal = ( + "I can't generate that reply — it would have crossed into " + "investment advice or specific recommendations, which I'm " + "not licensed to give. Try rephrasing as a question about " + "what the data means rather than what to do." + ) + return { + "role": "assistant", + "content": refusal, + "content_html": _md_to_html(refusal), + "prompt_tokens": result.prompt_tokens, + "completion_tokens": result.completion_tokens, + } + session.add(AICall( model=result.model, prompt_tokens=result.prompt_tokens, completion_tokens=result.completion_tokens, - cost_usd=result.cost_usd, + cost_usd=full_cost, status="ok", )) await session.commit() diff --git a/app/services/output_review.py b/app/services/output_review.py index 833b927..4fbb2fb 100644 --- a/app/services/output_review.py +++ b/app/services/output_review.py @@ -39,25 +39,29 @@ DEFAULT_REVIEWER_MODEL = "anthropic/claude-haiku-4.5" _SYSTEM_PROMPT = """\ You are a strict editor for a financial-markets dashboard. The author -was asked to produce a short interpretive read for human readers. -You receive their proposed read and decide if it is publishable as-is. +was asked to produce editorial commentary on public market data for +human readers. You receive the proposed text — it may be a one-line +read, a multi-paragraph daily log, a portfolio analysis, a chat +reply, or an email digest — and decide if it is publishable as-is. -Mark CLEAN only if the text reads like a finished interpretation a -reader could see on a public dashboard without confusion. +Mark CLEAN only if the text reads like finished editorial commentary +a reader could see on a public dashboard without confusion. Mark UNCLEAN if the text contains ANY of: -- Chain-of-thought / scratchpad markers used as thinking — phrases like +- Chain-of-thought / scratchpad markers — the author thinking on the + page rather than presenting finished commentary. Phrases like "Let me", "Let's see", "we need to", "actually" (correcting itself), - "wait", "hmm", "or rather", "I should". + "wait", "hmm", "or rather", "I should". Rhetorical questions used + as structure are fine; questions that the author then answers in + front of the reader (self-questioning) are not. - Self-questioning parentheticals: "Q1 2026? Actually Q4 2025?", "is it X or Y?", any place where the author appears to be working out the answer in front of the reader. -- Multiple rhetorical questions or any question that interrupts the - declarative voice. A clean interpretive read is assertive. - Meta-commentary about the task, output format, word limits, or instructions — e.g. "as required by the constraints", "the prompt asks", "let me address each". -- Partial / truncated content. Starts mid-word, mid-number, mid-clause. +- Partial / truncated content. Starts mid-word, mid-number, mid-clause, + ends mid-thought. - Visible internal numbers without clear meaning ("change 1y +5.9%?"), raw column names ("as_of 2026-01-01"), or any debug-like fragments. - FINANCIAL ADVICE or any phrasing that recommends an action the @@ -75,7 +79,7 @@ Mark UNCLEAN if the text contains ANY of: "valuations are stretched", "real yields are restrictive", "rates and credit disagree". The test: does the text describe a STATE, or does it suggest an ACTION? States are fine; actions are not. -- Anything else other than the finished, publishable interpretation. +- Anything else other than the finished, publishable commentary. Return ONLY a JSON object with this exact shape: {"clean": true | false, "reason": "<≤20 words, plain text>"} diff --git a/app/services/portfolio_analysis.py b/app/services/portfolio_analysis.py index 450f948..1f6bea7 100644 --- a/app/services/portfolio_analysis.py +++ b/app/services/portfolio_analysis.py @@ -33,6 +33,7 @@ from app.logging import get_logger from app.models import AICall from app.services.i18n import LANGUAGES, respond_in_clause from app.services.llm_prompts import build_system_prompt +from app.services.output_review import review_read from app.services.openrouter import ( LogResult, active_model, @@ -322,6 +323,8 @@ async def analyse( s = get_settings() system, user = build_prompt(req) + review_cost = 0.0 + review_reason: str | None = None async with httpx.AsyncClient() as client: try: llm: LogResult = await call_llm( @@ -340,15 +343,31 @@ async def analyse( llm = None log.error("portfolio_analysis.failed", error=error_msg) + # Reviewer gate. This is the highest-risk surface — the model is + # commenting on a real user's holdings, so any drift into + # buy/sell or allocation language is a regulatory hazard. Drop + # the response on a reject and surface a retry-able error to the + # caller; no analysis is ever persisted server-side anyway. + if llm is not None: + verdict = await review_read(client, llm.content) + review_cost = verdict.cost_usd or 0.0 + if not verdict.clean: + status = "leaked" + error_msg = f"reviewer rejected: {verdict.reason}" + review_reason = verdict.reason + log.warning("portfolio_analysis.reviewer_rejected", + reason=verdict.reason, preview=llm.content[:120]) + + full_cost = ((llm.cost_usd or 0.0) + review_cost) if llm else None # Ledger row — NO portfolio data, just metadata. Same row whether the - # call succeeded or failed, so cost-cap and rate-limit logic can - # observe the attempt. + # call succeeded, failed, or was rejected by the reviewer, so + # cost-cap and rate-limit logic can observe the attempt. session.add(AICall( called_at=utcnow(), model=llm.model if llm else active_model(), prompt_tokens=llm.prompt_tokens if llm else None, completion_tokens=llm.completion_tokens if llm else None, - cost_usd=llm.cost_usd if llm else None, + cost_usd=full_cost, status=status, error=error_msg, )) @@ -356,19 +375,26 @@ async def analyse( if llm is None: raise RuntimeError(error_msg or "portfolio analysis failed") + if review_reason is not None: + # Reviewer rejected the candidate. Treat as a generation failure + # at the API layer so the user sees a retry-able error rather + # than potentially non-compliant advice. + raise RuntimeError( + "AI analysis couldn't be generated cleanly — please try again." + ) log.info( "portfolio_analysis.ok", n_positions=len(req.positions), prompt_tokens=llm.prompt_tokens, completion_tokens=llm.completion_tokens, - cost_usd=llm.cost_usd, + cost_usd=full_cost, ) return AnalysisResult( content=llm.content, model=llm.model, prompt_tokens=llm.prompt_tokens, completion_tokens=llm.completion_tokens, - cost_usd=llm.cost_usd, + cost_usd=full_cost, generated_at=datetime.now(timezone.utc), ) diff --git a/tests/conftest.py b/tests/conftest.py index b032028..e49e229 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -22,6 +22,38 @@ os.environ.setdefault("CASSANDRA_MOCK", "1") import pytest +@pytest.fixture(autouse=True) +def stub_reviewer(monkeypatch): + """Replace review_read with a clean-passing stub in every consumer + module. Tests that mock the generator's call_llm shouldn't also + have to mock the reviewer that runs after it — the reviewer is a + safety gate, not behaviour under test. + + Tests in test_output_review.py exercise review_read through its + own module and are unaffected. Tests that want to assert the + reviewer-rejected branch can override with their own + monkeypatch.setattr — later wins. + """ + from app.services.output_review import Verdict + + async def _clean(_client, _candidate): + return Verdict(clean=True, reason="stubbed-by-conftest", cost_usd=0.0) + + for mod_path in ( + "app.services.portfolio_analysis", + "app.routers.chat", + "app.jobs.ai_log_job", + "app.jobs.email_digest_job", + "app.jobs.indicator_summary_job", + ): + try: + mod = __import__(mod_path, fromlist=["review_read"]) + except ImportError: + continue + if hasattr(mod, "review_read"): + monkeypatch.setattr(mod, "review_read", _clean) + + @pytest.fixture async def db_factory(tmp_path): """Per-test sqlite engine + async session factory.
            TickerNameQtyAvgQtyAvgLastP/L%