Commit graph

173 commits

Author SHA1 Message Date
838f227175 settings: drop the broker-list line from the import lede
Removed "Trading 212 is recognised natively and other formats (IBKR,
Fidelity, Schwab…) are auto-detected" from the import section's
lede paragraph — internal/marketing noise that doesn't help the user
once they're already on the import screen with a file picker in
front of them. Kept the surrounding sentence ("Drop a portfolio CSV
from any broker. We'll parse it…") and the T212 export-path hint
since the latter is concrete instructional content for T212 users.
2026-05-29 15:58:47 +02:00
dbb14340db fix: ascii quotes in settings.html script tags
The two <script src="{{ url_for(...) }}"> lines for the sync scripts
had Unicode smart-quotes (' / ') instead of ASCII apostrophes —
left over from a copy-paste at some point. Jinja's tokenizer hit the
first one and raised TemplateSyntaxError, so /settings returned a
500. Replaced with ASCII quotes and added the missing ?v=ASSET_VERSION
cache-buster the other static URLs already use.
2026-05-29 15:34:45 +02:00
21835afebe analyze: send the live toggle lang from the frontend, log resolution
The /api/analyze flow previously read principal.user.lang from the
DB on every request and ignored anything the client might send. That
races the language toggle's PATCH: a user can flip the toggle and
click Generate/Regenerate before the PATCH /api/settings/language
hits the DB, so the analysis is sent with the OLD persisted lang
while the toggle visually reads as the new one. From the user's POV
the analysis comes back in the wrong language.

Frontend portfolio.js now reads the live #lang-toggle data-lang
attribute (the same source the UI itself uses) and includes it in
the /api/analyze body. The dataset attribute is updated optimistically
by cassandraSetLang() before the PATCH fires, so it always reflects
what the user is looking at.

Backend universe.py prefers payload["lang"] when present and falls
back to user.lang otherwise — older clients (scripts, direct curl)
that don't send anything still get the DB-stored preference. The
resolution path is logged so we can confirm in prod which lang
actually drove a given request.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 15:32:58 +02:00
13dd3a8330 i18n: prepend a strong language directive for portfolio + chat
Reports that portfolio AI analysis was coming back in English even
for IT-toggled users. Traced the chain (DB user.lang IS set to it,
router passes it into the payload, parse_request reads it, build_prompt
appends respond_in_clause), so the wiring is correct end-to-end. The
model was simply ignoring the single-sentence tail nudge: when the
system prompt is hundreds of lines of English and the user message
adds more English context, "Respond in Italian." at the end is easy
to drop on the floor.

Add a new services/i18n.language_directive_lead() that returns a
strong, explicit top-of-prompt block — "# LANGUAGE — write everything
in <X>" plus the verbatim-tickers-and-numbers carve-out — meant to
be PREPENDED so the model anchors on the target language before it
reads the bulk of the instructions. Combined with the existing tail
clause it's belt-and-suspenders: top + bottom of the prompt both
say "in this language".

Applied to portfolio_analysis.build_prompt() and chat.py — the two
surfaces that generate user-facing prose in real time (the strategic
log + indicator summaries get post-hoc translation via translate(),
so the directive isn't needed there).

Empty-string return for en / unknown lang means callers can wire
it in unconditionally; no extra plumbing in i18n callsites.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 15:21:00 +02:00
736d161990 ui: portfolio actions row + AI analysis regenerate
Two small UX changes to the portfolio panel:

1. "Forget this pie" is destructive enough to belong in edit-mode
   only. The button now hides by default and only surfaces when the
   #portfolio-panel.pf-editing class is on the panel (same surface
   that already shows per-row × and the add-position form). The
   element stays in the DOM so the existing click handler keeps
   working without re-mount.

2. "Generate AI analysis" disappears once an analysis exists. In its
   place a small "Regenerate" button is rendered inside the
   collapsible analysis box — in the summary header, right-aligned
   next to the timestamp. The button stops the summary's default
   toggle action so a click regenerates without collapsing the
   panel. runAnalysis() now tolerates either pf-analyze or pf-regen
   as the trigger, and showAnalysis() takes an optional
   onRegenerate callback so callers can wire the button to the
   current pie/enriched closure context. Re-hydration after the
   60s portfolio refresh passes the same callback so the button
   survives a refresh cycle.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 15:04:08 +02:00
652995feea ui: log panel bottom-aligns with portfolio via contain:size
Third attempt at fixing the dashboard's right-column alignment, this
time with the structural cause identified explicitly.

Previous attempts (a55168d, 8347c90) changed align-self on #log-panel
to control how the panel filled its grid area. They got the box
edges aligned, but the underlying problem was a different one:
CSS Grid auto-sizes each row by MAX(intrinsic content height across
items in that row). When the log content is taller than indicators +
portfolio combined, the grid grows rows 2-3 to fit it; portfolio
ends up in a stretched row with empty space below the actual content.

The fix is to stop the log's content from contributing to the grid
row sizing at all. `contain: size` on the log panel declares "my
contents do not affect my intrinsic size" — the grid then sizes rows
2-3 from indicators + portfolio alone, and the log stretches to
inhabit that combined height. A flex column inside the panel
(min-height: 0 on every level of the chain) lets .panel-body fill
the remaining height below the header and scroll instead of
overflowing.

The 1100px mobile breakpoint undoes the constraint: at that width
the grid restructures to a single column, the log no longer shares
a row with indicators + portfolio, and `contain: size` would just
collapse the panel to zero. There the log expands naturally and
page scroll handles it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:56:11 +02:00
f9534f7ad6 review: gate strategic-log, portfolio, chat, and digest on reviewer
Extends the reviewer agent — previously only protecting indicator
summaries — to every AI-generated surface that reaches a user. The
reviewer's prompt already rejects scratchpad, truncation,
meta-commentary, and (since a6e476b) financial advice; wiring it in
turns those rules from prompt-level "asks" into structural gates.

Four call sites updated:

- ai_log_job.run() : after each tone/analysis variant is generated,
  pass through review_read. On reject, log the reason and skip the
  StrategicLog insert; the API's existing "latest StrategicLog" lookup
  falls back to the previous clean log.

- services/portfolio_analysis.analyse() : on reject, raise a clean
  RuntimeError that the /api/analyze router already maps to HTTP 502
  with a retry-able message. Portfolio analysis isn't cached server-
  side, so the user retries; the reviewer's verdict reason goes into
  the AICall ledger as the leaked-status row's error column.

- routers/chat.chat() : on reject, instead of returning the raw
  assistant content we return a short refusal explaining the limit
  and inviting a rephrase. Adds ~1-2 s of latency per turn (one extra
  LLM call to Haiku) — the only user-facing latency tax.

- jobs/email_digest_job._generate_variants() : on reject, the variant
  is dropped for the cycle. Recipients on the rejected tone get no
  digest email this run, which is better than delivering inbox copy
  that drifts into advice (emails are unrecallable once sent).

In every case the AICall ledger row records the reviewer cost so
month_spend stays accurate across all paths.

The reviewer system prompt is slightly generalised to cover both the
indicator-summary case and the longer-form log/digest/chat case:
- removes "short interpretive read" framing
- softens the "any question" rule so genuine rhetorical structure in
  a long-form log doesn't trigger a reject

tests/conftest.py grows an autouse fixture that stubs review_read to
clean=True in every consumer module. Tests that mock the generator
shouldn't have to also mock the safety gate behind it; tests that
specifically want the reject branch can override with their own
monkeypatch. test_output_review.py is unaffected — it imports
review_read directly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:40:04 +02:00
a6e476b851 review: reject financial advice in indicator-summary reads
Adds a new UNCLEAN criterion to the reviewer agent's system prompt:
direct recommendation language (buy/sell/hold/accumulate/trim/rotate),
allocation guidance (overweight/underweight, "X% in bonds"), price
targets, and personalised framing ("you should", "investors should")
all trigger a reject.

The operator is not licensed to give investment advice; this is
editorial commentary on public data. The generator's system prompt
already forbids buy/sell language, but a prompt-only constraint is
not an enforcement layer. The reviewer agent — already in the
pipeline for chain-of-thought / truncation / meta-commentary — is the
right place to enforce the regulatory boundary structurally: rows
that drift into advice get dropped, and the API falls back to the
previous compliant row.

Descriptive / interpretive language about market state remains
explicitly allowed ("valuations are stretched", "real yields are
restrictive"). The criterion is state vs action: states publish,
actions don't.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 14:26:37 +02:00
cd485fe646 scripts: one-off purge of unclean IndicatorSummary rows
Iterates every IndicatorSummary in the DB and asks the reviewer agent
(services/output_review.review_read) whether each row's content is
publishable. Rows the reviewer flags as unclean are deleted along
with their translation rows. The API's existing fallback path —
serve the latest IndicatorSummary by (group, tone) — picks up the
previous clean row automatically.

Concurrency defaults to 8 reviewer calls in flight; on the 3245-row
prod archive that completes in ~10 minutes for ~$1 of Haiku cost.

Idempotent: a second run only re-evaluates whatever's still in the
table. --dry-run skips the deletion stage. After the live pipeline
fix landed (JSON-mode + reviewer at write time) this script should
not find anything on subsequent invocations.
2026-05-29 13:56:47 +02:00
385c5fdc60 review: strip markdown code-fences from JSON verdicts
Haiku 4.5 occasionally wraps its JSON response in a markdown code
fence even with response_format={"type":"json_object"} enforced:

    ```json
    {"clean": true, "reason": "polished read"}
    ```

Live testing the new reviewer caught this — every verdict was being
dropped as "reviewer returned non-JSON". Strip a single leading
trailing fence before json.loads. Defensive for any model that does
the same (Claude variants commonly fence JSON even when told not to).

Adds a unit test covering fenced output.
2026-05-29 13:27:37 +02:00
788563a81f ai: route reviewer through OpenRouter + Claude Haiku 4.5
The DeepSeek-V4-flash reviewer was unreliable in production: it pads
its JSON verdicts with internal chain-of-thought even when the prompt
forbids it, so the verdict gets truncated at any reasonable max_tokens
cap and the parser drops it as malformed (a false-negative verdict
that would purge clean rows). A live run on 50 rows reproduced the
failure on 8 of 12 rejections, even at 800 tokens.

Fix: pin the reviewer call to OpenRouter with anthropic/claude-haiku-4.5.
Haiku answers structured-output classification tersely (no scratchpad
preamble), which means a 300-token cap is comfortably above the
~30-token JSON verdict. Cost is roughly the same (~$0.0001-$0.0003 per
review) and the latency tax is smaller.

To enable the pinned-provider call without disrupting other callers,
call_llm grows an optional `provider` parameter: when set, only that
provider is used (no fallback chain). All existing call sites
default to provider=None and keep the chain behaviour.

REVIEWER_MODEL is read from settings via getattr-with-fallback so an
env override can swap models without code changes — useful if we want
to A/B test against e.g. gemini-2.5-flash later.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 13:21:26 +02:00
8b9d3c9c3e ai: bump reviewer max_tokens 300 → 800
Live re-check on 50 recent IndicatorSummary rows after the previous
120 → 300 bump still produced 4 'reviewer returned non-JSON' verdicts
out of 12 rejections. DeepSeek-V4-flash sometimes prefixes its JSON
output with a short stretch of thinking even though response_format
is enforced, which truncates the JSON at the back end of the 300-token
cap.

800 tokens is comfortably above any realistic verdict + preamble at
~$0.00022/call (DeepSeek output rates). Negligible cost given the
hourly call volume.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 13:16:57 +02:00
0550063316 ai: bump reviewer max_tokens 120 → 300
A live sanity-check on 50 recent IndicatorSummary rows found 6 of 10
reviewer rejections were the reviewer hitting its own max_tokens cap
mid-verdict ('{"clean": false, "reason": "Truncated sent…'). The
parser then dropped the candidate as malformed JSON, producing a
false-negative verdict that would have purged legitimately clean
rows.

300 tokens is well above the ~30-token verdict the prompt asks for;
the extra headroom removes the artefact at ~$0.00015 per call.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 13:15:42 +02:00
45fa31bb2b ai: structured-output + reviewer agent for indicator summaries
Replaces the regex-based clean_summary / looks_like_leakage pipeline
that produced the 2026-05-29 valuation-read leak. Two layers of defence
in depth:

1. JSON-mode generation. The per-group and aggregate summary system
   prompts now require the model to emit a single object
   {"read": "..."}; response_format={"type":"json_object"} is passed
   through to the provider so the API enforces well-formed JSON. Prose
   outside the field is physically impossible. The "read" field is the
   only schema slot, so the model has nowhere to spill scratchpad
   into the envelope.

2. Reviewer agent. services/output_review.review_read() makes a second
   small LLM call that judges whether the candidate "read" string is
   publishable. It catches the residual failure mode — scratchpad
   INSIDE the field ("Let's see…", multi-question parentheticals,
   meta-commentary) — and returns a JSON verdict {"clean": bool,
   "reason": str}. Any failure (provider error, parse error, missing
   field) returns clean=false (fail-safe). Cost ~$0.0001/check; latency
   ~1-2 s in the hourly job, no user-facing latency.

The old regex scaffolding (_LEAK_PATTERNS, clean_summary,
looks_like_leakage, _TRAILING_QUOTE) is deleted entirely. It produced
false positives (chopped legitimate "The indicators are…" leaders) and
false negatives (never matched the chain-of-thought patterns the model
actually emits). The reviewer agent is strictly better on both.

On reviewer/parse rejection: don't persist a new IndicatorSummary; the
API's existing fallback to the previous good row continues to serve
the panel. Failures are logged as ind_summary.json_invalid /
ind_summary.reviewer_rejected so we can measure the rejection rate.

Reviewer cost is added to the row's recorded cost_usd so the monthly
budget cap covers the full pipeline.

Adds tests/test_output_review.py: 11 cases covering _extract_read
(JSON envelope handling — invalid JSON, missing field, wrong types,
empty values) and review_read (clean / unclean verdicts plus three
fail-safe paths for malformed reviewer responses).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 13:10:52 +02:00
19d4854f50 llm: support JSON-mode + stop publishing the reasoning field
Two changes to the LLM call path that together close the
chain-of-thought leakage surface:

1. _call_provider accepts an optional `response_format` (forwarded to
   the OpenAI-shaped API — DeepSeek and OpenRouter both honour
   {"type": "json_object"}). Threaded through call_llm so callers can
   force structured output without monkey-patching the body. The
   indicator-summary job will use this next: it'll require the model
   to emit {"read": "..."} and parse the field, making prose outside
   the JSON object physically impossible to publish.

2. Empty `content` no longer falls back to the `reasoning` field.
   `reasoning` is the model's internal scratchpad — "Let's see...",
   half-formed math, planning notes. We had a fallback that surfaced
   it when content was null, but the field is intended for debugging
   the model, not for publication. After the 2026-05-29 valuation
   read leaked into production, the fallback is gone: an empty
   content row now raises so the caller retries or skips, and the
   previous good row remains visible. Test updated to assert this
   safer behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 13:02:36 +02:00
8347c90235 ui: drop log-content's fixed-viewport scroll cap
The dashboard's log panel now stretches in the grid to bottom-align
with the portfolio (a55168d), but .log-content still carried
max-height: calc(100vh - 240px) + overflow-y: auto from an older
layout. That produced an inner scrollbar inside the panel AND left
visible dead space below the scrolling region. Removing the cap lets
the panel grid handle the height and the page scroll handle very long
logs; no more nested scroll region.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:58:06 +02:00
a55168d20a ui: log panel stretches to portfolio bottom; AI analysis stays expanded
Two small fixes to the dashboard right column based on user feedback:

1. layout.css — drop align-self:start from #log-panel.
   The panel previously shrank to its content, leaving the right-hand
   column visually shorter than the indicators+portfolio stack on the
   left. Removing the override lets the grid stretch the panel to the
   full row span so the two columns now bottom-align. The log content
   still sits at the top of the panel; any extra height is empty
   padding inside the box.

2. portfolio.js — re-hydrate AI analysis expanded.
   The 60s auto-refresh rebuilds the portfolio mount and re-attaches
   the previously-generated analysis from localStorage, but the
   <details> element was re-attached with open:false — collapsing it
   under the user's cursor every minute. Users reasonably perceived
   that as "the analysis disappeared". Hydrate as open:true so the
   body stays visible; the user can still click the summary to
   collapse manually within a refresh window.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:35:10 +02:00
6c4c711830 ui: log page tone badge follows the toggle (novice / pro)
The Strategic Log Archive panel header used to show two engineery
badges sourced from server config:
  new logs use: tone intermediate analysis speculative

Both were misleading:
- The tone badge described the SERVER's generator setting, not the
  user's reading preference — confusingly disconnected from the
  Novice | Pro toggle in the topbar that actually controls what AI
  panels render.
- The analysis flag is always SPECULATIVE in production, so the badge
  carried no information.

Drop the "new logs use:" prefix and the analysis badge. The tone badge
now mirrors the user's toggle: NOVICE → "novice", INTERMEDIATE → "pro"
(same data values; just the display label flips, matching the header
relabel from 3e1a14f).

Wiring lives in base.html: a new cassandraSyncToneBadge(tone) helper
updates the #tone-badge element when present. Called from
DOMContentLoaded (so the initial badge picks up the localStorage tone)
and from cassandraSetTone (so toggling the header updates the badge
live, without a page refresh).

current_tone / current_analysis are removed from _log_page_context —
log.html was the only consumer and neither key is referenced now.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:17:49 +02:00
259146ecdc fix: don't put literal Jinja syntax inside JS comments in base.html
The previous commit's i18n explanatory comment included the snippet
{% if user_lang == 'it' %} as illustration — but Jinja parses the
whole template, including content inside JS // comments, so that
literal got picked up as a real (unclosed) tag and every page rendered
with a TemplateSyntaxError. Rewrite the comment without the literal
Jinja syntax.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:03:44 +02:00
fca05aef7a i18n: live-swap chat sidebar labels on language toggle
The strategic log content already refreshes via HTMX on lang-changed
(server-side translation lookup), but the chat sidebar's static labels
— title, hint, helper lede, textarea placeholder, Send button — were
baked into the HTML by Jinja at page render and only updated after a
full reload.

Add a tiny client-side i18n dictionary (CASSANDRA_I18N) plus
applyI18n(lang) in base.html. cassandraSetLang() now calls
applyI18n(newLang) right after the language PATCH succeeds and before
firing the HTMX triggers, so labels swap in step with the AI content.

Convention: <element data-i18n="key">…</element> sets textContent;
<input data-i18n-placeholder="key" …> sets .placeholder. Initial
render still goes through the existing {% if user_lang == 'it' %}
Jinja blocks so there's no flash of English on page load for IT users
— applyI18n is a no-op until the toggle is clicked.

Only the chat sidebar has bindings today. Adding more labels later is
a matter of dropping a key into the dict and tagging the element.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 12:01:28 +02:00
48f022b71b i18n: stop truncating IT translations + localise the chat sidebar
Three connected fixes after the user spotted the 2026-05-28 IT log
cutting off mid-sentence:

1. translation: bump max_tokens 4000 → 8000.
   call_llm()'s default cap was 4000, which is what the English log
   generator itself uses as its ceiling. Italian expands roughly 15-25 %
   over English in tokens, so any near-cap English source produced an
   IT translation that hit finish_reason=length and returned a
   truncated body — silently, because _call_provider() only raises when
   content is fully empty. The strategic_log_translations table has
   dozens of rows where completion_tokens landed at exactly 4000 with
   content well under half the source length. 8000 gives ample
   headroom for any of the five LANGUAGES we ship (en/it/es/fr/de).

2. log.html: localise the chat sidebar strings.
   user_lang was already passed into the template by pages.py, so an
   inline {% if user_lang == 'it' %} keeps it simple. Covers the
   "Ask Cassandra" title, the "grounded on…" hint, the helper lede,
   the textarea placeholder, and the Send button label.

3. chat endpoint: append respond_in_clause(user.lang) to the system
   prompt. The chat conversation can now happen in IT — the model's
   first reply lands in the right language even when the user's first
   turn is short.

scripts/backfill_truncated_translations.py: one-off cleanup utility.
Scans strategic_log_translations for rows whose translated content is
< 70 % of the English source (the truncation signal — IT *expands*
beyond English, so a shorter translation is always suspect), deletes
them, and re-translates via the now-uncapped service. Supports --date,
--since, --all and --dry-run. The 2026-05-28 fan-out has already been
re-translated (13/13 rows). Other historical dates still hold older
truncations; the user can decide whether to backfill those (the script
is idempotent).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 11:44:41 +02:00
3e1a14f334 ui: flip tone relabel — "Pro" now maps to INTERMEDIATE, not NOVICE
Reverses the polarity of 71155a6 to match the actual semantics:

- "Novice" stays labelled "Novice" → glossary tooltips, plainer prose.
- "Intermediate" is relabelled "Pro" → terse, assumes fluency, no
  hand-holding. This is the mode an expert reader wants, so the "Pro"
  badge actually fits.

Backend tone values (NOVICE, INTERMEDIATE) are unchanged — no API,
prompt, or stored-preference impact. Only the display strings flip.

Also drops the .tone-toggle button min-width: 10em override added in
71155a6. With "Intermediate" gone from the visible label, the longest
remaining label is "Novice" (6 chars), which fits the shared 5.5em
just like the theme and language toggles.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 11:23:52 +02:00
71155a67be ui: rename tone "Novice" → "Pro"; fit tone-toggle to longest option
User-visible relabel only. Backend tone value stays NOVICE — no API
contract change, no migration on stored user.digest_tone, the
glossary/plain-prose depth of analysis is unchanged. The marketing
intent is that "Pro" reads better than "Novice" on the dashboard
header; landing/pricing/privacy copy still uses the word "Novice" in
flowing prose, so leaving those alone keeps the existing explanations
coherent until they get a copy pass.

Toggle width: the popup expansion (positioned left:0/right:0) is
sized by the container, which previously sized to the active button.
When "Pro" was active the popup was too narrow to fit "Intermediate".
Bumped .tone-toggle button min-width to 10em so both buttons reserve
enough room for the longest label regardless of which one is active.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 11:17:43 +02:00
f57c863145 ui: header toggles expand downward, not sideways
Hovering a toggle (tone, theme, language) previously revealed the
non-active option inline next to the active one, which widened the
toggle and pushed its neighbours sideways. Now the non-active option
appears as a popup ABSOLUTELY POSITIONED below the active one — the
toggle's in-flow footprint stays exactly one button wide and tall, so
the other two toggles next to it never move when the user mouses over
one of them.

Mechanism: inside @media (hover: hover) the container becomes
position:relative and every button defaults to display:none. The
:hover/:focus-within rule renders all options as position:absolute
under the container. Specificity (.X[data=Y] btn[data=Y]) on the
active-button rule then pins the active option back into the static
flow at the top, so only the non-active end(s) up absolute — popup
grows downward only. margin-top:-1px makes the popup's top border
overlap the container's bottom border for a single shared edge.
z-index:60 sits above the markets bar (z-50). Touch devices keep
both options side-by-side (the @media gate); the mobile drawer keeps
both visible too.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 11:11:46 +02:00
31a8efc27d ui: regroup topbar + unify the three header toggles
Header layout was visibly broken on desktop after the mobile-drawer
change: flex space-between distributed brand, BETA, tone-toggle, nav
and header-right across the bar, so BETA drifted away from the brand
wordmark and the tone-toggle landed in the middle of the row.

Markup: brand + BETA are now wrapped in .header-left so they ride
together. The tone-toggle moves back inside .header-right next to
theme + lang where it logically belongs. CSS: the header switches to
grid (1fr auto 1fr) on desktop, which truly centres the nav regardless
of side-group widths. The mobile @media block reverts to flex so the
hamburger + slide-out drawer still work.

Toggle redesign (tone, theme, language):

- The single-button theme widget becomes a Light | Dark segmented
  control matching the other two so all three read as one cluster.
  cassandraToggleTheme is replaced by cassandraSetTheme(theme), the
  toggle's data-theme attribute is synced on page load.
- All three share one CSS rule set: same padding, font, border, and
  a min-width so the active-only width matches the expanded width
  (no layout jump on hover).
- On hover-capable devices each toggle collapses to just the active
  option; hovering (or keyboard focus-within) reveals both. Touch
  devices keep both visible — the @media (hover: hover) gate handles
  that and the mobile-drawer block overrides it explicitly so the
  drawer-stacked controls remain full-width with both options shown.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 11:00:11 +02:00
daa3f79a52 mobile: cache-bust static assets so browser picks up CSS/JS edits
User reported phone still showing old behaviour (Qty/Avg portfolio
columns visible) even though the server-side JS had been updated.
Root cause: every <link>/<script> URL was a plain
/static/css/foo.css with no query string, so mobile Chrome served
the file from its HTTP cache rather than refetching it.

Adds a process-startup timestamp to the Jinja environment as
ASSET_VERSION (computed once when templates_env is imported). Every
<link>/<script> reference now appends `?v={{ ASSET_VERSION }}` so a
container restart bumps the URL and the browser refetches. 38 URLs
across 8 templates updated via sed; tests still pass.

Side benefit: future CSS/JS edits no longer require users to hard-
refresh.
2026-05-28 19:20:49 +02:00
1a20f0a15b mobile: tag Qty/Avg cells in JS-rendered portfolio table
The portfolio table is rendered client-side in portfolio.js (not by
the partials/portfolio.html Jinja template, which is unused for this
view). The previous commit's mobile-hide class made it into the
template but never reached the actual DOM. Adding the class to the
JS-emitted <th> and <td> strings so .dense .mobile-hide { display:
none } actually picks them up at ≤480px.
2026-05-28 19:13:52 +02:00
6459e8c43d mobile: wrap tabs, trim portfolio + markets bar columns
Three pieces of phone-side feedback:

1. Indicator group tabs wrap onto multiple rows instead of
   horizontal-scrolling — every group is visible at a glance. Each
   button keeps its own bottom border so wrapped rows stay
   visually delimited; the container's bottom border is removed.

2. Portfolio holdings table hides Qty and Avg columns on mobile via
   the mobile-hide class (same mechanism as the indicator table).
   Remaining columns are the actionable ones: Ticker, Name, Last,
   P/L, %.

3. Markets bar at the bottom compacts to one row per chip —
   dot + code + change% only. The state word ("open" / "closed")
   is implied by the dot colour; the index label, price, and
   until-time are dropped on mobile. Grid columns drop their 220px
   floor so the full set fits the viewport without horizontal
   scroll (previously the bar scrolled within itself).
2026-05-28 19:10:58 +02:00
8ec4ea1c72 mobile: clamp grid items + table cells to viewport width
User reported the page rendering at ~3x viewport width on Android
Chrome with overflow-x:hidden clipping off most of the content.
Root cause: CSS grid items default to min-width:min-content, and the
indicator table inside the indicators panel has white-space:nowrap
cells. A long Symbol/Label value forces the table wider than its
panel; the panel propagates that minimum width up the grid; the grid
expands the .app-main; .app-main pushes the page wider than the
viewport. overflow-x:hidden then just chops the right portion off.

Fix has three parts:

1. .app and .app-main get min-width:0 and max-width:100vw so the
   shell can't be wider than the viewport regardless of descendants.
2. Every direct child of .app-main (each panel) gets min-width:0
   on mobile so individual panels can shrink past their min-content.
3. table.dense drops white-space:nowrap on text cells at ≤480px —
   long symbols wrap to two lines instead of forcing the table wide.
   Numeric cells keep nowrap (negative percentages reading as
   "−12\n.34%" would be unreadable).

Also adds an overflow-x:auto fallback on .panel-body pre/code so
any code block in AI output scrolls within the panel instead of
blowing the page out.
2026-05-28 19:02:30 +02:00
5ceee96135 mobile: fix drawer stacking + horizontal page overflow
Two related bugs reported on phone:

1. Drawer was unclickable — backdrop covered it. Root cause: the
   .app-header (position:sticky, z-index:50) creates a stacking
   context, so the drawer inside it had its z-index:100 clamped to
   "above other things inside the header" but NOT above siblings of
   the header. The backdrop at root-level z:90 then sat over the
   drawer subtree.

   Fix: when body.drawer-open, raise .app-header z-index to 110
   so its entire descendant tree (drawer included) draws above the
   z:90 backdrop. The page body under the header stays dimmed.

2. Horizontal scrolling on the dashboard. Root cause: the bottom
   markets bar used `grid-template-columns: repeat(auto-fit,
   minmax(220px, 1fr))`, which at 4+ markets blows out to 880px+ and
   forces the page wider than the viewport.

   Fix: on ≤480px the markets bar becomes a horizontally scrolling
   flex strip with min-width:160px per chip — page stays narrow,
   user swipes the bar to see more markets.

Also added overflow-x:hidden to html/body as a defensive net against
the fixed off-screen drawer creating overflow on Safari iOS.
2026-05-28 18:55:04 +02:00
b6da1983d3 mobile: per-view ≤480px rules across the CSS bundle
Adds the @media (max-width: 480px) blocks specified in the design:

- dashboard.css: indicator table hides the 'mobile-hide'-tagged
  columns (Label, Ccy, 1y, anchor, as-of), keeping Symbol / Price /
  1d / 1m. Cell padding + font shrink. Group-tab buttons get a
  bigger touch target.
- panels.css: header padding tightens, scroll-body max-height drops
  to 60vh so log/news stay above the fold in the stacked layout.
- portfolio.css: overall grid keeps 2 cols (already at 640px) with
  tighter gap; action buttons wrap; composer input goes full-width.
- log-chat.css: chat bubbles edge-to-edge, input row stacks, font-
  size:14px on form fields to avoid iOS Safari zoom-on-focus.
- news.css: row collapses to age | (title / source) — source moves
  under the title. Tag filter strip wraps.
- settings.css: form rows stack (label above input). Import picker
  becomes single-column. Buttons full-width.
- auth.css: card padding tightens to free up vertical space when the
  iOS keyboard is up. font-size:14px on inputs.
- public.css: hero headline clamp() lower bound drops to 22px; CTAs
  stack full-width; pricing tier-grid stacks.

indicators.html: tagged the secondary cells with .mobile-hide rather
than relying on positional nth-child — the anchor column is
conditional and would have shifted positions.

336 tests still pass.
2026-05-28 18:43:36 +02:00
2b3ea33884 mobile: hamburger drawer (right-side slide-out)
≤480px gets a hamburger button in the topbar and a fixed slide-out
panel from the right edge (width min(82vw, 320px)). The topbar keeps
only brand + tone toggle + hamburger visible; nav and the
header-right widgets (theme, lang, user menu, version meta) move
into the drawer.

Markup change: nav and .header-right are now wrapped in
.mobile-drawer, which is display:contents on desktop (no layout
effect) and a fixed translateX panel on mobile. The user-menu
dropdown chip hides on mobile and its links surface flat inside the
drawer.

JS: ~50 lines of vanilla. Tap hamburger / backdrop / ESC / swipe-
right-on-drawer all close. Clicking a nav link inside the drawer
closes it after the navigation kicks off so the panel doesn't linger
on the next page.

CSS: per-file @media block at the bottom of layout.css per the
agreed-upon organisation.
2026-05-28 18:36:37 +02:00
4c1793e4e9 docs: mobile responsiveness design spec
Captures the decisions from the brainstorm: phones-only (≤480px),
all views in scope, right-side hamburger drawer, per-file @media
blocks, hide secondary indicator columns. User opted to iterate on
the coded product rather than running through writing-plans; spec
exists so the rationale survives the session.
2026-05-28 18:30:42 +02:00
f9f4f25ef7 tests: backfill coverage for openrouter transport, auth sessions, cadence
Three new test files covering modules the audit flagged as having zero
direct coverage:

- test_openrouter_transport.py (18 tests): provider chain selection,
  endpoint resolution, _call_provider parse path (including the
  reasoning-field fallback and token-based cost estimation), and
  call_llm's cross-provider failover. Uses httpx.MockTransport so no
  network. Patches _call_provider for failover tests to bypass
  tenacity's retry delays.

- test_auth_session.py (7 tests): sign/verify round-trip, tampered
  cookie rejection, expired cookie rejection (via TTL monkeypatch),
  garbage input handling, salt isolation between session and pending
  serializers, and rejection of cookies signed with a different secret.

- test_cadence_policy.py (16 tests): is_active_window weekday/weekend
  + half-open interval boundaries, min_gap_hours across bands,
  should_run gating for first-run / active / off-hours / weekend
  / naive-datetime cases, and the NEWS_POLICY 20-minute / 3-hour
  variations.

Suite goes from 291 to 336 passing.
2026-05-28 13:58:28 +02:00
83995e96c8 stripe: detect buyer currency at checkout (GBP/USD/EUR)
Pass `currency` to Stripe checkout for first-time buyers so Stripe
picks the matching `currency_options` rate configured on the Price
in the Dashboard (multi-currency Prices: one Price, per-currency
unit_amount). Operator configures the rates on existing Prices
prod_UaZ0xCpCboUGCN/price_*; this commit is the application-side
signal.

Currency precedence: explicit request body > Cloudflare cf-ipcountry
header > Accept-Language locale > GBP fallback. Only honoured when
the user has no stripe_customer_id yet — Stripe locks currency to
the customer record at first checkout, so existing customers keep
their original currency (they can switch via the portal).

Adds 4 tests: sniffed currency on new customer, body override beats
sniff, currency omitted for existing customer, and unit-tests for
the sniffing fallback chain.
2026-05-28 12:42:40 +02:00
c5fb4525f3 jobs: per-row savepoint + aggregate logging in translation fan-out
Previously translate_log_for_active_languages and
translate_summary_for_active_languages added every successful
translation to the session and called session.commit() once at the
end. A single bad row (DB error, constraint violation, encoding
mismatch) rolled back the whole batch — losing all the languages that
had succeeded.

Wrap each row in session.begin_nested() so a per-row failure only
loses that one row. Track succeeded/failed counts and log them at the
end — escalating to error if zero succeeded out of N attempted, so
total failure surfaces in monitoring instead of just N warning lines.
2026-05-28 12:37:06 +02:00
7348055d72 llm: estimate cost from tokens when provider omits it
DeepSeek's native API returns prompt_tokens/completion_tokens but not
`usage.cost`. OpenRouter returns both. Result: with DeepSeek-direct as
primary (current default), every LogResult.cost_usd was None — and
every downstream cost ledger row (AICall, StrategicLog,
IndicatorSummary, translation tables) stored None instead of the real
spend.

Added a per-model rate table and fallback computation in _call_provider:
when the upstream omits cost, multiply tokens by the table rates. If the
upstream DOES return cost, keep it (authoritative). Falls back to None
if both the upstream and the table miss.

deepseek-v4-flash rates: \$0.07/M input, \$0.28/M output (per DeepSeek).
2026-05-28 12:36:55 +02:00
355593c4f7 css: split cassandra.css into per-section files
Splits the 2571-line cassandra.css into ten focused stylesheets:
tokens (palette + fonts), layout (chrome), panels, dashboard,
portfolio, log-chat, auth, settings, news, public. base.html and
public_base.html load only what they need; auth pages (login,
verify, unsubscribe confirm) load tokens + layout + auth.

Brand drift-detection test repointed at tokens.css (where the
palette now lives). 291 tests still pass.
2026-05-28 12:31:29 +02:00
78ce8c8b0d alembic: make migration chain SQLite-compatible (fresh upgrade)
Five existing migrations used op.alter_column / op.create_unique_constraint /
op.drop_constraint / op.create_foreign_key directly on the users + quotes +
quotes_daily tables. SQLite has no native support for those operations and
requires Alembic's batch_alter_table copy-and-rename workaround.

This wasn't noticed until now because the test suite uses
Base.metadata.create_all to materialise schema, not the migration chain
itself; and prod is MariaDB. But running `alembic upgrade head` against
a fresh SQLite database (developer onboarding, CI smoke tests, the
test container's own bootstrap) would fail at 0005.

Fixes:
- alembic/env.py: set render_as_batch=True when the dialect is SQLite.
  This auto-wraps any future autogenerated migration but doesn't
  retroactively rewrite existing op.* calls.
- 0005 (widen quotes.symbol), 0013 (referrals), 0018 (polar webhook),
  0019 (stripe), 0023 (users.lang index + qd_symbol widen) explicitly
  wrap their problematic ops in `with op.batch_alter_table(...) as bop`.

Now `alembic upgrade head` + `alembic downgrade base` round-trip cleanly
on a fresh SQLite database. MariaDB prod behaviour unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 00:16:09 +02:00
2b9cd875b4 deps: add requirements.lock for reproducible builds
pyproject.toml uses range pins (>=) for all dependencies; without a
lockfile, a fresh `pip install .` on a different day could pull
materially different versions of fastapi, sqlalchemy, httpx, etc.
For a production-shaped service that's a reproducibility risk —
especially since we don't run a CI pipeline that would catch
"works on yesterday's container, fails on today's."

requirements.lock pins every transitive dep (60 packages) to the
exact versions running in the test container today. Dockerfile is
updated so both stages install from the lockfile first, then install
the project itself with --no-deps:

  pip install -r requirements.lock
  pip install --no-deps .

That way pyproject.toml's range pins document our compatible
upper-and-lower bounds, but the lockfile is what actually gets
installed on every build.

To bump deps later: bump pyproject.toml ranges, rebuild a fresh
venv, `pip freeze` it, save back to requirements.lock.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 00:07:38 +02:00
f9d448d57b Revert "i18n: add diagnostic logging to localizer + lang-toggle click path"
This reverts commit 74b61a59ed.
2026-05-27 23:55:59 +02:00
74b61a59ed i18n: add diagnostic logging to localizer + lang-toggle click path 2026-05-27 23:37:11 +02:00
833d1775ab routers: extract chat + ops from api.py
api.py was 933 lines mixing four distinct concerns: indicators +
news + strategic log (the JSON/HTMX API proper), the chat endpoint
+ its three private helpers (~200 lines), and the two HTML-only ops
endpoints /markets-bar + /health (~150 lines).

Extracted:
- app/routers/chat.py — POST /api/chat + _latest_quotes_by_group_chat,
  _thesis_headlines_for_chat, _month_spend
- app/routers/ops.py — GET /api/markets-bar + GET /api/health +
  _fmt_price helper

Both new routers use the same dependencies=[Depends(require_token)]
as api.py and are mounted at the /api prefix in app/main.py.
URL surface is byte-identical with no externally-visible change.

api.py shrinks to ~620 lines focused on indicators+news+log+settings.

Helpers shared with the original api.py (_md_to_html, _resolve_tone_param)
are imported from app.routers.api where needed in chat.py to avoid
duplication.

Also updated tests/test_chat_and_log_gates.py to mount chat_router
in its local test app, since /api/chat now lives there.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:43:17 +02:00
b055eea1c2 email: split digest renderer to digest_email.py
email_service.py was 428 lines covering three different concerns:
SMTP transport, OTP/welcome rendering (tightly coupled — same brand
template + theme), and digest rendering (a totally different shape
of email, different layout, different copy cadence). The two halves
changed at different cadences and made the file noisy to navigate.

Extracted render_digest_email + _DIGEST_HTML_TEMPLATE +
_strip_html_to_text to app/services/digest_email.py. SMTP transport
and the OTP/welcome surface stay in email_service.py.

Import sites updated: email_digest_job and test_email_render now
import render_digest_email from digest_email. The OTP/welcome
import sites (auth router, branding tests, test_email_service) are
untouched.

No behaviour change — pure relocation. Templates byte-identical.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:33:06 +02:00
4adc8dfe82 openrouter: split into llm_prompts (prompt engineering) + transport
openrouter.py was 790 lines mixing two orthogonal concerns:
- Prompt engineering (build_system_prompt, build_summary_*,
  build_chat_*, build_daily_digest_*, etc.) — ~400 lines, changes
  weekly as PROMPT_VERSION bumps
- LLM transport (call_llm, _provider_chain, _call_provider, retry
  + fallback machinery) — ~250 lines, rarely changes

Extracted the prompt-engineering surface to app/services/llm_prompts.py.
Transport stays in openrouter.py (consistent with the filename — the
OpenRouter URL is the transport's anchor).

All import sites (jobs, routers, services, tests) split their
multi-import lines into two: prompt-things from llm_prompts, transport
from openrouter. PROMPT_VERSION constant, _TONE_ALIASES, _resolve_tone,
and SYSTEM_PROMPT moved with the prompt functions.

No behaviour change — pure relocation. Function signatures, body, and
naming all preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:27:23 +02:00
a6d686324c models: align translation column naming + add token counts
Three recently-added tables (strategic_log_translations,
indicator_summary_translations, csv_format_templates) drifted from
the codebase's existing naming convention:
- llm_model -> model
- llm_cost_usd -> cost_usd
- content_md -> content  (on the two translation tables; csv_format
  doesn't have a content field)

Also added prompt_tokens and completion_tokens to the three tables;
they were silently dropped at write time despite LogResult exposing
them.

All writer call sites (ai_log_job, indicator_summary_job,
llm_csv_parser) and reader call sites (api.py localized helpers)
updated to match. Tests realigned.

Migration 0025 uses batch_alter_table for SQLite compatibility.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:18:29 +02:00
e4dc6d0071 i18n: instant lang switch via HTMX trigger + refresh paid-plans terms 2026-05-27 21:02:03 +02:00
f4d9c9f2ec settings: extract sync + import widget JS to static files
The two largest inline <script> blocks in settings.html — the cloud
sync modal/management UI (~145 lines) and the import widget wiring
(~245 lines) — moved to app/static/js/settings-sync.js and
settings-import.js respectively, included via <script src="..."
defer> at the bottom of the template.

Where the inline code referenced Jinja vars or {% if %} guards,
those values are now passed via data-* attributes on the relevant
DOM elements (or via window.cassandra* config objects for structured
data) and read in the static JS.

Smaller blocks (Stripe portal, digest prefs, language select,
invite copy) stay inline — each <40 lines and easier to follow
next to their markup. settings.html drops from 758 lines to roughly
half that.
2026-05-27 20:55:49 +02:00
dcc2c07111 tests: extract _build_session_factory to a shared conftest fixture
The same per-test sqlite-engine setup was duplicated across 14 test
files (~30 lines each). Consolidated into a single async fixture
`db_factory` in tests/conftest.py; tests now take db_factory as a
parameter and use `async with db_factory() as session` directly.

No behaviour change — same function-scope, same in-memory schema
created via Base.metadata.create_all, same app.db._engine /
_session_factory rebinding so module-level helpers see the test
engine. Just ~420 lines of boilerplate removed.
2026-05-27 20:50:09 +02:00
b13caa4c51 ui: name the theme-toggle handler instead of an inline IIFE
The theme toggle's onclick attribute held a 140-character inline
IIFE that was hard to read amongst the other named-function
handlers in the same header. Promoted it to cassandraToggleTheme()
alongside cassandraSetTone / cassandraSetLang.
2026-05-27 20:41:31 +02:00