phase G: data minimisation + passwordless auth + DeepSeek-first LLM

Server no longer holds portfolios. Holdings live in the browser
(localStorage); the server publishes an anonymous ticker_universe and a
gzipped /api/universe payload identical for every authenticated user, so
access patterns can't betray which tickers a user holds. AI commentary
is generated ephemerally from the browser-supplied pie and the cost
ledger row records no positions. Migrations 0009-0011 added the
universe table and dropped positions / portfolio_snapshots /
portfolios.

Authentication is now e-mail OTP only. Migration 0010 dropped
password_hash and email_verified (every active session is by
construction proof of email control). The /signup endpoint is gone;
signup and login share a single email-entry page. Email rendering is
HTML+plain-text multipart with a shared brand palette (app/branding.py)
asserted in sync with the CSS by a drift-detection test.

LLM provider defaults to DeepSeek-direct (cheaper, api.deepseek.com)
with OpenRouter as automatic fallback if DeepSeek fails. ai_log_job and
indicator_summary_job now iterate the two tones (NOVICE, INTERMEDIATE)
per cycle so the dashboard's tone toggle is instant; PROMPT_VERSION
bumped to 6 with an educational anti-TA / anti-gambling stance baked
into _CORE. NOVICE mode renders a curated glossary inline (CBOE VIX,
yield curve, HY OAS, etc.) with JS-positioned tooltips that survive
viewport edges and sticky bars. Model name and tokens hidden from the
user UI; still recorded in StrategicLog.model and AICall for admin.

Layout adds a sticky top nav, a sticky bottom markets bar (one chip per
exchange with status LED + headline index + 1d change), and
Phase H feedback reporting is queued in tasks/todo.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-18 14:16:57 +01:00
parent 480fd311c5
commit 6e7f57c6b2
54 changed files with 5005 additions and 916 deletions

View file

@ -1,16 +1,17 @@
"""User authentication primitives: password hashing, signup, login.
"""User authentication primitives.
Argon2id for password hashing (argon2-cffi). itsdangerous for signed
session cookies. Tier-aware user creation; phase D adds the actual
tier-based feature gating.
Cassandra is **passwordless**. Every login is an email-OTP round-trip
(see app.services.otp_service + app.services.email_service). This module
just handles user-row lookup and create-on-first-sight.
The trade-off (see Phase G plan in tasks/todo.md):
- Server holds: email, tier, AI cost ledger. No portfolio, no broker keys.
- Loss of password gives up nothing of value to protect; gains: no
password-reset flows, no hash rotation, no stuffing/breach exposure.
- Every successful session is by construction proof of email control.
"""
from __future__ import annotations
import re
from dataclasses import dataclass
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError, InvalidHashError
from email_validator import EmailNotValidError, validate_email
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
@ -19,112 +20,52 @@ from app.db import utcnow
from app.models import User
# Argon2 default parameters are sensible; we let it pick.
_HASHER = PasswordHasher()
# Reasonable floor. Real password policy lives in Phase E.
MIN_PASSWORD_LENGTH = 8
MAX_PASSWORD_LENGTH = 256
class AuthError(Exception):
"""Raised when signup/login validation fails. The message is safe to
surface to the user as-is."""
def hash_password(plain: str) -> str:
return _HASHER.hash(plain)
def verify_password(plain: str, hashed: str) -> bool:
try:
_HASHER.verify(hashed, plain)
return True
except (VerifyMismatchError, InvalidHashError):
return False
except Exception:
return False
"""Raised on bad input. The message is safe to surface to the user."""
def _validate_email_or_raise(email: str) -> str:
try:
info = validate_email(email, check_deliverability=False)
return info.normalized
return info.normalized.lower()
except EmailNotValidError as e:
raise AuthError(f"Invalid email: {e}")
def _validate_password_or_raise(password: str) -> None:
if not isinstance(password, str):
raise AuthError("Password must be a string")
if len(password) < MIN_PASSWORD_LENGTH:
raise AuthError(
f"Password must be at least {MIN_PASSWORD_LENGTH} characters"
)
if len(password) > MAX_PASSWORD_LENGTH:
raise AuthError("Password too long")
async def create_user(
session: AsyncSession,
email: str,
password: str,
*,
tier: str = "free",
) -> User:
"""Create a new user. Raises AuthError on bad input or duplicate email."""
email = _validate_email_or_raise(email).lower()
_validate_password_or_raise(password)
existing = (await session.execute(
select(User).where(User.email == email)
)).scalar_one_or_none()
if existing:
raise AuthError("An account with this email already exists")
user = User(
email=email,
password_hash=hash_password(password),
tier=tier,
email_verified=False, # phase E enforces verification
settings_json={},
created_at=utcnow(),
)
session.add(user)
await session.commit()
await session.refresh(user)
return user
async def authenticate(
session: AsyncSession,
email: str,
password: str,
) -> User:
"""Return the User if credentials match. Raises AuthError on miss.
Uses the same generic message for unknown-email and wrong-password to
avoid a username-enumeration oracle."""
email = email.strip().lower()
user = (await session.execute(
select(User).where(User.email == email)
)).scalar_one_or_none()
# Always run a hash verification even on unknown-email to keep timing
# similar (mitigates timing-based user enumeration).
if user is None:
verify_password(password, "$argon2id$v=19$m=65536,t=3,p=4$" + "a" * 22 + "$" + "b" * 43)
raise AuthError("Invalid email or password")
if not verify_password(password, user.password_hash):
raise AuthError("Invalid email or password")
user.last_login_at = utcnow()
await session.commit()
return user
async def get_user(session: AsyncSession, user_id: int) -> User | None:
return (await session.execute(
select(User).where(User.id == user_id)
)).scalar_one_or_none()
async def get_user_by_email(session: AsyncSession, email: str) -> User | None:
email = email.strip().lower()
return (await session.execute(
select(User).where(User.email == email)
)).scalar_one_or_none()
async def get_or_create_user(
session: AsyncSession,
email: str,
*,
create_if_missing: bool = True,
tier: str = "free",
) -> User:
"""Look up the user by email; create if absent and create_if_missing.
Raises AuthError on malformed email, or if create_if_missing=False
and the email is unknown.
Callers should set create_if_missing=False when CASSANDRA_SIGNUP_ENABLED
is false i.e., the operator is running a closed deployment."""
email = _validate_email_or_raise(email)
user = await get_user_by_email(session, email)
if user is not None:
return user
if not create_if_missing:
raise AuthError("Sign-ups are currently disabled. Ask the operator.")
user = User(email=email, tier=tier, settings_json={}, created_at=utcnow())
session.add(user)
await session.commit()
await session.refresh(user)
return user

View file

@ -1,19 +1,15 @@
"""Defensive parser for Trading 212 pie-export CSVs + writer that persists
the parsed pie into PortfolioSnapshot/Position rows.
"""Defensive parser for Trading 212 pie-export CSVs.
The parser is pure: no DB, no HTTP, no I/O. The writer (`persist_pie`)
takes a ParsedPie and resolves each position's Slice via InstrumentMap
to find its Yahoo ticker + canonical name before persisting.
The parser is pure: no DB, no HTTP, no I/O. Returns a ParsedPie that
`/api/portfolio/parse` ships to the browser; in Phase G the browser
keeps the pie in localStorage and the server keeps only the anonymous
ticker_universe.
"""
from __future__ import annotations
import csv
import io
from dataclasses import dataclass
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from sqlalchemy.ext.asyncio import AsyncSession
class CSVImportError(ValueError):
@ -200,96 +196,7 @@ def parse_t212_csv(content: str | bytes) -> ParsedPie:
)
# --- Persist parsed pie into portfolio/snapshot/positions -------------------
@dataclass
class PersistResult:
portfolio_id: int
snapshot_id: int
positions_written: int
unmapped_slices: list[str] # slices we couldn't resolve to a Yahoo ticker
portfolio_name: str
is_new_portfolio: bool
async def persist_pie(
session: "AsyncSession",
pie: ParsedPie,
*,
portfolio_name: str | None = None,
source: str = "t212-csv",
currency: str = "GBP",
) -> PersistResult:
"""Write a ParsedPie into Portfolio/PortfolioSnapshot/Position.
- Portfolio is created on first sight of a given name; subsequent uploads
stack as new snapshots under the same portfolio.
- Each position's Slice is resolved to a T212 ticker + name via the
InstrumentMap. Unmapped slices still get stored using their raw CSV
values; we collect them in `unmapped_slices` for the UI to surface.
"""
# Late imports keep this module dependency-light for unit tests.
from sqlalchemy import select
from app.db import utcnow
from app.models import Portfolio, PortfolioSnapshot, Position
from app.services.instrument_map import resolve_slice
name = portfolio_name or pie.name or "Imported pie"
name = name.strip()[:64]
portfolio = (await session.execute(
select(Portfolio).where(Portfolio.name == name)
)).scalar_one_or_none()
is_new = portfolio is None
if portfolio is None:
portfolio = Portfolio(name=name, source=source, currency=currency)
session.add(portfolio)
await session.flush()
snap = PortfolioSnapshot(
portfolio_id=portfolio.id,
snapshot_at=utcnow(),
total_value=pie.value,
cash=None,
invested=pie.invested,
raw_json={
"source": source,
"pie_name": pie.name,
"result": pie.result,
},
)
session.add(snap)
await session.flush()
unmapped: list[str] = []
for p in pie.positions:
resolved = await resolve_slice(session, p.slice)
if resolved and resolved.t212_ticker:
ticker = resolved.t212_ticker
position_name = resolved.name or p.name
else:
ticker = p.slice
position_name = p.name
unmapped.append(p.slice)
session.add(Position(
snapshot_id=snap.id,
ticker=ticker,
name=position_name[:128] if position_name else None,
quantity=p.quantity,
average_price=p.average_price,
current_price=p.current_price,
ppl=p.result,
))
await session.commit()
return PersistResult(
portfolio_id=portfolio.id,
snapshot_id=snap.id,
positions_written=len(pie.positions),
unmapped_slices=unmapped,
portfolio_name=name,
is_new_portfolio=is_new,
)
# persist_pie removed in Phase G — the parsed pie is returned to the
# browser by /api/portfolio/parse and lives in localStorage. The server
# now keeps only the anonymous ticker_universe (see
# app/services/ticker_universe.py).

View file

@ -0,0 +1,191 @@
"""SMTP-backed transactional email.
Sends multipart/alternative: a plain-text body for accessibility / minimal
clients and an HTML body for richer rendering. Designed for cross-client
robustness:
- Inline styles on every element (Outlook desktop ignores <style> blocks).
- `<style>` block in <head> only carries the prefers-color-scheme media
query for clients that respect it (Apple Mail, Gmail web, Outlook.com).
- Light background by default dark backgrounds are inconsistently
rendered across clients.
- 520-px max width, monospace stack with fallbacks, no external assets
(no remote images, no web fonts), so the email opens identically with
network off.
When SMTP_SERVER is empty, falls back to writing the payload to stdout
convenient for local dev that doesn't want a mail server configured.
"""
from __future__ import annotations
from email.message import EmailMessage
import aiosmtplib
from app import branding
from app.config import get_settings
from app.logging import get_logger
log = get_logger("email")
class EmailSendError(RuntimeError):
"""Raised when SMTP submission fails. The caller should surface a
generic error to the user rather than the SMTP details."""
async def send_email(
to: str,
subject: str,
text_body: str,
html_body: str | None = None,
) -> None:
"""Send a (potentially multipart) email. `text_body` is required —
it's the fallback for clients that can't or won't render HTML."""
s = get_settings()
sender = s.SMTP_FROM or s.SMTP_USER or "noreply@cassandra.local"
msg = EmailMessage()
msg["From"] = sender
msg["To"] = to
msg["Subject"] = subject
msg.set_content(text_body)
if html_body:
msg.add_alternative(html_body, subtype="html")
if not s.SMTP_SERVER:
log.info(
"email.stdout_fallback", to=to, subject=subject,
text_preview=text_body[:120],
multipart=bool(html_body),
)
return
try:
await aiosmtplib.send(
msg,
hostname=s.SMTP_SERVER,
port=s.SMTP_PORT,
username=s.SMTP_USER or None,
password=s.SMTP_PASSWORD or None,
start_tls=s.SMTP_USE_TLS,
timeout=20,
)
log.info("email.sent", to=to, subject=subject)
except Exception as e:
log.error("email.send_failed", to=to, error=str(e)[:200])
raise EmailSendError(f"Failed to send email: {e}") from e
# ---------------------------------------------------------------------------
# OTP email rendering
# ---------------------------------------------------------------------------
_OTP_HTML_TEMPLATE = """\
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="color-scheme" content="light dark">
<meta name="supported-color-schemes" content="light dark">
<title>Your Cassandra sign-in code</title>
<style>
@media (prefers-color-scheme: dark) {{
body {{ background:{D_bg} !important; }}
.card {{ background:{D_surface} !important; border-color:{D_border} !important; }}
.h1 {{ color:{D_text} !important; }}
.muted {{ color:{D_muted} !important; }}
.lead {{ color:{D_text} !important; }}
.code {{ background:{D_bg} !important; color:{D_accent} !important;
border-color:{D_accent} !important; }}
.divider {{ border-color:{D_border} !important; }}
}}
@media (max-width: 540px) {{
.card {{ padding:24px 18px !important; }}
.code {{ font-size:26px !important; letter-spacing:0.3em !important; }}
}}
</style>
</head>
<body style="margin:0; padding:24px 12px; background:{L_bg}; font-family:{FONT_MONO}; color:{L_text}; -webkit-font-smoothing:antialiased;">
<div style="display:none; max-height:0; overflow:hidden; mso-hide:all; font-size:1px; line-height:1px; color:{L_bg};">
Your Cassandra sign-in code {code} expires in {ttl_minutes} minutes.
</div>
<table role="presentation" cellpadding="0" cellspacing="0" border="0" align="center" width="100%" style="max-width:520px; margin:0 auto; border-collapse:separate;">
<tr><td class="card" style="background:{L_surface}; border:1px solid {L_border}; padding:32px 28px;">
<div class="muted" style="font-size:11px; letter-spacing:0.32em; color:{L_muted}; text-transform:uppercase;">
&#9648;&nbsp;CASSANDRA
</div>
<div style="height:22px; line-height:22px; font-size:0;">&nbsp;</div>
<div class="h1" style="font-size:17px; font-weight:normal; color:{L_text}; letter-spacing:0.02em;">
Your sign-in code
</div>
<div style="height:16px; line-height:16px; font-size:0;">&nbsp;</div>
<div class="code" style="font-family:{FONT_MONO}; font-size:32px; letter-spacing:0.4em; text-align:center; padding:20px 12px; border:1px solid {L_accent}; background:{L_surface}; color:{L_accent}; user-select:all;">
{code}
</div>
<div style="height:18px; line-height:18px; font-size:0;">&nbsp;</div>
<div class="lead muted" style="font-size:13px; color:{L_muted}; line-height:1.6;">
This code expires in <span style="color:{L_text};">{ttl_minutes} minutes</span>.
If you didn&rsquo;t request it, you can safely ignore this email &mdash; no changes
will be made to any account.
</div>
<div style="height:26px; line-height:26px; font-size:0;">&nbsp;</div>
<div class="divider" style="border-top:1px solid {L_border};"></div>
<div style="height:14px; line-height:14px; font-size:0;">&nbsp;</div>
<div class="muted" style="font-size:10.5px; color:{L_dim}; letter-spacing:0.06em;">
Sent automatically by Cassandra &middot; do not reply
</div>
</td></tr>
</table>
</body>
</html>
"""
def _html_template_filled(code: str, ttl_minutes: int) -> str:
"""Substitute palette + content into the OTP HTML template."""
return _OTP_HTML_TEMPLATE.format(
code=code,
ttl_minutes=ttl_minutes,
FONT_MONO=branding.FONT_MONO,
**{f"L_{k.replace('-', '_')}": v for k, v in branding.LIGHT.items()},
**{f"D_{k.replace('-', '_')}": v for k, v in branding.DARK.items()},
)
_OTP_TEXT_TEMPLATE = """\
CASSANDRA sign in
Your verification code:
{code}
This code expires in {ttl_minutes} minutes.
If you didn't request it, you can safely ignore this email — no changes
will be made to any account.
Sent automatically by Cassandra · do not reply
"""
def render_otp_email(code: str, ttl_minutes: int) -> tuple[str, str, str]:
"""Returns (subject, text_body, html_body).
Subject embeds the code so users can read it directly from the inbox
list without opening the message common practice for OTP emails
(Notion, Substack). The lock-screen exposure tradeoff is minimal:
anyone with phone access who could see the notification could also
open the email."""
subject = f"Cassandra sign-in: {code}"
text = _OTP_TEXT_TEMPLATE.format(code=code, ttl_minutes=ttl_minutes)
html = _html_template_filled(code=code, ttl_minutes=ttl_minutes)
return subject, text, html
async def send_otp(to: str, code: str, ttl_minutes: int) -> None:
subject, text, html = render_otp_email(code, ttl_minutes)
await send_email(to, subject, text, html_body=html)

106
app/services/fx.py Normal file
View file

@ -0,0 +1,106 @@
"""FX rate fetcher with Redis-backed cache.
The universe endpoint returns prices in each ticker's *local* currency
(USD for NYSE, EUR for Paris, GBP for LSE-after-pence-normalisation,
etc.). The browser needs FX rates to convert these into the pie's base
currency for P/L computation.
Rates are expressed against a USD pivot: `fx[CCY]` = "how many CCY for
1 USD". USD itself is always 1.0. To convert X-currency value to
Y-currency: `value_y = value_x * fx[Y] / fx[X]`.
Yahoo's `=X` symbols give the right shape: `USDGBP=X` returns GBP per
USD. Rates are cached in Redis for 1 hour (FX doesn't move much for
display-purpose P/L; intraday moves are noise at the second decimal).
"""
from __future__ import annotations
import json
from typing import Iterable
import httpx
from app.logging import get_logger
from app.redis_client import get_redis
from app.services.market import fetch_yahoo
log = get_logger("fx")
_CACHE_KEY = "fx:rates:v1"
_CACHE_TTL_SECONDS = 3600 # 1 hour
# Synonyms / shorthand currencies that should resolve to a canonical
# code before lookup. "GBp" (pence) is normalised to GBP at the
# universe endpoint, but we still set up the mapping defensively.
_CANONICALISE = {
"GBP.": "GBP",
"GBX": "GBP",
"GBp": "GBP",
}
def _canonical(ccy: str) -> str:
return _CANONICALISE.get(ccy, ccy)
async def _fetch_one(client: httpx.AsyncClient, ccy: str) -> float | None:
"""Yahoo: `USD<ccy>=X` returns units of <ccy> per 1 USD."""
q = await fetch_yahoo(client, f"USD{ccy}=X", ccy, "")
if q.price is None or q.price <= 0:
return None
return float(q.price)
async def get_rates(currencies: Iterable[str]) -> dict[str, float]:
"""Return `{ccy: units-per-USD}` for every currency requested.
USD is always 1.0. Unknown / fetch-failed currencies are omitted
rather than poisoned callers must check membership before
converting (browser falls back to "no conversion" for missing
pairs, which keeps the panel readable even when FX is degraded).
Cached in Redis for 1 hour; live fetches happen only on cache miss
or when the cached set doesn't cover all needed currencies."""
wanted = {_canonical(c) for c in currencies if c}
wanted.add("USD") # pivot — always present
r = get_redis()
cached_raw = await r.get(_CACHE_KEY)
cached: dict[str, float] = {}
if cached_raw:
try:
cached = json.loads(cached_raw)
except Exception:
cached = {}
missing = wanted - set(cached.keys())
if not missing:
return {c: cached[c] for c in wanted}
# Fetch any missing rates in parallel. Keep the existing cache to
# avoid re-fetching unchanged currencies.
rates = dict(cached)
rates["USD"] = 1.0
fetch_list = [c for c in missing if c != "USD"]
if fetch_list:
async with httpx.AsyncClient(follow_redirects=True, timeout=15) as client:
import asyncio
results = await asyncio.gather(
*(_fetch_one(client, c) for c in fetch_list),
return_exceptions=True,
)
for c, val in zip(fetch_list, results):
if isinstance(val, Exception):
log.warning("fx.fetch_failed", ccy=c, error=str(val)[:120])
continue
if val is not None:
rates[c] = val
# Persist (merged) cache.
await r.set(_CACHE_KEY, json.dumps(rates), ex=_CACHE_TTL_SECONDS)
log.info("fx.cache_refreshed", count=len(rates))
return {c: rates[c] for c in wanted if c in rates}

443
app/services/glossary.py Normal file
View file

@ -0,0 +1,443 @@
"""Novice-mode glossary: terms commonly used in macro market commentary,
each paired with a plain-language definition.
Applied via `wrap_glossary(html, tone)` in the AI-content rendering path
on the API side. Only NOVICE-tone responses get the wrapping; INTERMEDIATE
users see plain text.
The wrap markup is:
<span class="glossary" data-def="..." title="..." tabindex="0">VIX</span>
`title` gives a native fallback on touch devices that don't fire :hover.
The CSS tooltip (see `.glossary:hover::after` in cassandra.css) uses
`data-def` for richer formatting. Wrapping happens at most once per term
per HTML fragment repeated occurrences stay plain.
"""
from __future__ import annotations
import html as _html
import re
from dataclasses import dataclass
@dataclass(frozen=True)
class Term:
"""One glossary entry.
`aliases`: alternate forms that should also match (case-insensitive
unless the term is acronym-style, see `case_sensitive`).
`case_sensitive`: when True, the regex preserves capitalisation
used for acronyms like VIX, ERP, DXY where lowercase matches would
catch common words.
"""
label: str
definition: str
aliases: tuple[str, ...] = ()
case_sensitive: bool = False
# Curated for macro reads aimed at young investors. Keep definitions
# under ~30 words each — they have to fit in a tooltip.
TERMS: tuple[Term, ...] = (
Term(
"VIX",
"The CBOE Volatility Index. Tracks the market's expected 30-day "
"volatility of the S&P 500 — often called the 'fear gauge'. High "
"VIX = traders pricing in big moves; low VIX = calm complacency.",
case_sensitive=True,
),
Term(
"yield curve",
"A chart of US (or any government's) borrowing costs across "
"maturities — 2-year, 5-year, 10-year, etc. Its shape signals "
"what markets expect from growth and interest rates.",
),
Term(
"inverted yield curve",
"When short-term yields exceed long-term yields. Historically one "
"of the most reliable recession warning signals — it means "
"markets expect rates to be cut in the future.",
),
Term(
"basis point",
"One hundredth of a percent. 100bp = 1%. Markets quote rate "
"changes in basis points so '25bp hike' = a 0.25% rate increase.",
aliases=("basis points", "bp", "bps", "bps."),
),
Term(
"ERP",
"Equity risk premium — the extra return investors demand for "
"owning stocks instead of risk-free Treasuries. Low ERP = stocks "
"look expensive vs. bonds; high ERP = the opposite.",
aliases=("equity risk premium",),
case_sensitive=True,
),
Term(
"HY OAS",
"High-yield option-adjusted spread — the extra yield junk bonds "
"pay over Treasuries. Rising HY OAS = credit markets worried; "
"falling = complacency. A key risk gauge.",
aliases=("high-yield OAS", "high yield OAS", "high-yield spread", "credit spread"),
case_sensitive=True,
),
Term(
"CPI",
"Consumer Price Index — the headline inflation measure. Tracks "
"the average price change of a basket of goods households buy. "
"Released monthly; markets watch it for Fed-rate implications.",
case_sensitive=True,
),
Term(
"breakeven",
"Inflation breakeven — the difference between a regular Treasury "
"yield and an inflation-protected one. Markets' implied inflation "
"expectation for that horizon. Watched as a forward inflation read.",
aliases=("breakevens", "inflation breakeven"),
),
Term(
"duration",
"How sensitive a bond's price is to rate changes. A 10-year "
"duration means roughly a 10% price drop for every 1% rate "
"rise. Long-duration assets get hurt most by rate hikes.",
),
Term(
"Fed",
"The US Federal Reserve — the central bank that sets US interest "
"rates and provides dollar liquidity. Its rate decisions ripple "
"through every asset class globally.",
aliases=("Federal Reserve",),
case_sensitive=True,
),
Term(
"FOMC",
"Federal Open Market Committee — the Fed's rate-setting body. "
"Meets ~8 times a year; its statements and the chair's press "
"conference move markets reliably.",
case_sensitive=True,
),
Term(
"ECB",
"European Central Bank — the euro area's Fed-equivalent. Sets "
"rates for 20 countries; its decisions matter for EUR, bunds, "
"and European banks.",
case_sensitive=True,
),
Term(
"BOJ",
"Bank of Japan — Japan's central bank, the last major holdout of "
"near-zero rates. Its policy shifts move USD/JPY, global "
"carry trades, and long-end yields worldwide.",
case_sensitive=True,
),
Term(
"DXY",
"The Dollar Index — the USD's value against a basket of major "
"currencies (mostly EUR, JPY, GBP). Rising DXY squeezes dollar-"
"denominated debt and pressures commodities.",
aliases=("dollar index",),
case_sensitive=True,
),
Term(
"Brent",
"The international benchmark for crude oil, priced from "
"North Sea fields. Sets the price most of the world's oil "
"tracks. Compare to WTI (the US benchmark).",
case_sensitive=True,
),
Term(
"WTI",
"West Texas Intermediate — the US crude oil benchmark. Priced "
"out of Cushing, Oklahoma. Usually trades a few dollars below "
"Brent because of where it's delivered.",
case_sensitive=True,
),
Term(
"soft landing",
"The Fed's hoped-for outcome: cooling inflation without triggering "
"a recession. Historically rare — most rate-hike cycles end in "
"downturn, not gentle deceleration.",
),
Term(
"hard landing",
"Cooling inflation only because the economy tipped into recession. "
"The opposite of a soft landing — rate hikes work, but at the "
"cost of jobs and growth.",
),
Term(
"Magnificent 7",
"Apple, Microsoft, Alphabet, Amazon, Nvidia, Meta, and Tesla — the "
"seven US megacaps driving most of the S&P 500's gains since 2023. "
"Concentration risk: when they wobble, the index does too.",
aliases=("Mag 7", "Mag-7", "Magnificent Seven"),
),
Term(
"Treasury",
"US government debt. 'Treasuries' covers everything from 4-week "
"T-bills to 30-year bonds. Considered the world's safest asset; "
"their yields are the baseline for almost everything else.",
aliases=("Treasuries", "US Treasury", "US Treasuries"),
case_sensitive=True,
),
Term(
"regime",
"The broad market environment — what's driving prices right now. "
"Examples: 'risk-on regime' (stocks and credit bid), 'rates-driven "
"regime' (yields lead everything). Knowing the regime tells you "
"which signals matter.",
),
Term(
"safe haven",
"An asset investors flock to when scared — gold, the US dollar, "
"Treasuries, sometimes the Swiss franc and yen. Their behaviour "
"in a crisis tells you which fear is dominant.",
),
Term(
"Strait of Hormuz",
"A narrow waterway between Iran and Oman that ~20% of the "
"world's seaborne oil passes through. Tensions there spike "
"oil prices instantly — it's the single most-watched geopolitical "
"chokepoint for energy.",
aliases=("Hormuz",),
),
Term(
"quantitative easing",
"When a central bank prints new money and uses it to buy bonds "
"in the open market. Pushes asset prices up, yields down. The "
"post-2008 and 2020 playbook.",
aliases=("QE",),
),
Term(
"quantitative tightening",
"The reverse of QE — the central bank lets bonds it owns mature "
"without replacing them, shrinking its balance sheet. Drains "
"liquidity from markets.",
aliases=("QT",),
),
Term(
"OAS",
"Option-adjusted spread — the extra yield a corporate bond pays "
"above a Treasury of similar maturity, after accounting for any "
"embedded options. Widening OAS = market pricing more credit risk.",
aliases=("option-adjusted spread",),
case_sensitive=True,
),
Term(
"ATH",
"All-time high — the highest level a price or index has ever "
"reached. Often shorthand: 'S&P at ATH' = S&P 500 making new "
"record highs.",
case_sensitive=True,
),
Term(
"YoY",
"Year-over-year — comparing a value to the same value 12 months "
"earlier. 'CPI +3.8% YoY' = consumer prices are 3.8% higher than "
"they were a year ago.",
aliases=("year-over-year", "year over year"),
case_sensitive=True,
),
Term(
"MoM",
"Month-over-month — comparing a value to the previous month. "
"Useful for spotting recent shifts, but noisier than YoY since "
"one month is a small sample.",
aliases=("month-over-month", "month over month"),
case_sensitive=True,
),
Term(
"GDP",
"Gross domestic product — the total value of goods and services "
"an economy produces. The headline measure of economic size and "
"growth. Markets care most about its rate of change.",
case_sensitive=True,
),
Term(
"PMI",
"Purchasing Managers' Index — a monthly survey of business "
"activity. Reading above 50 = expansion; below 50 = contraction. "
"Leading indicator for the broader economy.",
case_sensitive=True,
),
Term(
"HY",
"High yield — corporate bonds rated below investment grade ('junk "
"bonds'). Pay more interest because there's more risk of default. "
"Their behaviour signals how worried credit markets are.",
aliases=("high yield", "high-yield"),
case_sensitive=True,
),
Term(
"IG",
"Investment grade — corporate bonds rated BBB- or higher by S&P. "
"Considered low default risk. The bulk of the corporate bond "
"market by value sits here.",
aliases=("investment grade", "investment-grade"),
case_sensitive=True,
),
Term(
"EM",
"Emerging markets — economies still industrialising (China, India, "
"Brazil, Mexico, Turkey, etc.). Higher growth potential but more "
"volatile and currency-exposed than developed-market peers.",
aliases=("emerging markets",),
case_sensitive=True,
),
Term(
"DM",
"Developed markets — mature economies with deep capital markets "
"(US, UK, Eurozone, Japan, Australia). Slower growth but more "
"stable than EM. The benchmark for global allocation.",
aliases=("developed markets",),
case_sensitive=True,
),
Term(
"rally",
"A sustained move higher in a price or index. Distinct from a "
"one-day bounce: implies multi-session momentum. The opposite of "
"a sell-off or drawdown.",
aliases=("rallies",),
),
Term(
"sell-off",
"A sustained move lower across a market segment. Usually triggered "
"by a shift in macro expectations (rate scare, growth scare, "
"geopolitical risk) rather than single-stock news.",
aliases=("selloff", "sell off"),
),
Term(
"drawdown",
"How far a price has fallen from its recent peak. A 20% drawdown "
"= a 20% drop from the high. The conventional threshold for a "
"'bear market'.",
),
Term(
"positioning",
"How much of a given asset investors collectively hold (or are "
"short). Crowded long positioning leaves no buyers left when "
"sentiment turns — that's when sell-offs accelerate.",
),
)
def _build_pattern(term: Term) -> re.Pattern:
"""Compile a word-boundary regex for the term + its aliases."""
flags = 0 if term.case_sensitive else re.IGNORECASE
forms = sorted([term.label, *term.aliases], key=len, reverse=True)
escaped = "|".join(re.escape(f) for f in forms)
return re.compile(rf"(?<!\w)({escaped})(?!\w)", flags)
# Pre-compile once; the pattern list is tiny.
_COMPILED: tuple[tuple[Term, re.Pattern], ...] = tuple(
(t, _build_pattern(t)) for t in TERMS
)
# Tags whose text content should NOT be wrapped — wrapping inside <code>
# breaks code samples, inside <a> doubles up tooltips with the link, and
# inside <pre> can break the formatting.
_PROTECTED_BLOCK_RE = re.compile(
r"<(code|pre|a|script|style)\b[^>]*>.*?</\1>",
re.IGNORECASE | re.DOTALL,
)
# Match a single HTML tag (open / close / self-closing) or a named/numeric
# entity. Used to split HTML into alternating "tag" and "text" segments so
# the term substitution only ever runs on text — never inside attribute
# values, where a stray match would corrupt previously-wrapped spans.
_TAG_OR_ENTITY_RE = re.compile(r"<[^>]+>|&[#a-zA-Z0-9]+;")
def _make_span(term: Term, matched_text: str) -> str:
# No `title=` attribute: it would render a *second* native tooltip
# alongside the JS-driven one. Mobile users get a tap-to-toggle path
# from the JS handler in base.html.
return (
f'<span class="glossary" '
f'data-term="{_html.escape(term.label, quote=True)}" '
f'data-def="{_html.escape(term.definition, quote=True)}" '
f'tabindex="0">{matched_text}</span>'
)
def _wrap_first_match_in_text_segments(html: str, term: Term, pattern: re.Pattern) -> tuple[str, bool]:
"""Wrap the very first match of `pattern` that appears outside any
HTML tag in `html`. Returns (new_html, wrapped). Walks alternating
tag/text segments so attribute values from earlier wraps are not
candidates for matching."""
out_parts: list[str] = []
last_end = 0
wrapped = False
for m in _TAG_OR_ENTITY_RE.finditer(html):
text_segment = html[last_end:m.start()]
if not wrapped and text_segment:
match = pattern.search(text_segment)
if match:
out_parts.append(text_segment[:match.start()])
out_parts.append(_make_span(term, match.group(0)))
out_parts.append(text_segment[match.end():])
wrapped = True
else:
out_parts.append(text_segment)
else:
out_parts.append(text_segment)
out_parts.append(m.group(0)) # tag / entity — verbatim
last_end = m.end()
# Trailing text after the final tag.
if last_end < len(html):
text_segment = html[last_end:]
if not wrapped:
match = pattern.search(text_segment)
if match:
out_parts.append(text_segment[:match.start()])
out_parts.append(_make_span(term, match.group(0)))
out_parts.append(text_segment[match.end():])
wrapped = True
else:
out_parts.append(text_segment)
else:
out_parts.append(text_segment)
return "".join(out_parts), wrapped
def wrap_glossary(html: str, *, tone: str | None = None) -> str:
"""Wrap the first occurrence of each glossary term in the HTML with a
`<span class="glossary">` so the frontend can render a tooltip.
No-op unless `tone == "NOVICE"`. Wrapping is also a no-op if `html` is
empty or None.
Wrapping is **tag-aware**: each term is matched only against text
that lies outside HTML tags. After wrapping a term, the new
`<span>` becomes part of the HTML; the next term's pass re-walks the
tag/text segments, so it never matches inside the newly-added
attribute values (e.g. the `HY` inside `data-term="HY OAS"`).
Content inside <code>, <pre>, <a>, <script>, and <style> is preserved
verbatim regardless.
"""
if not html or (tone or "").upper() != "NOVICE":
return html or ""
# 1) Stash protected containers behind sentinels so their inner HTML
# is preserved verbatim through the substitution pass.
placeholders: list[str] = []
def _stash(m: re.Match) -> str:
placeholders.append(m.group(0))
return f"\x00{len(placeholders) - 1}\x00"
protected = _PROTECTED_BLOCK_RE.sub(_stash, html)
# 2) Apply each term one at a time, re-splitting tag/text segments
# after each wrap so already-inserted spans become tags-to-skip
# rather than text-to-match in subsequent passes.
for term, pattern in _COMPILED:
protected, _ = _wrap_first_match_in_text_segments(protected, term, pattern)
# 3) Restore protected blocks.
def _unstash(m: re.Match) -> str:
idx = int(m.group(1))
return placeholders[idx]
return re.sub(r"\x00(\d+)\x00", _unstash, protected)

View file

@ -20,7 +20,13 @@ OPENROUTER_URL = "https://openrouter.ai/api/v1/chat/completions"
# Bump when the composed prompt changes meaningfully. Stored on every
# StrategicLog row so historical logs can be linked to the prompt that produced
# them.
PROMPT_VERSION = 5
#
# v6 (2026-05-17): TONE shrinks to NOVICE | INTERMEDIATE (PRO dropped). New
# educational stance baked into _CORE — explicit anti-TA, anti-gambling-mindset
# framing aimed at young investors entering the trading world. NOVICE retuned
# to be pedagogical (defining terms, anti-pattern teach-backs); INTERMEDIATE
# kept terse but with light-touch educational nudges. See tasks/todo.md.
PROMPT_VERSION = 6
# --- Core: invariant across tone/analysis settings ----------------------------
@ -92,6 +98,23 @@ predicted X and X did not happen". Both are useful; conflating them is not.
- No buy/sell recommendations. Triggers are pre-set elsewhere; your job is \
to report whether reality is confirming, modifying, or refuting the thesis.
# Stance (educational, anti-TA, anti-gambling)
The target reader is most likely young, new to investing, and at risk of \
treating markets like a horse race they need to "read" via chart patterns. \
Cassandra is the corrective.
- **No technical analysis.** Head-and-shoulders, RSI thresholds, Fibonacci \
levels, Elliott waves, "support/resistance" these are descriptions of past \
crowd behaviour, not predictions. Don't use them; don't legitimise them. If \
you mention a price level, frame it as a positioning fact (e.g. "the level \
where the latest tranche of buyers entered"), not a signal.
- **No gambling framing.** Markets are not a coin flip and not a horse race. \
Never present a position as a single decisive moment, a "now or never", or a \
bet to be won. Every read should follow the shape: *regime implication \
what would change the regime*.
- **Macro causality, every time.** Price moves get explained through \
fundamentals, geopolitics, monetary policy, and structural shifts not \
chart shapes. Even short paragraphs need the cause, not just the effect.
# System temperature (closing line, mandatory)
Close the log with a single sentence on a line of its own, formatted exactly:
@ -121,25 +144,92 @@ read shifts to Z")."""
# --- Tone: audience-shaping block --------------------------------------------
_TONE: dict[str, str] = {
"NOVICE": """# Audience: novice
The reader is new to markets. Define jargon the first time it appears (a \
short clause in parentheses is fine). Avoid ticker shorthand without context. \
Prefer everyday phrasing: "the price of US government debt fell, pushing \
yields higher" rather than "the long end backed up". Keep paragraphs short. \
Target ~600 words instead of ~800 so density stays digestible.""",
"NOVICE": """# Audience: novice — likely a young investor new to markets
This reader probably arrived from social media, treats charts as predictions, \
and is one bad week away from quitting. Your job is to **educate them out of \
the gambling mindset** without ever being preachy. Calm, patient, slightly \
teacherly. Never condescending.
"INTERMEDIATE": """# Audience: intermediate
- **Define jargon the first time it appears.** A short clause in parentheses \
is fine: "yield curve (the chart of borrowing costs across different \
maturities)", "ERP (equity risk premium the extra return investors demand \
for owning stocks instead of safe bonds)", "basis point (one hundredth of a \
percent 25bp = 0.25%)".
- **Avoid ticker shorthand without context.** Use "Apple (AAPL)" on first \
mention, then "Apple" or the ticker after.
- **Everyday phrasing over jargon** where the meaning survives: "the price \
of US government debt fell, pushing yields up" rather than "the long end \
backed up"; "investors are paying more for the same earnings" rather than \
"multiple expansion".
- **One analogy per concept, used sparingly.** Use them to bridge to \
something concrete the reader already understands not to entertain.
# Educational teach-backs (NOVICE-specific, when warranted)
When the day's data makes a common misconception concrete, drop in ONE \
teach-back of one to two sentences. Don't force it. Don't moralise. Examples \
of moments to do this:
- Anyone treating chart patterns as predictions: \
"Patterns like head-and-shoulders describe what crowds did, not what they \
will do they're stories told after the fact, not edges."
- Anyone fixated on day-to-day moves: \
"A 1% one-day move in a stock is roughly what you'd expect by chance. The \
multi-week trend is where the information lives."
- Anyone treating one ticker as a coin flip: \
"A single name's monthly move is mostly noise. The regime — what bonds, the \
dollar, and credit are doing together tells you whether ANY stock is \
likely to drift up or down."
- Anyone trying to "time the bottom" or "buy the dip": \
"Catching the bottom is a different game from owning the next cycle. The \
first needs you to be right within days; the second needs you to be roughly \
right within years."
Limit yourself to one teach-back per log. Skip them entirely if the day's \
data doesn't naturally invite one.
# Length
Target ~700 words. Slightly more than INTERMEDIATE because explanations \
need breathing room.""",
"INTERMEDIATE": """# Audience: intermediate — reads the news, learning to \
connect macro to markets
Assume the reader knows market basics (yield curves, breakevens, HY OAS, \
sector ETFs). Use common terms without defining them, but stay clear of \
deep institutional shorthand ("the belly", "duration trade", "carry pickup"). \
Target ~700 words lean and clear, no padding.""",
sector ETFs, the difference between cyclical and defensive, what a basis \
point is). Use common terms without defining them, but stay clear of deep \
institutional shorthand ("the belly", "duration trade", "carry pickup", \
"the RV book", "off-the-run").
"PRO": """# Audience: professional
Assume institutional vocabulary. Use dense market shorthand freely. Don't \
define standard terms. Target ~800 words. Density of insight > readability.""",
Light-touch educational nudges are welcome when the day's data warrants — \
e.g. "with rates this volatile, technical levels in equities are mostly \
distraction" — but keep them to a passing clause, not a paragraph. Don't \
moralise.
# Length
Target ~600 words. Lean and clear, no padding.""",
}
# Legacy values map to the closest current value. Logs a warning so we can
# notice if some caller's config didn't get updated.
_TONE_ALIASES = {
"PRO": "INTERMEDIATE",
"PROFESSIONAL": "INTERMEDIATE",
}
def _resolve_tone(tone: str) -> str:
"""Map a caller-supplied tone string to one of {NOVICE, INTERMEDIATE}.
Unknown tones fall back to INTERMEDIATE. The legacy PRO value is mapped
to INTERMEDIATE (audience pivot, see PROMPT_VERSION v6 notes)."""
upper = (tone or "").upper().strip()
if upper in _TONE:
return upper
if upper in _TONE_ALIASES:
return _TONE_ALIASES[upper]
return "INTERMEDIATE"
# --- Analysis: forward-vs-backward focus -------------------------------------
_ANALYSIS: dict[str, str] = {
@ -161,7 +251,7 @@ the trip-wires that decide between scenarios.""",
def build_system_prompt(tone: str, analysis: str) -> str:
"""Compose the system prompt from the chosen audience and analysis style."""
tone_block = _TONE.get(tone.upper(), _TONE["INTERMEDIATE"])
tone_block = _TONE[_resolve_tone(tone)]
analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"])
return "\n\n".join([_CORE, tone_block, analysis_block])
@ -192,7 +282,7 @@ def build_summary_system_prompt(tone: str, analysis: str) -> str:
"""A lean, focused system prompt for the per-indicator-group hourly
summary. INTERPRETATION not description the reader has the table
next to this paragraph; they don't need numbers recited at them."""
tone_block = _TONE.get(tone.upper(), _TONE["INTERMEDIATE"])
tone_block = _TONE[_resolve_tone(tone)]
analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"])
return f"""You write a TINY interpretation (≤60 words, 2-3 sentences) \
of ONE indicator group for a strategic markets dashboard.
@ -239,7 +329,7 @@ def build_summary_user_prompt(group_name: str, quotes: list[dict]) -> str:
def build_aggregate_summary_system_prompt(tone: str, analysis: str) -> str:
"""System prompt for the cross-group aggregate read shown on the dashboard.
Wider lens than a per-group summary synthesise across all groups."""
tone_block = _TONE.get(tone.upper(), _TONE["INTERMEDIATE"])
tone_block = _TONE[_resolve_tone(tone)]
analysis_block = _ANALYSIS.get(analysis.upper(), _ANALYSIS["SPECULATIVE"])
return f"""You write a single SHORT cross-asset INTERPRETATION (≤80 \
words, 2-4 sentences) for the dashboard header. The reader is glancing \
@ -381,30 +471,95 @@ def build_user_prompt(
return "\n".join(parts)
def _provider_chain() -> list[str]:
"""Ordered list of providers to try: primary, then fallback (unless
the fallback is unset, the same as primary, or has no API key)."""
s = get_settings()
primary = (s.LLM_PROVIDER or "deepseek").lower()
fallback = (s.LLM_FALLBACK or "").lower()
chain = [primary]
if fallback and fallback != primary:
chain.append(fallback)
# Drop providers with no API key configured.
return [p for p in chain if _provider_has_key(p)]
def _provider_has_key(provider: str) -> bool:
s = get_settings()
if provider == "deepseek":
return bool(s.DEEPSEEK_API_KEY)
if provider == "openrouter":
return bool(s.OPENROUTER_API_KEY)
return False
def _endpoint_for(provider: str) -> tuple[str, str, str, dict[str, str]]:
"""Resolve (url, api_key, default_model, extra_headers) for a specific
provider. Raises if its API key isn't set."""
s = get_settings()
if provider == "deepseek":
if not s.DEEPSEEK_API_KEY:
raise RuntimeError("DEEPSEEK_API_KEY not set")
return s.DEEPSEEK_URL, s.DEEPSEEK_API_KEY, s.DEEPSEEK_MODEL, {}
if provider == "openrouter":
if not s.OPENROUTER_API_KEY:
raise RuntimeError("OPENROUTER_API_KEY not set")
return (
OPENROUTER_URL,
s.OPENROUTER_API_KEY,
s.OPENROUTER_MODEL,
{
# OpenRouter-specific attribution headers.
"HTTP-Referer": "https://github.com/local/cassandra",
"X-Title": "Cassandra",
},
)
raise RuntimeError(f"Unknown LLM provider: {provider!r}")
def llm_configured() -> bool:
"""At least one provider in the configured chain has an API key."""
return bool(_provider_chain())
def active_model() -> str:
"""Return the model name of the *first* provider in the configured
chain (the one that would be tried first). Used to label AICall ledger
rows when no actual call result is available yet."""
chain = _provider_chain()
if not chain:
return "unknown"
s = get_settings()
return s.DEEPSEEK_MODEL if chain[0] == "deepseek" else s.OPENROUTER_MODEL
@retry(
reraise=True,
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=2, min=2, max=30),
retry=retry_if_exception_type((httpx.HTTPStatusError, httpx.TransportError)),
)
async def call_openrouter(
async def _call_provider(
client: httpx.AsyncClient,
provider: str,
messages: list[dict],
model: str,
max_tokens: int = 4000,
model: str | None,
max_tokens: int,
) -> LogResult:
s = get_settings()
if not s.OPENROUTER_API_KEY:
raise RuntimeError("OPENROUTER_API_KEY not set")
"""One provider call with tenacity retries on transport/HTTP errors.
Lives inside the retry decorator so retries happen within a provider,
not across the fallback chain."""
url, api_key, default_model, extra_headers = _endpoint_for(provider)
used_model = model or default_model
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
**extra_headers,
}
r = await client.post(
OPENROUTER_URL,
headers={
"Authorization": f"Bearer {s.OPENROUTER_API_KEY}",
"Content-Type": "application/json",
"HTTP-Referer": "https://github.com/local/cassandra",
"X-Title": "Cassandra",
},
json={"model": model, "messages": messages, "max_tokens": max_tokens},
url,
headers=headers,
json={"model": used_model, "messages": messages, "max_tokens": max_tokens},
timeout=180,
)
r.raise_for_status()
@ -416,19 +571,68 @@ async def call_openrouter(
if not content:
finish = data["choices"][0].get("finish_reason")
raise RuntimeError(
f"OpenRouter returned empty content (finish_reason={finish}, "
f"model={model}, max_tokens={max_tokens})"
f"LLM returned empty content (finish_reason={finish}, "
f"provider={provider}, model={used_model}, max_tokens={max_tokens})"
)
usage = data.get("usage") or {}
return LogResult(
content=content,
model=model,
# Record provider+model so admin can see which path produced this row.
model=f"{provider}/{used_model}",
prompt_tokens=usage.get("prompt_tokens"),
completion_tokens=usage.get("completion_tokens"),
cost_usd=usage.get("cost") or usage.get("total_cost"),
)
async def call_llm(
client: httpx.AsyncClient,
messages: list[dict],
model: str | None = None,
max_tokens: int = 4000,
) -> LogResult:
"""Provider-aware chat completion with fallback. Tries primary
(LLM_PROVIDER) first; if it raises after retries, falls through to
LLM_FALLBACK. Raises only if every provider in the chain fails.
The returned LogResult.model is prefixed with the provider that
actually answered (e.g. ``deepseek/deepseek-v4-flash`` or
``openrouter/deepseek/deepseek-v4-flash``) useful admin metadata
even though we hide it from the user-facing UI."""
chain = _provider_chain()
if not chain:
raise RuntimeError("No LLM provider configured (no API key set)")
last_exc: Exception | None = None
for i, provider in enumerate(chain):
try:
result = await _call_provider(
client, provider, messages, model, max_tokens,
)
if i > 0:
from app.logging import get_logger
get_logger("llm").info(
"llm.fallback_succeeded", provider=provider, attempt=i + 1,
)
return result
except Exception as e:
last_exc = e
if i + 1 < len(chain):
from app.logging import get_logger
get_logger("llm").warning(
"llm.primary_failed_trying_fallback",
provider=provider, error=str(e)[:200],
)
continue
# Re-raise the last exception so callers see the failure mode.
assert last_exc is not None
raise last_exc
# Back-compat alias for any straggling import sites.
call_openrouter = call_llm
def month_window() -> tuple[datetime, datetime]:
"""[start, now] in UTC for the current calendar month."""
now = datetime.now(timezone.utc)

153
app/services/otp_service.py Normal file
View file

@ -0,0 +1,153 @@
"""Email-OTP generation & verification.
A code is a 6-digit numeric string (000000999999). We store an argon2
hash so leaking the DB alone doesn't reveal active codes. Each code has
a 15-minute TTL and 5 attempts before it gets marked dead. Generating a
new code for an email invalidates any earlier unused ones (one valid
code at a time per email).
Rate limit: at most one new code per 60 seconds per email. Prevents an
attacker spamming the user's inbox via the /resend endpoint.
"""
from __future__ import annotations
import secrets
from datetime import datetime, timedelta, timezone
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
from sqlalchemy import desc, select, update
from sqlalchemy.ext.asyncio import AsyncSession
from app.db import utcnow
from app.models import EmailOTP
def _as_utc(d: datetime) -> datetime:
"""MariaDB returns naive datetimes — tag them UTC so arithmetic with
tz-aware utcnow() doesn't blow up."""
return d if d.tzinfo is not None else d.replace(tzinfo=timezone.utc)
_HASHER = PasswordHasher()
OTP_LENGTH = 6
OTP_TTL_MINUTES = 15
MAX_ATTEMPTS = 5
RESEND_COOLDOWN_SECONDS = 60
class OTPError(Exception):
"""User-safe error message for OTP failures."""
def _generate_code() -> str:
return f"{secrets.randbelow(10 ** OTP_LENGTH):0{OTP_LENGTH}d}"
def _hash_code(code: str) -> str:
return _HASHER.hash(code)
def _check_code(code: str, hashed: str) -> bool:
try:
_HASHER.verify(hashed, code)
return True
except VerifyMismatchError:
return False
except Exception:
return False
async def _latest_unused(session: AsyncSession, email: str) -> EmailOTP | None:
return (await session.execute(
select(EmailOTP)
.where(EmailOTP.email == email)
.where(EmailOTP.used_at.is_(None))
.order_by(desc(EmailOTP.created_at))
.limit(1)
)).scalar_one_or_none()
async def can_request_new(session: AsyncSession, email: str) -> tuple[bool, int]:
"""Returns (allowed, seconds_until_allowed)."""
latest = await _latest_unused(session, email)
if latest is None:
return True, 0
age = (utcnow() - _as_utc(latest.created_at)).total_seconds()
if age >= RESEND_COOLDOWN_SECONDS:
return True, 0
return False, int(RESEND_COOLDOWN_SECONDS - age)
async def issue(
session: AsyncSession,
email: str,
*,
purpose: str = "signup",
) -> str:
"""Generate a fresh code, persist its hash, invalidate any prior unused
codes for this email. Returns the plaintext code so the caller can mail
it. Caller is responsible for rate-limit check via can_request_new()."""
email = email.strip().lower()
# Invalidate prior unused codes for this email so only one is valid.
await session.execute(
update(EmailOTP)
.where(EmailOTP.email == email)
.where(EmailOTP.used_at.is_(None))
.values(used_at=utcnow())
)
code = _generate_code()
now = utcnow()
row = EmailOTP(
email=email,
code_hash=_hash_code(code),
created_at=now,
expires_at=now + timedelta(minutes=OTP_TTL_MINUTES),
attempts=0,
purpose=purpose,
)
session.add(row)
await session.commit()
return code
async def verify(
session: AsyncSession,
email: str,
code: str,
) -> bool:
"""Validate the user-submitted code against the latest unused OTP for
this email. On success, mark the OTP used. Raises OTPError on user-
facing failures (expired, too many attempts, no code outstanding)."""
email = email.strip().lower()
code = code.strip()
if not (code.isdigit() and len(code) == OTP_LENGTH):
raise OTPError("Code must be a 6-digit number")
latest = await _latest_unused(session, email)
if latest is None:
raise OTPError("No verification code outstanding for this email")
if _as_utc(latest.expires_at) < utcnow():
latest.used_at = utcnow()
await session.commit()
raise OTPError("This code has expired — request a new one")
if latest.attempts >= MAX_ATTEMPTS:
latest.used_at = utcnow()
await session.commit()
raise OTPError("Too many attempts — request a new code")
if not _check_code(code, latest.code_hash):
latest.attempts += 1
await session.commit()
remaining = MAX_ATTEMPTS - latest.attempts
if remaining <= 0:
raise OTPError("Too many attempts — request a new code")
raise OTPError(f"Incorrect code ({remaining} attempts left)")
latest.used_at = utcnow()
await session.commit()
return True

View file

@ -0,0 +1,356 @@
"""Ephemeral portfolio analysis — generates AI commentary from a pie that
exists only in the request's memory.
Phase G data-minimisation guarantee: this module **never writes the pie
to the database, to logs, to Redis, or to disk**. The positions list
enters as a function argument, is used to construct a prompt, the LLM
returns text, and the positions are dropped on function return. The
`ai_calls` ledger row written for the call contains model + token counts
+ cost no holdings.
Inputs come from the browser's localStorage. The server's role is to:
1. Validate shape + sanitise free-text fields (prompt-injection defence).
2. Compute summary stats (concentration, top-N, currency mix) these
reduce the LLM payload and let us cap the prompt size.
3. Call OpenRouter via the existing `call_openrouter` helper.
4. Write the cost ledger row (no holdings).
5. Return the commentary text + token / cost metadata.
"""
from __future__ import annotations
import json
import math
import re
from dataclasses import dataclass
from datetime import datetime, timezone
import httpx
from sqlalchemy.ext.asyncio import AsyncSession
from app.config import get_settings
from app.db import utcnow
from app.logging import get_logger
from app.models import AICall
from app.services.openrouter import (
LogResult,
active_model,
build_system_prompt,
call_llm,
)
log = get_logger("portfolio_analysis")
PROMPT_VERSION = 1
# Hard caps on prompt construction to keep token spend bounded regardless
# of pie size. A pie with 200 positions is real — we summarise the tail.
MAX_POSITIONS_INLINED = 25
MAX_NAME_LENGTH = 64
MAX_PROMPT_BYTES = 40_000
# ---------------------------------------------------------------------------
# Input shape
# ---------------------------------------------------------------------------
@dataclass
class Position:
"""One holding as supplied by the browser. Field names match the
/api/portfolio/parse response shape."""
yahoo_ticker: str
name: str
qty: float
avg_cost: float
currency: str | None = None
@dataclass
class AnalysisRequest:
positions: list[Position]
prices: dict[str, dict] # {ticker: {p, c, d:{1d,1m,1y}, ...}}
base_currency: str = "GBP"
anchor: str | None = None
tone: str = "INTERMEDIATE" # NOVICE | INTERMEDIATE | PRO
analysis: str = "SPECULATIVE" # DRY | SPECULATIVE
@dataclass
class AnalysisResult:
content: str
model: str
prompt_tokens: int | None
completion_tokens: int | None
cost_usd: float | None
generated_at: datetime
# ---------------------------------------------------------------------------
# Input validation + sanitisation
# ---------------------------------------------------------------------------
_CONTROL_CHARS = re.compile(r"[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]")
# Prompt-injection markers commonly used to break out of context. Stripped
# *and* their presence flagged — caller can choose to reject.
_INJECTION_TOKENS = (
"ignore previous", "ignore above", "system:", "assistant:",
"you are now", "</system>", "<|im_start|>", "<|im_end|>",
)
def _sanitise_text(value: str, max_len: int) -> str:
"""Strip control chars, collapse whitespace, truncate. Used on
user-supplied name fields before they reach the LLM."""
if not isinstance(value, str):
return ""
cleaned = _CONTROL_CHARS.sub(" ", value).strip()
cleaned = re.sub(r"\s+", " ", cleaned)
return cleaned[:max_len]
def _looks_injected(value: str) -> bool:
lower = value.lower()
return any(token in lower for token in _INJECTION_TOKENS)
def parse_request(payload: dict) -> AnalysisRequest:
"""Validate + sanitise the JSON the browser sent. Raises ValueError on
malformed input. The browser is trusted *minimally* strings are
sanitised, numbers coerced, oversized inputs truncated."""
raw_positions = payload.get("positions") or []
if not isinstance(raw_positions, list) or not raw_positions:
raise ValueError("positions must be a non-empty list")
positions: list[Position] = []
for p in raw_positions[:200]: # hard cap on input length
if not isinstance(p, dict):
continue
ticker = _sanitise_text(p.get("yahoo_ticker", ""), 32).upper()
if not ticker:
continue
name = _sanitise_text(p.get("name", ""), MAX_NAME_LENGTH)
if _looks_injected(name):
# Drop the name rather than the whole position — preserves
# the ticker (which has structure that constrains injection).
name = ticker
try:
qty = float(p.get("qty") or 0)
avg_cost = float(p.get("avg_cost") or 0)
except (TypeError, ValueError):
continue
# Reject NaN / inf — float() accepts these and they'd poison the
# prompt with garbage if they reached the LLM.
if not (math.isfinite(qty) and math.isfinite(avg_cost)):
continue
if qty <= 0:
continue
currency = _sanitise_text(p.get("currency", "") or "", 8) or None
positions.append(Position(
yahoo_ticker=ticker, name=name, qty=qty,
avg_cost=avg_cost, currency=currency,
))
if not positions:
raise ValueError("no valid positions after sanitisation")
prices = payload.get("prices") or {}
if not isinstance(prices, dict):
prices = {}
base_currency = _sanitise_text(payload.get("base_currency", "GBP"), 8) or "GBP"
anchor = _sanitise_text(payload.get("anchor") or "", 32) or None
tone = _sanitise_text(payload.get("tone", "INTERMEDIATE"), 16) or "INTERMEDIATE"
analysis = _sanitise_text(payload.get("analysis", "SPECULATIVE"), 16) or "SPECULATIVE"
return AnalysisRequest(
positions=positions, prices=prices, base_currency=base_currency,
anchor=anchor, tone=tone, analysis=analysis,
)
# ---------------------------------------------------------------------------
# Pre-LLM summarisation: keep prompt size bounded
# ---------------------------------------------------------------------------
def _enrich(req: AnalysisRequest) -> list[dict]:
"""Join positions with their current prices; compute per-position
value, P/L. Returns a list sorted by current value descending."""
out = []
for p in req.positions:
pq = req.prices.get(p.yahoo_ticker) or {}
price = pq.get("p")
currency = p.currency or pq.get("c")
value = (price * p.qty) if isinstance(price, (int, float)) else None
invested = p.avg_cost * p.qty
ppl = (value - invested) if value is not None else None
ppl_pct = ((value / invested - 1) * 100) if (value is not None and invested) else None
out.append({
"ticker": p.yahoo_ticker,
"name": p.name,
"qty": round(p.qty, 6),
"avg_cost": round(p.avg_cost, 4),
"current_price": price,
"currency": currency,
"value": round(value, 2) if value is not None else None,
"invested": round(invested, 2),
"ppl": round(ppl, 2) if ppl is not None else None,
"ppl_pct": round(ppl_pct, 2) if ppl_pct is not None else None,
"change_1d_pct": pq.get("d", {}).get("1d") if isinstance(pq.get("d"), dict) else None,
})
out.sort(key=lambda r: r["value"] if r["value"] is not None else -1, reverse=True)
return out
def _summarise(enriched: list[dict]) -> dict:
"""Aggregate stats for the model — concentration, currency mix,
P/L overall. Saves tokens by not making the LLM compute these."""
total_value = sum((r["value"] or 0) for r in enriched)
total_invested = sum(r["invested"] for r in enriched)
by_ccy: dict[str, float] = {}
for r in enriched:
if r["currency"] and r["value"] is not None:
by_ccy[r["currency"]] = by_ccy.get(r["currency"], 0) + r["value"]
top_n = enriched[:5]
top_share = (sum(r["value"] or 0 for r in top_n) / total_value * 100) if total_value else None
return {
"n_positions": len(enriched),
"total_value": round(total_value, 2),
"total_invested": round(total_invested, 2),
"total_ppl": round(total_value - total_invested, 2) if total_value else None,
"total_ppl_pct": round((total_value / total_invested - 1) * 100, 2)
if (total_value and total_invested) else None,
"top5_share_pct": round(top_share, 1) if top_share is not None else None,
"currency_split_pct": {
k: round(v / total_value * 100, 1)
for k, v in by_ccy.items()
} if total_value else {},
}
# ---------------------------------------------------------------------------
# Prompt construction
# ---------------------------------------------------------------------------
_SYSTEM_OVERRIDES = """\
# Mode: portfolio commentary
You are writing a short read of ONE investor's portfolio. Be specific to
the holdings shown. Frame each observation as analysis ("this allocation
implies X under scenario Y"), not advice ("buy X" / "sell Y" are forbidden).
# Output
- Open with one TL;DR sentence on the portfolio's *posture* (defensive,
cyclical, concentrated, etc.).
- Then 3-5 short paragraphs covering, in order of relevance to this pie:
concentration / single-name risk; sector or geography tilt;
currency exposure if multi-currency; notable winners or laggards;
what would invalidate the current posture.
- ~350 words. No bullet lists. No buy/sell recommendations.
- Do not repeat the input data verbatim interpret it.
"""
def build_prompt(req: AnalysisRequest) -> tuple[str, str]:
"""Returns (system_message, user_message). Pure function — pie data
flows in, prompt strings flow out, nothing is stored."""
enriched = _enrich(req)
summary = _summarise(enriched)
# Truncate the per-position table to keep the prompt bounded.
head = enriched[:MAX_POSITIONS_INLINED]
tail_count = max(0, len(enriched) - MAX_POSITIONS_INLINED)
system = build_system_prompt(req.tone, req.analysis) + "\n\n" + _SYSTEM_OVERRIDES
user_parts = [
f"# Portfolio commentary request — {utcnow().strftime('%Y-%m-%d')}",
f"Base currency: {req.base_currency}",
]
if req.anchor:
user_parts.append(f"Anchor reference date: {req.anchor}")
user_parts.append("\n## Portfolio summary")
user_parts.append("```json\n" + json.dumps(summary, indent=2) + "\n```")
user_parts.append(f"\n## Top {len(head)} positions by value"
+ (f" ({tail_count} smaller positions omitted)" if tail_count else ""))
user_parts.append("```json\n" + json.dumps(head, indent=2, default=str) + "\n```")
user_parts.append(
"\n## Task\nWrite the portfolio read per the system prompt. ~350 words. "
"No preamble, no headers other than the TL;DR opener."
)
user = "\n".join(user_parts)
# Cap on prompt size (token-cost protection).
if len(user) > MAX_PROMPT_BYTES:
user = user[:MAX_PROMPT_BYTES] + "\n[truncated]"
return system, user
# ---------------------------------------------------------------------------
# Orchestration
# ---------------------------------------------------------------------------
async def analyse(
session: AsyncSession,
req: AnalysisRequest,
) -> AnalysisResult:
"""The whole pipeline: prompt → LLM → ledger row → result. The `req`
object is a function-local when this function returns, the pie is
garbage-collected. No DB writes mention positions."""
s = get_settings()
system, user = build_prompt(req)
async with httpx.AsyncClient() as client:
try:
llm: LogResult = await call_llm(
client,
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user},
],
max_tokens=2000,
)
status = "ok"
error_msg = None
except Exception as e:
status = "failed"
error_msg = str(e)[:500]
llm = None
log.error("portfolio_analysis.failed", error=error_msg)
# Ledger row — NO portfolio data, just metadata. Same row whether the
# call succeeded or failed, so cost-cap and rate-limit logic can
# observe the attempt.
session.add(AICall(
called_at=utcnow(),
model=llm.model if llm else active_model(),
prompt_tokens=llm.prompt_tokens if llm else None,
completion_tokens=llm.completion_tokens if llm else None,
cost_usd=llm.cost_usd if llm else None,
status=status,
error=error_msg,
))
await session.commit()
if llm is None:
raise RuntimeError(error_msg or "portfolio analysis failed")
log.info(
"portfolio_analysis.ok",
n_positions=len(req.positions),
prompt_tokens=llm.prompt_tokens,
completion_tokens=llm.completion_tokens,
cost_usd=llm.cost_usd,
)
return AnalysisResult(
content=llm.content,
model=llm.model,
prompt_tokens=llm.prompt_tokens,
completion_tokens=llm.completion_tokens,
cost_usd=llm.cost_usd,
generated_at=datetime.now(timezone.utc),
)

View file

@ -0,0 +1,195 @@
"""Server-wide ticker universe — the set of Yahoo tickers Cassandra currently
tracks, without user attribution.
Population happens in two stages to mitigate the timing-correlation leak:
1. **Buffer.** When /api/portfolio/parse or /api/analyze sees a ticker, it
pushes that ticker into a Redis set keyed by the 5-minute wall-clock
bucket: ``ticker_universe:buffer:<bucket>``. The buffer expires
automatically (TTL = 2 hours, plenty of slack to recover from a missed
flush).
2. **Flush.** A scheduler job runs at fixed 5-minute boundaries (xx:00,
xx:05, ...), reads the *previous* bucket (now closed, no more writes
landing), and INSERTs new tickers into the `ticker_universe` table.
Multiple users' uploads in the same bucket batch together; intra-bucket
ordering is randomised by SET-set semantics. The longer a bucket stays
open, the more uploads it absorbs, the harder timing-correlation gets.
Refresh of `last_referenced_at` for already-known tickers happens
synchronously in the same request it's just an UPDATE and doesn't leak
membership.
Eviction: passive aging via a daily cron that prunes rows older than
UNIVERSE_EVICTION_TTL.
"""
from __future__ import annotations
import time
from datetime import datetime, timedelta, timezone
from typing import Iterable
from sqlalchemy import delete, insert, select, update
from sqlalchemy.dialects.mysql import insert as mysql_insert
from sqlalchemy.ext.asyncio import AsyncSession
from app.db import utcnow
from app.logging import get_logger
from app.models import TickerUniverse
from app.redis_client import get_redis
log = get_logger("ticker_universe")
# Bucket width for the timing-mitigation flush. 5 minutes is a sane default:
# small enough that the price feed isn't *that* stale, large enough that
# multiple uploads in a busy hour batch together. At alpha scale (1-10
# users) bucketing has limited statistical effect; we keep it anyway so
# the property is in place when traffic grows.
BUCKET_SECONDS = 5 * 60
BUFFER_TTL_SECONDS = 2 * 60 * 60 # 2h slack for a missed flush
UNIVERSE_EVICTION_TTL = timedelta(days=60)
def _as_utc(d: datetime) -> datetime:
return d if d.tzinfo is not None else d.replace(tzinfo=timezone.utc)
def _bucket_key(now_ts: float | None = None) -> str:
ts = int(now_ts if now_ts is not None else time.time())
bucket = (ts // BUCKET_SECONDS) * BUCKET_SECONDS
return f"ticker_universe:buffer:{bucket}"
def _previous_bucket_key(now_ts: float | None = None) -> str:
ts = int(now_ts if now_ts is not None else time.time())
bucket = ((ts // BUCKET_SECONDS) - 1) * BUCKET_SECONDS
return f"ticker_universe:buffer:{bucket}"
def _normalise(ticker: str) -> str:
"""Yahoo tickers are case-sensitive (AAPL is not the same as aapl in
their world); we uppercase the alpha part and strip whitespace. Suffixes
like .L / .DE / .HK are conventionally uppercase already."""
return ticker.strip().upper()
async def buffer_tickers(tickers: Iterable[str]) -> int:
"""Push tickers into the current 5-min flush bucket. Returns the count
of distinct tickers buffered. Safe to call with an empty iterable.
Already-known tickers are still buffered the flush job will collapse
them via INSERT IGNORE. Cheap and avoids a synchronous DB read here."""
items = [_normalise(t) for t in tickers if t and t.strip()]
if not items:
return 0
r = get_redis()
key = _bucket_key()
added = await r.sadd(key, *items)
await r.expire(key, BUFFER_TTL_SECONDS)
return int(added)
async def refresh_references(
session: AsyncSession,
tickers: Iterable[str],
) -> int:
"""Bump last_referenced_at for tickers already in the universe.
Returns rows updated. Tickers not yet in the universe are silently
ignored they'll land via the buffered flush path."""
items = list({_normalise(t) for t in tickers if t and t.strip()})
if not items:
return 0
res = await session.execute(
update(TickerUniverse)
.where(TickerUniverse.yahoo_ticker.in_(items))
.values(last_referenced_at=utcnow())
)
await session.commit()
return int(res.rowcount or 0)
async def flush_buffer(session: AsyncSession) -> dict[str, int]:
"""Read the previous 5-min bucket from Redis, INSERT any new tickers
into ticker_universe (collision-safe), and delete the bucket. Returns
counts for observability.
Idempotent: re-running on the same bucket is a no-op because the bucket
is deleted on success."""
r = get_redis()
key = _previous_bucket_key()
tickers = await r.smembers(key)
if not tickers:
return {"buffered": 0, "inserted": 0}
now = utcnow()
payload = [
{"yahoo_ticker": t, "currency": None,
"first_seen_at": now, "last_referenced_at": now}
for t in tickers
]
# ON DUPLICATE KEY UPDATE: existing rows just get their last_referenced_at
# bumped. INSERT IGNORE would also work but the timestamp refresh is
# useful (a ticker that's been buffered means an active user has it).
stmt = mysql_insert(TickerUniverse).values(payload)
stmt = stmt.on_duplicate_key_update(last_referenced_at=stmt.inserted.last_referenced_at)
res = await session.execute(stmt)
await session.commit()
inserted = int(res.rowcount or 0)
await r.delete(key)
log.info("universe.flush", buffered=len(tickers), affected=inserted)
return {"buffered": len(tickers), "inserted": inserted}
async def evict_stale(session: AsyncSession, ttl: timedelta = UNIVERSE_EVICTION_TTL) -> int:
"""Passive aging: delete rows not referenced within the TTL window.
Returns rows deleted."""
cutoff = utcnow() - ttl
res = await session.execute(
delete(TickerUniverse)
.where(TickerUniverse.last_referenced_at < cutoff)
)
await session.commit()
deleted = int(res.rowcount or 0)
if deleted:
log.info("universe.evicted", count=deleted, ttl_days=ttl.days)
return deleted
async def get_all_tickers(session: AsyncSession) -> list[str]:
"""Returns every ticker currently tracked. Order is unspecified."""
rows = (await session.execute(select(TickerUniverse.yahoo_ticker))).scalars().all()
return list(rows)
async def upsert_tickers(session: AsyncSession, tickers: Iterable[str]) -> int:
"""Synchronous upsert into ticker_universe, bypassing the Redis flush
buffer. Used by the /api/portfolio/parse endpoint so the dashboard
has live prices immediately after upload, rather than waiting up to
5 minutes for the buffer to flush.
Returns the count of distinct tickers in the input. The DB-level
side-effect is "row created" for new tickers and "last_referenced_at
bumped" for existing ones.
At alpha scale (<10 concurrent users) the buffer's timing-correlation
mitigation has no statistical effect anyway, so bypassing it is free.
When we hit 10 users this path will be deprecated in favour of the
buffered path, per the Phase G plan."""
items = list({_normalise(t) for t in tickers if t and t.strip()})
if not items:
return 0
now = utcnow()
payload = [
{"yahoo_ticker": t, "currency": None,
"first_seen_at": now, "last_referenced_at": now}
for t in items
]
stmt = mysql_insert(TickerUniverse).values(payload)
stmt = stmt.on_duplicate_key_update(
last_referenced_at=stmt.inserted.last_referenced_at,
)
await session.execute(stmt)
await session.commit()
return len(items)