The user pointed out that the only genuinely per-user AI surface is
portfolio analysis. The strategic log AND the email digest are both
shared cycles — generated once per cycle, consumed by many users.
For the digest, this means:
- _generate_variants still produces one English variant per tone (as
today, unchanged)
- A new helper translates each variant once per active non-en lang in
parallel via asyncio.gather, producing a {(tone, lang): content}
table for the duration of the job run
- The per-user send loop selects (user.digest_tone, user.lang),
falling back to the English variant of the same tone on miss
Translation count per run = tones × non-en active langs = 3 today.
100 Italian users no longer mean 100 translation calls.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
18 KiB
Localization (Italian active, ES/FR/DE WIP) — Design Spec
Date: 2026-05-27 Status: Draft — pending implementation plan
Context
All AI-generated content (strategic log, daily email digest, portfolio analysis) is English-only today. The operator wants to add Italian translation as the first localization, with Spanish, French, and German listed as "coming soon" in the settings UI but not yet functional. Italian must work end-to-end from settings dropdown to rendered output; the other three exist as commitments and design placeholders so adding them later is a flag flip.
This is foundational plumbing: it touches every LLM call site we ship today and shapes how every future AI feature handles language. Doing it first means later features (qty/cost edit narratives, P/L summaries, alert text, etc.) inherit the i18n wiring for free instead of needing a retrofit.
Goals
- A user can pick
Italianofrom a settings dropdown and immediately see every AI-generated surface in Italian. - Adding
es,fr, ordelater is a one-line change to a constant plus optionally validating the dropdown's enabled set. - Translation cost stays in the "noise" range — we use the same DeepSeek-4-flash model the rest of the system uses (~$0.28/M output tokens). No separate "cheap translation" plumbing.
- Strategic-log reads stay instant for non-English users — no read-time translation latency.
Non-goals
- UI label translation. The dashboard buttons, settings labels, headings, and other chrome remain English. Only the AI's own output is localized.
- Translation of indicator summaries. The same pattern will apply when those become user-facing prose, but they aren't surfaced today.
- Backfilling translations for historical strategic logs. Translation only happens going forward, at the moment a new English log is written.
- Activation of Spanish/French/German. They appear in the dropdown as "coming soon" with disabled options; the value-validation layer in the settings POST refuses them.
Two distinct translation paths
The system has two categories of AI-generated content, with different generation patterns:
Per-user content (portfolio analysis only)
Portfolio analysis is the only AI-generated surface whose content is
genuinely per-user — each call's input is the user's own pie. Here we
add the "Respond in Italian." clause to the system prompt when
user.lang != 'en'. One LLM call, no extra cost, no extra latency.
Shared content (strategic log, email digest)
Strategic log and email digest are generated once per cycle (hourly, daily) and consumed by many users. We do NOT generate them per-user per-language. Instead:
-
Strategic log:
ai_log_jobwrites the English row as today, then translates it to each active non-English language and persists instrategic_log_translations(one row per(log_id, lang))./logserves the translation matching the user'slang, falling back to English. -
Email digest: the digest job already generates one English variant per tone (NOVICE / INTERMEDIATE / PRO). We extend the same cycle so that for each tone variant, the job ALSO produces a translation for each active non-English language. The translations live alongside the English variants in memory for the duration of the job run; the per-user send step selects the matching
(tone, lang)cell. No new persistence — variants exist only for the lifetime of the job.
Why translate-after rather than generate-N-times: the shared content involves expensive context assembly (live market data, headlines, log history). Re-running the full generation in each language duplicates that work; translating the rendered output preserves a single source of truth and only spends LLM tokens on the actual prose conversion.
Why no per-user LLM call for the digest: 100 Italian users would otherwise mean 100 translation calls per day. With the shared cycle we make 3 translations per day (one per tone) regardless of how many Italian users receive that variant.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ User has user.lang preference │
│ Values: 'en' (default) | 'it' (active) | 'es'/'fr'/'de' (WIP) │
└─────────────────────────────────────────────────────────────────┘
│
├─ Per-user surface (portfolio analysis only)
│ └─ prompt assembly threads user.lang to
│ respond_in_clause() → appended to system prompt
│ when lang != 'en'. Single call_llm, no extra cost.
│
├─ Shared surface — strategic log
│ ├─ ai_log_job writes the English row as today
│ ├─ SELECTs distinct users.lang where lang != 'en'
│ │ (no tier gating)
│ ├─ asyncio.gather of one translate() call per language
│ └─ Each result → INSERT into strategic_log_translations
│ keyed by (log_id, lang) UNIQUE
│
└─ Shared surface — email digest
├─ Job builds one English variant per tone (existing
│ _generate_variants behaviour, unchanged)
├─ For each (variant, active non-en lang), translate
│ via asyncio.gather; results live in memory
└─ Per-user send loop looks up (user.digest_tone,
user.lang) in the in-memory dictionary; falls back
to the English variant of the same tone on miss
Data model
users.lang (new column)
ALTER TABLE users
ADD COLUMN lang VARCHAR(8) NOT NULL DEFAULT 'en';
Existing rows pick up the en default. Application-level validation
restricts writes to the ACTIVE_LANGUAGES set; the database column
accepts anything in VARCHAR(8) (no CHECK constraint — we want to
add new languages without a migration).
strategic_log_translations (new table)
CREATE TABLE strategic_log_translations (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
log_id BIGINT NOT NULL,
lang VARCHAR(8) NOT NULL,
content_md TEXT NOT NULL,
generated_at DATETIME(6) NOT NULL,
llm_model VARCHAR(64),
llm_cost_usd FLOAT,
CONSTRAINT fk_slt_log
FOREIGN KEY (log_id) REFERENCES strategic_logs(id) ON DELETE CASCADE,
CONSTRAINT uq_slt_log_lang UNIQUE (log_id, lang)
);
ON DELETE CASCADE means evicting an old strategic log row also drops its translations. The UNIQUE constraint prevents duplicate translations for the same log/lang combo.
Components
app/services/i18n.py (new)
LANGUAGES = {
"en": "English",
"it": "Italian",
"es": "Spanish",
"fr": "French",
"de": "German",
}
# Set of language codes that users can actually pick from the settings
# dropdown. ES/FR/DE remain in LANGUAGES so their labels render, but
# the settings POST validator and the strategic-log translation fan-out
# both consult this set.
ACTIVE_LANGUAGES = {"en", "it"}
def respond_in_clause(lang: str) -> str:
"""Suffix appended to per-user LLM system prompts.
Returns an empty string for 'en' (the default everywhere already).
Otherwise returns "\n\nRespond in <Language>." so the model knows
to write its output in the user's language.
"""
if not lang or lang == "en":
return ""
name = LANGUAGES.get(lang, "English")
return f"\n\nRespond in {name}."
app/services/translation.py (new)
async def translate(
client: httpx.AsyncClient,
text: str,
target_lang: str,
) -> tuple[str, LogResult]:
"""Translate ``text`` (markdown) to ``target_lang``.
Uses the default ``call_llm`` provider chain — DeepSeek-4-flash via
the OG API is already cheap enough ($0.28/M output) that a separate
'translation model' setting would be over-engineering.
Returns ``(translated_markdown, LogResult)`` so the caller can
persist provenance (model + cost) alongside the translation.
Raises on provider failure; caller decides whether to surface or
swallow.
"""
System prompt: "Translate the following markdown to {language}. Preserve all formatting (headings, lists, links, emphasis). Do NOT translate ticker symbols, company names, numbers, percentages, or dates. Output ONLY the translated markdown — no preamble, no commentary."
app/models.py (modified)
User: addlang: Mapped[str] = mapped_column(String(8), nullable=False, default="en", server_default="en")- New class
StrategicLogTranslationmatching the table above
app/jobs/ai_log_job.py (modified)
After the existing English log row is persisted, add a translation fan-out:
# Select distinct active non-English languages.
async with session_factory() as session:
rows = (await session.execute(
select(User.lang).distinct()
.where(User.lang.in_(ACTIVE_LANGUAGES - {"en"}))
)).scalars().all()
active_langs = list(rows)
if active_langs:
async with httpx.AsyncClient(...) as client:
results = await asyncio.gather(*[
translate(client, log_row.content_md, lang)
for lang in active_langs
], return_exceptions=True)
for lang, result in zip(active_langs, results):
if isinstance(result, Exception):
log.warning("log.translate.failed", lang=lang, error=str(result)[:200])
continue
translated_md, llm_log = result
session.add(StrategicLogTranslation(
log_id=log_row.id, lang=lang,
content_md=translated_md,
generated_at=utcnow(),
llm_model=llm_log.model,
llm_cost_usd=llm_log.cost_usd,
))
await session.commit()
Errors in individual language translations are logged but do not fail the job. Missing translations get rendered as the English fallback at read time.
app/jobs/email_digest_job.py (modified)
The job already builds one English variant per tone in
_generate_variants(...). After that returns, the job translates each
variant into every active non-English language (parallel via
asyncio.gather), and exposes a (tone, lang) -> content lookup that
_send_one(...) consults using the recipient's user.lang.
- Variants live only in memory for the duration of the job run.
- A failed translation for
(tone, lang)is logged and that cell falls back to the English variant of the same tone. The send proceeds — the user still gets a digest, just in English that day. - The subject line is part of each variant's content, so it gets translated as part of the same call.
app/services/portfolio_analysis.py (modified)
AnalysisRequestgains alang: str = "en"field, populated by the route fromprincipal.user.langanalyse(...)appendsrespond_in_clause(req.lang)to its system prompt
app/routers/universe.py (modified — the /api/analyze route)
Read the current user's lang and put it in the payload before calling
analyse(...). (The current route gets the principal via Depends.)
app/routers/pages.py / the /log resolution (modified)
When rendering /log (and the /log/{day} historical variant), look
up the user's lang. If lang != 'en', attempt to fetch the matching
StrategicLogTranslation; if present, render that. If absent, fall
back to the English StrategicLog.content_md. No silent error — the
fallback is the intended graceful path.
Settings UI (app/templates/settings.html modified)
New section under existing user preferences (alongside the digest-tone toggle):
<details class="settings-section">
<summary class="settings-section__head">Language</summary>
<p class="settings-section__lede">
The language the AI uses for the strategic log, your daily digest,
and portfolio commentary. UI labels stay in English for now.
</p>
<form method="post" action="/settings/language" class="settings-row">
<select name="lang" id="lang-select">
<option value="en" {% if user.lang == 'en' %}selected{% endif %}>English</option>
<option value="it" {% if user.lang == 'it' %}selected{% endif %}>Italiano</option>
<option value="es" disabled>Español (coming soon)</option>
<option value="fr" disabled>Français (coming soon)</option>
<option value="de" disabled>Deutsch (coming soon)</option>
</select>
<button type="submit" class="settings-btn">Save</button>
</form>
</details>
Settings POST endpoint (new)
@router.post("/settings/language")
async def set_language(
lang: str = Form(...),
cu: CurrentUser = Depends(require_auth),
session: AsyncSession = Depends(get_session),
):
if lang not in ACTIVE_LANGUAGES:
raise HTTPException(status_code=400, detail="unsupported language")
if cu.user is None:
raise HTTPException(status_code=403, detail="user required")
cu.user.lang = lang
await session.commit()
return RedirectResponse(url="/settings#language", status_code=303)
Server-side validation against ACTIVE_LANGUAGES is the gate that
keeps ES/FR/DE non-functional even if someone POSTs them by hand.
Error handling
| Case | Behaviour |
|---|---|
| Translation provider down at ai_log_job time | English row still written. Translation row missing for that hour and language. Next hour retries. No retroactive backfill in v1. |
| Translation returns malformed markdown | Stored anyway (we trust DeepSeek output enough that this is rare). Operator can delete a bad row by hand. |
User has lang=it but no IT translation for the latest log |
Fall back to English silently. Better than an empty pane. |
User saves an unsupported lang (es/fr/de/xx) via raw POST |
400 — validated against ACTIVE_LANGUAGES. |
Migrating an existing user with no lang column |
The DEFAULT 'en' clause on the migration handles it; no application code change needed. |
| User picks Italian, then logs change reaches them mid-hour | The next ai_log_job tick generates and translates a fresh log; users see the IT version on the next refresh. |
Tests
Backend (tests/test_i18n.py, tests/test_translation.py,
tests/test_localization_integration.py):
respond_in_clause('en')returns empty stringrespond_in_clause('it')includes the word "Italian"respond_in_clause('xx')falls back to "English" (defensive)translate()mocked happy path returns the translated text + LogResulttranslate()provider failure raises- ai_log_job: with no non-en users, no translation calls happen (mock asserts call_count=0)
- ai_log_job: with one user at
lang='it', one translation row written with the rightlangandlog_id - ai_log_job: translation failure on one lang doesn't fail the job; the other lang's row still writes
/logserves IT row whenuser.lang='it'and an IT translation exists/logfalls back to English whenuser.lang='it'but no IT translation exists/settings/languagePOST: acceptsen/it, rejectses/fr/de/xxwith 400analyse()system prompt contains"Respond in Italian."whenlang='it'(assert on the messages list passed to call_llm)- digest job system prompt likewise contains the clause when the user is Italian
Verification
End-to-end manual check after deploy:
- Switch a paid test user to Italian via the settings dropdown. Confirm
users.lang='it'in the DB. - Wait for the next hourly log generation (or trigger manually via cron/admin). Confirm a new
strategic_log_translationsrow exists withlang='it'andcontent_mdclearly Italian. - Open the dashboard as that user. Strategic log renders in Italian.
- Trigger the daily digest send for that user (CLI:
python -m app.cli send-test-digest user@x daily). Confirm the received email is in Italian. - Click "Analyse my portfolio" on the dashboard. Confirm the AI commentary is in Italian.
- Switch the same user back to English. Confirm the next dashboard refresh shows the English log. The IT translation row stays in the DB (other IT users still benefit).
- Inspect the dropdown. Verify ES/FR/DE appear with "(coming soon)" suffix and the option is disabled.
- Attempt
curl -X POST /settings/language -d lang=eswith a valid session cookie. Expect 400.
Migration / rollout
- Alembic migration
0022_localizationaddsusers.langand createsstrategic_log_translations. Existing rows pick upendefault. - App restart picks up the new code paths. Pre-existing English logs stay as-is. The first ai_log_job tick after deploy generates the first Italian translation for whatever active IT users exist (likely zero on day one until someone opts in).
- Removing localization later (if needed) is harmless: setting any
user's
langback toenmakes their experience identical to the pre-localization state.
Out-of-scope clarifications
- We do not translate UI labels. Italian users see English buttons, headings, and tooltips. Future scope.
- We do not translate user-supplied input (e.g. portfolio names, any free-text fields). Only AI-generated output is localized.
- The email subject line is part of each variant's content, so it
gets translated alongside the body in the same
translate()call per (tone, lang) cell — no separate subject-translation path. - We do not surface translation cost in any user-visible UI. Strategic
log translation cost lands in
strategic_log_translations.llm_cost_usd; digest translation cost is captured in the existingai_callsledger via the underlyingcall_llmcalls. - We do not gate strategic-log translation on user tier. Any user
with
lang='it'triggers Italian translation for that hour's log, regardless of whether they are paid, on credit, or free. Rationale: Italian + UK are the first markets the operator is targeting, so Italian availability is part of the public-facing experience — a free-tier visitor needs to see the AI in Italian to convert. At ~$0.005/day total cost the gating overhead is not worth the savings.