localization: digest is shared, not per-user (corrected design)

The user pointed out that the only genuinely per-user AI surface is portfolio analysis. The strategic log AND the email digest are both shared cycles — generated once per cycle, consumed by many users. For the digest, this means: - _generate_variants still produces one English variant per tone (as today, unchanged) - A new helper translates each variant once per active non-en lang in parallel via asyncio.gather, producing a {(tone, lang): content} table for the duration of the job run - The per-user send loop selects (user.digest_tone, user.lang), falling back to the English variant of the same tone on miss Translation count per run = tones × non-en active langs = 3 today. 100 Italian users no longer mean 100 translation calls. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 16:22:41 +02:00 · 2026-05-27 16:22:41 +02:00 · 2ecf250d53
commit 2ecf250d53
parent 8af1da12dd
2 changed files with 279 additions and 111 deletions
--- a/docs/superpowers/specs/2026-05-27-localization-italian-design.md
+++ b/docs/superpowers/specs/2026-05-27-localization-italian-design.md
@ -6,10 +6,10 @@
 ## Context

 All AI-generated content (strategic log, daily email digest, portfolio
-analysis, follow-up chat) is English-only today. The operator wants to
-add Italian translation as the first localization, with Spanish,
-French, and German listed as "coming soon" in the settings UI but not
-yet functional. Italian must work end-to-end from settings dropdown to
+analysis) is English-only today. The operator wants to add Italian
+translation as the first localization, with Spanish, French, and
+German listed as "coming soon" in the settings UI but not yet
+functional. Italian must work end-to-end from settings dropdown to
 rendered output; the other three exist as commitments and design
 placeholders so adding them later is a flag flip.

@ -49,30 +49,44 @@ retrofit.
 The system has two categories of AI-generated content, with different
 generation patterns:

-### Per-user content (analyse, digest, chat)
+### Per-user content (portfolio analysis only)

-Each call already produces output for exactly one user. The fix is
-trivial: the user's `lang` threads into the prompt assembly, and the
-system prompt gains a `"Respond in Italian."` clause when `lang != 'en'`.
-One LLM call, no extra cost, no extra latency.
+Portfolio analysis is the only AI-generated surface whose *content* is
+genuinely per-user — each call's input is the user's own pie. Here we
+add the `"Respond in Italian."` clause to the system prompt when
+`user.lang != 'en'`. One LLM call, no extra cost, no extra latency.

-### Shared content (strategic log)
+### Shared content (strategic log, email digest)

-The hourly `ai_log_job` writes a single English log row used by every
-user. To serve non-English users, we generate the English log as today,
-then translate it to each active non-English language via a separate
-LLM call and store the result in a new `strategic_log_translations`
-table. Translations are fanned out in parallel with `asyncio.gather` so
-total translation time is max(single call), not sum. The `/log`
-endpoint serves the translation matching the requester's `lang`,
-falling back to English if none exists.
+Strategic log and email digest are generated once per cycle (hourly,
+daily) and consumed by many users. We do NOT generate them per-user
+per-language. Instead:

-Why translate-after rather than generate-N-times: the strategic log
-includes live market data, headlines, and references that are
-expensive to assemble. Re-running the full generation in each language
-duplicates that work; translating the rendered output preserves a
-single source of truth (the English original) and only spends LLM
-tokens on the actual prose conversion.
+- **Strategic log**: `ai_log_job` writes the English row as today,
+  then translates it to each active non-English language and persists
+  in `strategic_log_translations` (one row per `(log_id, lang)`).
+  `/log` serves the translation matching the user's `lang`, falling
+  back to English.
+
+- **Email digest**: the digest job already generates one English
+  variant per tone (NOVICE / INTERMEDIATE / PRO). We extend the same
+  cycle so that for each tone variant, the job ALSO produces a
+  translation for each active non-English language. The translations
+  live alongside the English variants in memory for the duration of
+  the job run; the per-user send step selects the matching
+  `(tone, lang)` cell. No new persistence — variants exist only for
+  the lifetime of the job.
+
+Why translate-after rather than generate-N-times: the shared content
+involves expensive context assembly (live market data, headlines, log
+history). Re-running the full generation in each language duplicates
+that work; translating the rendered output preserves a single source
+of truth and only spends LLM tokens on the actual prose conversion.
+
+Why no per-user LLM call for the digest: 100 Italian users would
+otherwise mean 100 translation calls per day. With the shared cycle
+we make 3 translations per day (one per tone) regardless of how many
+Italian users receive that variant.

 ## Architecture

@ -82,18 +96,27 @@ tokens on the actual prose conversion.
 │  Values: 'en' (default) | 'it' (active) | 'es'/'fr'/'de' (WIP)  │
 └─────────────────────────────────────────────────────────────────┘
        │
-        ├─ Per-user surfaces (portfolio analyse, daily digest, chat)
+        ├─ Per-user surface (portfolio analysis only)
        │     └─ prompt assembly threads user.lang to
        │        respond_in_clause() → appended to system prompt
        │        when lang != 'en'. Single call_llm, no extra cost.
        │
-        └─ Shared surfaces (strategic log)
-              ├─ ai_log_job writes the English row as today
-              ├─ Then SELECTs distinct users.lang where lang != 'en'
-              │  AND user has active paid access
-              ├─ asyncio.gather of one translate() call per language
-              └─ Each result → INSERT into strategic_log_translations
-                 keyed by (log_id, lang) UNIQUE
+        ├─ Shared surface — strategic log
+        │     ├─ ai_log_job writes the English row as today
+        │     ├─ SELECTs distinct users.lang where lang != 'en'
+        │     │  (no tier gating)
+        │     ├─ asyncio.gather of one translate() call per language
+        │     └─ Each result → INSERT into strategic_log_translations
+        │        keyed by (log_id, lang) UNIQUE
+        │
+        └─ Shared surface — email digest
+              ├─ Job builds one English variant per tone (existing
+              │  _generate_variants behaviour, unchanged)
+              ├─ For each (variant, active non-en lang), translate
+              │  via asyncio.gather; results live in memory
+              └─ Per-user send loop looks up (user.digest_tone,
+                 user.lang) in the in-memory dictionary; falls back
+                 to the English variant of the same tone on miss
 ```

 ## Data model
@ -233,12 +256,18 @@ read time.

 ### `app/jobs/email_digest_job.py` (modified)

-The digest is already per-user and assembles its own prompt. Thread
-`user.lang` through:
+The job already builds one English variant per tone in
+`_generate_variants(...)`. After that returns, the job translates each
+variant into every active non-English language (parallel via
+`asyncio.gather`), and exposes a `(tone, lang) -> content` lookup that
+`_send_one(...)` consults using the recipient's `user.lang`.

- `_generate_variants(...)` accepts a `target_lang` param
- The system prompt assembly appends `respond_in_clause(target_lang)`
- Subject-line generation runs in the same call, so it's localized too
+- Variants live only in memory for the duration of the job run.
+- A failed translation for `(tone, lang)` is logged and that cell
+  falls back to the English variant of the same tone. The send
+  proceeds — the user still gets a digest, just in English that day.
+- The subject line is part of each variant's content, so it gets
+  translated as part of the same call.

 ### `app/services/portfolio_analysis.py` (modified)

@ -364,15 +393,15 @@ End-to-end manual check after deploy:

 - We do not translate UI labels. Italian users see English buttons,
  headings, and tooltips. Future scope.
- We do not translate user-generated content (chat questions the user
-  types). Only the AI's output is localized; user-supplied input flows
-  through unchanged.
- We do not translate the email subject line independently. The same
-  per-user LLM call that generates the digest body also generates the
-  subject in the target language.
- We do not surface translation cost in any user-visible UI. Cost is
-  recorded in `strategic_log_translations.llm_cost_usd` and the existing
-  `ai_calls` ledger picks up per-user calls as today.
+- We do not translate user-supplied input (e.g. portfolio names, any
+  free-text fields). Only AI-generated output is localized.
+- The email subject line is part of each variant's content, so it
+  gets translated alongside the body in the same `translate()` call
+  per (tone, lang) cell — no separate subject-translation path.
+- We do not surface translation cost in any user-visible UI. Strategic
+  log translation cost lands in `strategic_log_translations.llm_cost_usd`;
+  digest translation cost is captured in the existing `ai_calls` ledger
+  via the underlying `call_llm` calls.
 - We do **not** gate strategic-log translation on user tier. Any user
  with `lang='it'` triggers Italian translation for that hour's log,
  regardless of whether they are paid, on credit, or free. Rationale: