i18n: stop truncating IT translations + localise the chat sidebar

Three connected fixes after the user spotted the 2026-05-28 IT log cutting off mid-sentence: 1. translation: bump max_tokens 4000 → 8000. call_llm()'s default cap was 4000, which is what the English log generator itself uses as its ceiling. Italian expands roughly 15-25 % over English in tokens, so any near-cap English source produced an IT translation that hit finish_reason=length and returned a truncated body — silently, because _call_provider() only raises when content is fully empty. The strategic_log_translations table has dozens of rows where completion_tokens landed at exactly 4000 with content well under half the source length. 8000 gives ample headroom for any of the five LANGUAGES we ship (en/it/es/fr/de). 2. log.html: localise the chat sidebar strings. user_lang was already passed into the template by pages.py, so an inline {% if user_lang == 'it' %} keeps it simple. Covers the "Ask Cassandra" title, the "grounded on…" hint, the helper lede, the textarea placeholder, and the Send button label. 3. chat endpoint: append respond_in_clause(user.lang) to the system prompt. The chat conversation can now happen in IT — the model's first reply lands in the right language even when the user's first turn is short. scripts/backfill_truncated_translations.py: one-off cleanup utility. Scans strategic_log_translations for rows whose translated content is < 70 % of the English source (the truncation signal — IT *expands* beyond English, so a shorter translation is always suspect), deletes them, and re-translates via the now-uncapped service. Supports --date, --since, --all and --dry-run. The 2026-05-28 fan-out has already been re-translated (13/13 rows). Other historical dates still hold older truncations; the user can decide whether to backfill those (the script is idempotent). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 11:44:41 +02:00 · 2026-05-29 11:44:41 +02:00 · 48f022b71b
commit 48f022b71b
parent 3e1a14f334
4 changed files with 180 additions and 5 deletions
--- a/app/routers/chat.py
+++ b/app/routers/chat.py
@ -21,6 +21,7 @@ from app.db import get_session, utcnow
 from app.jobs._market_context import REFERENCE_LINE
 from app.models import AICall, Headline, Quote, StrategicLog
 from app.routers.api import _md_to_html
+from app.services.i18n import respond_in_clause
 from app.services.llm_prompts import build_chat_system_prompt
 from app.services.openrouter import call_llm, month_start

@ -160,6 +161,13 @@ async def chat(
        headlines=headlines,
        reference_line=REFERENCE_LINE,
    )
+    # Respect the user's interface language preference: append a single
+    # localized "respond in" nudge so the assistant answers in IT when
+    # the user has lang=it. The prompt + history (which includes the
+    # user's own question, often in their language) are usually enough,
+    # but the nudge guarantees the first reply lands correctly.
+    user_lang = principal.user.lang if principal and principal.user else "en"
+    system_prompt = system_prompt + respond_in_clause(user_lang)

    msgs = [{"role": "system", "content": system_prompt}]
    for m in history:
--- a/app/services/translation.py
+++ b/app/services/translation.py
@ -65,7 +65,13 @@ async def translate(
        {"role": "system", "content": system_prompt},
        {"role": "user",   "content": text},
    ]
-    result = await call_llm(client, messages)
+    # Italian / Spanish / French / German typically expand the token count
+    # 15-25 % over English (longer words, more sub-word splits). Our
+    # strategic-log generator runs up to its own 4000-token cap, so a 4000
+    # cap here would silently truncate any near-cap source. 8000 gives
+    # ample headroom for every language we currently support and costs
+    # nothing extra unless the model actually emits more tokens.
+    result = await call_llm(client, messages, max_tokens=8000)

    content = (result.content or "").strip()
    # Strip code fences if the model wrapped its output despite the system rule.
--- a/app/templates/log.html
+++ b/app/templates/log.html
@ -33,21 +33,28 @@
    {% if paid %}
    <aside id="chat-sidebar" class="log-page__chat">
      <div class="chat-header">
-        <span class="chat-title">Ask Cassandra</span>
-        <span class="chat-hint">grounded on the latest log + live data</span>
+        <span class="chat-title">{% if user_lang == 'it' %}Chiedi a Cassandra{% else %}Ask Cassandra{% endif %}</span>
+        <span class="chat-hint">{% if user_lang == 'it' %}basato sull'ultimo log + dati in tempo reale{% else %}grounded on the latest log + live data{% endif %}</span>
      </div>
      <div id="chat-thread" class="chat-thread">
        <div class="chat-msg chat-msg--system">
+          {% if user_lang == 'it' %}
+          Fai domande sull'analisi di oggi. Il modello vede l'ultimo log
+          strategico, le quotazioni di mercato in tempo reale per tutti i
+          gruppi e le ultime 24h di titoli filtrati per tesi. Un refresh
+          della pagina cancella questa conversazione.
+          {% else %}
          Ask about today's analysis. The model sees the latest strategic log,
          live market readings across all groups, and the last 24h of
          thesis-filtered headlines. Refresh wipes this conversation.
+          {% endif %}
        </div>
      </div>
      <form id="chat-form" class="chat-form" autocomplete="off">
        <textarea id="chat-input" rows="2"
-                  placeholder="e.g. why is the defence sleeve flat through Hormuz?"
+                  placeholder="{% if user_lang == 'it' %}es. perché il comparto difesa è piatto nonostante Hormuz?{% else %}e.g. why is the defence sleeve flat through Hormuz?{% endif %}"
                  required></textarea>
-        <button id="chat-send" type="submit">Send</button>
+        <button id="chat-send" type="submit">{% if user_lang == 'it' %}Invia{% else %}Send{% endif %}</button>
      </form>
    </aside>
    {% else %}