i18n: stop truncating IT translations + localise the chat sidebar

Three connected fixes after the user spotted the 2026-05-28 IT log
cutting off mid-sentence:

1. translation: bump max_tokens 4000 → 8000.
   call_llm()'s default cap was 4000, which is what the English log
   generator itself uses as its ceiling. Italian expands roughly 15-25 %
   over English in tokens, so any near-cap English source produced an
   IT translation that hit finish_reason=length and returned a
   truncated body — silently, because _call_provider() only raises when
   content is fully empty. The strategic_log_translations table has
   dozens of rows where completion_tokens landed at exactly 4000 with
   content well under half the source length. 8000 gives ample
   headroom for any of the five LANGUAGES we ship (en/it/es/fr/de).

2. log.html: localise the chat sidebar strings.
   user_lang was already passed into the template by pages.py, so an
   inline {% if user_lang == 'it' %} keeps it simple. Covers the
   "Ask Cassandra" title, the "grounded on…" hint, the helper lede,
   the textarea placeholder, and the Send button label.

3. chat endpoint: append respond_in_clause(user.lang) to the system
   prompt. The chat conversation can now happen in IT — the model's
   first reply lands in the right language even when the user's first
   turn is short.

scripts/backfill_truncated_translations.py: one-off cleanup utility.
Scans strategic_log_translations for rows whose translated content is
< 70 % of the English source (the truncation signal — IT *expands*
beyond English, so a shorter translation is always suspect), deletes
them, and re-translates via the now-uncapped service. Supports --date,
--since, --all and --dry-run. The 2026-05-28 fan-out has already been
re-translated (13/13 rows). Other historical dates still hold older
truncations; the user can decide whether to backfill those (the script
is idempotent).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-29 11:44:41 +02:00
parent 3e1a14f334
commit 48f022b71b
4 changed files with 180 additions and 5 deletions

View file

@ -21,6 +21,7 @@ from app.db import get_session, utcnow
from app.jobs._market_context import REFERENCE_LINE
from app.models import AICall, Headline, Quote, StrategicLog
from app.routers.api import _md_to_html
from app.services.i18n import respond_in_clause
from app.services.llm_prompts import build_chat_system_prompt
from app.services.openrouter import call_llm, month_start
@ -160,6 +161,13 @@ async def chat(
headlines=headlines,
reference_line=REFERENCE_LINE,
)
# Respect the user's interface language preference: append a single
# localized "respond in" nudge so the assistant answers in IT when
# the user has lang=it. The prompt + history (which includes the
# user's own question, often in their language) are usually enough,
# but the nudge guarantees the first reply lands correctly.
user_lang = principal.user.lang if principal and principal.user else "en"
system_prompt = system_prompt + respond_in_clause(user_lang)
msgs = [{"role": "system", "content": system_prompt}]
for m in history:

View file

@ -65,7 +65,13 @@ async def translate(
{"role": "system", "content": system_prompt},
{"role": "user", "content": text},
]
result = await call_llm(client, messages)
# Italian / Spanish / French / German typically expand the token count
# 15-25 % over English (longer words, more sub-word splits). Our
# strategic-log generator runs up to its own 4000-token cap, so a 4000
# cap here would silently truncate any near-cap source. 8000 gives
# ample headroom for every language we currently support and costs
# nothing extra unless the model actually emits more tokens.
result = await call_llm(client, messages, max_tokens=8000)
content = (result.content or "").strip()
# Strip code fences if the model wrapped its output despite the system rule.

View file

@ -33,21 +33,28 @@
{% if paid %}
<aside id="chat-sidebar" class="log-page__chat">
<div class="chat-header">
<span class="chat-title">Ask Cassandra</span>
<span class="chat-hint">grounded on the latest log + live data</span>
<span class="chat-title">{% if user_lang == 'it' %}Chiedi a Cassandra{% else %}Ask Cassandra{% endif %}</span>
<span class="chat-hint">{% if user_lang == 'it' %}basato sull'ultimo log + dati in tempo reale{% else %}grounded on the latest log + live data{% endif %}</span>
</div>
<div id="chat-thread" class="chat-thread">
<div class="chat-msg chat-msg--system">
{% if user_lang == 'it' %}
Fai domande sull'analisi di oggi. Il modello vede l'ultimo log
strategico, le quotazioni di mercato in tempo reale per tutti i
gruppi e le ultime 24h di titoli filtrati per tesi. Un refresh
della pagina cancella questa conversazione.
{% else %}
Ask about today's analysis. The model sees the latest strategic log,
live market readings across all groups, and the last 24h of
thesis-filtered headlines. Refresh wipes this conversation.
{% endif %}
</div>
</div>
<form id="chat-form" class="chat-form" autocomplete="off">
<textarea id="chat-input" rows="2"
placeholder="e.g. why is the defence sleeve flat through Hormuz?"
placeholder="{% if user_lang == 'it' %}es. perché il comparto difesa è piatto nonostante Hormuz?{% else %}e.g. why is the defence sleeve flat through Hormuz?{% endif %}"
required></textarea>
<button id="chat-send" type="submit">Send</button>
<button id="chat-send" type="submit">{% if user_lang == 'it' %}Invia{% else %}Send{% endif %}</button>
</form>
</aside>
{% else %}