news: auto-tag headlines + market-aware cadence + filter UI

- Move news_job from hourly to 3x/hour (cron 10,30,50), with a CadencePolicy
  gate that throttles to active hours (07-21 UTC weekdays at 20 min), off-hours
  (3 h), weekends (6 h). Keeps the daytime feed fresh without spamming RSS
  sources overnight.
- Tag each headline on ingestion via DeepSeek (BATCH_SIZE=25, max_tokens=4000,
  json.JSONDecoder().raw_decode + per-row regex recovery for resilient parsing).
  Vocabulary: 16 tags including new EU / USA / AI / Conflict. NULL tags are
  picked up automatically on the next news_job run, so back-tagging is implicit
  rather than a separate migration step.
- Tag UI: pill bar above the feed with off → include → exclude cycle on click;
  shift-click jumps straight to exclude. State persists in localStorage and is
  injected into /api/news requests via htmx:configRequest. Per-row chips sit to
  the right of the headline (new 5-column grid: age | source | title | tags |
  UTC) so vertical density stays high.
- Strategic log header bug: model was hallucinating "(Updated 21:30 UTC)" in
  future tense. Bumped PROMPT_VERSION 6→7, added explicit ban on time-of-day
  clauses, and supply the actual current UTC time in the user prompt so the
  model has no need to invent one.

Migration 0012 adds headlines.tags (JSON, nullable). Tests cover vocabulary
integrity, validation/normalisation, and the JSON-recovery parser (17 tests).
This commit is contained in:
Giorgio Gilestro 2026-05-21 23:25:03 +01:00
parent 6e7f57c6b2
commit 2013bfa8cc
15 changed files with 745 additions and 25 deletions

View file

@ -26,7 +26,10 @@ OPENROUTER_URL = "https://openrouter.ai/api/v1/chat/completions"
# framing aimed at young investors entering the trading world. NOVICE retuned
# to be pedagogical (defining terms, anti-pattern teach-backs); INTERMEDIATE
# kept terse but with light-touch educational nudges. See tasks/todo.md.
PROMPT_VERSION = 6
# v7 (2026-05-18): Forbid "(Updated HH:MM UTC)" clauses in the date header —
# the model was hallucinating future times. The user prompt now carries the
# actual current UTC time so the model has accurate temporal context.
PROMPT_VERSION = 7
# --- Core: invariant across tone/analysis settings ----------------------------
@ -49,7 +52,11 @@ cover the same event, read the gap in framing — that's the data.
implications is filler.
# Structure
- One-line date header + any anchor framing (e.g. "Week 11 since Hormuz").
- One-line date header containing ONLY the date (e.g. `2026-05-18`) and \
optional anchor framing on the same line (e.g. "Week 11 since Hormuz"). \
**Never include a time-of-day clause like "(Updated 21:30 UTC)"** \
generation time is recorded as metadata elsewhere. Inventing a future or \
arbitrary time in the header confuses readers.
- Immediately after the date header with **nothing** in between write a \
TL;DR. Format it as:
@ -423,7 +430,12 @@ def build_user_prompt(
"""Assemble the user message from already-fetched-and-persisted data.
If `previous_log` is a StrategicLog from earlier today, it's included
as 'Update mode' context the model will revise rather than restart."""
parts = [f"# Strategic log request — {today.strftime('%Y-%m-%d')}"]
parts = [
f"# Strategic log request — {today.strftime('%Y-%m-%d')}",
# Explicit current time so the model doesn't hallucinate one. The
# date header it writes MUST stay date-only (per system prompt).
f"Current time: {today.strftime('%Y-%m-%d %H:%M UTC')}",
]
if anchor:
parts.append(f"Anchor reference date: {anchor}")
if reference_line: