news: auto-tag headlines + market-aware cadence + filter UI
- Move news_job from hourly to 3x/hour (cron 10,30,50), with a CadencePolicy gate that throttles to active hours (07-21 UTC weekdays at 20 min), off-hours (3 h), weekends (6 h). Keeps the daytime feed fresh without spamming RSS sources overnight. - Tag each headline on ingestion via DeepSeek (BATCH_SIZE=25, max_tokens=4000, json.JSONDecoder().raw_decode + per-row regex recovery for resilient parsing). Vocabulary: 16 tags including new EU / USA / AI / Conflict. NULL tags are picked up automatically on the next news_job run, so back-tagging is implicit rather than a separate migration step. - Tag UI: pill bar above the feed with off → include → exclude cycle on click; shift-click jumps straight to exclude. State persists in localStorage and is injected into /api/news requests via htmx:configRequest. Per-row chips sit to the right of the headline (new 5-column grid: age | source | title | tags | UTC) so vertical density stays high. - Strategic log header bug: model was hallucinating "(Updated 21:30 UTC)" in future tense. Bumped PROMPT_VERSION 6→7, added explicit ban on time-of-day clauses, and supply the actual current UTC time in the user prompt so the model has no need to invent one. Migration 0012 adds headlines.tags (JSON, nullable). Tests cover vocabulary integrity, validation/normalisation, and the JSON-recovery parser (17 tests).
This commit is contained in:
parent
6e7f57c6b2
commit
2013bfa8cc
15 changed files with 745 additions and 25 deletions
|
|
@ -67,6 +67,10 @@ class Headline(Base):
|
|||
published_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False)
|
||||
fetched_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=utcnow)
|
||||
fingerprint: Mapped[str] = mapped_column(String(40), nullable=False) # sha1 of normalised title
|
||||
# Semantic content tags from app.services.news_tagging. NULL = not yet
|
||||
# tagged; the next news_job run picks it up. Each entry is one of the
|
||||
# values in news_tagging.TAG_VOCABULARY.
|
||||
tags: Mapped[list[str] | None] = mapped_column(JSON, nullable=True)
|
||||
|
||||
__table_args__ = (
|
||||
UniqueConstraint("fingerprint", name="uq_headlines_fingerprint"),
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue