read.markets/tests/test_glossary.py
Giorgio Gilestro 6e7f57c6b2 phase G: data minimisation + passwordless auth + DeepSeek-first LLM
Server no longer holds portfolios. Holdings live in the browser
(localStorage); the server publishes an anonymous ticker_universe and a
gzipped /api/universe payload identical for every authenticated user, so
access patterns can't betray which tickers a user holds. AI commentary
is generated ephemerally from the browser-supplied pie and the cost
ledger row records no positions. Migrations 0009-0011 added the
universe table and dropped positions / portfolio_snapshots /
portfolios.

Authentication is now e-mail OTP only. Migration 0010 dropped
password_hash and email_verified (every active session is by
construction proof of email control). The /signup endpoint is gone;
signup and login share a single email-entry page. Email rendering is
HTML+plain-text multipart with a shared brand palette (app/branding.py)
asserted in sync with the CSS by a drift-detection test.

LLM provider defaults to DeepSeek-direct (cheaper, api.deepseek.com)
with OpenRouter as automatic fallback if DeepSeek fails. ai_log_job and
indicator_summary_job now iterate the two tones (NOVICE, INTERMEDIATE)
per cycle so the dashboard's tone toggle is instant; PROMPT_VERSION
bumped to 6 with an educational anti-TA / anti-gambling stance baked
into _CORE. NOVICE mode renders a curated glossary inline (CBOE VIX,
yield curve, HY OAS, etc.) with JS-positioned tooltips that survive
viewport edges and sticky bars. Model name and tokens hidden from the
user UI; still recorded in StrategicLog.model and AICall for admin.

Layout adds a sticky top nav, a sticky bottom markets bar (one chip per
exchange with status LED + headline index + 1d change), and
Phase H feedback reporting is queued in tasks/todo.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 14:16:57 +01:00

101 lines
3.8 KiB
Python

"""Unit tests for the Novice-mode glossary wrap. Pure-function; no DB / HTTP."""
from __future__ import annotations
import pytest
from app.services.glossary import wrap_glossary
def test_no_op_when_tone_is_not_novice():
"""Wrap is gated by tone — INTERMEDIATE and unset both pass through."""
text = "VIX spiked to 22."
assert wrap_glossary(text, tone="INTERMEDIATE") == text
assert wrap_glossary(text, tone=None) == text
assert wrap_glossary(text, tone="") == text
def test_no_op_when_html_is_empty():
assert wrap_glossary("", tone="NOVICE") == ""
assert wrap_glossary(None, tone="NOVICE") == ""
def test_wraps_first_occurrence_only():
"""A term that appears twice gets wrapped only on the first hit —
repeating tooltips on every word is noisy."""
out = wrap_glossary("VIX is high; VIX matters.", tone="NOVICE")
assert out.count('class="glossary"') == 1
assert '>VIX</span>' in out
# Second occurrence stays plain.
assert "; VIX matters" in out
def test_wraps_multiple_distinct_terms():
out = wrap_glossary("VIX rose; the yield curve flattened.", tone="NOVICE")
assert 'data-term="VIX"' in out
assert 'data-term="yield curve"' in out
def test_acronyms_are_case_sensitive():
"""VIX matches; 'vix' alone shouldn't (avoid false positives)."""
assert 'class="glossary"' in wrap_glossary("VIX up.", tone="NOVICE")
assert 'class="glossary"' not in wrap_glossary("vix up.", tone="NOVICE")
def test_phrase_terms_match_case_insensitively():
"""'yield curve' should match regardless of capitalisation."""
out_lower = wrap_glossary("the yield curve flattened.", tone="NOVICE")
out_title = wrap_glossary("The Yield Curve flattened.", tone="NOVICE")
assert 'class="glossary"' in out_lower
assert 'class="glossary"' in out_title
def test_aliases_match():
"""'high-yield OAS' aliases through to the canonical HY OAS entry."""
out = wrap_glossary("the credit spread widened today.", tone="NOVICE")
assert 'class="glossary"' in out
assert 'data-term="HY OAS"' in out
def test_word_boundary_prevents_substring_match():
"""ERP shouldn't match inside 'WERP', 'HERP', etc."""
out = wrap_glossary("WERPS isn't a term.", tone="NOVICE")
assert 'class="glossary"' not in out
def test_definition_is_escaped_in_data_attr():
"""A definition with quotes/HTML must be HTML-escaped in attributes
so it doesn't break the surrounding markup."""
out = wrap_glossary("VIX moved.", tone="NOVICE")
# data-def="..." must use &quot; not raw ", &amp; not raw &.
assert 'data-def="' in out
# The S&P 500 reference in the VIX definition uses an ampersand; it
# should be escaped.
assert "&amp;P 500" in out
assert '"P 500' not in out # raw " inside attr would break
def test_skips_content_inside_code_blocks():
"""Wrapping inside <code> would mangle source examples; we skip those."""
html = "Outside: VIX is up. <code>Inside: VIX is up.</code>"
out = wrap_glossary(html, tone="NOVICE")
# The first VIX (outside) should be wrapped.
assert '<span class="glossary"' in out
# The VIX inside <code> stays plain.
assert "<code>Inside: VIX is up.</code>" in out
def test_skips_content_inside_anchor_tags():
"""Wrapping inside <a> would double-up on tooltips and weird the link."""
html = '<a href="/x">VIX explainer</a> and VIX here too.'
out = wrap_glossary(html, tone="NOVICE")
# Anchor content untouched.
assert '<a href="/x">VIX explainer</a>' in out
# The non-anchor VIX got wrapped.
assert '<span class="glossary"' in out
def test_preserves_original_casing_in_wrapped_text():
"""The visible text inside the span should match what was in the source,
not be replaced with the canonical label."""
out = wrap_glossary("The Yield Curve is flat.", tone="NOVICE")
assert ">Yield Curve</span>" in out