ai: bump reviewer max_tokens 120 → 300

A live sanity-check on 50 recent IndicatorSummary rows found 6 of 10
reviewer rejections were the reviewer hitting its own max_tokens cap
mid-verdict ('{"clean": false, "reason": "Truncated sent…'). The
parser then dropped the candidate as malformed JSON, producing a
false-negative verdict that would have purged legitimately clean
rows.

300 tokens is well above the ~30-token verdict the prompt asks for;
the extra headroom removes the artefact at ~$0.00015 per call.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-29 13:15:42 +02:00
parent 45fa31bb2b
commit 0550063316

View file

@ -84,7 +84,15 @@ async def review_read(client: httpx.AsyncClient, candidate: str) -> Verdict:
try: try:
result = await call_llm( result = await call_llm(
client, messages, client, messages,
max_tokens=120, # 300 tokens is comfortably above the 30-token JSON verdict
# the prompt asks for. An earlier 120-token cap was producing
# frequent finish_reason=length cutoffs that left the JSON
# half-written ('{"clean": false, "reason": "Text…'), which
# the parser then rejected as malformed — a false-negative
# in the verdict. The extra headroom costs ~$0.00015 per
# call (DeepSeek output rates) and removes that whole class
# of artefact.
max_tokens=300,
response_format={"type": "json_object"}, response_format={"type": "json_object"},
) )
except Exception as e: except Exception as e: