Add barrier_picker_app — Dockerised web picker for barrier opening

A FastAPI app + plain HTML5 video page that replaces the matplotlib
picker. Browse to http://host:8000/, scrub through each video with
arrow keys (±5 s, ±1 s with Shift, ±0.1 s with Ctrl, ±1 frame with
,/.), and click one of three buttons:
  - All barriers open      — every ROI usable
  - Upper barrier opens    — ROIs 1,3,5 usable; lower row marked bad
  - Lower barrier opens    — ROIs 2,4,6 usable; upper row marked bad

The current playhead time is recorded as opening_s; bad_rois is set
accordingly. Also keyboard shortcuts (1/2/3 for the three modes,
s/u for skip/unusable). Refresh-safe: every submission persists to
data/metadata/barrier_opening.csv before advancing.

Server uses byte-range streaming so seeking inside long videos is
fast. Dockerfile + docker-compose.yml mount the data volume RO and
the metadata folder RW.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-01 12:33:28 +01:00
parent 24403e0474
commit 1a7542def2
7 changed files with 611 additions and 5 deletions

View file

@ -0,0 +1,16 @@
FROM python:3.12-slim
WORKDIR /app
# Reason: keep the image small — we only need pandas + fastapi + uvicorn.
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY app.py .
COPY static/ ./static/
ENV HOST=0.0.0.0 PORT=8000
EXPOSE 8000
CMD ["python", "app.py"]

View file

@ -0,0 +1,70 @@
# Cupido — web-based barrier-opening picker
A small FastAPI + HTML5-video app for annotating the barrier-opening
moment in each tracked recording. Lives in its own Docker container so
it can run on the lab server without polluting any existing
environment.
## What it does
For every video referenced by `all_video_info_merged.tsv` that has a
tracking DB on disk and isn't yet in `barrier_opening.csv`, it serves
a `<video>` element pre-loaded with that mp4. The analyst plays /
scrubs / pauses at the moment the barrier opens, then clicks one of:
| button | meaning | written to `bad_rois` |
|---|---|---|
| **All barriers open** | every ROI (1..6) is usable post-opening | _empty_ |
| **Upper barriers open** | only the top row opens (ROIs 1, 3, 5) | `2,4,6` |
| **Lower barriers open** | only the bottom row opens (ROIs 2, 4, 6) | `1,3,5` |
Plus a **Skip** (advance without saving) and **Unusable** (write
`opening_s = NaN`).
## Keyboard shortcuts
- <kbd>Space</kbd> — play / pause
- <kbd></kbd> / <kbd></kbd> — ±5 s
- <kbd>Shift</kbd>+arrows — ±1 s
- <kbd>Ctrl</kbd>+arrows — ±0.1 s
- <kbd>,</kbd> / <kbd>.</kbd> — ±1 frame
- <kbd>1</kbd> / <kbd>2</kbd> / <kbd>3</kbd> — All / Upper / Lower
- <kbd>s</kbd> — skip ; <kbd>u</kbd> — unusable
## Run locally (docker compose)
```bash
cd scripts/barrier_picker_app
docker compose up --build
```
Then browse to http://localhost:8000/.
The container mounts:
- `/mnt/data/projects/cupido` (data volume, read-only)
- `/mnt/ethoscope_data/videos` (source mp4s, read-only)
- `data/metadata/` from the repo (read-write — for persisting
`barrier_opening.csv`)
Adjust paths in `docker-compose.yml` if your layout differs.
## Run without Docker (development)
```bash
cd scripts/barrier_picker_app
pip install -r requirements.txt
python app.py # serves on http://localhost:8000
```
By default it expects:
- merged TSV at `/mnt/data/projects/cupido/all_video_info_merged.tsv`
- inventory at `/cupido/data/metadata/video_inventory.csv`
- writes to `/cupido/data/metadata/barrier_opening.csv`
Override via environment variables:
```bash
CUPIDO_DATA_VOLUME=/path/to/data \
CUPIDO_INVENTORY_CSV=$(pwd)/../../data/metadata/video_inventory.csv \
CUPIDO_OUTPUT_CSV=$(pwd)/../../data/metadata/barrier_opening.csv \
python app.py
```

View file

@ -0,0 +1,280 @@
"""FastAPI server for the web-based barrier-opening picker.
Browse to http://<host>:8000/ and you'll see a video player loaded with
the next un-annotated video from the queue. Use the arrow keys to
scrub (/ ±5 s, Shift+/ ±1 s, Ctrl+/ ±0.1 s), space to pause/play,
or click the seekbar. When the barrier opens, click one of:
[All barriers open] every ROI is usable post-opening
[Upper barriers open] only ROIs 1,3,5 are usable
[Lower barriers open] only ROIs 2,4,6 are usable
The current playhead time is recorded as the barrier-opening moment;
ROI inclusion is set accordingly. There is also a Skip and a Mark
unusable button.
The queue is built from the merged TSV plus the inventory: every
unique (machine, date, time) that has both a tracking DB and an mp4
on disk and is not yet in barrier_opening.csv. Submissions persist
to barrier_opening.csv after every click refresh-safe.
Configuration (environment variables):
CUPIDO_DATA_VOLUME /mnt/data/projects/cupido (data volume)
CUPIDO_INVENTORY_CSV /cupido/data/metadata/video_inventory.csv
CUPIDO_OUTPUT_CSV /cupido/data/metadata/barrier_opening.csv
"""
from __future__ import annotations
import os
import re
from dataclasses import dataclass
from pathlib import Path
import pandas as pd
from fastapi import FastAPI, HTTPException, Request
from fastapi.responses import FileResponse, HTMLResponse, JSONResponse, Response
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel
# ─── Config ──────────────────────────────────────────────────────────────
DATA_VOLUME = Path(os.environ.get("CUPIDO_DATA_VOLUME", "/mnt/data/projects/cupido"))
INVENTORY_CSV = Path(os.environ.get(
"CUPIDO_INVENTORY_CSV", "/cupido/data/metadata/video_inventory.csv"
))
OUTPUT_CSV = Path(os.environ.get(
"CUPIDO_OUTPUT_CSV", "/cupido/data/metadata/barrier_opening.csv"
))
TSV_PATH = DATA_VOLUME / "all_video_info_merged.tsv"
# Reason: the (date, time, machine_uuid) prefix encoded in every tracking
# DB filename and every inventory mp4 filename.
DB_NAME_RE = re.compile(
r"^(\d{4}-\d{2}-\d{2})_(\d{2}-\d{2}-\d{2})_([0-9a-f]{32})__"
)
OUT_COLS = ["machine_name", "session_date", "session_time",
"opening_s", "trim_first_s", "bad_rois", "notes"]
# ROI numbering in the HD mating arena (verified via tracking_geometry):
# upper row = ROIs 1, 3, 5 (y ≈ 0.125)
# lower row = ROIs 2, 4, 6 (y ≈ 0.795)
ROIS_UPPER = "1,3,5"
ROIS_LOWER = "2,4,6"
@dataclass(frozen=True)
class QueueItem:
idx: int
machine_name: str
session_date: str
session_time: str
mp4_path: str
duration_s: float | None
done: bool
# ─── Queue building ─────────────────────────────────────────────────────
def _build_queue() -> list[QueueItem]:
"""Build the ordered queue of pickable videos."""
if not TSV_PATH.exists():
raise RuntimeError(f"merged TSV not found at {TSV_PATH}")
if not INVENTORY_CSV.exists():
raise RuntimeError(f"inventory not found at {INVENTORY_CSV}")
tsv = pd.read_csv(TSV_PATH, sep="\t")
inv = pd.read_csv(INVENTORY_CSV)
inv_by_key: dict[tuple[str, str, str], dict] = {}
for r in inv.itertuples(index=False):
inv_by_key[(r.machine_name, r.session_date, r.session_time)] = {
"mp4_path": r.mp4_path,
"duration_s": float(r.duration_s) if pd.notna(r.duration_s) else None,
}
if OUTPUT_CSV.exists():
out = pd.read_csv(OUTPUT_CSV)
done_keys = set(zip(out["machine_name"],
out["session_date"],
out["session_time"]))
else:
done_keys = set()
seen: set[tuple[str, str, str]] = set()
items: list[QueueItem] = []
for col in ("training_db_path", "testing_db_path"):
for row in tsv.itertuples(index=False):
db = getattr(row, col)
if not isinstance(db, str) or not db:
continue
db_path = Path(db)
if not db_path.exists():
continue
m = DB_NAME_RE.match(db_path.name)
if not m:
continue
session_date, session_time = m.group(1), m.group(2)
key = (row.machine_name, session_date, session_time)
if key in seen:
continue
seen.add(key)
inv_row = inv_by_key.get(key)
if inv_row is None or not Path(inv_row["mp4_path"]).exists():
continue
items.append(QueueItem(
idx=len(items),
machine_name=row.machine_name,
session_date=session_date,
session_time=session_time,
mp4_path=inv_row["mp4_path"],
duration_s=inv_row["duration_s"],
done=key in done_keys,
))
return items
# ─── App ───────────────────────────────────────────────────────────────
app = FastAPI(title="Cupido barrier-opening picker")
STATIC_DIR = Path(__file__).parent / "static"
app.mount("/static", StaticFiles(directory=STATIC_DIR), name="static")
@app.get("/")
async def index() -> FileResponse:
return FileResponse(STATIC_DIR / "index.html")
@app.get("/api/queue")
async def get_queue() -> JSONResponse:
queue = _build_queue()
return JSONResponse([
{
"idx": q.idx,
"machine_name": q.machine_name,
"session_date": q.session_date,
"session_time": q.session_time,
"duration_s": q.duration_s,
"done": q.done,
}
for q in queue
])
def _stream_video(file_path: Path, request: Request) -> Response:
"""HTTP Range-aware video streaming."""
file_size = file_path.stat().st_size
range_header = request.headers.get("range")
if range_header is None:
return FileResponse(file_path, media_type="video/mp4",
headers={"Accept-Ranges": "bytes"})
# Parse "bytes=START-END" (END optional)
m = re.match(r"bytes=(\d+)-(\d*)", range_header)
if not m:
raise HTTPException(status_code=416, detail="bad Range header")
start = int(m.group(1))
end = int(m.group(2)) if m.group(2) else file_size - 1
end = min(end, file_size - 1)
if start > end:
raise HTTPException(status_code=416, detail="range not satisfiable")
chunk_size = end - start + 1
def iterfile():
with open(file_path, "rb") as f:
f.seek(start)
remaining = chunk_size
while remaining > 0:
buf = f.read(min(64 * 1024, remaining))
if not buf:
break
yield buf
remaining -= len(buf)
return Response(
content=b"".join(iterfile()),
status_code=206,
media_type="video/mp4",
headers={
"Content-Range": f"bytes {start}-{end}/{file_size}",
"Accept-Ranges": "bytes",
"Content-Length": str(chunk_size),
},
)
@app.get("/api/video/{idx}")
async def get_video(idx: int, request: Request) -> Response:
queue = _build_queue()
if not 0 <= idx < len(queue):
raise HTTPException(status_code=404, detail="idx out of range")
return _stream_video(Path(queue[idx].mp4_path), request)
class Submission(BaseModel):
idx: int
time_s: float | None # None when marking unusable
mode: str # "all" | "upper" | "lower" | "unusable" | "skip"
notes: str = ""
@app.post("/api/submit")
async def submit(payload: Submission) -> dict:
queue = _build_queue()
if not 0 <= payload.idx < len(queue):
raise HTTPException(status_code=404, detail="idx out of range")
item = queue[payload.idx]
if payload.mode == "skip":
return {"status": "skipped"}
if payload.mode == "unusable":
row = {
"machine_name": item.machine_name,
"session_date": item.session_date,
"session_time": item.session_time,
"opening_s": float("nan"),
"trim_first_s": 0,
"bad_rois": "",
"notes": payload.notes or "unusable",
}
else:
if payload.time_s is None:
raise HTTPException(status_code=400, detail="time_s required")
bad_rois = {
"all": "",
"upper": ROIS_LOWER, # upper-only opens → lower row is bad
"lower": ROIS_UPPER, # lower-only opens → upper row is bad
}.get(payload.mode)
if bad_rois is None:
raise HTTPException(status_code=400, detail=f"unknown mode: {payload.mode}")
row = {
"machine_name": item.machine_name,
"session_date": item.session_date,
"session_time": item.session_time,
"opening_s": round(payload.time_s, 1),
"trim_first_s": 0,
"bad_rois": bad_rois,
"notes": payload.notes,
}
OUTPUT_CSV.parent.mkdir(parents=True, exist_ok=True)
if OUTPUT_CSV.exists():
out = pd.read_csv(OUTPUT_CSV)
else:
out = pd.DataFrame(columns=OUT_COLS)
for col in OUT_COLS:
if col not in out.columns:
out[col] = ""
# Replace any existing row for this key.
mask = ~((out["machine_name"] == row["machine_name"])
& (out["session_date"] == row["session_date"])
& (out["session_time"] == row["session_time"]))
out = pd.concat([out[mask], pd.DataFrame([row])], ignore_index=True)
out[OUT_COLS].to_csv(OUTPUT_CSV, index=False)
return {"status": "saved", "row": row}
if __name__ == "__main__":
import uvicorn
host = os.environ.get("HOST", "0.0.0.0")
port = int(os.environ.get("PORT", "8000"))
uvicorn.run("app:app", host=host, port=port, reload=False)

View file

@ -0,0 +1,21 @@
services:
picker:
build: .
image: cupido-barrier-picker
container_name: cupido-barrier-picker
ports:
- "8000:8000"
volumes:
# Project data volume (videos + tracking DBs + merged TSV) — read-only.
- /mnt/data/projects/cupido:/mnt/data/projects/cupido:ro
# Source video tree — mount at the same path the inventory references
# (so the mp4_path strings in video_inventory.csv resolve unchanged).
- /mnt/ethoscope_data/videos:/mnt/ethoscope_data/videos:ro
# Repo's data/metadata folder — mount read-write so the app can persist
# barrier_opening.csv. Adjust the host path to your local checkout.
- ../../data/metadata:/cupido/data/metadata:rw
environment:
CUPIDO_DATA_VOLUME: /mnt/data/projects/cupido
CUPIDO_INVENTORY_CSV: /cupido/data/metadata/video_inventory.csv
CUPIDO_OUTPUT_CSV: /cupido/data/metadata/barrier_opening.csv
restart: unless-stopped

View file

@ -0,0 +1,4 @@
fastapi>=0.110
uvicorn[standard]>=0.27
pandas>=2.0
pydantic>=2.0

View file

@ -0,0 +1,214 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>Cupido — barrier-opening picker</title>
<style>
* { box-sizing: border-box; }
body { margin: 0; font-family: -apple-system, "Segoe UI", system-ui, sans-serif;
background: #1a1a1a; color: #e8e8e8; }
header { padding: 0.6rem 1rem; background: #111; border-bottom: 1px solid #333;
display: flex; align-items: center; gap: 1.5rem; }
h1 { margin: 0; font-size: 1rem; font-weight: 500; color: #ccc; }
#status { font-family: ui-monospace, "SF Mono", monospace; font-size: 0.85rem;
color: #9aa; }
#info { font-family: ui-monospace, monospace; font-size: 0.85rem; color: #cce; }
main { display: flex; flex-direction: column; align-items: center; padding: 1rem; }
video { width: 100%; max-width: 1400px; height: auto; background: #000;
border-radius: 4px; }
#controls { margin-top: 1rem; display: flex; gap: 1rem; flex-wrap: wrap;
justify-content: center; }
button { padding: 0.7rem 1.4rem; font-size: 0.95rem; border: 1px solid #444;
background: #2a2a2a; color: #eee; border-radius: 4px; cursor: pointer;
transition: all 80ms; }
button:hover { background: #383838; border-color: #666; }
button:active { transform: translateY(1px); }
button.primary { background: #2d5; color: #053; border-color: #1a4;
font-weight: 600; }
button.primary:hover { background: #3e6; }
button.warn { background: #d84; color: #311; border-color: #b62; }
button.warn:hover { background: #ea5; }
button.muted { background: #2a2a2a; color: #888; }
.mute-divider { width: 1px; background: #333; margin: 0 0.3rem; }
#help { margin-top: 1rem; font-size: 0.8rem; color: #889; text-align: center;
font-family: ui-monospace, monospace; }
kbd { background: #222; border: 1px solid #444; border-radius: 3px;
padding: 0.1rem 0.4rem; font-size: 0.8rem; color: #ddd;
font-family: ui-monospace, monospace; }
#progress { font-size: 0.85rem; color: #889; margin-left: auto; }
#flash { position: fixed; top: 0.5rem; right: 0.5rem; padding: 0.5rem 1rem;
border-radius: 4px; opacity: 0; transition: opacity 200ms;
font-family: ui-monospace, monospace; font-size: 0.85rem; }
#flash.show { opacity: 1; }
#flash.ok { background: #2d5; color: #042; }
#flash.err { background: #d44; color: white; }
</style>
</head>
<body>
<header>
<h1>Cupido — barrier picker</h1>
<span id="info">loading…</span>
<span id="progress"></span>
</header>
<main>
<video id="player" controls preload="auto"></video>
<div id="controls">
<button class="primary" data-mode="all">All barriers open <kbd>1</kbd></button>
<button class="primary" data-mode="upper">Upper barrier opens <kbd>2</kbd></button>
<button class="primary" data-mode="lower">Lower barrier opens <kbd>3</kbd></button>
<span class="mute-divider"></span>
<button class="muted" data-mode="skip">Skip <kbd>s</kbd></button>
<button class="warn" data-mode="unusable">Unusable <kbd>u</kbd></button>
<span class="mute-divider"></span>
<button class="muted" id="prev">◀ Previous</button>
<button class="muted" id="next">Next ▶</button>
</div>
<div id="help">
<kbd>Space</kbd> play/pause &nbsp;·&nbsp;
<kbd></kbd> / <kbd></kbd> ±5 s &nbsp;·&nbsp;
<kbd>Shift</kbd>+arrows ±1 s &nbsp;·&nbsp;
<kbd>Ctrl</kbd>+arrows ±0.1 s &nbsp;·&nbsp;
<kbd>,</kbd> / <kbd>.</kbd> ±1 frame
</div>
<div id="flash"></div>
</main>
<script>
const player = document.getElementById('player');
const info = document.getElementById('info');
const progress = document.getElementById('progress');
const flash = document.getElementById('flash');
let queue = [];
let cursor = 0;
function showFlash(msg, kind = 'ok') {
flash.textContent = msg;
flash.className = 'show ' + kind;
setTimeout(() => { flash.className = ''; }, 1800);
}
function updateProgress() {
const done = queue.filter(q => q.done).length;
progress.textContent = `${done}/${queue.length} done`;
}
function loadCursor() {
if (queue.length === 0) {
info.textContent = 'queue empty';
return;
}
cursor = ((cursor % queue.length) + queue.length) % queue.length;
const item = queue[cursor];
info.textContent =
`[${cursor + 1}/${queue.length}] ${item.machine_name} ${item.session_date} ${item.session_time} ` +
(item.duration_s ? `(${(item.duration_s/60).toFixed(1)} min)` : '') +
(item.done ? ' — already done' : '');
player.src = `/api/video/${item.idx}`;
player.load();
}
async function fetchQueue() {
const resp = await fetch('/api/queue');
queue = await resp.json();
// Jump to first not-yet-done
const firstUndone = queue.findIndex(q => !q.done);
cursor = firstUndone === -1 ? 0 : firstUndone;
updateProgress();
loadCursor();
}
async function submit(mode) {
if (queue.length === 0) return;
const item = queue[cursor];
const payload = {
idx: item.idx,
time_s: (mode === 'skip' || mode === 'unusable') ? null : player.currentTime,
mode: mode,
notes: '',
};
try {
const resp = await fetch('/api/submit', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
if (!resp.ok) {
const err = await resp.json();
showFlash('error: ' + (err.detail || resp.status), 'err');
return;
}
const result = await resp.json();
if (result.status === 'saved') {
item.done = true;
updateProgress();
const t = result.row.opening_s;
const bad = result.row.bad_rois;
showFlash(
`saved ${item.machine_name} ${item.session_date} ${item.session_time}: ` +
(Number.isNaN(t) || t === null ? 'unusable' :
`${t.toFixed(1)}s${bad ? ' (bad ROIs: ' + bad + ')' : ''}`)
);
} else if (result.status === 'skipped') {
showFlash(`skipped ${item.machine_name} ${item.session_date} ${item.session_time}`);
}
cursor = (cursor + 1) % queue.length;
loadCursor();
} catch (e) {
showFlash('network error: ' + e.message, 'err');
}
}
// Button handlers
document.querySelectorAll('button[data-mode]').forEach(btn => {
btn.addEventListener('click', () => submit(btn.dataset.mode));
});
document.getElementById('prev').addEventListener('click', () => {
cursor = (cursor - 1 + queue.length) % queue.length;
loadCursor();
});
document.getElementById('next').addEventListener('click', () => {
cursor = (cursor + 1) % queue.length;
loadCursor();
});
// Keyboard shortcuts
document.addEventListener('keydown', (e) => {
if (e.target.tagName === 'INPUT' || e.target.tagName === 'TEXTAREA') return;
// Prevent the browser default (e.g. video focus side effects on space).
const stop = () => { e.preventDefault(); e.stopPropagation(); };
switch (e.key) {
case ' ':
stop();
if (player.paused) player.play(); else player.pause();
break;
case 'ArrowLeft':
stop();
player.currentTime -= e.ctrlKey ? 0.1 : (e.shiftKey ? 1 : 5);
break;
case 'ArrowRight':
stop();
player.currentTime += e.ctrlKey ? 0.1 : (e.shiftKey ? 1 : 5);
break;
case ',':
stop();
// Step back one frame (assume 25 fps if unknown)
player.currentTime -= 1 / 25;
break;
case '.':
stop();
player.currentTime += 1 / 25;
break;
case '1': stop(); submit('all'); break;
case '2': stop(); submit('upper'); break;
case '3': stop(); submit('lower'); break;
case 's': stop(); submit('skip'); break;
case 'u': stop(); submit('unusable'); break;
}
});
fetchQueue();
</script>
</body>
</html>