Remove data/raw/ entirely — all bulky data now under /mnt/data/projects/cupido/

Deleted the 5 stale pre-pipeline tracking DBs and the data/raw/ directory.
Dropped DATA_RAW from config.py; build_video_inventory now scans
TRACKING_OUTPUT_DIR for already-tracked sessions. Notebooks no longer
import DATA_RAW. README, PLANNING and todo updated to reflect that the
repo holds only code + small curated metadata, never bulky DBs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-01 09:20:25 +01:00
parent 9f3ee24a23
commit 23050360ea
9 changed files with 37 additions and 70 deletions

6
.gitignore vendored
View file

@ -1,9 +1,7 @@
# Large data files (reproducible from raw DBs)
data/raw/*.db
# Generated CSVs (regenerable from the tracking DBs + the merged TSV)
data/processed/*.csv
# Offline-tracking outputs (regenerable from videos + target JSONs)
# DBs and target JSONs live outside the repo at /mnt/data/projects/cupido/
# Tracking DBs and target JSONs live outside the repo at /mnt/data/projects/cupido/
data/metadata/video_inventory.csv
data/logs/*.log