Remove data/raw/ entirely — all bulky data now under /mnt/data/projects/cupido/

Deleted the 5 stale pre-pipeline tracking DBs and the data/raw/ directory.
Dropped DATA_RAW from config.py; build_video_inventory now scans
TRACKING_OUTPUT_DIR for already-tracked sessions. Notebooks no longer
import DATA_RAW. README, PLANNING and todo updated to reflect that the
repo holds only code + small curated metadata, never bulky DBs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-01 09:20:25 +01:00
parent 9f3ee24a23
commit 23050360ea
9 changed files with 37 additions and 70 deletions

View file

@ -55,14 +55,14 @@ See `docs/bimodal_hypothesis.md` for detailed methodology.
### Recap
Tracked so far (5 sessions, all from 2025-07-15, machines 076/145/268). The DBs in
`data/raw/` use tracker `ConstrainedMultiFlyTracker` and template
`HD_Mating_Arena_6_ROIS.json` (2 flies × 6 ROIs per video).
Tracked so far (5 sessions, all from 2025-07-15, machines 076/145/268). Those
were re-tracked through the unified pipeline and now live at
`/mnt/data/projects/cupido/tracked/` (no separate `data/raw/` anymore — the
old pre-pipeline copies were deleted on 2026-05-01).
The metadata file `../all_video_info_merged.xlsx` indexes a different set of
experiments: 7 dates from 2024-09-17 → 2024-10-21, 16 ethoscope machines,
63 unique (date, machine) sessions = 484 ROI-rows. **None of the already-tracked
sessions are in this xlsx — these are fresh recordings to track.**
The metadata file `/mnt/data/projects/cupido/all_video_info_merged.xlsx`
indexes a different set of experiments: 7 dates from 2024-09-17 → 2024-10-21,
16 ethoscope machines, 63 unique (date, machine) sessions = 484 ROI-rows.
Inventory: see `data/metadata/video_inventory.csv` (built by
`scripts/build_video_inventory.py`).