Remove data/raw/ entirely — all bulky data now under /mnt/data/projects/cupido/
Deleted the 5 stale pre-pipeline tracking DBs and the data/raw/ directory. Dropped DATA_RAW from config.py; build_video_inventory now scans TRACKING_OUTPUT_DIR for already-tracked sessions. Notebooks no longer import DATA_RAW. README, PLANNING and todo updated to reflect that the repo holds only code + small curated metadata, never bulky DBs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
9f3ee24a23
commit
23050360ea
9 changed files with 37 additions and 70 deletions
26
README.md
26
README.md
|
|
@ -14,9 +14,11 @@ python -m venv .venv
|
|||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Get the data files (not in git - ask lab for copies)
|
||||
# Place .db files in data/raw/
|
||||
# Place large .csv files in data/processed/
|
||||
# Project data lives outside the repo at /mnt/data/projects/cupido/:
|
||||
# tracked/ → SQLite tracking DBs
|
||||
# targets/ → target-point JSONs
|
||||
# all_video_info_merged.{xlsx,tsv} → metadata spreadsheet
|
||||
# Generated CSVs land in data/processed/ (gitignored).
|
||||
|
||||
# Run the main analysis notebook
|
||||
jupyter notebook notebooks/flies_analysis_simple.ipynb
|
||||
|
|
@ -66,7 +68,7 @@ python scripts/pick_targets.py --redo # re-pick already-picked videos
|
|||
|
||||
# 3) batch tracking (idempotent, can run in background)
|
||||
python scripts/track_videos.py --jobs 4 # parallel
|
||||
# output → /mnt/data/projects/cupido/tracked/*_tracking.db (SQLite, same schema as data/raw/)
|
||||
# output → /mnt/data/projects/cupido/tracked/*_tracking.db (SQLite)
|
||||
```
|
||||
|
||||
See `tasks/todo.md` "Offline Tracking" section for the full plan, and
|
||||
|
|
@ -80,9 +82,9 @@ tracking/
|
|||
├── PLANNING.md # Architecture & conventions
|
||||
├── requirements.txt # Python dependencies
|
||||
├── data/
|
||||
│ ├── raw/ # SQLite tracking databases (gitignored)
|
||||
│ ├── metadata/ # Experiment metadata CSVs
|
||||
│ └── processed/ # Generated analysis CSVs (gitignored)
|
||||
│ ├── metadata/ # Experiment metadata CSVs (small, hand-curated)
|
||||
│ ├── processed/ # Generated analysis CSVs (gitignored)
|
||||
│ └── logs/ # Tracker logs (gitignored)
|
||||
├── scripts/ # Python analysis scripts
|
||||
│ ├── config.py # Shared path constants
|
||||
│ ├── load_roi_data.py # Extract data from DBs
|
||||
|
|
@ -107,13 +109,13 @@ tracking/
|
|||
## Data Pipeline
|
||||
|
||||
```
|
||||
SQLite DBs (data/raw/)
|
||||
SQLite DBs (/mnt/data/projects/cupido/tracked/) + merged TSV
|
||||
│
|
||||
▼ load_roi_data.py / notebook step 1
|
||||
ROI CSVs (data/processed/*_roi_data.csv)
|
||||
▼ scripts/load_roi_data.py
|
||||
single DataFrame stamped with experimental metadata
|
||||
│
|
||||
▼ notebook steps 2-4
|
||||
Aligned Distance CSVs (data/processed/*_distances_aligned.csv)
|
||||
▼ notebooks/flies_analysis_simple.ipynb (steps 2–4)
|
||||
Aligned distance CSVs (data/processed/*_distances_aligned.csv)
|
||||
│
|
||||
├──▶ Plots (figures/)
|
||||
├──▶ Statistical tests
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue