Initial commit: organized project structure for student handoff

Reorganized flat 41-file directory into structured layout with:
- scripts/ for Python analysis code with shared config.py
- notebooks/ for Jupyter analysis notebooks
- data/ split into raw/, metadata/, processed/
- docs/ with analysis summary, experimental design, and bimodal hypothesis tutorial
- tasks/ with todo checklist and lessons learned
- Comprehensive README, PLANNING.md, and .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-03-05 16:08:36 +00:00
commit e7e4db264d
27 changed files with 3105 additions and 0 deletions

39
data/processed/README.md Normal file
View file

@ -0,0 +1,39 @@
# Processed Data
Large CSV files generated from the analysis pipeline. All files are gitignored (~370MB total) and can be regenerated.
## Files and Regeneration
| File | Description | Generated By |
|------|-------------|--------------|
| `trained_roi_data.csv` | Raw tracking data for trained ROIs | `scripts/load_roi_data.py` or notebook step 1 |
| `untrained_roi_data.csv` | Raw tracking data for untrained ROIs | `scripts/load_roi_data.py` or notebook step 1 |
| `trained_distances.csv` | Pairwise distances (unaligned) | `scripts/calculate_distances.py` |
| `untrained_distances.csv` | Pairwise distances (unaligned) | `scripts/calculate_distances.py` |
| `trained_distances_aligned.csv` | Distances aligned to barrier opening | Notebook step 4 |
| `untrained_distances_aligned.csv` | Distances aligned to barrier opening | Notebook step 4 |
| `trained_tracked.csv` | Identity-tracked fly positions | Notebook step 7 |
| `untrained_tracked.csv` | Identity-tracked fly positions | Notebook step 7 |
| `trained_max_velocity.csv` | Max velocity over 10s windows | Notebook step 7 |
| `untrained_max_velocity.csv` | Max velocity over 10s windows | Notebook step 7 |
## To Regenerate All Data
Run the full notebook `notebooks/flies_analysis_simple.ipynb` with:
```python
recalculate_distances = True
recalculate_tracking = True
```
**Warning**: Identity tracking and velocity calculations take significant time (~30+ minutes).
## Column Reference
### Distance CSVs (`*_distances_aligned.csv`)
- `machine_name`: Ethoscope machine ID (string)
- `ROI`: ROI number (1-6)
- `aligned_time`: Time in ms relative to barrier opening (0 = opening)
- `distance`: Euclidean distance between flies in pixels
- `n_flies`: Number of flies detected at this time point
- `area_fly1`, `area_fly2`: Bounding box areas (w*h) in pixels^2
- `group`: "trained" or "untrained"