cupido/docs/experimental_design.md
Giorgio e7e4db264d Initial commit: organized project structure for student handoff
Reorganized flat 41-file directory into structured layout with:
- scripts/ for Python analysis code with shared config.py
- notebooks/ for Jupyter analysis notebooks
- data/ split into raw/, metadata/, processed/
- docs/ with analysis summary, experimental design, and bimodal hypothesis tutorial
- tasks/ with todo checklist and lessons learned
- Comprehensive README, PLANNING.md, and .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 16:08:36 +00:00

5.1 KiB

Experimental Design: Barrier-Opening Social Interaction Assay

Overview

This document describes the experimental design of a Drosophila behavioral tracking experiment conducted as part of the Cupido project. The experiment uses a barrier-opening assay to measure social interaction patterns between trained and untrained flies.

Date: July 15, 2025 Species: Drosophila melanogaster, Canton-S (CS) wild-type strain

Assay Description

The barrier-opening assay places two flies in each Region of Interest (ROI), separated by a physical barrier. After a configurable delay from the start of recording, the barrier is manually opened, allowing the flies to interact socially. The primary behavioral metric is the distance between the two flies over time following barrier opening, which serves as a proxy for social engagement.

Experimental Groups

  • Trained: Flies that received prior social experience before the assay.
  • Untrained: Socially naive flies with no prior social experience.

Note

: The exact training protocol (duration, conditions, group sizes) should be documented separately. "Trained" refers to flies with prior social experience; "untrained" refers to socially naive individuals.

Equipment and Recording Parameters

Parameter Value
Resolution 1920 x 1088 pixels
Frame rate 25 fps
Video codec H.264
Quality 28q
ROIs per session 6 (each containing a pair of flies)
Tracking output SQLite databases (one per session)

Machines and Recording Sessions

Three ethoscope machines were used for tracking, with a fourth (Machine 139) having metadata but no tracking data.

Machine Session Start Time Barrier Opening (s) Status
ETHOSCOPE_076 16:03:10 52 OK
ETHOSCOPE_076 16:31:34 25 OK
ETHOSCOPE_145 16:03:27 42 OK
ETHOSCOPE_145 16:31:41 20 OK
ETHOSCOPE_268 16:32:05 75 OK
ETHOSCOPE_139 16:31:52 Not recorded DATA MISSING

Total sessions: 6 (5 with tracking data, 1 missing)

ROI-to-Group Mapping

Each session contains 6 ROIs. The assignment of trained/untrained groups to ROIs varies across sessions.

Machine Session ROI 1 ROI 2 ROI 3 ROI 4 ROI 5 ROI 6
076 16:03:10 Trained Untrained Trained Untrained Trained Untrained
076 16:31:34 Trained Trained Trained Untrained Untrained Untrained
145 16:03:27 Trained Trained Trained Untrained Untrained Untrained
145 16:31:41 Trained Trained Trained Untrained Untrained Untrained
268 16:32:05 Untrained Untrained Untrained Trained Trained Trained
139 16:31:52 Trained Trained Trained Untrained Untrained Untrained

Sample Sizes

Group ROIs (total) ROIs (with data) ROIs (missing)
Trained 18 15 3 (Machine 139)
Untrained 18 15 3 (Machine 139)
Total 36 30 6

Tracking Database Schema

Each recording session produces a SQLite database file containing tables ROI_1 through ROI_6. Each table has the following columns:

Column Type Description
id INTEGER Row identifier
t INTEGER Timestamp in milliseconds from start of recording
x REAL Horizontal position in pixels
y REAL Vertical position in pixels
w REAL Width of detected object (pixels)
h REAL Height of detected object (pixels)
phi REAL Angle/orientation of detected object
is_inferred INTEGER Whether the position was inferred (not directly detected)
has_interacted INTEGER Whether an interaction was detected

Known Issues and Data Caveats

  1. Machine 139 missing data: Metadata entries exist for ETHOSCOPE_139 (session 16:31:52) in the metadata CSV, but no corresponding tracking database file is present and no barrier opening time was recorded. This accounts for 6 missing ROIs (3 trained, 3 untrained). The cause needs investigation.

  2. Time unit mismatch between files: The tracking databases store time (t) in milliseconds, while 2025_07_15_barrier_opening.csv stores barrier opening times in seconds. The analysis pipeline converts barrier opening times to milliseconds for alignment.

  3. Machine name type inconsistency: The metadata CSV stores machine identifiers as integers (e.g., 76, 145, 268), while 2025_07_15_barrier_opening.csv also stores them as integers (e.g., 076, 145, 268). String conversion with zero-padding is required when matching between files and when constructing tracking database filenames (e.g., ETHOSCOPE_076).

Source Files

File Description
2025_07_15_metadata_fixed.csv ROI-to-group mapping (trained/untrained)
2025_07_15_barrier_opening.csv Barrier opening times per machine/session
*_tracking.db SQLite tracking databases (one per session)