Move metadata xlsx/TSV to /mnt/data/projects/cupido/
Consolidates everything bulky (tracking DBs, targets, metadata spreadsheet) under a single DATA_VOLUME root outside the ownCloud-synced repo. Notebooks now use a visible DATA_DIR = Path(...) idiom rather than walking up the filesystem with PROJECT_ROOT.parent — easier for students with no Python background to follow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
ec56e51bf9
commit
f176224150
8 changed files with 102 additions and 160 deletions
|
|
@ -16,11 +16,11 @@
|
|||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 00 \u00b7 Welcome to the Cupido fly-tracking project\n",
|
||||
"# 00 · Welcome to the Cupido fly-tracking project\n",
|
||||
"\n",
|
||||
"Hi! You're about to start working on a project that studies how *Drosophila*\n",
|
||||
"(fruit flies) form **memories of mating experiences** \u2014 and whether trained\n",
|
||||
"flies behave differently from na\u00efve ones in their later courtship.\n",
|
||||
"(fruit flies) form **memories of mating experiences** — and whether trained\n",
|
||||
"flies behave differently from naïve ones in their later courtship.\n",
|
||||
"\n",
|
||||
"**You don't need any prior experience with Python or data science to follow\n",
|
||||
"along.** This series of notebooks will walk you through everything, one\n",
|
||||
|
|
@ -48,7 +48,7 @@
|
|||
"metadata": {},
|
||||
"source": [
|
||||
"You should have seen `Hello, fly world!` printed and the number `2`\n",
|
||||
"appear underneath. If something else happened, ask Giorgio \u2014 that's a\n",
|
||||
"appear underneath. If something else happened, ask Giorgio — that's a\n",
|
||||
"sign the environment isn't set up right.\n",
|
||||
"\n",
|
||||
"If this is the very first time you're using JupyterLab, take 10 minutes\n",
|
||||
|
|
@ -61,7 +61,7 @@
|
|||
" (Python that the computer runs).\n",
|
||||
"- The **kernel** is the running Python process behind the notebook. It\n",
|
||||
" remembers everything you've defined. If something gets weird, restart\n",
|
||||
" the kernel: top menu \u2192 *Kernel* \u2192 *Restart Kernel\u2026*.\n",
|
||||
" the kernel: top menu → *Kernel* → *Restart Kernel…*.\n",
|
||||
"- `Shift + Enter` runs a cell and moves to the next one.\n",
|
||||
"- `Ctrl + Enter` runs a cell and stays put.\n"
|
||||
]
|
||||
|
|
@ -74,7 +74,7 @@
|
|||
"\n",
|
||||
"Drosophila males court females with a stereotyped sequence (chasing,\n",
|
||||
"wing-extension, tapping). When a male is rejected by a female (e.g.\n",
|
||||
"because she's already mated), he **learns** to suppress his courtship \u2014\n",
|
||||
"because she's already mated), he **learns** to suppress his courtship —\n",
|
||||
"even toward new, receptive females, for a while. This is a textbook\n",
|
||||
"example of *non-associative learning* in invertebrates ([review on\n",
|
||||
"PubMed](https://pubmed.ncbi.nlm.nih.gov/?term=courtship+conditioning+drosophila)).\n",
|
||||
|
|
@ -85,7 +85,7 @@
|
|||
" species recorded.)\n",
|
||||
"- How long does the memory last? (training_length_hr,\n",
|
||||
" consolidation_length_hr columns in the metadata.)\n",
|
||||
"- Are there **individual differences** \u2014 do some males learn while others\n",
|
||||
"- Are there **individual differences** — do some males learn while others\n",
|
||||
" don't? (The \"bimodal hypothesis\" in `docs/bimodal_hypothesis.md`.)\n",
|
||||
"\n",
|
||||
"Your job, broadly, will be to **turn videos of flies into numbers and\n",
|
||||
|
|
@ -100,17 +100,17 @@
|
|||
"\n",
|
||||
"1. **Training**: a male fly is placed with a non-receptive (mated) female.\n",
|
||||
" He courts, gets rejected, eventually gives up.\n",
|
||||
"2. *Wait* for some hours (the \"consolidation\" period \u2014 gives memory time\n",
|
||||
"2. *Wait* for some hours (the \"consolidation\" period — gives memory time\n",
|
||||
" to form).\n",
|
||||
"3. **Testing**: same male is placed with a fresh receptive female.\n",
|
||||
" Does he court her vigorously, or has he learned to give up easily?\n",
|
||||
"\n",
|
||||
"Each experiment runs in an **HD mating arena** \u2014 a small chamber with\n",
|
||||
"Each experiment runs in an **HD mating arena** — a small chamber with\n",
|
||||
"6 sub-arenas (we call them **ROIs**, for \"regions of interest\"). Each ROI\n",
|
||||
"contains one couple (a male and a female). A camera films the whole arena\n",
|
||||
"from above. So one **video** gives us 6 simultaneous experiments.\n",
|
||||
"\n",
|
||||
"The setup uses [Ethoscopes](https://www.ethoscope.com/) \u2014 open-source\n",
|
||||
"The setup uses [Ethoscopes](https://www.ethoscope.com/) — open-source\n",
|
||||
"behavioural recording boxes built in this lab. Each ethoscope is a\n",
|
||||
"machine; we have 16 in total, named `ETHOSCOPE_067`, `ETHOSCOPE_076`, etc.\n"
|
||||
]
|
||||
|
|
@ -124,7 +124,7 @@
|
|||
"For each video, the **tracker** (a piece of software that runs after the\n",
|
||||
"recording) finds the flies frame-by-frame and writes their positions to a\n",
|
||||
"**SQLite database** (a single file, ending in `.db`). One DB per video.\n",
|
||||
"Inside each DB there are 6 tables called `ROI_1`, `ROI_2`, \u2026, `ROI_6` \u2014\n",
|
||||
"Inside each DB there are 6 tables called `ROI_1`, `ROI_2`, …, `ROI_6` —\n",
|
||||
"one per sub-arena. Each row of an ROI table is **one fly detection at one\n",
|
||||
"moment in time** with these columns:\n",
|
||||
"\n",
|
||||
|
|
@ -139,7 +139,7 @@
|
|||
"| `has_interacted` | (legacy column, mostly unused) |\n",
|
||||
"\n",
|
||||
"If a single ROI has two flies that the tracker can see, you'll get **two\n",
|
||||
"rows with the same `t`** \u2014 one for each fly. If only one fly is detected\n",
|
||||
"rows with the same `t`** — one for each fly. If only one fly is detected\n",
|
||||
"(maybe they're on top of each other), you'll get one row.\n",
|
||||
"\n",
|
||||
"That's the heart of the data. Everything else (distances, velocities,\n",
|
||||
|
|
@ -149,51 +149,25 @@
|
|||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Where everything lives\n",
|
||||
"\n",
|
||||
"Take a moment to memorize these locations \u2014 you'll come back to them often.\n",
|
||||
"\n",
|
||||
"| what | where |\n",
|
||||
"|---|---|\n",
|
||||
"| Tracking DBs (SQLite, one per video) | `/mnt/data/projects/cupido/tracked/` |\n",
|
||||
"| Target JSONs (the user-clicked reference points) | `/mnt/data/projects/cupido/targets/` |\n",
|
||||
"| Source video files | `/mnt/ethoscope_data/videos/` |\n",
|
||||
"| Project code (this repo) | `/home/gg/ownCloud/Work/Projects/coding/cupido/tracking/` |\n",
|
||||
"| The metadata table (xlsx + TSV) | `/home/gg/ownCloud/Work/Projects/coding/cupido/all_video_info_merged.tsv` |\n",
|
||||
"| Your notebooks | `notebooks/getting_started/` (this folder) |\n",
|
||||
"\n",
|
||||
"Let's verify a couple of these from inside Python:\n"
|
||||
]
|
||||
"source": "## Where everything lives\n\nTake a moment to memorize these locations — you'll come back to them often.\n\n| what | where |\n|---|---|\n| Tracking DBs (SQLite, one per video) | `/mnt/data/projects/cupido/tracked/` |\n| Target JSONs (the user-clicked reference points) | `/mnt/data/projects/cupido/targets/` |\n| The metadata table (xlsx + TSV) | `/mnt/data/projects/cupido/all_video_info_merged.tsv` |\n| Source video files | `/mnt/ethoscope_data/videos/` |\n| Project code (this repo) | `/home/gg/ownCloud/Work/Projects/coding/cupido/tracking/` |\n| Your notebooks | `notebooks/getting_started/` (this folder) |\n\nNotice the pattern: **everything bulky or regenerable lives under\n`/mnt/data/projects/cupido/`**. The repository itself only stores code,\ndocumentation, and small metadata files. We'll refer to that data\ndirectory as `DATA_DIR` from here on.\n\nLet's verify a couple of these from inside Python:\n"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"metadata": {},
|
||||
"execution_count": null,
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from pathlib import Path\n",
|
||||
"\n",
|
||||
"tracked = Path(\"/mnt/data/projects/cupido/tracked\")\n",
|
||||
"targets = Path(\"/mnt/data/projects/cupido/targets\")\n",
|
||||
"\n",
|
||||
"n_dbs = len(list(tracked.glob(\"*_tracking.db\")))\n",
|
||||
"n_jsons = len(list(targets.glob(\"*.json\")))\n",
|
||||
"\n",
|
||||
"print(f\"Tracking DBs available: {n_dbs}\")\n",
|
||||
"print(f\"Target JSONs available: {n_jsons}\")\n"
|
||||
]
|
||||
"source": "from pathlib import Path\n\n# Single root for all the bulky / regenerable project data.\nDATA_DIR = Path(\"/mnt/data/projects/cupido\")\n\ntracked_dir = DATA_DIR / \"tracked\"\ntargets_dir = DATA_DIR / \"targets\"\nmetadata_tsv = DATA_DIR / \"all_video_info_merged.tsv\"\n\nprint(f\"Tracking DBs available: {len(list(tracked_dir.glob('*_tracking.db')))}\")\nprint(f\"Target JSONs available: {len(list(targets_dir.glob('*.json')))}\")\nprint(f\"Metadata TSV exists: {metadata_tsv.exists()}\")\n"
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You should see roughly 113 tracking DBs and 130 target JSONs. If those\n",
|
||||
"numbers are zero, the storage volume isn't mounted \u2014 ask Giorgio.\n",
|
||||
"numbers are zero, the storage volume isn't mounted — ask Giorgio.\n",
|
||||
"\n",
|
||||
"> **Note**: the tracking DBs are read-only inside the JupyterLab\n",
|
||||
"> container. You can read them but not modify or delete them. That's a\n",
|
||||
"> deliberate safety measure \u2014 we don't want analysis code accidentally\n",
|
||||
"> deliberate safety measure — we don't want analysis code accidentally\n",
|
||||
"> corrupting the source data.\n"
|
||||
]
|
||||
},
|
||||
|
|
@ -203,26 +177,26 @@
|
|||
"source": [
|
||||
"## Glossary (refer back as needed)\n",
|
||||
"\n",
|
||||
"- **ROI** \u2014 *region of interest*. One sub-arena inside the HD mating\n",
|
||||
" arena. There are 6 ROIs per video, numbered 1\u20136.\n",
|
||||
"- **fly** \u2014 one detection in a single (t, ROI) cell. Two flies in the\n",
|
||||
"- **ROI** — *region of interest*. One sub-arena inside the HD mating\n",
|
||||
" arena. There are 6 ROIs per video, numbered 1–6.\n",
|
||||
"- **fly** — one detection in a single (t, ROI) cell. Two flies in the\n",
|
||||
" same ROI at the same time = two rows with the same `t`.\n",
|
||||
"- **trained** \u2014 the male had a training session before testing.\n",
|
||||
"- **naive** \u2014 the male is a control (no training).\n",
|
||||
"- **training session** \u2014 the recording where the male meets the\n",
|
||||
"- **trained** — the male had a training session before testing.\n",
|
||||
"- **naive** — the male is a control (no training).\n",
|
||||
"- **training session** — the recording where the male meets the\n",
|
||||
" non-receptive female (he gets rejected).\n",
|
||||
"- **testing session** \u2014 the recording where the male meets a fresh\n",
|
||||
"- **testing session** — the recording where the male meets a fresh\n",
|
||||
" receptive female (we measure his courtship).\n",
|
||||
"- **t (milliseconds)** \u2014 time within one session, starting at 0.\n",
|
||||
"- **(x, y) pixels** \u2014 fly position in the image. Top-left is (0, 0); x\n",
|
||||
"- **t (milliseconds)** — time within one session, starting at 0.\n",
|
||||
"- **(x, y) pixels** — fly position in the image. Top-left is (0, 0); x\n",
|
||||
" grows to the right, y grows **downward** (this is the image-coordinate\n",
|
||||
" convention, opposite of math class).\n",
|
||||
"- **machine_name** \u2014 which ethoscope recorded the video, e.g.\n",
|
||||
"- **machine_name** — which ethoscope recorded the video, e.g.\n",
|
||||
" `ETHOSCOPE_076`.\n",
|
||||
"- **species** \u2014 `Melanogaster/CS`, `Sechellia`, `Simulans`, `Yakuba`,\n",
|
||||
"- **species** — `Melanogaster/CS`, `Sechellia`, `Simulans`, `Yakuba`,\n",
|
||||
" `Erecta`, `Willistoni`, or `CS`.\n",
|
||||
"\n",
|
||||
"If you bump into other terms in the code, ask. Don't guess \u2014 biology\n",
|
||||
"If you bump into other terms in the code, ask. Don't guess — biology\n",
|
||||
"codebases pick up jargon over the years.\n"
|
||||
]
|
||||
},
|
||||
|
|
@ -234,16 +208,16 @@
|
|||
"\n",
|
||||
"When you're ready, open these notebooks **in order**:\n",
|
||||
"\n",
|
||||
"1. `01_python_pandas_basics.ipynb` \u2014 just enough Python and pandas to\n",
|
||||
"1. `01_python_pandas_basics.ipynb` — just enough Python and pandas to\n",
|
||||
" read and manipulate tabular data.\n",
|
||||
"2. `02_explore_one_database.ipynb` \u2014 open one tracking DB, plot a fly's\n",
|
||||
"2. `02_explore_one_database.ipynb` — open one tracking DB, plot a fly's\n",
|
||||
" trajectory, see what the numbers actually look like.\n",
|
||||
"3. `03_compare_trained_vs_naive.ipynb` \u2014 your first real analysis,\n",
|
||||
"3. `03_compare_trained_vs_naive.ipynb` — your first real analysis,\n",
|
||||
" comparing groups of flies.\n",
|
||||
"\n",
|
||||
"After those, the notebooks one level up (`flies_analysis.ipynb`,\n",
|
||||
"`flies_analysis_simple.ipynb`) contain the analysis pipeline that the\n",
|
||||
"previous student built \u2014 those will make sense once you've worked\n",
|
||||
"previous student built — those will make sense once you've worked\n",
|
||||
"through the tutorials.\n",
|
||||
"\n",
|
||||
"Don't try to power through all of them in one sitting. Run a few cells,\n",
|
||||
|
|
@ -252,4 +226,4 @@
|
|||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue