Move metadata xlsx/TSV to /mnt/data/projects/cupido/

Consolidates everything bulky (tracking DBs, targets, metadata
spreadsheet) under a single DATA_VOLUME root outside the ownCloud-synced
repo. Notebooks now use a visible DATA_DIR = Path(...) idiom rather than
walking up the filesystem with PROJECT_ROOT.parent — easier for students
with no Python background to follow.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Giorgio Gilestro 2026-05-01 08:47:15 +01:00
parent ec56e51bf9
commit f176224150
8 changed files with 102 additions and 160 deletions

View file

@ -16,7 +16,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# 01 \u00b7 Python and pandas \u2014 just enough to be dangerous\n",
"# 01 · Python and pandas — just enough to be dangerous\n",
"\n",
"This notebook teaches the **minimum** Python and `pandas` you need to read\n",
"the rest of the project's code and write your own analyses.\n",
@ -28,10 +28,10 @@
"\n",
"External resources, in order of how much time they take:\n",
"\n",
"- \ud83e\udd98 [Python in 10 minutes (very condensed)](https://www.stavros.io/tutorials/python/)\n",
"- \ud83d\udc0d [Official Python tutorial \u2014 chapters 3\u20135](https://docs.python.org/3/tutorial/introduction.html)\n",
"- \ud83d\udc3c [pandas in 10 minutes (official)](https://pandas.pydata.org/docs/user_guide/10min.html)\n",
"- \ud83d\udcda [Python for Data Analysis (the book)](https://wesmckinney.com/book/) \u2014 free online\n"
"- 🦘 [Python in 10 minutes (very condensed)](https://www.stavros.io/tutorials/python/)\n",
"- 🐍 [Official Python tutorial — chapters 35](https://docs.python.org/3/tutorial/introduction.html)\n",
"- 🐼 [pandas in 10 minutes (official)](https://pandas.pydata.org/docs/user_guide/10min.html)\n",
"- 📚 [Python for Data Analysis (the book)](https://wesmckinney.com/book/) — free online\n"
]
},
{
@ -90,7 +90,7 @@
"message = \"We tracked \" + str(n_flies) + \" \" + species + \" males.\"\n",
"print(message)\n",
"\n",
"# A nicer way to build strings \u2014 f-strings (note the leading 'f'):\n",
"# A nicer way to build strings f-strings (note the leading 'f'):\n",
"print(f\"We tracked {n_flies} {species} males.\")\n"
]
},
@ -111,7 +111,7 @@
"outputs": [],
"source": [
"machines = [\"ETHOSCOPE_076\", \"ETHOSCOPE_082\", \"ETHOSCOPE_086\"]\n",
"print(machines[0]) # first item \u2014 Python counts from 0!\n",
"print(machines[0]) # first item Python counts from 0!\n",
"print(machines[-1]) # last item\n",
"print(len(machines)) # how many items\n",
"print(machines + [\"ETHOSCOPE_140\"]) # concatenate (returns a new list)\n"
@ -212,7 +212,7 @@
" return days / 7\n",
"\n",
"print(fly_age_in_weeks(14)) # 2.0\n",
"print(fly_age_in_weeks(5)) # 0.714\u2026\n"
"print(fly_age_in_weeks(5)) # 0.714\n"
]
},
{
@ -242,12 +242,12 @@
"source": [
"## 9. Meet pandas\n",
"\n",
"Real data is rarely a single number \u2014 it's a **table** with rows and\n",
"Real data is rarely a single number it's a **table** with rows and\n",
"columns (think Excel). `pandas` is the library that handles tables in\n",
"Python. The two main objects are:\n",
"\n",
"- **`Series`** \u2014 a single column with a name.\n",
"- **`DataFrame`** \u2014 a whole table.\n",
"- **`Series`** a single column with a name.\n",
"- **`DataFrame`** a whole table.\n",
"\n",
"By convention we import pandas as `pd`. Always.\n"
]
@ -257,17 +257,7 @@
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"# Read the project's metadata TSV (Tab-Separated Values).\n",
"tsv_path = \"/home/gg/ownCloud/Work/Projects/coding/cupido/all_video_info_merged.tsv\"\n",
"df = pd.read_csv(tsv_path, sep=\"\\t\")\n",
"\n",
"# How big is it?\n",
"print(f\"Rows: {len(df)}\")\n",
"print(f\"Columns: {df.shape[1]}\")\n"
]
"source": "import pandas as pd\nfrom pathlib import Path\n\n# All the project's bulky data lives under /mnt/data/projects/cupido/.\n# This pattern — define one DATA_DIR variable, then build sub-paths from\n# it — is much easier to read (and to update) than hard-coding long\n# strings everywhere.\nDATA_DIR = Path(\"/mnt/data/projects/cupido\")\ntsv_path = DATA_DIR / \"all_video_info_merged.tsv\"\n\n# Read the project's metadata TSV (Tab-Separated Values).\ndf = pd.read_csv(tsv_path, sep=\"\\t\")\n\n# How big is it?\nprint(f\"Rows: {len(df)}\")\nprint(f\"Columns: {df.shape[1]}\")\n"
},
{
"cell_type": "markdown",
@ -368,7 +358,7 @@
"mel_only = df[df[\"species\"] == \"Melanogaster/CS\"]\n",
"print(f\"Melanogaster/CS rows: {len(mel_only)}\")\n",
"\n",
"# Combine conditions with & (and) | (or) \u2014 and wrap each part in parentheses.\n",
"# Combine conditions with & (and) | (or) and wrap each part in parentheses.\n",
"trained_mel = df[(df[\"male\"] == \"trained\") & (df[\"species\"] == \"Melanogaster/CS\")]\n",
"print(f\"trained Mel rows: {len(trained_mel)}\")\n"
]
@ -497,4 +487,4 @@
]
}
]
}
}