Move metadata xlsx/TSV to /mnt/data/projects/cupido/

Consolidates everything bulky (tracking DBs, targets, metadata spreadsheet) under a single DATA_VOLUME root outside the ownCloud-synced repo. Notebooks now use a visible DATA_DIR = Path(...) idiom rather than walking up the filesystem with PROJECT_ROOT.parent — easier for students with no Python background to follow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 08:47:15 +01:00 · 2026-05-01 08:47:15 +01:00 · f176224150
commit f176224150
parent ec56e51bf9
8 changed files with 102 additions and 160 deletions
--- a/notebooks/getting_started/01_python_pandas_basics.ipynb
+++ b/notebooks/getting_started/01_python_pandas_basics.ipynb
@ -16,7 +16,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# 01 \u00b7 Python and pandas \u2014 just enough to be dangerous\n",
+    "# 01 · Python and pandas — just enough to be dangerous\n",
    "\n",
    "This notebook teaches the **minimum** Python and `pandas` you need to read\n",
    "the rest of the project's code and write your own analyses.\n",
@ -28,10 +28,10 @@
    "\n",
    "External resources, in order of how much time they take:\n",
    "\n",
-    "- \ud83e\udd98 [Python in 10 minutes (very condensed)](https://www.stavros.io/tutorials/python/)\n",
-    "- \ud83d\udc0d [Official Python tutorial \u2014 chapters 3\u20135](https://docs.python.org/3/tutorial/introduction.html)\n",
-    "- \ud83d\udc3c [pandas in 10 minutes (official)](https://pandas.pydata.org/docs/user_guide/10min.html)\n",
-    "- \ud83d\udcda [Python for Data Analysis (the book)](https://wesmckinney.com/book/) \u2014 free online\n"
+    "- 🦘 [Python in 10 minutes (very condensed)](https://www.stavros.io/tutorials/python/)\n",
+    "- 🐍 [Official Python tutorial — chapters 3–5](https://docs.python.org/3/tutorial/introduction.html)\n",
+    "- 🐼 [pandas in 10 minutes (official)](https://pandas.pydata.org/docs/user_guide/10min.html)\n",
+    "- 📚 [Python for Data Analysis (the book)](https://wesmckinney.com/book/) — free online\n"
   ]
  },
  {
@ -90,7 +90,7 @@
    "message = \"We tracked \" + str(n_flies) + \" \" + species + \" males.\"\n",
    "print(message)\n",
    "\n",
-    "# A nicer way to build strings \u2014 f-strings (note the leading 'f'):\n",
+    "# A nicer way to build strings — f-strings (note the leading 'f'):\n",
    "print(f\"We tracked {n_flies} {species} males.\")\n"
   ]
  },
@ -111,7 +111,7 @@
   "outputs": [],
   "source": [
    "machines = [\"ETHOSCOPE_076\", \"ETHOSCOPE_082\", \"ETHOSCOPE_086\"]\n",
-    "print(machines[0])         # first item \u2014 Python counts from 0!\n",
+    "print(machines[0])         # first item — Python counts from 0!\n",
    "print(machines[-1])        # last item\n",
    "print(len(machines))       # how many items\n",
    "print(machines + [\"ETHOSCOPE_140\"])  # concatenate (returns a new list)\n"
@ -212,7 +212,7 @@
    "    return days / 7\n",
    "\n",
    "print(fly_age_in_weeks(14))    # 2.0\n",
-    "print(fly_age_in_weeks(5))     # 0.714\u2026\n"
+    "print(fly_age_in_weeks(5))     # 0.714…\n"
   ]
  },
  {
@ -242,12 +242,12 @@
   "source": [
    "## 9.  Meet pandas\n",
    "\n",
-    "Real data is rarely a single number \u2014 it's a **table** with rows and\n",
+    "Real data is rarely a single number — it's a **table** with rows and\n",
    "columns (think Excel). `pandas` is the library that handles tables in\n",
    "Python. The two main objects are:\n",
    "\n",
-    "- **`Series`** \u2014 a single column with a name.\n",
-    "- **`DataFrame`** \u2014 a whole table.\n",
+    "- **`Series`** — a single column with a name.\n",
+    "- **`DataFrame`** — a whole table.\n",
    "\n",
    "By convention we import pandas as `pd`. Always.\n"
   ]
@ -257,17 +257,7 @@
   "metadata": {},
   "execution_count": null,
   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "\n",
-    "# Read the project's metadata TSV (Tab-Separated Values).\n",
-    "tsv_path = \"/home/gg/ownCloud/Work/Projects/coding/cupido/all_video_info_merged.tsv\"\n",
-    "df = pd.read_csv(tsv_path, sep=\"\\t\")\n",
-    "\n",
-    "# How big is it?\n",
-    "print(f\"Rows: {len(df)}\")\n",
-    "print(f\"Columns: {df.shape[1]}\")\n"
-   ]
+   "source": "import pandas as pd\nfrom pathlib import Path\n\n# All the project's bulky data lives under /mnt/data/projects/cupido/.\n# This pattern — define one DATA_DIR variable, then build sub-paths from\n# it — is much easier to read (and to update) than hard-coding long\n# strings everywhere.\nDATA_DIR = Path(\"/mnt/data/projects/cupido\")\ntsv_path = DATA_DIR / \"all_video_info_merged.tsv\"\n\n# Read the project's metadata TSV (Tab-Separated Values).\ndf = pd.read_csv(tsv_path, sep=\"\\t\")\n\n# How big is it?\nprint(f\"Rows: {len(df)}\")\nprint(f\"Columns: {df.shape[1]}\")\n"
  },
  {
   "cell_type": "markdown",
@ -368,7 +358,7 @@
    "mel_only = df[df[\"species\"] == \"Melanogaster/CS\"]\n",
    "print(f\"Melanogaster/CS rows: {len(mel_only)}\")\n",
    "\n",
-    "# Combine conditions with & (and) | (or) \u2014 and wrap each part in parentheses.\n",
+    "# Combine conditions with & (and) | (or) — and wrap each part in parentheses.\n",
    "trained_mel = df[(df[\"male\"] == \"trained\") & (df[\"species\"] == \"Melanogaster/CS\")]\n",
    "print(f\"trained Mel rows: {len(trained_mel)}\")\n"
   ]
@ -497,4 +487,4 @@
   ]
  }
 ]
-}
+}