Skip to content

Colab-ready bootstrap with fully pinned deps + collapsed install cell#149

Open
bendichter wants to merge 44 commits into
masterfrom
colab-bootstrap-locked
Open

Colab-ready bootstrap with fully pinned deps + collapsed install cell#149
bendichter wants to merge 44 commits into
masterfrom
colab-bootstrap-locked

Conversation

@bendichter
Copy link
Copy Markdown
Member

@bendichter bendichter commented May 11, 2026

Closes #140. Replaces #141.

Summary

Adds a Colab-ready install bootstrap to every runnable notebook in the repo (41 notebooks). The bootstrap pins every Python package — direct and transitive — at exact versions, and presents to the user as a single collapsed cell in Colab rather than ~110 lines of pin metadata.

This PR supersedes #141. Two structural improvements over the prior approach:

  1. Full transitive lock. PR Make 39 notebooks Colab-ready: bootstrap cells + inline deps #141 used loose direct-dep pins (pynwb<3, hdmf<4, etc.) and accumulated transitive-breakage patches over time. Here, every notebook's install cell carries a fully resolved ==-pinned set generated by uv pip compile against Python 3.12 / linux. No transitive drift; reproducible on any future Colab session.
  2. Collapsed install cell in Colab. The install cell uses cellView: "form" + #@title ... { display-mode: "form" }, so in Colab it renders as a single titled bar with a play button. The ~100-line pinned set is hidden behind the form. In Jupyter / nbviewer the #@title line is just a comment — graceful degradation.

Per-notebook shape

Each notebook gets four cells prepended at the top:

# Type Contents
0 markdown Open-in-Colab badge linking to the notebook on master
1 markdown ## Installing requirements + 1-sentence explanation
2 code (cellView: form) #@title Installing requirements (click ▶ to run) { display-mode: "form" }!pip install -q uv!uv pip install --system "<pkg>==<ver>" ... for every resolved package → optional !curl lines for sibling helper .py files
3 markdown Restart-runtime admonition

No other notebook content is changed.

Scope

41 notebooks: the 36 runnable notebooks covered by #141 that still exist on master, plus 5 notebooks that landed on master after #141 was opened.

From PR #141 (36): 000039/AllenInstitute/Create_manifest, 000055/BruntonLab/peterson21/* (2), 000108/chunglab/demo/* (3), 000363/MAP/demo/browse_map_ephys_data, 000402/MICrONS/* (2), 000458/AllenInstitute/reanalysis, 000458/FlatironInstitute/* (2), 000559/dattalab/markowitz_gillis_nature_2023/* (3), 000582/Sargolini2006/000582_Sargolini2006_demo, 000718/CaiLab/zaki_2024/000718_demo, 000727/clandinin/simple_data_access/reading_data, 000728/AllenInstitute/visual_coding_ophys_tutorial, 000947/TurnerLab/public_demo/000947_demo, 000971/lernerlab/seiler_2024/* (2), 001038/DombeckLab/001038_demo, 001075/001075_paper_figure_1d, 001084/HoweLab/001084_demo, 001170/ReimerLab/public_demo/001170_demo, 001754/CatalystNeuro/001754_demo, dandi/DANDI User Guide, Part I/II (2), demos/NWBWidget-demo, tutorials/bcm_2024/analysis-demo, tutorials/cosyne_2023/* (2), tutorials/neurodatarehack_2024/* (2), tutorials/open_data_quick_start_2026/Get-to-know-a-Dandiset.

New on master since #141 (5): 000409/IBL/bwm_usage_notebook (replaces the three pre-existing IBL notebooks that were deleted on master), and 001636/TurnerLab/motor_cortex/{antidromic_detection_tutorial,turner_m1_glm,turner_m1_peth,turner_m1_usage}. Deps for these are sourced from the colocated environment.yml.

Notebooks dropped vs #141 (3): 000409/IBL/01_list_datasets, 02_behaviour_psychometric_curve, 03_analysis_Imbizo_2023 — deleted on master.

How to review

Each notebook is an independent commit. The bootstrap insertions for a given notebook only touch that notebook's .ipynb. There's one additional sweep commit (d2879d1) that bumps nbformat_minor from 4 → 5 on 15 legacy notebooks; required because cell id fields (used by the prepended cells) are an nbformat-4.5+ feature.

Per-PR-#141 difference highlights:

  • No "fix pins" follow-up commits — the transitive lock catches everything numpy<2, zarr<3, cellpose<4, ipython_genutils, pynwb<3, hdmf<4 etc. were patching.
  • Notebooks that previously had git+https://... direct deps (e.g. brunton-lab-to-nwb) are now locked to the exact resolved git commit via uv pip compile.
  • Helper-file fetches (!curl -sL -o foo.py ... for files like plot_utils.py, notebook_helpers.py, stream_nwbfile.py, ng_utils.py) are preserved from Make 39 notebooks Colab-ready: bootstrap cells + inline deps #141 and continue to run inside the install cell.

Verification

Pilot (tutorials/cosyne_2023/simple_dandiset_search.ipynb) was verified in Colab before fanning out. Verification of the remaining notebooks is the same loop:

  1. Open the notebook in Colab from this branch.
  2. Confirm the install cell renders as a collapsed ▶ Installing requirements bar.
  3. Click ▶, then Runtime → Restart session, then Run all cells below.
  4. Verify the notebook executes end-to-end.

Out of scope

  • "Open in Colab" pills on the index README — that change lives on branch add-colab-links, held until this PR merges (same arrangement as Make 39 notebooks Colab-ready: bootstrap cells + inline deps #141).
  • Pre-existing jetTransient validation noise in one cell output of tutorials/open_data_quick_start_2026/Get-to-know-a-Dandiset.ipynb — not introduced by this PR; flagged but left for a separate cleanup.

🤖 Generated with Claude Code

Replace loose dep pins with a full transitive lock (resolved via
`uv pip compile` against Python 3.12 / linux). Wrap the install cell
in Colab's `#@title ... display-mode: form` so it renders as a single
collapsed bar in Colab instead of confronting the user with ~110
lines of pin metadata.

Pilot only — once verified in Colab the same shape gets applied to
the other 40 notebooks tracked by #141.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

bendichter and others added 28 commits May 11, 2026 16:40
…bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eady (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…strap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dy (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Colab-ready (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Colab-ready (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…olab-ready (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-ready (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bendichter and others added 13 commits May 11, 2026 16:45
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…b-ready (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…b-ready (pinned bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d bootstrap)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Required after prepending bootstrap cells that carry IDs (nbformat 4.5+
feature). No other notebook content changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bendichter bendichter changed the title Colab bootstrap with pinned deps + collapsed install cell Colab-ready bootstrap with fully pinned deps + collapsed install cell May 11, 2026
@bendichter bendichter marked this pull request as ready for review May 11, 2026 21:26
bendichter and others added 2 commits May 12, 2026 12:49
Carries forward the content fix from PR #141 (commit 5b7959a) — modern
pandas requires `df.groupby(...)[["col1", "col2"]]` instead of the
legacy tuple form, which raises "Cannot subset columns with a tuple
with more than one element. Use a list instead."

Caught by the headless test sweep of PR #149.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
uv pip compile resolved pynwb==2.8.3 alongside hdmf==6.0.1 / hdmf==5.0.1
because pynwb 2.8.x's dep declaration doesn't constrain hdmf's upper bound.
The combo is broken: hdmf>=4 introduced `external_resources` as an
abstractmethod that pynwb 2.x doesn't implement, raising
"Can't instantiate abstract class NWBFile without an implementation for
abstract method 'external_resources'" at read time.

Adds hdmf<4 to the direct-dep constraints for these three notebooks and
regenerates the transitive lock. Caught by the headless test sweep.

Affects: 001038/DombeckLab, 001084/HoweLab, 000971/.../optogenetics_example_notebook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bendichter
Copy link
Copy Markdown
Member Author

Headless test results — PR #149 (colab-bootstrap-locked)

Environment: Docker linux/amd64 (Rosetta-emulated x86_64 on Apple Silicon) · Python 3.12 · per-notebook fresh system install via uv pip install --system <pinned-set> · execution via nbconvert --to script + ipython (the ZMQ kernel hangs under Rosetta — direct ipython script execution sidesteps that).

Total notebooks tested: 41

After fixes

  • 25 passed in initial sweep
  • 2 newly passing after a lockfile-bug fix (this PR's fault): 001038, 001084
  • 1 newly passing after a content-regression fix carried over from Make 39 notebooks Colab-ready: bootstrap cells + inline deps #141: validate_lev6 now advances past the tuple bug, but hits the 1-hr headless cap because the notebook genuinely walks every zarr chunk in the dandiset (expected behavior, also flagged in this state by the prior batch)
  • 14 still failing — categorized below; none are this PR's responsibility

Net pass rate: 27/41 (66%) of notebooks run end-to-end under the headless test rig. The remaining 14 fail for reasons unrelated to the lockfile/bootstrap approach itself — see categories below.

Bugs found and fixed in this PR

1. pynwb<3 + missing hdmf<4 → broken lockfile (3 notebooks)

uv pip compile happily resolved pynwb==2.8.3 alongside hdmf==6.0.1 (or 5.0.1) because pynwb 2.8.x's own dep declaration doesn't pin an hdmf upper bound. The combo is broken: hdmf ≥4 introduced external_resources as an abstract method that pynwb 2.x doesn't implement. Result at read time: Can't instantiate abstract class NWBFile without an implementation for abstract method 'external_resources'.

Fix: explicitly added hdmf<4 to the direct deps and regenerated the lock for:

  • 001038/DombeckLab/001038_demo.ipynb (hdmf: 6.0.13.14.6)
  • 001084/HoweLab/001084_demo.ipynb (hdmf: 6.0.13.14.6)
  • 000971/lernerlab/seiler_2024/optogenetics_example_notebook.ipynb (hdmf: 5.0.13.14.6) — silently broken even though headless test happened to pass; would have failed at NWB-write time

General implication: any notebook with pynwb<3 in its direct deps must also carry hdmf<4 explicitly. The PR-141-era playbook called this out but a few notebooks slipped through.

2. Lost content fix for validate_lev6 (1 notebook)

Carried forward commit 5b7959a from #141 — pandas requires df.groupby(...)[["col1", "col2"]] (list) not ("col1", "col2") (tuple) for multi-column subset. Was fixed on the old PR-141 branch but never reached master, so my fresh-off-master branch regressed it. Re-applied as a 1-char fix.

Remaining 14 failures — categorized

A. Pre-existing notebook content bugs (6) — out of scope

Notebook Symptom
000039/AllenInstitute/Create_manifest.ipynb NameError: name 'pd' is not defined — notebook uses pd without import pandas as pd
000108/chunglab/demo/2021-09-27_dandi-demo.ipynb TypeError: list indices must be integers — DANDI API response shape changed
000363/MAP/demo/browse_map_ephys_data.ipynb File does not exist: ../../dandiset/... — hardcoded local path
dandi/DANDI User Guide, Part I.ipynb NameError: dandiset_id — fill-in-the-blank cell, by design
demos/NWBWidget-demo.ipynb ValueError: path ... not found in dandiset 000054 — stale asset path
000055/BruntonLab/peterson21/Fig_coarse_labels.ipynb ImportError: cannot import name 'align_by_times' from 'nwbwidgets.utils.timeseries' — colocated plot_utils.py calls a removed nwbwidgets API

All of these are flagged in my project memory (project_colab_rollout_triage.md) as known content bugs that exist independently of the Colab-readiness rollout. Fixing them belongs in a separate content-cleanup PR.

B. Headless-environment limitations (4) — would pass in Colab

Notebook Symptom
000559/dattalab/.../read_avi.ipynb libxcb.so.1: cannot open shared object file — opencv loads xcb; Colab images have it, slim test container doesn't
000582/Sargolini2006_demo.ipynb could not locate runnable browser — notebook calls webbrowser.open(); works on Colab
001170/ReimerLab/.../001170_demo.ipynb same webbrowser.open() issue
000718/CaiLab/zaki_2024/000718_demo.ipynb EOFError: EOF when reading a line — notebook calls input(); Colab is interactive

These would all run cleanly in Colab. The test rig is more austere than the Colab runtime by design.

C. Network / auth / external services (3)

Notebook Symptom
000402/MICrONS/coregistration/microns_nwb_coreg_notebook.ipynb AuthException: token not set up — needs a CAVE API token (already documented as a prerequisite)
000055/BruntonLab/peterson21/dashboard.ipynb ValueError: h5py was built without ROS3 support — PyPI h5py wheel lacks the ROS3 driver. Notebook would need to switch to remfile/fsspec streaming, or have ROS3-enabled h5py preinstalled
tutorials/cosyne_2023/advanced_asset_search.ipynb same h5py ROS3 issue

These are likely also broken on current Colab (since Colab uses the same PyPI h5py wheel). The notebooks need a content update to switch streaming backends; this PR's bootstrap can't fix it.

D. Headless test timeout (1)

Notebook Symptom
000108/chunglab/demo/validate_lev6.ipynb Hit the rig's 3600 s cap — under Rosetta emulation, walking every zarr chunk in the dandiset takes longer than that. After the tuple-fix above, the notebook gets past the initial bug; it just doesn't finish within the test budget

Pass list (27)

000402/MICrONS/demo/000402_microns_demo, 000409/IBL/bwm_usage_notebook, 000458/AllenInstitute/reanalysis, 000458/FlatironInstitute/000_lindi_vs_fsspec_streaming, 000458/FlatironInstitute/001_summarize_contents, 000559/dattalab/.../reproduce_figure1d, 000559/dattalab/.../reproduce_figure_S1, 000727/clandinin/.../reading_data, 000728/AllenInstitute/visual_coding_ophys_tutorial, 000947/TurnerLab/public_demo/000947_demo, 000971/lernerlab/.../fiber_photometry_example_notebook, 000971/lernerlab/.../optogenetics_example_notebook, 001038/DombeckLab/001038_demo, 001075/001075_paper_figure_1d, 001084/HoweLab/001084_demo, 001636/TurnerLab/motor_cortex/antidromic_detection_tutorial, 001636/TurnerLab/motor_cortex/turner_m1_glm, 001636/TurnerLab/motor_cortex/turner_m1_peth, 001636/TurnerLab/motor_cortex/turner_m1_usage, 001754/CatalystNeuro/001754_demo, dandi/DANDI User Guide, Part II, tutorials/bcm_2024/analysis-demo, tutorials/cosyne_2023/simple_dandiset_search, tutorials/neurodatarehack_2024/advanced_asset_search, tutorials/neurodatarehack_2024/simple_dandiset_search, tutorials/open_data_quick_start_2026/Get-to-know-a-Dandiset, 000108/chunglab/demo/dashboard.

Slowest pass: 000402/MICrONS/demo/000402_microns_demo at 3016 s (cell 52 streams ~67 GB of data — this is the notebook's purpose, not a bug).

bendichter added a commit that referenced this pull request May 12, 2026
Reworks the CI test logic per review feedback:

- Default behavior is to test every `.ipynb` in the repo. No longer
  filters on "has a uv pip install --system cell" — that filter
  silently let unbootstrapped notebooks through, which is the opposite
  of what we want.

- Adds `.github/notebook-test-exclusions.txt` for the legitimate
  skip cases: DataJoint (needs MySQL/Postgres), repo-operational meta
  notebooks (archive_stats, dandiset_size_plot), the legacy
  cosyne_2020 tutorial, and structurally-blocked notebooks (RutishauserLab,
  reproduce_figure_S3, BruntonLab plot_utils breakage). Each entry has
  a comment explaining why.

- Adds `.github/scripts/list_notebooks.py` to enumerate testable
  notebooks. In `--changed-only` mode it intersects with the PR's
  changed files. Both workflows now use it.

- Adding a new `.ipynb` without a Colab-bootstrap install cell is now
  a hard failure with a pointed error message telling the contributor
  to either add the bootstrap (link to #149 pattern) or add the
  notebook to the exclusion list with justification.

- Weekly-sweep report breaks failures into a 'Missing the Colab-bootstrap
  install cell' section (actionable, contributor-facing) vs other failures
  (execution errors, install errors, etc).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bendichter added a commit to alessandratrapani/example-notebooks that referenced this pull request May 12, 2026
Adds the four-cell Colab bootstrap (badge / install requirements header /
collapsed install cell / restart admonition) with a fully transitive lock
generated via 'uv pip compile --python-version 3.12 --python-platform
linux'. Direct deps from environment.yml plus an explicit pynwb<3 + hdmf<4
constraint (the latter avoids the broken pynwb-2.x-with-hdmf-6 combo;
see dandi#149).

Also drops the 'client.dandi_authenticate()' call. The notebook's own
inline comment said to remove it once 001172 was published — verified
via the DANDI API that the dandiset's embargo_status is now OPEN, so
the call is no longer needed (and would break Colab use because Colab
users don't have DANDI creds wired up).

Follows the same pattern as the 41 notebooks bootstrapped in dandi#149.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

links to Google Colab

1 participant