Colab-ready bootstrap with fully pinned deps + collapsed install cell#149
Colab-ready bootstrap with fully pinned deps + collapsed install cell#149bendichter wants to merge 44 commits into
Conversation
Replace loose dep pins with a full transitive lock (resolved via `uv pip compile` against Python 3.12 / linux). Wrap the install cell in Colab's `#@title ... display-mode: form` so it renders as a single collapsed bar in Colab instead of confronting the user with ~110 lines of pin metadata. Pilot only — once verified in Colab the same shape gets applied to the other 40 notebooks tracked by #141. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
…bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eady (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…strap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dy (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Colab-ready (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Colab-ready (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…olab-ready (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-ready (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…b-ready (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…b-ready (pinned bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d bootstrap) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Required after prepending bootstrap cells that carry IDs (nbformat 4.5+ feature). No other notebook content changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Carries forward the content fix from PR #141 (commit 5b7959a) — modern pandas requires `df.groupby(...)[["col1", "col2"]]` instead of the legacy tuple form, which raises "Cannot subset columns with a tuple with more than one element. Use a list instead." Caught by the headless test sweep of PR #149. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
uv pip compile resolved pynwb==2.8.3 alongside hdmf==6.0.1 / hdmf==5.0.1 because pynwb 2.8.x's dep declaration doesn't constrain hdmf's upper bound. The combo is broken: hdmf>=4 introduced `external_resources` as an abstractmethod that pynwb 2.x doesn't implement, raising "Can't instantiate abstract class NWBFile without an implementation for abstract method 'external_resources'" at read time. Adds hdmf<4 to the direct-dep constraints for these three notebooks and regenerates the transitive lock. Caught by the headless test sweep. Affects: 001038/DombeckLab, 001084/HoweLab, 000971/.../optogenetics_example_notebook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Headless test results — PR #149 (colab-bootstrap-locked)Environment: Docker linux/amd64 (Rosetta-emulated x86_64 on Apple Silicon) · Python 3.12 · per-notebook fresh system install via Total notebooks tested: 41 After fixes
Net pass rate: 27/41 (66%) of notebooks run end-to-end under the headless test rig. The remaining 14 fail for reasons unrelated to the lockfile/bootstrap approach itself — see categories below. Bugs found and fixed in this PR1.
|
| Notebook | Symptom |
|---|---|
000039/AllenInstitute/Create_manifest.ipynb |
NameError: name 'pd' is not defined — notebook uses pd without import pandas as pd |
000108/chunglab/demo/2021-09-27_dandi-demo.ipynb |
TypeError: list indices must be integers — DANDI API response shape changed |
000363/MAP/demo/browse_map_ephys_data.ipynb |
File does not exist: ../../dandiset/... — hardcoded local path |
dandi/DANDI User Guide, Part I.ipynb |
NameError: dandiset_id — fill-in-the-blank cell, by design |
demos/NWBWidget-demo.ipynb |
ValueError: path ... not found in dandiset 000054 — stale asset path |
000055/BruntonLab/peterson21/Fig_coarse_labels.ipynb |
ImportError: cannot import name 'align_by_times' from 'nwbwidgets.utils.timeseries' — colocated plot_utils.py calls a removed nwbwidgets API |
All of these are flagged in my project memory (project_colab_rollout_triage.md) as known content bugs that exist independently of the Colab-readiness rollout. Fixing them belongs in a separate content-cleanup PR.
B. Headless-environment limitations (4) — would pass in Colab
| Notebook | Symptom |
|---|---|
000559/dattalab/.../read_avi.ipynb |
libxcb.so.1: cannot open shared object file — opencv loads xcb; Colab images have it, slim test container doesn't |
000582/Sargolini2006_demo.ipynb |
could not locate runnable browser — notebook calls webbrowser.open(); works on Colab |
001170/ReimerLab/.../001170_demo.ipynb |
same webbrowser.open() issue |
000718/CaiLab/zaki_2024/000718_demo.ipynb |
EOFError: EOF when reading a line — notebook calls input(); Colab is interactive |
These would all run cleanly in Colab. The test rig is more austere than the Colab runtime by design.
C. Network / auth / external services (3)
| Notebook | Symptom |
|---|---|
000402/MICrONS/coregistration/microns_nwb_coreg_notebook.ipynb |
AuthException: token not set up — needs a CAVE API token (already documented as a prerequisite) |
000055/BruntonLab/peterson21/dashboard.ipynb |
ValueError: h5py was built without ROS3 support — PyPI h5py wheel lacks the ROS3 driver. Notebook would need to switch to remfile/fsspec streaming, or have ROS3-enabled h5py preinstalled |
tutorials/cosyne_2023/advanced_asset_search.ipynb |
same h5py ROS3 issue |
These are likely also broken on current Colab (since Colab uses the same PyPI h5py wheel). The notebooks need a content update to switch streaming backends; this PR's bootstrap can't fix it.
D. Headless test timeout (1)
| Notebook | Symptom |
|---|---|
000108/chunglab/demo/validate_lev6.ipynb |
Hit the rig's 3600 s cap — under Rosetta emulation, walking every zarr chunk in the dandiset takes longer than that. After the tuple-fix above, the notebook gets past the initial bug; it just doesn't finish within the test budget |
Pass list (27)
000402/MICrONS/demo/000402_microns_demo, 000409/IBL/bwm_usage_notebook, 000458/AllenInstitute/reanalysis, 000458/FlatironInstitute/000_lindi_vs_fsspec_streaming, 000458/FlatironInstitute/001_summarize_contents, 000559/dattalab/.../reproduce_figure1d, 000559/dattalab/.../reproduce_figure_S1, 000727/clandinin/.../reading_data, 000728/AllenInstitute/visual_coding_ophys_tutorial, 000947/TurnerLab/public_demo/000947_demo, 000971/lernerlab/.../fiber_photometry_example_notebook, 000971/lernerlab/.../optogenetics_example_notebook, 001038/DombeckLab/001038_demo, 001075/001075_paper_figure_1d, 001084/HoweLab/001084_demo, 001636/TurnerLab/motor_cortex/antidromic_detection_tutorial, 001636/TurnerLab/motor_cortex/turner_m1_glm, 001636/TurnerLab/motor_cortex/turner_m1_peth, 001636/TurnerLab/motor_cortex/turner_m1_usage, 001754/CatalystNeuro/001754_demo, dandi/DANDI User Guide, Part II, tutorials/bcm_2024/analysis-demo, tutorials/cosyne_2023/simple_dandiset_search, tutorials/neurodatarehack_2024/advanced_asset_search, tutorials/neurodatarehack_2024/simple_dandiset_search, tutorials/open_data_quick_start_2026/Get-to-know-a-Dandiset, 000108/chunglab/demo/dashboard.
Slowest pass: 000402/MICrONS/demo/000402_microns_demo at 3016 s (cell 52 streams ~67 GB of data — this is the notebook's purpose, not a bug).
Reworks the CI test logic per review feedback: - Default behavior is to test every `.ipynb` in the repo. No longer filters on "has a uv pip install --system cell" — that filter silently let unbootstrapped notebooks through, which is the opposite of what we want. - Adds `.github/notebook-test-exclusions.txt` for the legitimate skip cases: DataJoint (needs MySQL/Postgres), repo-operational meta notebooks (archive_stats, dandiset_size_plot), the legacy cosyne_2020 tutorial, and structurally-blocked notebooks (RutishauserLab, reproduce_figure_S3, BruntonLab plot_utils breakage). Each entry has a comment explaining why. - Adds `.github/scripts/list_notebooks.py` to enumerate testable notebooks. In `--changed-only` mode it intersects with the PR's changed files. Both workflows now use it. - Adding a new `.ipynb` without a Colab-bootstrap install cell is now a hard failure with a pointed error message telling the contributor to either add the bootstrap (link to #149 pattern) or add the notebook to the exclusion list with justification. - Weekly-sweep report breaks failures into a 'Missing the Colab-bootstrap install cell' section (actionable, contributor-facing) vs other failures (execution errors, install errors, etc). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the four-cell Colab bootstrap (badge / install requirements header / collapsed install cell / restart admonition) with a fully transitive lock generated via 'uv pip compile --python-version 3.12 --python-platform linux'. Direct deps from environment.yml plus an explicit pynwb<3 + hdmf<4 constraint (the latter avoids the broken pynwb-2.x-with-hdmf-6 combo; see dandi#149). Also drops the 'client.dandi_authenticate()' call. The notebook's own inline comment said to remove it once 001172 was published — verified via the DANDI API that the dandiset's embargo_status is now OPEN, so the call is no longer needed (and would break Colab use because Colab users don't have DANDI creds wired up). Follows the same pattern as the 41 notebooks bootstrapped in dandi#149. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #140. Replaces #141.
Summary
Adds a Colab-ready install bootstrap to every runnable notebook in the repo (41 notebooks). The bootstrap pins every Python package — direct and transitive — at exact versions, and presents to the user as a single collapsed cell in Colab rather than ~110 lines of pin metadata.
This PR supersedes #141. Two structural improvements over the prior approach:
pynwb<3,hdmf<4, etc.) and accumulated transitive-breakage patches over time. Here, every notebook's install cell carries a fully resolved==-pinned set generated byuv pip compileagainst Python 3.12 / linux. No transitive drift; reproducible on any future Colab session.cellView: "form"+#@title ... { display-mode: "form" }, so in Colab it renders as a single titled bar with a play button. The ~100-line pinned set is hidden behind the form. In Jupyter / nbviewer the#@titleline is just a comment — graceful degradation.Per-notebook shape
Each notebook gets four cells prepended at the top:
master## Installing requirements+ 1-sentence explanationcellView: form)#@title Installing requirements (click ▶ to run) { display-mode: "form" }→!pip install -q uv→!uv pip install --system "<pkg>==<ver>" ...for every resolved package → optional!curllines for sibling helper.pyfilesNo other notebook content is changed.
Scope
41 notebooks: the 36 runnable notebooks covered by #141 that still exist on master, plus 5 notebooks that landed on master after #141 was opened.
From PR #141 (36):
000039/AllenInstitute/Create_manifest,000055/BruntonLab/peterson21/*(2),000108/chunglab/demo/*(3),000363/MAP/demo/browse_map_ephys_data,000402/MICrONS/*(2),000458/AllenInstitute/reanalysis,000458/FlatironInstitute/*(2),000559/dattalab/markowitz_gillis_nature_2023/*(3),000582/Sargolini2006/000582_Sargolini2006_demo,000718/CaiLab/zaki_2024/000718_demo,000727/clandinin/simple_data_access/reading_data,000728/AllenInstitute/visual_coding_ophys_tutorial,000947/TurnerLab/public_demo/000947_demo,000971/lernerlab/seiler_2024/*(2),001038/DombeckLab/001038_demo,001075/001075_paper_figure_1d,001084/HoweLab/001084_demo,001170/ReimerLab/public_demo/001170_demo,001754/CatalystNeuro/001754_demo,dandi/DANDI User Guide, Part I/II(2),demos/NWBWidget-demo,tutorials/bcm_2024/analysis-demo,tutorials/cosyne_2023/*(2),tutorials/neurodatarehack_2024/*(2),tutorials/open_data_quick_start_2026/Get-to-know-a-Dandiset.New on master since #141 (5):
000409/IBL/bwm_usage_notebook(replaces the three pre-existing IBL notebooks that were deleted on master), and001636/TurnerLab/motor_cortex/{antidromic_detection_tutorial,turner_m1_glm,turner_m1_peth,turner_m1_usage}. Deps for these are sourced from the colocatedenvironment.yml.Notebooks dropped vs #141 (3):
000409/IBL/01_list_datasets,02_behaviour_psychometric_curve,03_analysis_Imbizo_2023— deleted on master.How to review
Each notebook is an independent commit. The bootstrap insertions for a given notebook only touch that notebook's
.ipynb. There's one additional sweep commit (d2879d1) that bumpsnbformat_minorfrom 4 → 5 on 15 legacy notebooks; required because cellidfields (used by the prepended cells) are an nbformat-4.5+ feature.Per-PR-#141 difference highlights:
numpy<2,zarr<3,cellpose<4,ipython_genutils,pynwb<3,hdmf<4etc. were patching.git+https://...direct deps (e.g.brunton-lab-to-nwb) are now locked to the exact resolved git commit viauv pip compile.!curl -sL -o foo.py ...for files likeplot_utils.py,notebook_helpers.py,stream_nwbfile.py,ng_utils.py) are preserved from Make 39 notebooks Colab-ready: bootstrap cells + inline deps #141 and continue to run inside the install cell.Verification
Pilot (
tutorials/cosyne_2023/simple_dandiset_search.ipynb) was verified in Colab before fanning out. Verification of the remaining notebooks is the same loop:▶ Installing requirementsbar.Out of scope
add-colab-links, held until this PR merges (same arrangement as Make 39 notebooks Colab-ready: bootstrap cells + inline deps #141).jetTransientvalidation noise in one cell output oftutorials/open_data_quick_start_2026/Get-to-know-a-Dandiset.ipynb— not introduced by this PR; flagged but left for a separate cleanup.🤖 Generated with Claude Code