feat(server): /load backend=polars, /load_expr config parity, per-client search isolation (#851)#882
feat(server): /load backend=polars, /load_expr config parity, per-client search isolation (#851)#882paddymul wants to merge 2 commits into
Conversation
…g, init_sd
The headless ServerDataflow that powers mode="buckaroo" sessions
already supports these kwargs (it inherits CustomizableDataflow); they
just weren't exposed through the POST /load body. Now you can match a
notebook BuckarooInfiniteWidget invocation server-side:
POST /load { "session": "demo", "path": "...", "mode": "buckaroo",
"column_config_overrides": {...},
"extra_grid_config": {"rowHeight": 70},
"init_sd": {...} }
All three are optional. Existing /load calls (which don't include the
new fields) are unchanged — body.get(...) returns None and the
dataflow handles None just like the widget does.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures work-in-progress standalone-server /load enhancements that were sitting uncommitted on feat/load-column-config-overrides. * backend='polars' for POST /load — new data_loading_polars.py (PolarsServerDataflow, load_file_polars, get_metadata_polars, handle_infinite_request_buckaroo_polars). mode='buckaroo' can now build a polars-backed dataflow. handlers.py validates/routes the backend; websocket_handler.py routes the infinite row-fetch to the polars handler. polars stays optional — imported lazily only when requested. * Per-client search_string (#851): search_string moves off the shared SessionState onto each DataStreamHandler. Two clients sharing a session were clobbering each other's live-typed filter. A targeted highlight overlay (_send_highlight_overlay) goes to the typing client only; the term is stripped before snapshotting onto the session and reset to "" on every new /load. * /load_expr config parity: LoadExprHandler forwards column_config_overrides / extra_grid_config / init_sd to XorqServerDataflow (mirrors the /load kwargs added in 6bf4c12). * styling: merge init_sd displayer_args before injecting lowcode-op highlight metadata, so an init_sd-promoted string column still picks up highlight_phrase and an explicit init_sd highlight wins. * deps: add matplotlib (used by customizations/histogram.py). * js: DFViewerDirect storybook story (direct <DFViewer> consumer demo). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9fe25809d8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| elif ext in (".parquet", ".parq"): | ||
| return pl.read_parquet(path) | ||
| elif ext == ".json": | ||
| return pl.read_ndjson(path) |
There was a problem hiding this comment.
Load regular JSON files with the polars backend
When POST /load is called with backend='polars' for a .json file, this uses the NDJSON reader, so ordinary JSON documents that the existing pandas /load path accepts via pd.read_json (for example a records array like [{"a":1}]) fail even though they have the same .json extension. This makes the new backend unexpectedly reject valid JSON inputs; use the regular JSON reader here or reserve read_ndjson for an NDJSON-specific extension/option.
Useful? React with 👍 / 👎.
What this does
Bundles a set of standalone-server
/loadenhancements that were sitting uncommitted onfeat/load-column-config-overrides. They extend the server so aPOST /load(and/load_expr) call can reach the same configuration surface a notebookBuckarooInfiniteWidgetalready has, add a polars loading path, and fix a multi-client live-search bug.1.
backend='polars'forPOST /loadNew module
buckaroo/server/data_loading_polars.py:load_file_polars/get_metadata_polars— eager parquet/csv/tsv/json read into apl.DataFrame.PolarsServerDataflow— polars analogue ofServerDataflow(polars analysis / autocleaning / stats / sampling), capped atpre_limit=1_000_000rows for the stats pipeline.handle_infinite_request_buckaroo_polars— polars row-fetch handler; applies the livesearch_stringas a literal substring match on String columns, matching the pandassearch_df_strsemantics.handlers.pyvalidatesbackend(pandasdefault, orpolars;polarsonly valid withmode='buckaroo') and routes the load/build path.websocket_handler.pyroutes the infinite row-fetch to the polars handler whensession.backend == 'polars'. polars stays an optional dependency — it's imported lazily, only when a request actually asks forbackend='polars'.2. Per-client
search_string(fixes #851)search_stringwas stored on the sharedSessionState, so two browser clients on the same session clobbered each other's live-typed filter (and each other's highlight). It now lives on theDataStreamHandlerinstance:self.search_stringinstead ofsession.search_string._send_highlight_overlaysends aninitial_statewithhighlight_phraseinjected into string-columndisplayer_argsto only the typing client — never broadcast — and is skipped when a dataflow rebuild is already going to broadcast the highlight anyway.buckaroo_stateonto the session, and reset to""on every new/load//load_exprpush.3.
/load_exprconfig-override parityLoadExprHandlernow forwardscolumn_config_overrides,extra_grid_config, andinit_sdtoXorqServerDataflow, mirroring the/loadkwargs added in6bf4c12b.4.
init_sdvs. highlight ordering (styling.py)init_sddisplayer_argsare now merged before lowcode-op highlight metadata is injected, so a column whose pandas type isobjbut whichinit_sdpromotes todisplayer: 'string'still picks uphighlight_phrase. The highlight injection skips any key the caller already set, so an explicitinit_sdhighlight wins.Tangential (also uncommitted, rode along)
pyproject.toml/uv.lock: addmatplotlib(used bybuckaroo/customizations/histogram.py).packages/buckaroo-js-core/src/stories/DFViewerDirect.stories.tsx: a Storybook story demonstrating the direct<DFViewer>consumer pattern.Base note
This PR is based on
mainbut includes commit6bf4c12b(/load accepts column_config_overrides, extra_grid_config, init_sd) in addition to the new work, because the polars/load_expr changes build directly on that plumbing and that commit is not yet inmain. If6bf4c12blands separately first, rebasing will collapse this PR's diff to just the new commit.Testing
pytest tests/unit/server/— 103 passed, 1 skipped (the one failure,test_mcp_uvx_install, is a networkuvxinstall that timed out — unrelated).pytest tests/unit/.../customizable_dataflow_test.py tests/unit/server/test_server.py -k "highlight or init_sd or string"— 6 passed.ruff check+paddy-format --check(full tree) — clean.🤖 Generated with Claude Code