feat(cubestore): error when a query plan node materializes too many rows#10997
feat(cubestore): error when a query plan node materializes too many rows#10997waralexrom wants to merge 3 commits into
Conversation
Add MaterializedRowsLimitExec that counts rows at the points of the plan where they accumulate in memory: sort and window inputs, join build sides, aggregation outputs, worker results and the final query result. When a single node materializes more than the limit, the query fails with a suggestion to create a pre-aggregation for that stage. Configured via CUBESTORE_MATERIALIZED_ROWS_LIMIT (default 500000, 0 disables the check).
|
Claude finished @waralexrom's task in 2m 45s —— View job Claude PR ReviewVerdict: LGTM overall — solid design, clean implementation. 4 low-severity findings, all posted inline. Review detailsCompleted checklist
Summary of findings (all low severity)
Positives
Nits (not posted inline)
|
Include the observed row count, the limit and the env var name in the error message. Cover stage labels with e2e and plan-shape tests, pin the hash join build-side and counter lifetime invariants in comments.
|
Claude finished @waralexrom's task in 46s —— View job Claude PR Review (re-run after
|
| # | Prior concern | Resolution in ed1529c |
|---|---|---|
| 1 | HashJoinExec left-as-build invariant unpinned |
Pinning comment added at optimizations/materialized_rows_limit.rs:26 (// HashJoinExec always builds the hash table from its left input.) |
| 2 | Atomic counter not reset across execute() calls |
Doc comment added on the rows field documenting the invariant (Never reset: plans are built per query and executed once.). Behavior unchanged but the contract is now explicit — acceptable. |
| 3 | Error message omitted observed count, limit, env var | Now reads materialized at least {total} rows which exceeds the limit of {limit} rows … adjust the CUBESTORE_MATERIALIZED_ROWS_LIMIT environment variable. Unit test asserts each substring. |
| 4 | E2E test only covered "query result" stage | materialized_rows_limit now also asserts 'aggregation groups' and 'worker result' labels. New materialized_rows_limit_plan test uses the pretty printer to assert all four stage labels (sort input, query result, aggregation groups, worker result) appear in the planned router/worker plans — locks the labels against optimizer reshuffles, which is what I wanted. |
Other observations
- Pretty printer gains a
MaterializedRowsLimitExeccase soshow_materialized_rows_limit_nodes=trueproduces stableMaterializedRowsLimit, stage: …output — exactly what the new plan test needs. - Off-by-one:
total > limit⇒ exactlylimitrows pass,limit+1fails. Error message says "at least {total}" which matches. - Hash join / cross join stage labels are still untested at the E2E layer (only worker result + aggregation groups exercise the error path). Minor; the plan test covers the label surface for the common stages.
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## master #10997 +/- ##
===========================================
- Coverage 83.32% 58.50% -24.82%
===========================================
Files 255 216 -39
Lines 77177 17151 -60026
Branches 0 3494 +3494
===========================================
- Hits 64304 10035 -54269
+ Misses 12873 6607 -6266
- Partials 0 509 +509
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…hreshold A single plan node should not materialize more than one partition's worth of rows, so CUBESTORE_MATERIALIZED_ROWS_LIMIT now defaults to the configured CUBESTORE_PARTITION_SPLIT_THRESHOLD (2M rows by default) instead of a flat 500k. Drop the env var mention from the error message.
|
Claude finished @waralexrom's task in 37s —— View job Claude PR Review (re-run after
|
Summary
CubeStore now returns an error when a single physical plan node materializes more rows than the configured limit, telling the user to consider creating a pre-aggregation for that stage. Guards against queries that accumulate huge intermediate results in memory instead of failing late or OOMing.
Changes
MaterializedRowsLimitExecwrapper node (modeled afterCheckMemoryExec): a pass-through stream with a shared atomic row counter across all partitions of the wrapped node; exceeding the limit fails the query with a user-facing error naming the stage.finalize_physical_planonly at points where rows actually accumulate in memory (streaming nodes are not counted):SortExecinput (skipped when TopKfetch <= limit— the buffer holdsmin(input, fetch)rows)HashJoinExecbuild side andCrossJoinExecleft sideAggregateExecoutput (hash groups; sorted/inline streaming aggregates excluded)WindowAggExecinputWorkerExecinput (worker result collected before sending to the router)CUBESTORE_MATERIALIZED_ROWS_LIMIT, default 500000,0disables the check. PlumbedConfigObj→QueryExecutorImpl→CubeQueryPlanner.show_materialized_rows_limit_nodesflag, so existing plan assertions are unaffected.Only the query execution path (
QueryExecutor) is affected — ingestion, compaction and cachestore paths are untouched.Testing
MaterializedRowsLimitExec(under/over limit, error message includes stage and pre-aggregation suggestion).sql::tests::materialized_rows_limitwith limit 4: a 3-row query succeeds, a 6-row query fails with the pre-aggregation hint.cubestore --libsuite (206),cubestore-sql-testsin-process (168) and cluster (167) suites all green with the default 500k limit enabled.