Skip to content

Enable multi-threaded execution for TableFunction#6

Draft
otegami wants to merge 3 commits into
mainfrom
refactor/extract-executor-module
Draft

Enable multi-threaded execution for TableFunction#6
otegami wants to merge 3 commits into
mainfrom
refactor/extract-executor-module

Conversation

@otegami
Copy link
Copy Markdown
Owner

@otegami otegami commented Apr 3, 2026

Summary

Enable multi-threaded execution for DuckDB::TableFunction on DuckDB >= 1.5.0 by introducing per-worker proxy threads.

DuckDB invokes table function callbacks from its own worker threads, which are not Ruby threads. Since rb_thread_call_with_gvl crashes when called from non-Ruby threads, we previously forced single-threaded execution. This PR gives each DuckDB worker thread a dedicated Ruby proxy thread that acquires the GVL on its behalf, making table function callbacks safe under multi-threaded DuckDB execution.

Add a per-worker proxy: one dedicated Ruby thread per DuckDB worker
thread, using the same mutex/condvar hand-off protocol as the global
executor but private to a single worker. This lets callbacks from
different workers run concurrently instead of serializing through the
one global executor queue.

rbduckdb_function_executor_dispatch_via_proxy() routes the non-Ruby
thread path (Case 3) through a given proxy when non-NULL, falling back
to the global executor when NULL; the existing dispatch() now delegates
to it with NULL, so behavior is unchanged. Live proxies are held in a
GC-protected array. The new symbols are unused until table function
integration lands, so this commit is behavior-preserving (full suite
green).
@otegami otegami force-pushed the refactor/extract-executor-module branch 2 times, most recently from 83684e6 to f5e821e Compare May 31, 2026 04:40
otegami added 2 commits May 31, 2026 12:45
rbduckdb_worker_proxy_destroy waits for the proxy thread to exit, but
the proxy thread must run Ruby code (removing itself from the
GC-protection array) before exiting, which needs the GVL. DuckDB may
invoke this destructor either from a worker thread (no GVL) or from a
Ruby thread that holds the GVL, depending on when it tears down the
local state. In the latter case waiting with the GVL held would
deadlock: the proxy thread can never acquire the GVL to finish.

Split the wait into proxy_join_func and run it via
rb_thread_call_without_gvl when the caller holds the GVL; wait directly
otherwise. The proxy is still dead code at this point (no integration
yet), so behavior is unchanged.
Wire the execute path to per-worker proxy threads on DuckDB >= 1.5.0.
A local_init callback registered via duckdb_table_function_set_local_init
runs once per worker thread, creates a proxy (allocating its Ruby thread
under the GVL through the global executor, since local_init runs on a
non-Ruby thread), and stores it as thread-local init data. The execute
callback retrieves that proxy and dispatches through it via
rbduckdb_function_executor_dispatch_via_proxy, so callbacks from
different workers run concurrently instead of serializing on the single
global executor. DuckDB frees each proxy through rbduckdb_worker_proxy_destroy.

bind and init stay on the global executor (not on the hot path). On
DuckDB < 1.5.0 the local_init hook is absent and the execute callback
keeps using the global executor unchanged.

Verified: with SET threads=4 plus cardinality/max_threads hints, a
GVL-releasing callback reaches max_concurrent=4 (vs 2 on the global
executor) for a ~2x speedup; results are identical. The added test
asserts correctness of the local_init -> proxy -> destroy lifecycle
under multi-threaded execution (throughput is checked manually to avoid
CI flakiness).
@otegami otegami force-pushed the refactor/extract-executor-module branch from f5e821e to 7f65ec7 Compare May 31, 2026 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant