feat\!: Replace data_batch state machine with RAII lock accessor types#99
feat\!: Replace data_batch state machine with RAII lock accessor types#99bwyogatama wants to merge 2 commits intomainfrom
Conversation
|
/ok to test 5380f13 |
|
I incorporated a lot of @dhruv9vats and @aminaramoon feedback into this PR. @dhruv9vats there are 2 changes i made to your idea:
Reviews are appreciated, I am not surprised if I am missing something while working on this. |
|
One thing that I still don't know is the best API to pop For example, when popping data batch from the repo for downgrade it allows us to skip data batch that's currently in the But right now, since the state is gone, we will just pop Let me know if people have some ideas @aminaramoon @dhruv9vats |
|
Thanks for the refactor @bwyogatama !
These are, IMO, 2 different use cases. The current use case we are addressing is: we want to be able to have multiple readers + enforce a single writer in a streamlined fashion using standard sync utilities (shared_lock / unique_lock). If we remove |
We would need the probing ability for planned observability in the near future. We should think about how we want to do that. |
| * that expose the data_batch state when certain events occur, like state transitions. | ||
| * Key characteristics: | ||
| * - Compiler-enforced read/write separation via accessor types | ||
| * - PtrType agnostic — works with shared_ptr, unique_ptr, or stack allocation |
There was a problem hiding this comment.
I think we dont want this to be PtrType agnostic, we are intentionally keeping it so this is only shared_ptr.
| { | ||
| } | ||
| private: | ||
| friend class synchronized_data_batch; |
There was a problem hiding this comment.
It is okay to be explicit, but I believe we dont need the friend class declarations.
|
|
||
| /** | ||
| * @brief Destructor decrements processing count and potentially transitions state. | ||
| * @brief RAII read-only accessor. Borrows the parent, does not extend lifetime. |
There was a problem hiding this comment.
We want to extend the lifetime.
| data_batch_processing_handle& operator=(const data_batch_processing_handle&) = delete; | ||
| /** | ||
| * @brief Downgrade from mutable → read-only. | ||
| * Internally releases unique lock, then blocks until shared lock acquired. |
There was a problem hiding this comment.
then blocks until
I feel the user should be allowed to make this decision, and inspect if they can get a write lock quickly or not.
| }; | ||
| /** | ||
| * @brief Upgrade from read-only → mutable. | ||
| * Internally releases shared lock, then blocks until unique lock acquired. |
There was a problem hiding this comment.
Again, should be the users choice whether they want to wait or not.
| * as try_to_create_task(). Always succeeds when it returns. | ||
| */ | ||
| void wait_to_create_task(); | ||
| // -- Immutable field exposed on wrapper (lock-free, for repository lookups) -- |
There was a problem hiding this comment.
The use of const for the batch_id_ field in the data_batch should be considered.
| bool subscribe(); | ||
| void unsubscribe(); | ||
| size_t get_subscriber_count() const; |
There was a problem hiding this comment.
we can circumvent this by having an API on the data repository along the lines of std::optional<mutable_data_batch> try_pop() {
std::optional<mutable_data_batch> result = std::nullopt;
auto position = std::ranges::find_if(batches_, [&to_return](std::shared_ptr<synchronized_data_batch>& batch) {
result = batch.try_get_mutable();
return result != std::nullopt;
});
if (pos != batches_.end()) {
std::erase(pos);
}
return result;
}or something like this. And then have a blocking API that can be used if none of the batches can be readily downgraded. |
Redesign data_batch concurrency model with compile-time enforced data access safety. The new 3-class design uses data_batch (idle/unlocked), read_only_data_batch (shared lock), and mutable_data_batch (exclusive lock). All data access requires acquiring a lock through RAII accessor types, and move semantics make stale references a compile error. - Rewrite data_batch.hpp/cpp with new type system - Migrate data_repository and data_repository_manager to new types - Rewrite all test files for the new API - Add representation_converter support for new batch types - Fix pixi.toml channel configuration for cudf dependencies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
31ffd86 to
85316e8
Compare
…filtered repository pop - Add batch_state enum (idle, read_only, mutable_locked) and atomic _state member to data_batch for observable lock state - data_batch inherits enable_shared_from_this for non-consuming transitions - Add non-static to_read_only(), to_mutable(), try_to_read_only(), try_to_mutable() that use shared_from_this (caller's shared_ptr preserved) - Add static readonly_to_mutable() and mutable_to_readonly() for direct locked-to-locked transitions via move - All existing static transitions now update _state - Replace pop_data_batch() with pop_idle_data_batch(), pop_read_only_data_batch(), pop_mutable_data_batch() that filter by batch_state - pop_data_batch_by_id and get_data_batch_by_id unchanged Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ok to test acd69dd |
Summary
Replace the 4-state finite state machine (
batch_state: idle, task_created, processing, in_transit) with a 3-class design wheredata_batchis the "idle" state and all data access requires acquiring a lock through RAII accessor types.Core invariant: it is impossible to read or mutate batch data without holding the appropriate lock, and move semantics make stale references a compile error.
Design
Observable state
data_batchexposes an atomicbatch_stateenum (idle,read_only,mutable_locked) viaget_state(). Updated during every state transition. Enables the repository to filter batches by lock state.State transitions
Static methods (move the smart pointer, nullifying the source — works with both
shared_ptrandunique_ptr):Non-static methods (use
shared_from_this— caller'sshared_ptris NOT consumed):Locked-to-locked transitions (static, consume via move):
Non-blocking variants (
try_to_read_only,try_to_mutable) returnstd::optional— available as both static and non-static.State-filtered repository
pop_data_batch()replaced with state-specific FIFO pops that filter byget_state():pop_data_batch_by_id()andget_data_batch_by_id()unchanged — caller checks state after retrieval.Breaking changes
data_batchnow inheritsstd::enable_shared_from_this<data_batch>data_batchhas observablebatch_stateenum andget_state()methodbatch_stateenum replaced (old: idle/task_created/processing/in_transit → new: idle/read_only/mutable_locked)pop_data_batch()removed — usepop_idle_data_batch(),pop_read_only_data_batch(), orpop_mutable_data_batch()idata_batch_probe,data_batch_processing_handle,lock_for_processing_result/statusremovedpop_data_batch_by_id()andget_data_batch_by_id()no longer taketarget_statedata_repository_manager::add_data_batch_implusesif constexprinstead of SFINAEFiles changed
data_batch.hpp,data_batch.cppenable_shared_from_this, locked↔locked transitionsdata_repository.hpppop_idle/read_only/mutable_data_batch()data_repository.cpp,data_repository_manager.cpprepresentation_converter.cpptest_data_batch.cpp,test_data_repository.cpp,test_data_repository_manager.cppTest plan
pixi run buildcompiles cleanly (all 59 targets)pixi run testpasses (100% tests passed)try_to_read_only/try_to_mutablereturnnulloptwhen lock is heldget_state()returns correct value after every transition typeshared_ptrreadonly_to_mutable/mutable_to_readonlyupgrade/downgrade correctly🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com