fix: Add retry logic for RedisStorageClient and SqlStorageClient#1838
fix: Add retry logic for RedisStorageClient and SqlStorageClient#1838Mantisus wants to merge 8 commits intoapify:masterfrom
RedisStorageClient and SqlStorageClient#1838Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a unified StorageWriteError and adds retry behavior for write operations (KeyValueStore.set_value and Dataset.push_data) to reduce the risk of data loss on transient storage failures (notably SQL/Redis).
Changes:
- Add
StorageWriteErrorand update SQL/Redis/FS storage clients to raise it on write failures instead of swallowing errors. - Add retry parameters (
max_attempts,wait_time_between_retries) toKeyValueStore.set_valueandDataset.push_data. - Add unit tests asserting that client-level write failures surface as
StorageWriteError.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/crawlee/errors.py |
Introduces StorageWriteError for consistent write-failure signaling. |
src/crawlee/storages/_key_value_store.py |
Adds retry loop to KeyValueStore.set_value (currently logs instead of raising after retries). |
src/crawlee/storages/_dataset.py |
Adds retry loop to Dataset.push_data (currently logs instead of raising after retries). |
src/crawlee/storage_clients/_sql/_key_value_store_client.py |
Converts SQL write failures into StorageWriteError and avoids with_simple_commit swallow behavior. |
src/crawlee/storage_clients/_sql/_dataset_client.py |
Converts SQL write failures into StorageWriteError and avoids with_simple_commit swallow behavior. |
src/crawlee/storage_clients/_redis/_key_value_store_client.py |
Wraps Redis write failures as StorageWriteError. |
src/crawlee/storage_clients/_redis/_dataset_client.py |
Wraps Redis write failures as StorageWriteError. |
src/crawlee/storage_clients/_file_system/_key_value_store_client.py |
Wraps filesystem write failures as StorageWriteError. |
src/crawlee/storage_clients/_file_system/_dataset_client.py |
Wraps filesystem write failures as StorageWriteError. |
src/crawlee/storage_clients/_sql/_request_queue_client.py |
Adds an extra requests_db emptiness guard (currently redundant). |
tests/unit/storage_clients/_sql/test_sql_kvs_client.py |
Adds test ensuring SQL KVS write errors raise StorageWriteError. |
tests/unit/storage_clients/_sql/test_sql_dataset_client.py |
Adds test ensuring SQL Dataset write errors raise StorageWriteError. |
tests/unit/storage_clients/_redis/test_redis_kvs_client.py |
Adds test ensuring Redis KVS write errors raise StorageWriteError. |
tests/unit/storage_clients/_redis/test_redis_dataset_client.py |
Adds test ensuring Redis Dataset write errors raise StorageWriteError. |
tests/unit/storage_clients/_file_system/test_fs_kvs_client.py |
Adds test ensuring FS KVS write errors raise StorageWriteError. |
tests/unit/storage_clients/_file_system/test_fs_dataset_client.py |
Adds test ensuring FS Dataset write errors raise StorageWriteError. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
KeyValueStore.set_value and Dataset.push_dataKeyValueStore.set_value and Dataset.push_data
vdusek
left a comment
There was a problem hiding this comment.
This should be thought through carefully as it extends the public interface of the public components (storages). We want to keep the public interface consistent in JS & Py implementations as much as possible.
vdusek
left a comment
There was a problem hiding this comment.
We just had a brief discussion about this with @janbuchar, and we think this should be handled at the storage client level rather than leaking into the storages (frontend). One reason is that the Apify API client used by ApifyStorageClient already has its own retry mechanism. Adding another layer here would duplicate that behavior. Therefore, each storage client should manage retries on its own.
KeyValueStore.set_value and Dataset.push_dataRedisStorageClient and SqlStorageClient
4323326 to
d09828c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Description
retry_on_errordecorator with exponential backoffSqlStorageClientandRedisStorageClientto handle transient failuresIssues