Make Harbor library downloads async-safe#1887
Conversation
| finally: | ||
| output = capture.get().strip() | ||
|
|
||
| thread = threading.Thread(target=run_download, name="harbor-download") |
There was a problem hiding this comment.
🟡 Medium harbor/taskset.py:149
download_with_harbor() starts a non-daemon Thread and blocks on thread.join(). If the caller is interrupted (e.g. Ctrl-C) during the join, the main thread unwinds and TemporaryDirectory.__exit__() deletes output_dir, but the background download thread keeps running because it is neither stopped nor a daemon — the process can hang on shutdown waiting for it to finish, and the worker races against deletion of the directory it's writing to. Marking the thread as daemon=True lets the process exit without waiting for it.
| thread = threading.Thread(target=run_download, name="harbor-download") | |
| thread = threading.Thread(target=run_download, name="harbor-download", daemon=True) |
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @verifiers/v1/tasksets/harbor/taskset.py around line 149:
`download_with_harbor()` starts a non-daemon `Thread` and blocks on `thread.join()`. If the caller is interrupted (e.g. `Ctrl-C`) during the join, the main thread unwinds and `TemporaryDirectory.__exit__()` deletes `output_dir`, but the background download thread keeps running because it is neither stopped nor a daemon — the process can hang on shutdown waiting for it to finish, and the worker races against deletion of the directory it's writing to. Marking the thread as `daemon=True` lets the process exit without waiting for it.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 636a96f. Configure here.
|
|
||
| thread = threading.Thread(target=run_download, name="harbor-download") | ||
| thread.start() | ||
| thread.join() |
There was a problem hiding this comment.
Worker thread failures not surfaced
Medium Severity
download_with_harbor starts Harbor in a worker thread and only records failures caught as Exception from download_command. Uncaught errors in that thread (including setup before the inner try, or BaseException subclasses) are not stored in error, so after thread.join() the caller can return normally even though the download never completed successfully.
Reviewed by Cursor Bugbot for commit 636a96f. Configure here.
ApprovabilityVerdict: Needs human review 1 blocking correctness issue found. Unresolved review comments identify potential threading bugs: non-daemon thread causing shutdown issues and uncaught BaseException failures. These substantive concerns warrant human review before merging. You can customize Macroscope's approvability policy. Learn more. |


Summary
asyncio.run(...)does not collide with Verifiers' async eval/validate loop.RuntimeErroron download failures.Validation
uv run --frozen --python 3.12 --extra harborasync smoke: uncachedharbor/hello-worlddownload insideasyncio.run(...)loadedhello-world/hello-world.uv run --frozen --python 3.12 --extra harborselector-error smoke:repo + registry_urlraised a VerifiersRuntimeErrorcontaining Harbor's ownError: --repo and --registry-url are mutually exclusive.uv run --frozen --python 3.12 --extra harborlocal registry smoke:registry_path=Path("~/harbor-registry-test-1884.json")expanded and loaded3d_print_shop_t0fromgeneral-agent@2026-06-25.uv run --frozen --python 3.12 python:import verifiers.v1.tasksetssucceeds without Harbor installed.uv run --frozen --python 3.11 python:import verifiers.v1.tasksetssucceeds without Harbor installed.uv lock --check --python 3.11uv lock --check --python 3.12uv run --python 3.13 ty check verifiersuv run --frozen --python 3.12 pre-commit run --files verifiers/v1/tasksets/harbor/taskset.pygit pushpre-push hook: ruff check, ruff format, sync check,ty (ci parity)Note
Make Harbor library downloads async-safe by running them in a dedicated thread
taskset.pyare now executed in a separate thread via a newdownload_with_harborhelper, making them safe to call from async contexts.harbor.cli.downloadat import time; Harbor presence is validated at call time, raisingImportErrorwith an install hint if missing.RuntimeErrorwith the exit code and captured console output for easier debugging.ImportErrorat import time must now handle it at call time.Macroscope summarized 636a96f.
Note
Medium Risk
Changes how and when Harbor is loaded and how download errors surface; behavior is localized to Harbor taskset caching but affects any async path that triggers uncached downloads.
Overview
Harbor dataset fetching no longer imports Harbor at module load time, so core Verifiers can import tasksets without the optional
harborextra until a download actually runs.Downloads are routed through a new
download_with_harborhelper that lazy-imports Harbor, runsdownload_commandon a dedicatedharbor-downloadthread (so Harbor’s internalasyncio.rundoes not clash with Verifiers’ async eval loop), and wraps failures in aRuntimeErrorthat includes exit codes and captured Rich console output when available.Reviewed by Cursor Bugbot for commit 636a96f. Bugbot is set up for automated code reviews on this repo. Configure here.