Parallelize volume header key derivation across layouts on mount#1793
Parallelize volume header key derivation across layouts on mount#1793damianrickard wants to merge 1 commit into
Conversation
Volume mounting auto-detects the layout by probing each candidate layout's header, each deriving its key (PBKDF2 or Argon2) independently. This ran one layout at a time, so with Argon2 (memory-hard, single-lane, seconds-to- minutes per pass) the probe is slow -- worst for hidden volumes, where the outer and hidden headers are derived serially. - Add VolumeHeader::DecryptHeaderParallel: dispatch every (candidate layout x KDF) derivation to the encryption thread pool at once, so different layouts' expensive KDFs run concurrently. Candidates are then resolved in their original (priority) order -- candidate N is considered only once every higher-priority candidate is known not to decrypt and not to throw -- so the serial detection's layout priority and exception ordering are preserved. Within a candidate the first KDF whose key decrypts the header wins, so a fast match does not wait on that candidate's slow KDFs. - Volume::Open gathers the candidate headers and uses it when no specific KDF is requested; the serial per-layout path is otherwise unchanged. - If the pool is not already running (e.g. in the elevated core service), Volume::Open starts it only for the derivation and stops it before returning -- so it is NOT running when the caller fork()s the FUSE daemon (fork() in a multithreaded process is unsafe; the FUSE daemon starts its own pool after that fork). The GUI's persistent pool is left untouched.
|
Thanks for the PR. The performance goal is interesting, but I can’t merge this as-is. First, the new lambda in Volume.cpp breaks Linux compatibility baseline, especially CentOS 6 / GCC 4.4 where lambdas are not supported. More importantly, starting/stopping the global EncryptionThreadPool from Also, parallelizing across layouts can run multiple Argon2 derivations at the same time, significantly increasing peak memory use and potentially causing correct mounts to fail under memory pressure. I think this needs a redesign: avoid lambdas, avoid ad-hoc global pool lifecycle changes in |
Summary
On mount,
Volume::Openauto-detects the volume layout by probing each candidate layout's header, deriving each header's key independently. Two things made this slow:EncryptionThreadPool, so the derivations ran single-threaded on a single core.With Argon2id (memory-hard, single-lane, seconds-to-minutes per derivation) this is slow — worst for hidden volumes, where the outer and hidden headers are both probed.
This change derives the candidate layouts' header keys concurrently via the thread pool, so the expensive KDFs overlap and use all available cores.
Change
VolumeHeader::DecryptHeaderParallel(candidates, password, pim): dispatches every (candidate layout × KDF) derivation toEncryptionThreadPoolat once. Candidates are then resolved strictly in their original (priority) order — candidate N is considered only once every higher-priority candidate is known not to decrypt and not to throw — so the serial detection's layout priority and exception ordering (e.g.HigherVersionRequired) are preserved. Within a candidate, the first KDF whose derived key decrypts the header wins, so a fast match does not wait on that candidate's slower KDFs.Volume::Opengathers the candidate headers and uses the parallel path when no specific KDF is requested; the existing serial per-layout path is the unchanged fallback (specific KDF requested, or thread pool unavailable).Volume::Openstarts the pool for the derivation only if it is not already running, and stops it again before returning (see below). This is what lets the core-service mount path use the pool at all, without keeping pool threads alive across the subsequent FUSEfork().Thread-pool lifetime / fork safety
fork()in a multithreaded process is unsafe, and the mount pathfork()s to launch the FUSE service. SoVolume::Openonly starts the pool if it is not already running, and stops it before returning:main()) is left untouched.Scope
Platform-independent; touches only
src/Volume/(Volume.cpp,VolumeHeader.cpp,VolumeHeader.h). No on-disk format change. The serial detection path and its results are unchanged.Testing
Built and tested on Apple Silicon (macOS, FUSE-T). The real-world impact is large: a hidden volume on an external device that previously took over 15 minutes to mount (in one attempt it had still not completed when I cancelled it) now mounts in under a minute — contents verified, clean dismount. Normal/PBKDF2 volumes mount without regression (a fast KDF match returns without waiting on Argon2). The serial fallback path is exercised via specific-KDF mounts.