wip: reduce size of thread buckets at larger maxtermsize#829
Draft
jodavies wants to merge 2 commits into
Draft
Conversation
Splitting the buffer will allow valgrind to detect out-of-bounds accesses of deferbuffer or threadbuffer. Start with a small compressbuffer, and double it as needed. This saves a lot of memory with many threads and large maxtermsize.
Using "large" values of MaxTermSize (say, 5M, 10M, ...) results in very large allocations for the thread buffers, since they historically have been ThreadBucketCount(500)/4*MaxTermSize * 2 * workers, which easily reaches 100GB or more. Reduce the size of these allocations, such that with the default setup nothing changes, and then the buffers grow as log(MaxTermSize), up to the point where we enforce a BUCKETMINTERMS*MaxTermSize limit. Then they grow linearly, but BUCKETMINTERMS is much smaller than the default ThreadBucketCount.
Collaborator
Author
|
Related: #831 separately takes the mbox1l test allocation to 353GB. The forcer test is unaffected (its configured small+large is not the limiting factor). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
These commits address the second point of the discussion in #795. With the default buffer sizes nothing changes, but then the size of the threadbuckets increases only logarithmically with maxtermsize, until eventually it increases linearly again to ensure at least
BUCKETMINTERMS=2terms fit. But this linear increase is much slower than the current behaviour, which is125*MaxTermSize.The compressbuffer can also start small, and be doubled in size only when necessary.
Only
forcerandmbox1ltests in form-bench set a larger-than-default maxtermsize. With 24 workers, the forcer allocation (80K maxtermsize) goes from 22.4GB -> 20.4GB, but the mbox1l (3M maxtermsize) reduces from 389GB -> 260GB. There is no measurable change in performance.Any thoughts on this?