Skip to content

GH-3558: Properly close buffers#3559

Open
Fokko wants to merge 2 commits into
apache:masterfrom
Fokko:fd-correctly-close-buffers
Open

GH-3558: Properly close buffers#3559
Fokko wants to merge 2 commits into
apache:masterfrom
Fokko:fd-correctly-close-buffers

Conversation

@Fokko
Copy link
Copy Markdown
Contributor

@Fokko Fokko commented May 13, 2026

Rationale for this change

With updating to a later version of Hadoop, we got the error that there are lingering buffers that are not closed properly. This PR addresses this issue by properly releasing the acquired buffers.

Initially caught by @pan3793 in #3360 (comment)

Closes #3558
Closes #3356

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if @gszadovszky could take a look at this as this is related to TrackingByteBufferAllocator.

Comment thread pom.xml
<shade.prefix>shaded.parquet</shade.prefix>
<!-- Guarantees no newer classes/methods/constants are used by parquet. -->
<hadoop.version>3.3.0</hadoop.version>
<hadoop.version>3.4.0</hadoop.version>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to reflect this in the PR title?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the underbound of the supported Hadoop version, so I think it is best to set this to 3.3.0 before merging this. Unless @steveloughran has an opinion on a reasonable lower-bound :)

@Override
public void close() throws LeakedByteBufferException {
if (!allocated.isEmpty()) {
LeakedByteBufferException ex = new LeakedByteBufferException(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a behavior change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is, but it is not covered in any tests. WDYT @gszadovszky ?

Copy link
Copy Markdown
Contributor Author

@Fokko Fokko May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was working properly since Hadoop did raise exception around not properly closing buffers, while this exception wasn't being thrown. Let me dig a bit further here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The whole point of this allocator implementation is to fail if something was not released. It may make sense to actually release the buffers so the tests won't leak, be we need to throw that exception, otherwise this class is pointless.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I've created this class, I've caught a couple of leaks in parquet-java code with it. Not having that exception thrown during our unit tests means the leaks are fixed not that the monitoring class is not working :)

@Fokko Fokko force-pushed the fd-correctly-close-buffers branch 3 times, most recently from eb75160 to ba219c3 Compare May 13, 2026 12:43
@Fokko Fokko force-pushed the fd-correctly-close-buffers branch from ba219c3 to d1a7cec Compare May 13, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Properly close buffers TestParquetReader fails with hadoop 3.4.2

3 participants