Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-2054: [C++] fix return wrong result if lack of hasnull #2055

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

shuai-xu
Copy link

What changes were proposed in this pull request?

This pr fix the bug that if the column statistics in a orc file is not fully written, and lack of hasnull field, user may get a wrong result using c++ to read it.
For example, a file struct<string col1, string col2>, has 10 lines, col1 all has value, col2 all is null. the column 1's stat written by trino may be
numberOfValues: 10
stringStatistics {
minimum: "10"
maximum: "100"
sum: 565
}. col2's stat is numberOfValues: 0. They all have no hasnull field. When we want to get where col2 is null, we will get nothing.

Why are the changes needed?

User may get a wrong result with this bug.

How was this patch tested?

Add unit tests.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the CPP label Oct 25, 2024
@wgtmac
Copy link
Member

wgtmac commented Oct 27, 2024

Thanks for the fix! It makes sense to me. Please make the CI happy.

BTW, has Trino fixed this issue on its writer side?

@shuai-xu
Copy link
Author

Thanks for the fix! It makes sense to me. Please make the CI happy.

BTW, has Trino fixed this issue on its writer side?

The ci is fixed, it seems need approval to rerun. We are trying to fix it on Trino.

@wgtmac
Copy link
Member

wgtmac commented Oct 28, 2024

Please fix the format check. Thanks!

@shuai-xu
Copy link
Author

Please fix the format check. Thanks!

Done, CI passed.

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@dongjoon-hyun
Copy link
Member

What is the target version for this fix, @shuai-xu and @wgtmac ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants