Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use fast uproot.num_entries method instead of TTree.num_entries #1197

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

pfackeldey
Copy link
Contributor

@pfackeldey pfackeldey commented Oct 23, 2024

uproot.num_entries is a fast path to get the number of events. It should always be faster than TTree.num_entries. This may become noticeable for preprocessing very large filesets.

The nanoevents factory should use uproot.num_entries aswell (at some point), in this PR I just removed the overhead of calculating it twice.

@ikrommyd
Copy link
Contributor

@pfackeldey I was wondering because I find preprocessing slow (many minutes for large datasets), did you try it and have seen speed improvement?

@pfackeldey
Copy link
Contributor Author

pfackeldey commented Oct 23, 2024

@ikrommyd I have not measured it, because I don't have an analysis setup to do so. If you can, I'd be happy to see the difference!
I'm not sure how much one can gain here (I didn't profile the preprocess method), but calculating the number of entries should be always faster with uproot.num_entries compared than TTree.num_entries.
(I noticed slow preprocessing aswell when running the AGC - actually I might be able to measure the performance gain with the 200GBps challenge?)

@pfackeldey pfackeldey changed the title use fast uproot.num_entries method instead of TTree.num_entries perf: use fast uproot.num_entries method instead of TTree.num_entries Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants