Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize BlockLocators Iterator with Parallel Processing #3430

Draft
wants to merge 3 commits into
base: staging
Choose a base branch
from

Conversation

meetrick
Copy link

@meetrick meetrick commented Nov 5, 2024

Motivation

This PR aims to optimize the BlockLocators into_iter method by introducing parallel processing. The existing sequential processing approach can be time-consuming when handling large datasets of block data. By leveraging rayon for parallel processing, this change seeks to improve performance in merging checkpoints and recents maps, providing a more efficient solution for large-scale data handling.

Test Plan

Generate a dataset of 1000 items to compare processing speeds.

Related PRs

(Link any related PRs here)

This PR enhances the BlockLocators iterator by introducing parallel
processing with rayon. The checkpoints and recents maps are combined
using into_par_iter, leveraging multi-threading to potentially
improve performance.

Signed-off-by: Hwangjae Lee <[email protected]>
This patch introduces a performance test for the BlockLocators iterator,
outputting the processing times.
@meetrick meetrick marked this pull request as draft November 5, 2024 22:14
@meetrick
Copy link
Author

meetrick commented Nov 5, 2024

hey guys!

I expected that processing speed would increase with parallel processing. However, after testing, I observed that up to a certain number of items, parallel processing introduced overhead, which caused it to take longer. Based on my test code, sequential processing performed better up to approximately 100,000 items. The performance is also affected by the machine’s capabilities.

I would appreciate reviewers’ opinions on this. It seems that additional changes may be needed in the backend to fully accommodate these updates.

self.checkpoints
.into_par_iter()
.chain(self.recents.into_par_iter())
.collect::<Vec<_>>(),
Copy link
Collaborator

@ljedrz ljedrz Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this be collected directly into a BTreeMap via from_par_iter?

@ljedrz
Copy link
Collaborator

ljedrz commented Nov 7, 2024

@meetrick in general, I'm not convinced that parallelizing the iteration itself is likely to be a performance improvement at any scale, unless we can utilize rayon to do the sorting that's involved when creating the BTreeMap; in other words, the operation that we should consider is BTreeMap::from_par_iter, as indicated in the removed comment.

@meetrick
Copy link
Author

meetrick commented Nov 7, 2024

@ljedrz It seems I misunderstood the TODO. I’ll work on it again. Thanks for your input.

@ljedrz
Copy link
Collaborator

ljedrz commented Nov 7, 2024

In addition, it is possible that sorting is not always necessary; I don't recall the related logic very well, but it seems to me that checkpoints and recents might already be in the right order at least in some scenarios. in such cases, avoiding the iteration and collecting to a BTreeMap would have a positive impact on performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants