Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MapReduce multithread combined storage #77

Open
Konard opened this issue Oct 27, 2021 · 2 comments
Open

MapReduce multithread combined storage #77

Konard opened this issue Oct 27, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@Konard
Copy link
Member

Konard commented Oct 27, 2021

We have an ability to set minimum value of internal references range.

public LinksConstants(int targetPart, Range<TLinkAddress> possibleInternalReferencesRange, Range<TLinkAddress>? possibleExternalReferencesRange)

We also can allow to reference any address even if it does not exist in the current storage.

Then we can create separate file or allocate separate section for each new N links.

Later, we will be able to combine all read and write operations on the combined storage.

Each section of links storage can be allocated in separate heap block or in separated file. And can be always accessed by specific thread. That means that all read and write operations can be spread across multiple threads, and we can distribute the load this way without using any locks at all, just lock free queues to stream requests and results to and from threads.

Each request is mapped to all threads, when results are ready they are reduced to a single result.

In the case of heap allocation, each section (64 MB or any user-defined size) can be allocated separately without need to copy data, so this can save additional CPU resources.

In the case of mmap allocation, there is no need to close files (just new ones will be open), so there is no need to force flush data to disk, this it also saves resources during regular operation of the storage.

All trees in all sections are smaller, so it also helps to scale.

No need to use more than C+1 threads, where C is the number of memory channels in the system.

@Konard Konard added the enhancement New feature or request label Oct 27, 2021
@uselessgoddess
Copy link
Member

I used a similar approach in my buffered iterator library.
But how to link trees from different blocks?

@uselessgoddess
Copy link
Member

No need to use more than C+1 threads, where C is the number of memory channels in the system.

Really no need? rayon gives a 4-fold increase in speed in quick-sort on a system with two memory channels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants