Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you do this, but async? #14

Open
nichoth opened this issue Jan 11, 2019 · 4 comments
Open

Can you do this, but async? #14

nichoth opened this issue Jan 11, 2019 · 4 comments

Comments

@nichoth
Copy link

nichoth commented Jan 11, 2019

I was thinking about a situation of doing a 'live' map-reduce operation — where you group multiple log values under a key (this is the map function, choosing the key), and then reducing to get the value for that key. In order to get the previous reduced value from the leveldb though, you would need to read the current value for this key, which is an async operation.

I have a feeling that it could get weird when you try to keep all the constraints we want, like atomic writes, and also optimizing the writes by batching a number of input values. I haven't thought this through too much yet, but figured it wouldn't hurt to think out loud.

@dominictarr
Copy link
Contributor

I have day dreamed about this occasionally. the naive way is to just get the current value, then write the new value before processing the next one. easy because of pull streams, but probably slow (how slow?)

You could make that much faster by adding caching (caching is basically go-fast-sauce, just squirt it on your database and it goes faster) but also adds edge cases... like, don't evict something from the cache if it's being written. if you update a key, but it was already being written, probably better wait for it to be written before you write it again, incase the second write lands first.

hmm, probably a simple way would be to do a bunch of writes as a batch, on all dirty records.

@dominictarr
Copy link
Contributor

the most important thing, is that the state of the view should be consistent with some sequence number. you shouldn't just write a single record at a time, you need to write them in order, or rather, in a batch, up to some point.

@mixmix
Copy link
Member

mixmix commented Oct 4, 2019

I'm interested in doing async map's as well.

To do this it looks like this would be a major version change... changing the signature of map, and inside here probably changing the signature of pull-write as well to have a async reduce ? hmm that might be hard / make unpredicatable queues?

@dominictarr
Copy link
Contributor

@mixmix when designing a thing like a database, it's important to think about what you are actually trying to do. What is an example of a query you are wanting to write?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants