-
Notifications
You must be signed in to change notification settings - Fork 626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hashset feature] Replace dict with hashset for keys and expire #1178
base: hashset
Are you sure you want to change the base?
Conversation
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
This changes the type of command tables from dict to hashset. Command table lookup takes ~3% of overall CPU time in benchmarks, so it is a good candidate for optimization. My initial SET benchmark comparison suggests that hashset is about 4.5 times faster than dict and this replacement reduced overall CPU time by 2.79% 🥳 --------- Signed-off-by: Rain Valentine <[email protected]> Co-authored-by: Rain Valentine <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
* Add hashsetRehashMicroseconds * Change hashsetTwoPhasePopDelete to *not* call the element destructor * Change a field name in the hashset iterator * Add missing prototype for hashsetIsRehashingPaused Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
…ty buckets" Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
Signed-off-by: Viktor Söderqvist <[email protected]>
|
||
#include "dict.h" | ||
#include "hashset.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to include this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR currently targets the hashset
feature branch, where this file exists.
Do you think I should target the unstable branch instead? I'm trying to find a way to split it into smaller PRs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we first create a PR to unstable with just the hashset implementation and replace dict with hashet in the command lookup table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed this, I thought this was the PR for the hashset and I was interested. You can ignore me :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, seriously, I think it can be very hard to get a huge feature branch merged. We should split it up in some way.
The implementation of the hashset is in the hashset branch in this repo. PTAL!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't look at db.c yet
} | ||
if (val->hasembkey) { | ||
uint8_t hdr_size = *(uint8_t *)data; | ||
data += 1 + hdr_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we made hdr_size
into key_offset
and have the +1 included in the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the value we store in key_offset
byte includes its own size?
Yeah, that would work. I copied this code from how keys are currently embedded in dict entry.
sds oldkey = *(sds *)data; | ||
if (oldkey != NULL) sdsfree(oldkey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't quite make sense to me with the assert at the top of the function. If valkeyGetKey(val) == NULL
then there should be nothing here - not embedded or a pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. This code isn't ready yet. I want to rewrite it to avoid code duplication and improve it in general.
/* Size of embedded value (EMBSTR) including \0 term. */ | ||
min_size += sizeof(struct sdshdr8) + val_len + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not call sdscopytobuffer() again? How do we know it always uses sdshdr8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EMBSTR was always hard-coded to using sdshdr8. This is old code, refactored. I wanted to avoid rewriting it.
In the future, a good thing would be to get rid of the ptr
field in robj and instead compute it using a function. Then we can save another 8 bytes for embedded strings.
sdscopytobuffer(data + 1, key_sds_size, key, data); | ||
data += 1 + key_sds_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the +1? Is that for the sds header size? Is there a reason it's not assigned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The +1 is the byte where we store the sds header size.
Assigned? Not sure what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, in this function where we allocate and initialize a brand new valkey object, when do we initialize this embedded header size value? What value is it going to have when this function ends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The byte at data
is assigned by sdscopytobuffer
by the last parameter we pass to it.
struct sdshdr8 *sh = (void *)data; | ||
sh->flags = SDS_TYPE_8; | ||
sh->len = val_len; | ||
size_t capacity = bufsize - (min_size - val_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct but I found it rather confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It's old code, just moved/copied around. You're welcome to suggest simplification, or do it as a follow-up. I wanted to focus on getting things to work first and to avoid changing more than necessary to keep the diff less huge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a suggestion for a code comment or different variable names?
src/memory_prefetch.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think we'll implement something, or will we end up completely removing prefetch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... Help wanted. :)
Signed-off-by: Viktor Söderqvist <[email protected]>
8fe59b3
to
3038293
Compare
Keys and expire are embedded in
robj
.More info to come.