-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateful partitioners #758
Conversation
Any measurable performance impact? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a part of this change, you should also tackle issues with public API for calculating tokens.
Currently, there are some problems with that: There are 3 functions to calculate Token: Session::calculate_token
, Session::calculate_token_for_partition_key
and ClusterData::compute_token
. It doesn't make sense to put them inside Session
or ClusterData
- wrong level of abstraction. A similar case for functions to calculate partition key.
More context: #757 (comment)
My initial exploration on fixing it: https://github.com/avelanarius/scylla-rust-driver/tree/token_calc_cleanup
Number of allocations has dropped (6 -> 5 per insert, 19 -> 18 per select). Before:
After:
|
Please prepare a microbenchmark for this. While an allocation is now avoided, the new implementation has more branches, so it will be interesting to see whether the tradeoff was worth it. We are basically interested in the We already have a small amount of benchmarks in |
1707c35
to
948c355
Compare
948c355
to
d5c7737
Compare
After benchmarks and some changes to the algorithm, the new way seems to outperform the old way by 15% to 40% in various cases. I believe this suffices to merge this. |
NICE! Well done! Would be happy to see details of this tests. |
I'll clean the benchmarks up and include them in our codebase, so they can guard against unintended performance regressions in the future. |
126d77a
to
c0353dc
Compare
CI fails due to use of |
0aa8953
to
286044f
Compare
I reimplemented |
286044f
to
c9851ed
Compare
Rebased on |
frame::types, | ||
frame::value::ValueList, | ||
transport::partitioner::{calculate_token_for_partition_key, Murmur3Partitioner}, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: putting the commit with the benchmark at the beginning of the PR should make it easier to use it to compare performance before and after the changes.
c9851ed
to
e107525
Compare
e107525
to
624e79e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another batch of comments - hopefully the last one this time.
/// Instances of this trait are created by a `Partitioner` and are stateful. | ||
/// At any point, one can call `finish()` and a `Token` will be computed | ||
/// based on values that has been fed so far. | ||
pub trait PartitionerHasher { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The first thing that comes to my mind when I see PartitionerHasher
is that it hashes partitioners. I don't have a good, alternative suggestion, though.
prepared_metadata: &PreparedMetadata, | ||
partition_key: Option<&Bytes>, | ||
pub(crate) fn new_prepared<'ps>( | ||
partition_key: Option<impl Iterator<Item = (&'ps [u8], &'ps ColumnSpec)> + Clone>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be that much generic? Does it really make sense to put here anything that isn't an Option<PartitionKey>
? Perhaps doing so will make it easier to fix the lifetime issue in partition_key_displayer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried it up and gave up. Apparently, the Iterator
returned by PartitionKey::iter()
is less problematic in terms of lifetime issue than PartitionKey
itself.
624e79e
to
959520b
Compare
An `else` branch is introduced to make code flow clearer, and a typo is fixed.
The Partitioner API is changed into stateful one, resembling std::hash traits. The motivation is simple: having Partitioners hold state enables feeding them with data in chunks instead of one contiguous slice, which saves us an allocation. Since now, Partitioners' users are encouraged to first build a hasher using `Partitioner::build_hasher()` and then feed it in chunks with `write()`. Finally, `finish()` is to be called and a token is returned. The `Partitioner::hash()` method was renamed to `hash_one()` for closer correspondence to `std::hash::BuildHasher` trait.
This is an umbrella enum for all available PartitionerHashers. It is analogous to PartitionerName in this regard, as well as when it comes to its usage.
In the next few commits, the logic concerning partition key and token computation will be split into two steps: 1) extracting partition key from serialized values 2) calculating token based on the extracted partition key Therefore, to gain more granularity over errors, `PartitionKeyError` is divided into `PartitionKeyExtractionError` and `TokenCalculationError`. This will be needed for clean error handling in the new API.
The recently added `calculate_token_for_partition_key()` duplicates the logic that `ClusterData::compute_token()` uses. To deduplicate, `compute_token()` now calls `calculate_token_for_partition_key()`.
It is clear that the algorithm for computing partition key for a given prepared statement and bound values can be divided into steps. The first of them involves extracting values that consistute the partition key and putting them in proper partition key order. PartitionKey struct performs this step on construction, and serves as an entry point for further steps, with token calculation among others. For those further steps to be done, an iterator over partition key values is accessible, which yields pairs (value, spec) for each column that constitutes partition key, in partition key order. It will be used to avoid materialising partition key in a buffer.
959520b
to
4c7e268
Compare
The tests require PartialEq, Eq on TableSpec and ColumnSpec, so these traits were derived unconditionally.
Traces generated upon requests are expected to contain the partition key if available. In order to avoid an allocation for that, `PartitionKey` struct is used, whose iterator yields the required values one by one and then they are deserialised. Partition key is extracted only once, and token is calculated only once, too. Not to raise our MSRV, `Option::unzip()` is temporarily reimplemented as an util `unzip_option()`. It is to be replaced with `Option::unzip()` once MSRV is raised to at least 1.66.
It's no longer useful, as now we operate straight on serialized values before they get encoded in partition key format.
`Session::calculate_token()` is modified so that it takes advantage of the new Partitioner API and hence avoids allocating the whole partition key. Additionally, `calculate_token()` is now called upon queries straight, instead of `calculate_partition_key()` first. As a bonus, the mess around various flavours of `[calculate|compute]_partition_key` functions is finally gone: two of them could be completely removed.
It has no usages left, as `build_hasher()` is to be used now.
It should be exposed that calculating token without materialising partition key is feasible now, and promote this way.
In effort to combat overloading Session with nonrelated functionality, token computation functions are moved out from session.rs. Specifically, `calculate_token()` is moved to prepared.rs and `calculate_token_for_partition_key()` is moved to partitioner.rs.
The benchmark's results confirm that the new algorithm for stateful partitioners is more performant that the old one, by about 15% to 40%. In the future, the benchmark can be run to prevent unintended performance regressions in partitioners.
4c7e268
to
0e21804
Compare
Motivation
Currently, partitioners require the whole input to be passed in a single contiguous slice, which often forces an allocation that could else be avoided.
What's done
std::hash
API:Partitioner
corresponds toBuildHasher
as it becomes kind of a factory,PartitionerHasher
is introduced and corresponds toHasher
. It contains the hashing state, which enables feeding it with multiple chunks of data separately.A test is added that ensures consistent hashing result no matter how the partition key is partitioned into chunks.
To this end,
PartitionKey
struct is introduced, whose motivation is as follows: the algorithm for computing partition key for a given prepared statement and bound values can be divided into steps. The first of them involves extracting values that consistute the partition key and putting them in proper partition key order.PartitionKey
performs this step on construction, and serves as an entry point for further steps, with token calculation among others. For those further steps to be done, an iterator over partition key values is accessible, which yields pairs (value, spec) for each column that constitutes partition key, in partition key order.PartitionKey
's iterator provides values to be deserialised by the mechanism added recently in session: include partition key in RequestSpan in human readable form #766. Allocation is then saved there, too.calculate_token_from_partition_key()
is moved topartitioner.rs
,calculate_token()
toprepared_statement.rs
, and a number of helper functions are deleted as no longer needed.Pre-review checklist
[ ] I have adjusted the documentation in./docs/source/
.[ ] I added appropriateFixes:
annotations to PR description.