Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard selecting load balancing #791

Closed
wants to merge 6 commits into from

Conversation

wprzytula
Copy link
Collaborator

Motivation

Most of our drivers, being inherited from Cassandra, load balance only over nodes, not specific shards. Multiple ideas have arised that could benefit from having a shard-selecting load balancing. Among them:

  • shard-aware batching (Shard aware batching - add Session::shard_for_statement & Batch::enforce_target_node #738);
  • tablets support:
    with tablets enabled (ATM experimental in ScyllaDB), target shard is derived from token (computed from partition key), but rather read from system.tablets. Therefore, a load balancer should be able to decide a target shard on its own, by abstracting over either token ring or tablets being used for cluster topology.
  • overloaded shard optimisation:
    some tests have shown that sometimes, when a shard is particularly overloaded, it may be beneficial (performance-wise) to send the request to the proper node, but a wrong shard. That shard would then do part of the work that the overloaded shard would else have to do itself.

Design

  • LB policy now is to return a (NodeRef, Shard) pair, enabling finer-grained control over targeted shards.
  • regarding tablets support: ReplicaLocator is the place where the abstraction over either token ring or tablets is to be implemented. Ideally, the LB policy does not have to be aware of the actual mechanism (token ring or tablets) being used for a particular query.

What's done

  • internal and public load-balancing-related interfaces are changed from NodeRef to (NodeRef, Shard) pair,
  • shard selection logic is removed from NodeConnectionPool; a method is added there that returns a connection to a specific shard,
  • Session's logic propagates the load balancing policy's target shard down to the connection pool,
  • a stub implementation of shard selection is added to ReplicaLocator. At the moment, it simply computes the shard based on the token, the same way as it was done in the connection pool layer before.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

&'a self,
query: &'a RoutingInfo,
cluster: &'a ClusterData,
) -> Option<(NodeRef<'a>, Shard)>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more for hiding NodeRef, Shard pair into some struct e.g.

pub struct PlanElement<'n> {
   node_ref: NodeRef<'n>,
   shard: Shard,
}

Converting load_balancing interfaces to use it instead of a plain tuple would allow us to have a greater flexibility in adding/removing PlanElement's fields without breaking API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hide them, then how could a user implement their own policies???

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using PlanElements produced by the locator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What operations do you propose to be pub on PlanElement ? Will it be possible to examine its contents? Will it be possible to alter them? What about crafting one's own PlanElement ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What operations do you propose to be pub on PlanElement ? Will it be possible to examine its contents?

Viewing shard and node_ref should be user-accessible.

What about crafting one's own PlanElement ?

I don't immediately see why one would like to craft their own PlanElement. Objects required to do so (NodeRef + Shard) can only be obtained through locator.

Will it be possible to alter them?

Allowing the user to influence a shard selection sounds like a good reason for altering the shard field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't immediately see why one would like to craft their own PlanElement. Objects required to do so (NodeRef + Shard) can only be obtained through locator.
Precisely the case of #738 is one where a user would like to craft their own PlanElement, or at least this is a case close to crafting one's own.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is ReplicaLocator only responsible for locating replicas or is it, from the LB policy point of view, also the only source of knowledge about non-necessarily-replica nodes in the cluster?

@Lorak-mmk
Copy link
Collaborator

I'll close this - we won't be merging this PR on it's own, it will be part of tablets PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants