Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quorum size limit #41

Closed
tylertreat opened this issue Oct 6, 2018 · 4 comments · Fixed by #312
Closed

Quorum size limit #41

tylertreat opened this issue Oct 6, 2018 · 4 comments · Fixed by #312
Labels
enhancement roadmap Included on the current product roadmap

Comments

@tylertreat
Copy link
Member

Currently, all servers in a cluster participate in the metadata Raft quorum. This severely limits scalability of the cluster (e.g. in a 100 node cluster, 51 have to respond to commit any change, Raft has n^2 messages, etc.).

Liftbridge should allow only having a subset of servers form the Raft quorum. The Raft library should make this fairly straightforward since non-quorum servers can still run Raft as non-voters and receive committed logs and follow the state without increasing quorum size. The challenge here is making the UX for actually promoting one of them to be a quorum member if one of the existing quorum fails, but the Raft library provides a way to do this.

@RussellLuo
Copy link

As far as I know, jocko has the same issue, which is because we are trying to eliminate the external Raft-service dependency. Looking forward to the solution!

@tylertreat
Copy link
Member Author

tylertreat commented Oct 18, 2018

I think this is actually easier to solve than the author of that issue lets on, especially since both Liftbridge and Jocko use the same Raft implementation as Consul. This library provides an API for adding/removing cluster nodes as well as "non-voters" which allows nodes to run Raft and still receive committed logs but don't increase the quorum size. It's mostly a problem of configuration. I have some ideas on how to do this.

@RussellLuo
Copy link

Yeah, per the docs, it looks feasible! Then it's the problem of how to turn a nonvoter into a voter after some existing voter fails.

@riaan53
Copy link

riaan53 commented Jan 12, 2019

This is not exactly related to this topic but might be worthwhile to have a look. Just found an interesting project (havnt experimented with it yet) that uses high performance raft groups similar to cockroachdb. It currently depends on an external installed rocksdb but looks like he is planning a go kv store or you can implement your own store interface. The numbers claimed seems a bit wild for running them though a raft quorum but will do some investigation later. https://github.com/lni/dragonboat

@tylertreat tylertreat added the roadmap Included on the current product roadmap label Jul 2, 2020
tylertreat added a commit that referenced this issue Dec 30, 2020
Add a new configuration, `clustering.raft.max.quorum.size`, to allow
limiting the number of servers that participate in the Raft quorum.

Resolves #41.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement roadmap Included on the current product roadmap
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants