Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

add(RFC): sharding + index alloc #566

Merged
merged 12 commits into from
Jan 30, 2023
Merged

Conversation

kaiserd
Copy link
Contributor

@kaiserd kaiserd commented Jan 26, 2023

This PR is part of the secure scaling roadmap
and addresses vacp2p/research#160

Here is how it integrates into the bigger picture:

Notes

  • The index allocation in form of an informational RFC is a suggestion.
    We could also opt to manage allocation in another document, or do allocation in another way.

  • We could also add/mention various levels of network segregation in this RFC (or in another document).
    With this RFC, apps can have segregated shard clusters.
    Imo, this level of segregation should be enough.
    We could, however, additionally segregate the discv5 discovery network (at the cost of connectivity),
    as well as completely segregate the gossipsub network (with the current version of the RFC specific control messages are shared beyond shard boundaries).

cc @Menduist @rymnc @LNSD @fryorcraken @cammellos @corpetty

for global shard 43.
And for shard 43 of the Status app (which has allocated index 16):

`subscribe("/waku2/xxx", 16, 43)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the actual pubsub string would like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in df6c44a


| key | value |
|--- |--- |
| `relay-shard-0` | `0x0000100000000000` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being conscious of the size limit of the ENR, wouldn't shard-0 be enough to describe the information?

Suggested change
| `relay-shard-0` | `0x0000100000000000` |
| `shard-0` | `0x0000100000000000` |

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I used relay-shard because there might be other kinds of shards.
What about rshard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it to rshard in all occurrences in 4e4bc56.
Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder about the other types of shards. Relay is the backbone of the network. Hence a relay shard, impacts all other protocols (store, light push, filter). So a relay shard sounds to be the most impactful type of shard there can be in Waku.

However, rshard sounds good for a more cautious and future proof name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about potential New Message Dissemination Methods (currently last item on vacp2p/research#154)
These would be an alternative to Waku Relay and would come with other trade-offs. E.g. better for suited for 1:1 communicastion, or lower latency but lower anonymity guarantee. These dissemination networks could be sharded, too.
(But this is in the farther future.)

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks. This LGTM as a raw spec and static sharding makes sense to me as an initial strategy for scaling.

content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
content/docs/rfcs/51/README.md Outdated Show resolved Hide resolved
@kaiserd kaiserd force-pushed the add/rfc51-waku-relay-sharding branch from 6066e8f to 123ea35 Compare January 27, 2023 14:12
Copy link
Contributor

@rymnc rymnc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@alrevuelta alrevuelta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! Left some comments.

A more generic question. Whats is the intention behind having both static sharding and automatic sharding? I mean do you plan the network to support both, or is static sharding the most immediate scaling solution and automatic sharding the evolution of it?

Wondering if we should just have named sharding (which we currently support without modying the code) and then aim directly to automatic sharding.

Thanks!

which allow application protocols to scale in the number of content topics.
This document also covers discovery of topic shards.

# Named Sharding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if "named sharding" is already covered here.

Copy link
Contributor Author

@kaiserd kaiserd Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The document mentiones his (in this section), along with the option (in the note) to merge RFC 23 here.
I put this in to this RFC to consolidate sharding strategies into one RFC, and to categorize this approach as "named sharding", distinguishing it from the other strategies.
We could leave RFC 23 as an informational RFC discussing naming strategies, or merge it here and deprecate 23.

| 13 | reserved | |
| 14 | reserved | |
| 15 | reserved | |
| 16 | Status | Status main net |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since waku is permissionless how do you enforce this? I mean, this could be an internal recommendation but any app can send messages to Status mainnet shard. So wondering about the impact it will have if people don't respect this.

Copy link
Contributor Author

@kaiserd kaiserd Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the IANA process, there would be no enforcement.
Apps could, however, make their shards permissioned on the app layer.
An attacker who controls enough nodes can still overtake the shards,
or significantly increase load.
Part of this will be addressed by DoS mitigation research.

A shard cluster is either globally available to all apps (like the default pubsub topic),
specific for an app protocol,
or reserved for automatic sharding (see next section).
In total, there are $2^16 * 64 = 4194304$ shards for which Waku manages discovery.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there is any rationale behind this numbers?

How is the mapping to pubsub topics done? Reading whats below, its 1 shard per topic? Will we have 4194304 gossip sub topics?

Copy link
Contributor Author

@kaiserd kaiserd Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 shards per shard cluster is chosen to match the Eth ENR shard representation.
2^16 for the index is the next byte boundary after 2^8, which seemed too low and would not save significant space in the ENR.
(Also, 2^16 is the whole IANA port range, and ranges in this RFC match the IANA ranges.)

If there are strong arguments for other numbers, we can of course adjust while in the raw phase.

How is the mapping to pubsub topics done?

For static sharding: up to the app layer. The document states this.

Reading whats below, its 1 shard per topic?

One shard per pubsub topic yes.

Will we have 4194304 gossip sub topics?

yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we have 4194304 gossip sub topics?

yes.

Can gossipsub scale to this amount of topics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For a long time at least,) most topics/shards would be not be used.
For a very large number of pubsub topics, we might have to adjust (limit) some of the control messages.
As long as the number of control messages (that cross pubsub topic boundaries) sent and received is < O(#number of pubsub topics), it would be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can gossipsub scale to this amount of topics?

This would be good to check with libp2p team in terms of how this would work in practice. I can't find it now, but long time I ago I tried to have multiple pubsub topics (basically using content topics as pubsub topics) and there were problems with creating meshes for pubsub topics. If a client is listening to these topics ahead of time it is probably fine, but it is something worth checking with network testing too. Maybe Nimbus knows of some potential gotchas here?

cc @Menduist re libp2p and @jm-clius re network testing (not sure who to ping re this)

@kaiserd
Copy link
Contributor Author

kaiserd commented Jan 30, 2023

Thank you for the feedback.

A more generic question. Whats is the intention behind having both static sharding and automatic sharding?

With automatic sharding, apps do not have to manage sharding.
With static sharding, apps have the option to manage the mapping.
Since static sharding is easier to realize, we do this first for the MVP.

I mean do you plan the network to support both,

yes

@kaiserd kaiserd merged commit 7c23eea into master Jan 30, 2023
Assigning content topics to specific shards is up to app protocols,
but the discovery of these shards is managed by Waku.

These shards are managed in an array of $2^16$ shard clusters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull up these constants and define them above perhaps?

Copy link
Contributor

@oskarth oskarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, quite thorough as a raw spec! I know it has been merged but just a few comments

A shard cluster is either globally available to all apps (like the default pubsub topic),
specific for an app protocol,
or reserved for automatic sharding (see next section).
In total, there are $2^16 * 64 = 4194304$ shards for which Waku manages discovery.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can gossipsub scale to this amount of topics?

This would be good to check with libp2p team in terms of how this would work in practice. I can't find it now, but long time I ago I tried to have multiple pubsub topics (basically using content topics as pubsub topics) and there were problems with creating meshes for pubsub topics. If a client is listening to these topics ahead of time it is probably fine, but it is something worth checking with network testing too. Maybe Nimbus knows of some potential gotchas here?

cc @Menduist re libp2p and @jm-clius re network testing (not sure who to ping re this)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants