Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for autosharded pubsub topics #1723

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

adklempner
Copy link
Member

@adklempner adklempner commented Nov 17, 2023

Problem

js-waku needs to support scaling via autosharding by:

  • allowing the developer to specify whether to use autosharding or static sharding when creating a node
  • when using autosharding, libp2p pubsub topics should be determined using content topics, specifically the application and version fields

Solution

  • Adds optional parameters autosharding and clusterId when creating an encoder or decoder. Defaults to static sharding behavior.
  • Adds autosharding and contentTopics fields to ShardInfo. Updates logic for determining pubsub topics from shard info to branch on whether autosharding is enabled. All protocols use this to determine pubsub topics using the content topics if autosharding is enabled. Defaults to using static sharding.
  • Adds functions to NimGoNode for calling autosharding-specific RPC endpoints in tests
  • Tests each protocol on js-waku nodes configured with autosharding using above endpoints on nwaku node

Notes

@adklempner adklempner force-pushed the adklempner/autoshard-encoder-decoder branch 3 times, most recently from 2edcf4e to cfd2671 Compare November 30, 2023 09:13
Copy link

github-actions bot commented Nov 30, 2023

size-limit report 📦

Path Size Loading time (3g) Running time (snapdragon) Total time
Waku core 81.96 KB (+3.41% 🔺) 1.7 s (+3.41% 🔺) 206 ms (-18.22% 🔽) 1.9 s
Waku Simple Light Node 261.28 KB (+0.05% 🔺) 5.3 s (+0.05% 🔺) 412 ms (-15.05% 🔽) 5.7 s
ECIES encryption 31.94 KB (-56.71% 🔽) 639 ms (-56.71% 🔽) 164 ms (-55.15% 🔽) 803 ms
Symmetric encryption 31.93 KB (-56.71% 🔽) 639 ms (-56.71% 🔽) 249 ms (-26.78% 🔽) 887 ms
DNS discovery 123.17 KB (+0.43% 🔺) 2.5 s (+0.43% 🔺) 375 ms (+16.49% 🔺) 2.9 s
Privacy preserving protocols 119.3 KB (-6.59% 🔽) 2.4 s (-6.59% 🔽) 447 ms (+30.55% 🔺) 2.9 s
Light protocols 79.42 KB (+3.41% 🔺) 1.6 s (+3.41% 🔺) 184 ms (-28.36% 🔽) 1.8 s
History retrieval protocols 78.31 KB (+3.47% 🔺) 1.6 s (+3.47% 🔺) 214 ms (+10.53% 🔺) 1.8 s
Deterministic Message Hashing 5.92 KB (0%) 119 ms (0%) 49 ms (-0.12% 🔽) 167 ms

@adklempner adklempner force-pushed the adklempner/autoshard-encoder-decoder branch from cfd2671 to 1899da7 Compare December 1, 2023 06:30
@adklempner adklempner marked this pull request as ready for review December 1, 2023 22:13
@adklempner adklempner requested a review from a team as a code owner December 1, 2023 22:13
autosharding?: boolean;
contentTopics?: string[];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be assumed that if content topics are passed then we are using autosharding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated in latest commit

ephemeral = false,
metaSetter
}: EncoderOptions,
autosharding = false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why having the parameter outside the EncoderOptions object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to EncoderOptions in latest commit

@@ -21,6 +21,7 @@ export interface Waku2 {
export interface ShardInfo {
clusterId: number;
shards: number[];
contentTopics?: string[];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ENR does not contain contentTopics info so I think this can be classified as an abuse of the interface.

It may make more sense to have a different interface with the content topics. And actually, this interface should have shards XOR contentTopics as the source of truth shoul dbe one or the other, not both.

Maybe something like

type inputForShard = ShardInfo | ContentTopicsInfo

interface ContentTopicsInfo {
	clusterId: number
	contentTopics: string[]
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in latest commit

@@ -99,6 +105,7 @@ export interface EncoderOptions extends BaseEncoderOptions {
* in [26/WAKU2-PAYLOAD](https://rfc.vac.dev/spec/26/).
*/
export function createEncoder({
autosharding = false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not great. If a dev use the "Autosharding" create node option. they should not need to then pass it again on the encoder.
I think we can merge this and then do a follow-up where the shard decision can be left at protocol time when node is autosharding on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this as well but wasn't sure how to "link" the node settings with the encoder settings, since they are initialized separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in latest commit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this as well but wasn't sure how to "link" the node settings with the encoder settings, since they are initialized separately.

Will checkyyour code and sorry for the short ccomment, was on the phone.

My initial thoughts is about moving the logic of deciding the shard from the encoder class to the protocols class on the node.

];
for (const [topic, shard] of contentTopics) {
expect(contentTopicToShardIndex(topic)).to.eq(shard);
}
});
});

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a fast check here and confirm that any content topic with same app name are in the same shard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in latest commit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in latest commit

I don't see it.

I'd expect something using the fast-check package.

e.g.

import fc from "fast-check";

  it("Content topics for the same app go in the same shard", function () {
    fc.assert(
      fc.Property(
        fc.string({ minLength: 1 }), // app name
	fc.number(), // app version 1
        fc.string({ minLength: 1 }), // topic name 1
        fc.string({ minLength: 1 }), // encoding 1
	fc.number(), // app version 2 (I don't remember if this is used for autosharding)
        fc.string({ minLength: 1 }), // topic name 2
        fc.string({ minLength: 1 }), // encoding 2
        (appName, appVersion1, topicName1, encoding1, appVersion2, topicName2, encoding2) => {
          const contentTopic1 = `/${appName}/${appVersion1}/${topicName1}/${encoding1}`;
          const contentTopic2 = `/${appName}/${appVersion2}/${topicName2}/${encoding2}`;
	
	      expect(contentTopicToShardIndex(contentTopic1)).to.eq(contentTopicToShardIndex(contentTopic1));
})
})

Then do the same where appname are different

Comment on lines 127 to 130
{ pubsubTopicShardInfo, contentTopic, ephemeral, metaSetter }: EncoderOptions,
autosharding = false,
clusterId = 0
): Encoder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment as above: we can move these two args into EncoderOptions if they're important for the encoder creation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


why do we need the clusterId flag? perhaps we can leverage the shardInfo arg and even remove the autosharding flag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in latest commit

Comment on lines 198 to 205
pubsubTopicShardInfo?: SingleShardInfo,
autosharding = false,
clusterId = 0
): Decoder {
return new Decoder(
pubsubTopicShardInfo
autosharding
? contentTopicToPubsubTopic(contentTopic, clusterId)
: pubsubTopicShardInfo
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment as aboev

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in latest commit

@@ -21,6 +21,7 @@ export interface Waku2 {
export interface ShardInfo {
clusterId: number;
shards: number[];
contentTopics?: string[];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

@adklempner adklempner force-pushed the adklempner/autoshard-encoder-decoder branch from 431d8a8 to 24a7d67 Compare December 16, 2023 06:48
@@ -2,7 +2,10 @@ import type { PubsubTopic } from "./misc.js";

export interface SingleShardInfo {
clusterId: number;
shard: number;
/**
* Specifying this field indicates to the encoder/decoder that static sharding should be used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Specifying this field indicates to the encoder/decoder that static sharding should be used.
* Specifying this field indicates to the encoder/decoder that static sharding must be used.

Just to clarify is not a preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

let subscription: IFilterSubscription;
let messageCollector: MessageCollector;

const customContentTopic1 = "/waku/2/content/test.js";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const customContentTopic1 = "/waku/2/content/test.js";
const customContentTopic1 = "/waku/2/content/null";

Just to keep in line with expected content topic format. or you could use utf8 too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated all to use utf8 instead of test.js

Comment on lines +259 to +261
const subscription2 = await waku.filter.createSubscription(
pubsubTopicToSingleShardInfo(autoshardingPubsubTopic2)
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API does not look great. possibly a future improvement is that the user just pass their content topic here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened an issue for this #1764

@@ -184,3 +193,176 @@ describe("Static Sharding: Peer Management", function () {
});
});
});

describe("Autoharding: Peer Management", function () {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
describe("Autoharding: Peer Management", function () {
describe("Autosharding: Peer Management", function () {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

];
for (const [topic, shard] of contentTopics) {
expect(contentTopicToShardIndex(topic)).to.eq(shard);
}
});
});

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in latest commit

I don't see it.

I'd expect something using the fast-check package.

e.g.

import fc from "fast-check";

  it("Content topics for the same app go in the same shard", function () {
    fc.assert(
      fc.Property(
        fc.string({ minLength: 1 }), // app name
	fc.number(), // app version 1
        fc.string({ minLength: 1 }), // topic name 1
        fc.string({ minLength: 1 }), // encoding 1
	fc.number(), // app version 2 (I don't remember if this is used for autosharding)
        fc.string({ minLength: 1 }), // topic name 2
        fc.string({ minLength: 1 }), // encoding 2
        (appName, appVersion1, topicName1, encoding1, appVersion2, topicName2, encoding2) => {
          const contentTopic1 = `/${appName}/${appVersion1}/${topicName1}/${encoding1}`;
          const contentTopic2 = `/${appName}/${appVersion2}/${topicName2}/${encoding2}`;
	
	      expect(contentTopicToShardIndex(contentTopic1)).to.eq(contentTopicToShardIndex(contentTopic1));
})
})

Then do the same where appname are different

@adklempner adklempner force-pushed the adklempner/autoshard-encoder-decoder branch from 24a7d67 to 29670d4 Compare December 19, 2023 04:55
];
for (const [topic, shard] of contentTopics) {
expect(contentTopicToShardIndex(topic)).to.eq(shard);
}
});

it("topics with same application and version share the same shard", () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fryorcraken here is the test I added

@@ -97,7 +98,9 @@ export class BaseProtocol implements IBaseProtocol {
return filterPeers(allPeersForProtocol, numPeers, maxBootstrapPeers);
}

initializePubsubTopic(shardInfo?: ShardInfo): PubsubTopic[] {
initializePubsubTopic(
shardInfo?: ShardInfo | ContentTopicInfo
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShardInfo | ContentTopicInfo this is a common pattern at this point - makes sense to have a type alias

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in latest commit

pubsubTopicShardInfo.shard
? singleShardInfoToPubsubTopic(pubsubTopicShardInfo)
: contentTopicToPubsubTopic(contentTopic, pubsubTopicShardInfo.clusterId);
}
return new Encoder(
contentTopic,
ephemeral,
pubsubTopicShardInfo
Copy link
Collaborator

@weboko weboko Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens in more than 2 places at this point.
Let's abstract it (e.g utility function or something else)

pubsubTopicShardInfo
      ? pubsubTopicShardInfo.shard
        ? singleShardInfoToPubsubTopic(pubsubTopicShardInfo)
        : contentTopicToPubsubTopic(
            contentTopic,
            pubsubTopicShardInfo.clusterId
          )
      : DefaultPubsubTopic,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abstracted away in latest commit

@@ -197,10 +206,13 @@ class Decoder extends DecoderV0 implements IDecoder<DecodedMessage> {
export function createDecoder(
contentTopic: string,
symKey: Uint8Array,
pubsubTopicShardInfo?: SingleShardInfo
pubsubTopicShardInfo?: SingleShardInfo,
autosharding = false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to move forward with adding this parameter - other create functions should follow too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to replace this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in latest commit

export function determinePubsubTopic(
contentTopic: string,
pubsubTopicShardInfo?: SingleShardInfo,
defaultPubsubTopic: string = "/waku/2/default-waku/proto"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's refer to the constant here and not hardcoded value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in latest commit

@adklempner adklempner force-pushed the adklempner/autoshard-encoder-decoder branch from 2da60ec to cf46315 Compare December 21, 2023 18:01
tests: use a generator for sharded pubsub topics
set pubsub topic in encoder/decoder based on sharding type
add function for grouping content topics by pubsub topic
add autosharding config to create options
add autoshard rpc endpoints to nwaku and use in tests
set autoshard pubsub topics in all protocols
fix rebase with static sharding
removes unused function
remove console logs
remove autosharding from ShardInfo, add to EncoderOptions
fix enr and encoder/decoder options
test that same application/version hashes to same shard index
update comment on shard field
fix spelling of autosharding
fix content topic protocol in tests
add sharding type alias and function to determine topic in encoders/decoders
move DefaultPubsubTopic from core to interfaces
@adklempner adklempner force-pushed the adklempner/autoshard-encoder-decoder branch from cf46315 to 2bc3735 Compare December 21, 2023 18:04
@adklempner adklempner merged commit 197926d into master Dec 21, 2023
10 of 11 checks passed
@adklempner adklempner deleted the adklempner/autoshard-encoder-decoder branch December 21, 2023 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: validate content topics in encoder and decoder
4 participants