Stream-Rail distribution with addr + port #1411

narategithub · 2024-07-01T06:20:23Z

When the stream forward the stream data, it used the peer address to determine the endpoint in the rail to forward the data. This patch adds port into the calculation as well to increase the stream data distribution among the endpoints in the rail.

narategithub · 2024-07-01T06:29:40Z

@tom95858 it failed the compatibility test due to IPv6 issue in the test script that was addressed in #1408 . Once that pull request is merged, I'll rerun the test in this PR.

narategithub · 2024-07-01T06:30:55Z

@tom95858 Shall I also include the stream name into the calculation to increase the distribution?

narategithub · 2024-07-12T01:51:19Z

I'll add name into the calculation too. I'll mark this as draft for now.

tom95858 · 2024-07-12T17:34:00Z

@narategithub what's the status of this?

When the stream forward the stream data, it used the peer address to determine the endpoint in the rail to forward the data. This patch adds `port` into the calculation as well to increase the stream data distribution among the endpoints in the rail.

narategithub · 2024-07-13T07:58:21Z

@tom95858 the code is done but the ldms_stream_test script failed. I'm debugging it.

narategithub · 2024-07-13T14:43:59Z

@tom This is ready for your review.

ldms/src/core/ldms_stream.c

tom95858 · 2024-07-13T16:11:38Z

ldms/src/core/ldms_stream.c

@@ -478,6 +491,7 @@ __stream_deliver(struct ldms_addr *src, uint64_t msg_gn,
 			.name_len = name_len,
 			.data_len = data_len,
 			.name = stream_name,
+			.name_hash = hash,


If you're going to put the hash in the message, you could put the entire thing in there and avoid the computation on every message. The port and addr aren't changing. This would also mean that the same hash is used all the way up the chain from sender to consumer.

The address got changed in the following case.

There is a logic to exclude the use of local address (e.g. 127.0.0.1) as src address here:

https://github.com/ovis-hpc/ovis/blob/f71e7adcff4a852f3ccc05b143a7794a6bc71986/ldms/src/core/ldms_stream.c#L1131-L1153

This is to prevent the case that the application running on the same host connects to sampler ldmsd using localhost address and make all src in the messages from all nodes being 127.0.0.1. Basically, if the src address is local, do not resolve it and let the next guy resolve it instead.

So there must be code somewhere else that checks the address and family and if 0 sets it to the address and port of the forwarder? Where is that logic?

@narategithub In any event, all we're doing here is randomizing the seed to what we hope will be a random hash. So as-usual, we've found ourselves in a hole so we keep digging. Let's just back up and decide how we're going to generate a random hash.

Adding the (hash of) stream name into the endpoint index calculation to increase entropy (and hopefully more even distribution among endpoints).

narategithub · 2024-08-20T15:41:15Z

Closing this PR. This will be a part of adaptive flow control PR.

narategithub requested a review from tom95858 July 1, 2024 06:21

narategithub marked this pull request as draft July 12, 2024 01:51

narategithub force-pushed the stream-rail-dist branch from 53aa7a5 to f147f17 Compare July 13, 2024 14:29

narategithub marked this pull request as ready for review July 13, 2024 14:42

tom95858 reviewed Jul 13, 2024

View reviewed changes

ldms/src/core/ldms_stream.c Outdated Show resolved Hide resolved

tom95858 reviewed Jul 13, 2024

View reviewed changes

Also add stream name into the endpoint index assignment

45251ba

Adding the (hash of) stream name into the endpoint index calculation to increase entropy (and hopefully more even distribution among endpoints).

narategithub force-pushed the stream-rail-dist branch from f147f17 to 45251ba Compare July 14, 2024 15:37

narategithub closed this Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream-Rail distribution with addr + port #1411

Stream-Rail distribution with addr + port #1411

narategithub commented Jul 1, 2024

narategithub commented Jul 1, 2024

narategithub commented Jul 1, 2024

narategithub commented Jul 12, 2024

tom95858 commented Jul 12, 2024

narategithub commented Jul 13, 2024

narategithub commented Jul 13, 2024

tom95858 Jul 13, 2024

narategithub Jul 14, 2024

tom95858 Jul 15, 2024

tom95858 Jul 15, 2024

narategithub commented Aug 20, 2024

Stream-Rail distribution with addr + port #1411

Stream-Rail distribution with addr + port #1411

Conversation

narategithub commented Jul 1, 2024

narategithub commented Jul 1, 2024

narategithub commented Jul 1, 2024

narategithub commented Jul 12, 2024

tom95858 commented Jul 12, 2024

narategithub commented Jul 13, 2024

narategithub commented Jul 13, 2024

tom95858 Jul 13, 2024

Choose a reason for hiding this comment

narategithub Jul 14, 2024

Choose a reason for hiding this comment

tom95858 Jul 15, 2024

Choose a reason for hiding this comment

tom95858 Jul 15, 2024

Choose a reason for hiding this comment

narategithub commented Aug 20, 2024