feat: Call provide endpoint in batches. #64

ajnavarro · 2022-11-08T10:38:34Z

This will avoid huge provide payloads.

It closes #55

Signed-off-by: Antonio Navarro Perez [email protected]

This will avoid huge provide payloads. Signed-off-by: Antonio Navarro Perez <[email protected]>

lidel

Quick nits:

batch size based on item count is not enough, we may need second one based on byte size (see comment below why)
server side is does not enforce batch size limits, which is a DoS vector
limits should be documented in specs (IPIP-337: Delegated Content Routing HTTP API specs#337)

lidel · 2022-11-08T12:52:43Z

client/client.go

+	}
+
+	if c.ProvideBatchSize == 0 {
+		c.ProvideBatchSize = 30000 // this will generate payloads of ~1MB in size


nit (1): this comes with various assumptions about average size of a single item, which will quickly get outdated when we have WebTransport enabled by default (/webtransport multiaddrs with two /certhash segments will baloon the size of the batch beyond initial estimate), or add more transports in the future.

In other places, such as UnixFS autosharding, we've moved away from ballpark counting items assuming they are of some arbitrary average size, and switched to calculating the total size of the final block.

Thoughts on swithcing to byte size, or having two limits? ProvideBatchSize (current one) and ProvideBatchByteSize (new one)

nit (1): this comes with various assumptions about average size of a single item, which will quickly get outdated when we have WebTransport enabled by default (/webtransport multiaddrs with two /certhash segments will baloon the size of the batch beyond initial estimate), or add more transports in the future.

Then we need to add a specific limit on the amount of multiaddrs allowed. But that is not the problem right now.

Checking the raw byte size of the payload will make the code hard to read and understand. In my opinion, is not worth it (have a payload of ~2Mb instead of ~900Kb because we added more multiaddresses)

ischasny

LGTM. Thanks for implementing.

ajnavarro · 2022-11-14T11:42:10Z

@lidel:

batch size based on item count is not enough, we may need second one based on byte size (see comment below why)

See my comment about why I think check byte size is not worth-it in that case.

server side is does not enforce batch size limits, which is a DoS vector

Is up to the implementation to limit it.

limits should be documented in specs

They are here: https://github.com/ipfs/specs/blob/main/reframe/REFRAME_KNOWN_METHODS.md#provide

Call provide endpoint in batches.

bf3fdc2

This will avoid huge provide payloads. Signed-off-by: Antonio Navarro Perez <[email protected]>

ajnavarro requested review from willscott and guseggert November 8, 2022 10:38

ajnavarro self-assigned this Nov 8, 2022

ajnavarro changed the title ~~Call provide endpoint in batches.~~ feat: Call provide endpoint in batches. Nov 8, 2022

lidel suggested changes Nov 8, 2022

View reviewed changes

ischasny approved these changes Nov 8, 2022

View reviewed changes

ischasny mentioned this pull request Nov 21, 2022

IPIP-337: Delegated Content Routing HTTP API ipfs/specs#337

Merged

ajnavarro removed their assignment Feb 5, 2023

BigLep closed this Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Call provide endpoint in batches. #64

feat: Call provide endpoint in batches. #64

ajnavarro commented Nov 8, 2022

lidel left a comment

lidel Nov 8, 2022

ajnavarro Nov 14, 2022

ischasny left a comment

ajnavarro commented Nov 14, 2022

feat: Call provide endpoint in batches. #64

feat: Call provide endpoint in batches. #64

Conversation

ajnavarro commented Nov 8, 2022

lidel left a comment

Choose a reason for hiding this comment

lidel Nov 8, 2022

Choose a reason for hiding this comment

ajnavarro Nov 14, 2022

Choose a reason for hiding this comment

ischasny left a comment

Choose a reason for hiding this comment

ajnavarro commented Nov 14, 2022