Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP protocol #204

Merged
merged 11 commits into from
May 8, 2023
Merged

HTTP protocol #204

merged 11 commits into from
May 8, 2023

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Apr 26, 2023

This is branching from feat/http which is in #193; using this to make my specific revisions. Working toward #195.

Initial changes:

  • go-multicodec has a released version with this new entry
  • Dealing with the new Selector vs Path complexity which we discussed a few weeks ago (see HTTP Prep: Updating the Request Struct #180 for part of that) and is now here with us.
    • We have to either gracefully handle the case where a user provides a custom selector, or remove the ability to provide anything other than a path+scope. After discussing with @kylehuntsman today, it seems like there may be some desire to keep selectors around, which is fine by me, and we can still decide later to ditch custom selectors. But for HTTP retrievals we just have to rule out the possibility of a custom selector with that retriever, so it will simply error.
    • RetrievalRequest and CarScope themselves are used to figure out which selector (GetSelector()), what the path & querystring are for the HTTP request (GetUrlPath() - which can error), CarScope#TerminalSelectorSpec() to help with selector building and CarScope#AcceptHeader() to generate the appropriate Accept for that scope.

I'll continue to keep notes in here as I go.

@codecov-commenter
Copy link

codecov-commenter commented Apr 26, 2023

Codecov Report

Merging #204 (c812ef7) into feat/http (c22d369) will increase coverage by 2.27%.
The diff coverage is 84.61%.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##           feat/http     #204      +/-   ##
=============================================
+ Coverage      68.16%   70.43%   +2.27%     
=============================================
  Files             63       64       +1     
  Lines           5443     5496      +53     
=============================================
+ Hits            3710     3871     +161     
+ Misses          1533     1409     -124     
- Partials         200      216      +16     
Impacted Files Coverage Δ
cmd/lassie/flags.go 0.00% <ø> (ø)
pkg/session/session.go 57.14% <45.45%> (-1.95%) ⬇️
pkg/retriever/directcandidatefinder.go 64.80% <50.00%> (-1.31%) ⬇️
pkg/internal/testutil/gen.go 92.94% <62.50%> (-7.06%) ⬇️
pkg/types/request.go 55.95% <64.28%> (-5.96%) ⬇️
pkg/types/types.go 80.89% <70.58%> (-6.83%) ⬇️
pkg/retriever/httpretriever.go 73.33% <77.10%> (+73.33%) ⬆️
pkg/retriever/graphsyncretriever.go 87.57% <87.65%> (-4.03%) ⬇️
...g/retriever/prioritywaitqueue/prioritywaitqueue.go 92.39% <88.46%> (-1.90%) ⬇️
pkg/retriever/parallelpeerretriever.go 93.75% <93.75%> (ø)
... and 5 more

... and 1 file with indirect coverage changes

@willscott
Copy link
Contributor

  • having custom selectors error here is probably fine for the first pass
  • the second pass could do something similar to bifrost-gateway, and fall back to an un-optimized block-by-block bitswap-like request pattern in that case. less efficient than potentially optimal, but will always work.

@rvagg
Copy link
Member Author

rvagg commented Apr 26, 2023

I'm pivoting here a little, refactoring the graphsync retriever so we can share the majority of the code between the two retrievers since the primary flow that we need is the same but the actual retrieving is different. In fact, it's almost close enough that you could just replace the "client" and they'd be the same, but not quite. But the whole mechanics of managing the parallel fetches, coordinating them, even selecting which one to next (as per #197) is very close and I'm finding myself copying and pasting a lot of code to bring this up to scratch already.

@rvagg
Copy link
Member Author

rvagg commented Apr 27, 2023

Pushed a new update - my second (but very rough) try and trying to share logic between the graphsyncretriever and the http one. My initial attempt got too messy, trying to make an asbtract shared type that could be extended by two different retrievers. So my new tack was to just flesh out the graphsyncretriever more so it can serve both protocols, just with a different constructor for each. I'l try and disentangle it more (the switch statements are the main smell in here that I think will drive refactoring tomorrow).

The flow is basically the same between the two, with these differences:

  • Metadata comparison - we have no metadata for HTTP so we only have "connect time" to compare (but...)
  • What is "connect time" and when should we connect for HTTP? The current code starts the requests for all candidates in parallel but defers reading the bodies for the serial portion under the assumption that an upstream provider is going to have backpressure enabled that will stop them buffering up a huge response that we don't read promptly (or at all). That could be a bad assumption and maybe we should do request -> read as one operation and not have a separate "connect" time for providers.

@rvagg
Copy link
Member Author

rvagg commented Apr 28, 2023

Refactored this to a form that I don't hate and I think deals with the concerns quite well.

  • parallelPeerRetriever has the guts of the old Graphsync retriever flow but none of the Graphsync logic
  • TransportProtocol is an interface to the actual protocol guts that a parallelPeerRetriever will use to execute the communication pieces. This is implemented by ProtocolGraphsync and ProtocolHttp where the specific protocol pieces live.

So now, parallelPeerRetriever handles the flow of receiving candidates (async), managing their metadata and deduping, getting them started (connect) in parallel, and then performing a retrieval in serial according to prioritisation.

The TransportProtocol implementations do some of the work of managing and comparing metadata to select the next-best candidate (this will be extended a bit for #92), doing a Connect() and then a Retrieve().

We still have the weirdness about what does a "Connect" mean for HTTP, but that question is separate to this abstraction and we need to answer that at some point. For now I'm still going with "start a request from every candidate as soon as we get them" and then "read the body from one candidate at a time until we get what we need".

@rvagg
Copy link
Member Author

rvagg commented Apr 28, 2023

Aaaaand it lives. Working unit test added, so far just a simple single peer fetch, with a mocked http Client so it doesn't touch a server, just gets the bytes of a CAR and we expect to get the same blocks into our client LinkSystem in the same order. See httpretriever_test.go.

In theory, if you were to compile a lassie and the indexer were to serve you candidates that could do HTTP then it should work. Alternatively a --provider pointing to an endpoint that serves a CAR should do the same (probably just a Kubo node would do).

Copy link
Contributor

@willscott willscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very exciting!

pkg/retriever/graphsyncretriever.go Outdated Show resolved Hide resolved
pkg/retriever/graphsyncretriever.go Show resolved Hide resolved
pkg/retriever/httpretriever.go Outdated Show resolved Hide resolved
@olizilla
Copy link

olizilla commented May 3, 2023

I tested this branch against https://freeway.dag.haus and cid+path works as does --car-scope: file.

The only minor snag i hit was --providers expects a multiaddr with a p2p

~/Code/filecoin-project/lassie on rvagg/http  
❯ ./lassie fetch --providers /dns4/freeway.dag.haus/tcp/443/https bafybeihetumsy24mjzbts7t4vpft2rwput44joxxnzxhfh5woq6z46fe2y
2023-05-03T12:24:48.376+0100	FATAL	lassie	lassie/main.go:57	invalid p2p multiaddr

More here: storacha/freeway#34 (comment)

@willscott
Copy link
Contributor

i wonder if you can use the same p2p identity as eipfs for now? i suspect the http transport client is validating that against anything at the moment

@olizilla
Copy link

olizilla commented May 3, 2023

If you want to run any tests against freeway, anything recently uploaded to web3.storage or via the new w3up api (cli here) should be available via https://freeway.dag.haus ... but it's dag root only. fetching via non-root block cid isn't supported via that gateway yet.

@olizilla
Copy link

olizilla commented May 3, 2023

@willscott that's what i did in storacha/freeway#34 (comment) (it works)

@rvagg rvagg marked this pull request as ready for review May 8, 2023 09:44
@rvagg
Copy link
Member Author

rvagg commented May 8, 2023

Sorry, have tuned out notifications for a few days, catching up on some backlog.

The only minor snag i hit was --providers expects a multiaddr with a p2p

This is because we're heavily peer ID centric in the Lassie codebase. You can just stick a random peer ID on the end of your multiaddr for --providers and it'll work anyway.

Screenshot 2023-05-04 at 8 55 17 am

Mainly we use that for internal disambiguation between parallel fetches (only relevant for --providers if you provide >1), metrics reporting (not enabled by default of course), and also will be used for stats collection for peer prioritisation in long-running instances (i.e. daemon). Not ideal UX though if you just want http and just want to give a simple form of the address so we should probably have a way to deal with this.

@rvagg rvagg force-pushed the rvagg/http branch 2 times, most recently from f6e8366 to a8cc6f8 Compare May 8, 2023 10:28
@rvagg
Copy link
Member Author

rvagg commented May 8, 2023

:old-man-shakes-fist-at-mock-timers: (I finally got flaky tests sorted out, after much struggle, using a horrible goroutine scheduling hack, but we shall speak of that no further for now.)

So the tests are complete-enough for now, there's not much more I can test for this basic functionality. All the most interesting stuff will come with #195.

So I'm going to merge this into feat/http for now; so this code can be better reviewed in #193 and I can start working on #195.

@olizilla I should have the basics of #195 in place in the next few days, it's when things might get more interesting because right now, using lassie on this branch is only really testing that you have an HTTP endpoint that responds to a certain path string with a valid CAR, and that's it. We don't care whether it's a v1 or v2, what it says its roots are, or even what, or how many, blocks are in it. #193 will bring in all the strictness we need according to ipfs/specs#402 and where discrepancies or deficiencies in implementations will become more obvious.

@rvagg rvagg merged commit e72f174 into feat/http May 8, 2023
@rvagg rvagg deleted the rvagg/http branch May 8, 2023 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants