Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reframe peer records #279

Closed
wants to merge 2 commits into from
Closed

Reframe peer records #279

wants to merge 2 commits into from

Conversation

aschmahmann
Copy link
Contributor

WIP but wanted to put out an initial draft for some thoughts and discussion.

Background

The Reframe protocol gives us a way to build request-response protocols across multiple transports in a data format agnostic way. The first target use case of this protocol was to be able to handle delegated operation of the various types of requests that the IPFS Public DHT handles. At the moment getting provider records as well as putting and getting IPNS records are supported. The remaining function that can be fully delegated to a third party DHT client is getting peer records (putting provider records and peer records into the DHT are not performable in a delegated way due to protocol limitations described in libp2p/go-libp2p-kad-dht#584).

Motivation

Enabling Reframe to support getting peer records will allow us to start building tooling like a standalone delegated routing server that allows consumers of content to delegate their requests to the server which in turn can query the IPFS Public DHT, Indexers, or any other routing system. More functions will still need to be added going forward to support users who want to make themselves or their content available using this protocol, but those can come later.

Proposal

The simplest proposal here would be to just have a method that requests the multiaddresses for a given PeerID. However, given that libp2p signed peer records are prevalent in the network it should be possible for peers to get those as well and choose to prioritize them if available.

Unfortunately, the libp2p signed peer records are signed protobufs which means: 1) they're not particularly Reframe "friendly" even though we could just send them as bytes 2) we can't reformat the data in a Reframe friendly way because the encoded protobuf is signed and the way in which the data is encoded into the protobuf is non-canonical.

The initial proposal allows the responder send back both unsigned lists of multiaddresses and opaque signed peer records

type GetPeerAddressesRequest struct {
ID Bytes # libp2p PeerID
RecordTypes [String]
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having RecordTypes doesn't seem necessary as the responder could just send back all record types it knows about. Is there value in being able to ask to send back less data here? The same could be argued in other places (e.g. not sending back TransferProtocols we don't support anyway).

I'm planning to remove this, but wanted to leave this as a place for comment if there was demand for it to stay

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do have it, we should have some recommendation of how we identify types - what are these strings supposed to be?

It doesn't seem overly harmful to have, but it does add another point of complexity in implementation.

Copy link
Contributor

@petar petar Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1: if we are to have it, it should be a union of sorts that becomes the formal spec of known record types.

if we are to have it, there must also be a case for "return all types", otherwise we preclude a dumb middlebox from pre-fetching data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the whole, this does seem as an unnecessary micro-optimization (at complexity cost), because the number of records a peer can ever have is limited by the number of types that exist, which is a small number. this is not something that can grow arbitrarily, so ... i would vote no here. the rule of thumb being: can the number of returned results grow in a runtime-dependent manner (e.g. # of providers for a cid depends on runtime conditions, vs number of address records is always capped by the number of types of records which is a compile-time constant)

Comment on lines +299 to +300
| Multiaddresses "multiaddrs"
| Libp2pSignedPeerRecord "769" // the libp2p signed peer record entry interpreted as decimal https://github.com/multiformats/multicodec/blob/f5dd49f35b26b447aa380e9a491f195fd49d912c/table.csv#L129
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure which identifiers to use here. In particular it feels weird to be mixing text and codes like this without any sort of standard for how we do this.

@Stebalien has occasionally proposed utilizing / (since it's the 0x90 codec) as a way to bridge text + codes although I'm not sure if that applies here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have codes for some of them already, can we make a code for the reframe-multiaddrs alternative to libp2p signed peer records and standardize on codes?

}

type Multiaddresses [Bytes] # Each element in the list is the binary representation of a complete multiaddr without a peerID suffix
type Libp2pSignedPeerRecord Bytes # https://github.com/libp2p/specs/blob/8c967a22bfbaff1ae41072b358fdba7c5883b6a4/RFC/0003-routing-records.md
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including raw bytes like this doesn't feel great, although in order to use this peers will likely need to understand this format to send addresses in Identify anyhow so not so bad.

Are there better ideas here given that we can't just do something like the below due to the use of (non-canonically encoded) protobufs in signed peer records?

type PeerRecord struct {
      Multiaddresses [Bytes]
      SeqNo optional Int
      Signature optional Bytes
}

Copy link
Contributor

@petar petar Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. One variation, purely for convenience, would be:

type Libp2pSignedPeerRecord struct {
  Multiaddresses [Bytes] # copy of the multiaddresses from the record, for the benefit of systems that don't know/want to check signatures
  Libp2pSignedRecord Bytes
}

type GetPeerAddressesRequest struct {
ID Bytes # libp2p PeerID
RecordTypes [String]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do have it, we should have some recommendation of how we identify types - what are these strings supposed to be?

It doesn't seem overly harmful to have, but it does add another point of complexity in implementation.

Comment on lines +299 to +300
| Multiaddresses "multiaddrs"
| Libp2pSignedPeerRecord "769" // the libp2p signed peer record entry interpreted as decimal https://github.com/multiformats/multicodec/blob/f5dd49f35b26b447aa380e9a491f195fd49d912c/table.csv#L129
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have codes for some of them already, can we make a code for the reframe-multiaddrs alternative to libp2p signed peer records and standardize on codes?

```
{"GetPeerAddressesRequest" : {
"ID" : {"/":{"bytes":"AXIUBPnagss"}},
"Record" : {"/":{"bytes":"AXIUBPnagss"}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there presumably isn't a record in the request, this would be in the response

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I didn't update any of the JSON examples yet (marked with TODO)

} representation keyed

type GetPeerAddressesResponse struct {
Records [PeerRecordType]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to return [PeerRecordType] here rather than just a single PeerRecordType since we're allowed to send back a stream of GetPeerAddressesResponse? Probably doesn't hurt, but we might want to build some conventions here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep the array. A receiver might want to forward/use results as soon as the first one arrives. They have no good way of knowing that more are coming back to back.

@BigLep BigLep requested a review from petar April 25, 2022 15:25
@BigLep
Copy link
Contributor

BigLep commented Apr 25, 2022

@petar : can you take a look at this as well?

@BigLep BigLep added this to the go-ipfs 0.14 milestone Apr 25, 2022
@BigLep BigLep removed this from the go-ipfs 0.14 milestone Oct 4, 2022
@BigLep
Copy link
Contributor

BigLep commented Oct 4, 2022

@aschmahmann @willscott @guseggert : from a planning perspective, are any key immediate usecases blocked on this?

@willscott
Copy link
Contributor

I think we can mark this as deprecated - we added something pretty similar in the publish code paths for delegated routing

@aschmahmann
Copy link
Contributor Author

aschmahmann commented Oct 12, 2022

I think we can mark this as deprecated - we added something pretty similar in the publish code paths for delegated routing

The special code in the publishing path doesn't help here. For example, the DHT doesn't always return addresses for who has the content, just their peerID, and you might need to do another lookup for the peer's addresses.

from a planning perspective, are any key immediate usecases blocked on this?

  1. If you want to allow any system that currently uses the IPFS Public DHT to allow delegating out FindProviders queries to something like https://github.com/ipfs-shipyard/someguy you need someguy to do a FindPeer lookups in the background for peer addresses if they're missing.
  2. Users of the DHT that want to delegate out lookups (e.g. if they wanted to use the DHT as a distributed version of dynamic DNS so they can connect to their machine no matter where it is) will be unable to do so.

How key and immediate these are you likely have better perspective on. If this isn't going to happen in the short term though then I'd recommend making the changes to someguy that are needed in the meanwhile.

@willscott
Copy link
Contributor

maybe the other consideration is to expect an updated format around this with double hashing, which is proposing a variant on authenticated records

@lidel
Copy link
Member

lidel commented Oct 20, 2022

I am late to this PR, but I agree with @aschmahmann this is a gap that should be closed.

We have two basic concepts in routing:

  • ipfs routing findprovs does content routing (tell me whi peerids have this cid)
  • ipfs routing findpeer does peer routing (tell me multiaddrs this peerid has currently)

Currently we have "content+peer routing" in form of FindProviders method, but there is no way to use Reframe for delegated peer routing alone.

This means:

  • IPFS node is unable to shut down local DHT client completely, as connecting via /p2p/peerid would no longer work.
  • This also means we are unable to use Reframe as replacement for Kubo RPC /api/v0/dht/findpeer used in https://github.com/libp2p/js-libp2p-delegated-peer-routing – which will be a problem when we have browser nodes trying to do things like bootstrap using only peerid, or find a new Multiaddr for a peer that moved between networks (and values from cached FindProviders response no longer work)

Adding this will be necessary if we want to base delegated routing in Reframe.

@willscott
Copy link
Contributor

I think this shouldn't be hard for indexers to support if needed

@willscott
Copy link
Contributor

I do wonder if we want it just for hash-derived peer records, versus for direct peer records

@hacdias
Copy link
Member

hacdias commented Oct 3, 2023

R.I.P. Reframe.

Partially superseded by #417 which adds peer schema to /routing/v1 API and defines a convention that could be used for returning signed libp2p records in the future.

@hacdias hacdias closed this Oct 3, 2023
@hacdias hacdias deleted the feat/reframe-peer-records branch October 3, 2023 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants