Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add WebTransport protocol #4874

Closed
wants to merge 27 commits into from
Closed

Conversation

dgarus
Copy link
Contributor

@dgarus dgarus commented Nov 16, 2023

Description

related to #2993

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start! I've left some comments :)

Comment on lines 85 to 89
pub fn make_server_config_with_cert(
certificate: Certificate,
private_key: PrivateKey,
alpn: Vec<Vec<u8>>
) -> Result<rustls::ServerConfig, certificate::GenError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to find a way to make generation of these certificates deterministic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the corresponding go-libp2p pull request libp2p/go-libp2p#1833.

@@ -79,3 +80,23 @@ pub fn make_server_config(

Ok(crypto)
}

/// Create a TLS server configuration for libp2p.
pub fn make_server_config_with_cert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think special-casing this for webtransport would be better.

.map_err(ConnectError)?;
Connecting::new(connecting, handshake_timeout).await
if wt {
todo!("Here should be part that creates connection to a WT server")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I mentioned in the issue, I don't think we need to support dialing of WT addresses.

let this = self.get_mut();

let incoming = this.incoming.get_or_insert_with(|| {
let accept = this.session.accept_bi();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you Arc the WebTransportSession then you can clone it here and fix the lifetime issues.


/// A certificate fingerprint that is assumed to be created using the SHA256 hash algorithm.
#[derive(Eq, PartialEq, Copy, Clone)]
pub struct Fingerprint([u8; 32]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should eventually upstream (parts) of this to rust-multiaddr and have Protocol::Certhash take a Fingerprint.

Comment on lines 43 to 48
/// Formats this fingerprint as uppercase hex, separated by colons (`:`).
///
/// This is the format described in <https://www.rfc-editor.org/rfc/rfc4572#section-5>.
pub fn to_sdp_format(self) -> String {
self.0.map(|byte| format!("{byte:02X}")).join(":")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for anything SDP related in WebTransport :)

if request.method() == &Method::CONNECT {
let session = WebTransportSession::accept(request, stream, h3_conn)
.await.map_err(|e| WebTransportError::Http3Error(e))?;
Ok((peer_id, session))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to run a noise security handshake here, assuming the type query parameter was set to noise, see https://github.com/libp2p/specs/tree/master/webtransport#security-handshake.

let ext = request.extensions();
let proto = ext.get::<Protocol>();
if Some(&Protocol::WEB_TRANSPORT) == proto {
if request.method() == &Method::CONNECT {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

);

let timeout = self.handshake_timeout.clone();
let fut = if self.web_transport_addr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of deciding this based on a boolean, it would be great if we could first accept the QUIC connection and dynamically decide based on what the client actually does.

In the current state, this means we cannot share a listener on the same port for WT and regular QUIC connections.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the quinn docs correctly, we can do connecting.await which gives us a quinn::Connection. We can then call Connection::handshake_data and downcast that to https://docs.rs/quinn/latest/quinn/crypto/rustls/struct.HandshakeData.html.

From there, we can inspect the ALPN protocol and decide whether this is a regular libp2p QUIC connection or in fact a WT transport connection.

Using that, we can then branch correctly on what we want to do, assuming the user is actually listening on both transports! If the user only said listen_on without /webtransport, then we should reject the connection. Similarly, if we are only listening on /webtransport, we should (probably?) reject regular QUIC connections.

I think to achieve this, we will need to slightly modify our listen_on behaviour to remember, which port we are listening on and try to find an existing listener on the same port to add the specific functionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state, this means we cannot share a listener on the same port for WT and regular QUIC connections.

It would be really nice if we could achieve this but I'd also be okay if we land this in a 2nd iteration.

@dgarus dgarus changed the title An attempt to extend QUIC transport implementation Add WebTransport protocol Dec 2, 2023
@dgarus dgarus changed the title Add WebTransport protocol feat: add WebTransport protocol Dec 2, 2023
@thomaseizinger
Copy link
Contributor

@dgarus Please ping me once you'd like me to take another look here!

@dgarus
Copy link
Contributor Author

dgarus commented Dec 22, 2023

@dgarus Please ping me once you'd like me to take another look here!

Hi! Ok.
If you suggest a lib which we could use for deterministic cert generation, it would be great.

@thomaseizinger
Copy link
Contributor

@dgarus Please ping me once you'd like me to take another look here!

Hi! Ok. If you suggest a lib which we could use for deterministic cert generation, it would be great.

I don't think there is one at the moment :(

In libp2p-webrtc, we solved the problem by letting the user pass a certificate in. That way, they can generate one and keep it around (like save it to disk). It would obviously nicer to generate one deterministically but if the choice is between requiring the user to pass one in and not being able to ship webtransport at all, I'd rather pick the former :)

@thomaseizinger
Copy link
Contributor

See also: #3049

@dgarus
Copy link
Contributor Author

dgarus commented Dec 22, 2023

@dgarus Please ping me once you'd like me to take another look here!

Hi! Ok. If you suggest a lib which we could use for deterministic cert generation, it would be great.

I don't think there is one at the moment :(

In libp2p-webrtc, we solved the problem by letting the user pass a certificate in. That way, they can generate one and keep it around (like save it to disk). It would obviously nicer to generate one deterministically but if the choice is between requiring the user to pass one in and not being able to ship webtransport at all, I'd rather pick the former :)

Really, I don't like the idea of generating a cert deterministically because I think it's bad for security. What is the problem with fresh certs?

@thomaseizinger
Copy link
Contributor

Really, I don't like the idea of generating a cert deterministically because I think it's bad for security. What is the problem with fresh certs?

Can you elaborate the security problem that you are seeing? The essence of the certificate is the private key that it is signed with. The node's private key should be sufficient to produce such a key via a HKDF.

The problem with "fresh" certificates is that they are useless unless you communicate the certificate hash to the remote who wants to connect. If we generate a new certificate every time on startup, all addresses that we have previously given out via identify or what other nodes have stored in their kademlia routing table are now useless.

@dgarus
Copy link
Contributor Author

dgarus commented Dec 22, 2023

Can you elaborate the security problem that you are seeing?

I thought about it one more time, and it looks ok.
If someone gets the host's private key and saved TLS traffic, he can't decrypt it in any case.

@thomaseizinger
Copy link
Contributor

Can you elaborate the security problem that you are seeing?

I thought about it one more time, and it looks ok. If someone gets the host's private key and saved TLS traffic, he can't decrypt it in any case.

If a peer's private key is exposed then we have bigger problems I think :)
Pretty much every protocol builds on the assumption that the private key remains - well - private! So tying the certificate to the key just makes operation of the node easier (only one element to persist between restarts instead of two). Long-term, we should definitely achieve that!

@dgarus
Copy link
Contributor Author

dgarus commented Dec 25, 2023

@thomaseizinger
Hi, Thomas!

this means we cannot share a listener on the same port for WT and regular QUIC connections.

What are your thoughts on sharing an endpoint rather than a listener?
I just tried to find an endpoint with the same socket address and use it again.

Please take a look how I did a noise security handshake. I'm not sure that it's correct.

@dgarus
Copy link
Contributor Author

dgarus commented Jan 9, 2024

@thomaseizinger Hi, Thomas!

this means we cannot share a listener on the same port for WT and regular QUIC connections.

What are your thoughts on sharing an endpoint rather than a listener? I just tried to find an endpoint with the same socket address and use it again.

Please take a look how I did a noise security handshake. I'm not sure that it's correct.

@thomaseizinger Hi!
Do you have time to take a look?

@thomaseizinger
Copy link
Contributor

@thomaseizinger Hi, Thomas!

this means we cannot share a listener on the same port for WT and regular QUIC connections.

What are your thoughts on sharing an endpoint rather than a listener? I just tried to find an endpoint with the same socket address and use it again.
Please take a look how I did a noise security handshake. I'm not sure that it's correct.

@thomaseizinger Hi! Do you have time to take a look?

Sorry for the delay, I was on vacation! Will take a look now.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I've left some comments, let me know what you think!

Comment on lines +34 to +36
h3 = { git = "https://github.com/hyperium/h3" }
h3-quinn = { git = "https://github.com/hyperium/h3" }
h3-webtransport = { git = "https://github.com/hyperium/h3" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there no released versions of these libraries yet? We won't be able to merge this without released versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With git versions, everything is ok:

image

But when I define h3 = "0.0.4" I got two versions 0.0.3 and 0.0.4:

image

And, accordingly, I got a lot of errors.
I stuck with this problem for the first time, do you know how to fix it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because there hasn't yet been a release of h3-webtransport with h3 0.0.4. See https://crates.io/crates/h3-webtransport/0.1.0/dependencies. I'd suggest nudging the maintainers for a new release!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG
Thanks, I'll do it a bit later

Comment on lines +553 to +564
fn same_cert_hashes(&self, cert_hashes: &Vec<Multihash<64>>) -> bool {
if self.webtransport.cert_hashes.len() != cert_hashes.len() {
return false
}
let mut set: HashSet<Multihash<64>> = self.webtransport.cert_hashes
.clone().into_iter().collect();
for hash in cert_hashes {
set.remove(hash);
}

set.is_empty()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we solve this more elegantly using HashSet::symmetric_difference?

}

pub fn server_quinn_config(&mut self
) -> Result<(quinn::ServerConfig, libp2p_noise::Config, Vec<Multihash<64>>), certificate::GenError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quinn is currently a private dependency and it would be great if we can keep it that way!

Comment on lines +108 to +112
#[error(transparent)]
CertificateGenerationError(#[from] certificate::GenError),

#[error("WebTransport error.")]
WebTransportError,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the enum is not non_exhaustive so this would be a breaking change. Could we (in a first version at least) not add these variants? Worst case we could "hide" them in a custom IO error using io::Error::new().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I've not checked where these errors are used and whether we could avoid them otherwise).

let mut h3_conn = h3::server::builder()
.enable_webtransport(true)
.enable_connect(true)
.enable_datagram(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can set this to false, we don't support datagrams.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably de-duplicate this one with a macro_rules! :)

@@ -37,7 +37,7 @@ pub use futures_rustls::TlsStream;
pub use upgrade::Config;
pub use upgrade::UpgradeError;

const P2P_ALPN: [u8; 6] = *b"libp2p";
pub const P2P_ALPN: [u8; 6] = *b"libp2p";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would pub(crate) be enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not reviewed this in detail because I am not sure this is useful to us as long as we don't have deterministic certificates.

Instead of rolling certificates, I'd suggest (something like) the following:

  • Give our users a simple API to generate and parse certificates
  • Allow the user to pass in a list of certificates in the QUIC config, explicitly for WebTransport
  • For the QUIC functionality, we can generate one similarly to how we do it currently
  • When the user calls listen_on for webtransport, we validate the certificates have a valid date range (the webtransport spec requires +/- 14 days I think)

Whilst not super convenient, this allows users to externally manage certificates until we can generate them deterministically. What is important is that we give users some way of restarting a libp2p-node and retain the certificate. WebTransport is of no use if the certificate hashes change constantly.

Does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, @thomaseizinger ! I agree with you.

Give our users a simple API to generate and parse certificates

What should it look like?
I could suggest, for example, defining in the settings a folder where WT will save autogenerated certs. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could suggest, for example, defining in the settings a folder where WT will save autogenerated certs. WDYT?

I would like to avoid interacting with the file system as much as possible. My suggestion would be:

  • libp2p::webtransport::Transport::new takes a list of certificates (of type libp2p::webtransport::Certificate)
  • libp2p::webtransport::Certificate::generate allows users generate a new certificate with certain parameters (validity date etc)
  • libp2p::webtransport::Certificate::{parse,to_bytes} allow users to serialize and deserialize certificates

I would completely leave it up to the user to roll those certificates. For an MVP, that is good enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! I look at it from the side of an application developer, but it's a library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine, a user started node with WT and a few certs. But when certs become outdated, the user has to restart node with a new list of certs. And this restart is a wrong thing, IMHO.
WDYT about adding a new cert to WT cert's list?
Also, I think, to manage certs, users should have the ability to get this list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, then the max age of the cert is two weeks in order to use the certHashes feature. I think for an MVP, it is acceptable having to restart a node every two weeks to rotate the certs.

Let's start with that. We can always add more APIs later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! I look at it from the side of an application developer, but it's a library.

It is very important to take the look of the app developer, otherwise we won't be building useful APIs :)
But we also need to make sure we offer flexible APIs and don't provide some that are too opinionated.

@@ -306,7 +331,7 @@ impl<P: Provider> Transport for GenTransport<P> {
&mut self,
addr: Multiaddr,
) -> Result<Self::Dial, TransportError<Self::Error>> {
let (socket_addr, _version, peer_id) =
let (socket_addr, _version, peer_id, _is_webtransport) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we also need to validate that we aren't dialing a webtransport address.

Comment on lines +226 to +229
let endpoint = match self.get_existed_endpoint(&socket_addr, &cert_hashes) {
Some(res) => res,
None => Self::new_endpoint(self.config.endpoint_config(), Some(server_config), socket)?,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this will work as intended. Using the same endpoint in two Listeners will make them compete, no?

Let's consider the following scenario:

  1. User calls listen_on for /quic-v1
  2. User calls listen_on for /webtransport using the same port
  3. Another peer attempts to connect using the /webtransport address

The incoming connection will be picked up by one of the two listeners. Until we inspect the Connecting struct, we don't actually know whether it is a QUIC or a WebTransport connection.

So I think what we need to do is:

  • At most have 1 Listener for a given socket address
  • The listener needs to be one of 3 modes:
    • Only QUIC
    • Only WT
    • Both
  • When we receive a listen_on for a socket with an existing listener, we need to update its mode
  • For incoming connections, we need to inspect it whether it is a QUIC or WT connection and check whether the listener supports whatever is incoming. If yes, upgrade it accordingly.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll try it

@dgarus
Copy link
Contributor Author

dgarus commented Jan 19, 2024

@thomaseizinger Hi, Thomas!
Glad to see you. I'm going back to the PR on next week.

@thomaseizinger
Copy link
Contributor

Btw @dgarus, if you want to push the deterministic certificate story forward: The next step would be to open an issue with ring to inquire about deterministic APIs. See rustls/rcgen#173 (comment).

@dgarus
Copy link
Contributor Author

dgarus commented Aug 20, 2024

Actual PR
#5564

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants