Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact RSA public key representation #235

Closed
clehner opened this issue Sep 30, 2021 · 8 comments
Closed

Compact RSA public key representation #235

clehner opened this issue Sep 30, 2021 · 8 comments

Comments

@clehner
Copy link
Contributor

clehner commented Sep 30, 2021

RSA public keys can be represented using ASN.1/DER RSAPublicKey in #233. However, there is a concern that potential implementers may consider this a legacy format: w3c-ccg/did-method-key#41 (comment). Addressing this could be an opportunity to find a more compact encoding, as there is some overhead with the RSAPublicKey DER encoding, as it encodes a data structure format that is not really needed here, as the data structure may be understood from the context (multiformats/multicodec and the particular multicodec entry): we just need to represent two positive integers.

In Example 1 in #233 (comment), a RSA public key is represented in 270 bytes, that is 256 bytes for the 2048-bit modulus, 3 bytes for the (public) exponent, 1 byte for the (sometimes needed) leading zero in the modulus, and 10 remaining bytes for the data structure.
@expede and @b5, how would you feel about approaching a more compact representation?

I think this may be more work for some implementers, do to wide support of RSAPublicKey in ASN.1/DER; but I think @dlongley is right that new implementations might not in fact have such support.

Is anyone aware of an appropriate encoding here? As we are in the context multiformats/multicodec, I would think to reuse varint. However some implementations of multicodec (and did:key) might not actually have varint implemented, as they may rely on the static encoded types. The readme lists only two implementations: https://github.com/multiformats/unsigned-varint#implementations. Perhaps a simpler format may be preferable? On the other hand, as an implementer I believe I could work with that, using the Rust implementation.
If a simpler format is preferred, unless there is some existing standard compact and canonical representation for this key type, I would propose the following: one byte for the length of the exponent, followed by the exponent, followed by the modulus.
As integers, the exponent and modulus would be encoded the same as in ASN.1 (after dropping the leading zero where appropriate) and as in JWK (before Base64Url encoding). I think it is the same in OpenSSH public key format as well.

@expede
Copy link
Contributor

expede commented Oct 1, 2021

I'm all for exploring more compact representations. The smaller we can get these, the better. Agreed that 10 bytes of structure seems unnecessary. I'm happy with an unsigned varint, though it's certainly harder to debug looking directly at the raw output. Given that the key is going to be one of a few fixed lengths, do we want a varint specifically? Is the MSB overhead worthwhile in this context?

Further tricks like omitting the exponent (and assuming it's 65537) feel too far 😱

@rvagg
Copy link
Member

rvagg commented Oct 1, 2021

Also keep in mind that the more novel and custom things are in crypto the more difficulty you'll have with standards bodies and implementers actually adopting your thing. Making something new to shave off a few bytes might get points docked when trying to get broad adoption.

@b5
Copy link
Contributor

b5 commented Oct 1, 2021

I'm biased toward the view @rvagg is cautioning us on.

While I don't sleep as well knowing that there's a data structure that isn't actually needed, I do find this kind of thing relying a little too heavily on multicodec. I interpret the design intent behind multicodec to be a multiplexing prefix on existing serialization formats, not a source of serialization logic itself. If we can't point to an existing RSA key representation that matches this compactness, then I think we've found the right balance of adoption potential and wire efficiency.

With that said, @clehner, you mentioned that OpenSSH uses a similar approach, so I personally could be won over if there were folks outside of OpenSSH using OpenSSH-style public key representations. In that case we're just adding momentum to an existing adoption effort.

@expede
Copy link
Contributor

expede commented Oct 1, 2021

I interpret the design intent behind multicodec to be a multiplexing prefix on existing serialization formats, not a source of serialization logic itself.

Oh yeah, that's a good way to view it 👍

standards bodies and implementers actually adopting your thing

On the flip side, over in the W3C did:key repo folks are raising concerns about the existing common RSA formats: w3c-ccg/did-method-key#41 (comment)

I can see a lot of did:key resolver libraries not wanting to take on the legacy code required to parse DER / ASN.1 to validate RSA keys ... so enabling that to be avoided should be a consideration as well.

At the end of the day, an RSA public key is 3 numbers. We can tightly pack them, or use a different serialization format. RSA keys are pretty big at baseline, but agreed that usability is a factor.

FWIW I just want interop and not a huge amount of overhead (some is fine). I'm happy with any well-specified encoding, as long as we don't end up with massive headers that come with some formats.

@clehner
Copy link
Contributor Author

clehner commented Nov 2, 2021

Hi, sorry to leave this hanging.

I put together a comparison of some possible RSA public key representations, here:
https://celehner.com/2021/11/rsapk/

Here is a copy of the table from that page:

Representation Bytes for Reference Key
RSAPublicKey (ASN.1) 270
COSE Key 268
ssh-rsa (SSH Public Key) 279
ssh-rsa without prefix string 268
ssh-rsa without prefix string or modulus length 264
4-byte exponent length 263
1-byte exponent length 260

The "reference key" is the 2048-bit key "Example 1" from #233 (comment).
RSAPublicKey (ASN.1) is the representation we have in multicodec currently.
The representation called "1-byte exponent length" is the one mentioned in #235 (comment) as "one byte for the length of the exponent, followed by the exponent, followed by the modulus".

If we want to stick with a standard format, COSE Key (in Canonical CBOR) is slightly more compact than RSAPublicKey/ASN.1. If we want to compact things further, the "1-byte exponent length" representation is the best I can see right now.

I see it would be unusual to apparently define a format in multicodec, rather than reference an existing format. If we were to use a new format, would it be better to define it somewhere else - maybe even in the did:key specification itself? That could be proposed alongside the test vectors, or in an appendix, etc., and referred to like "RSA public key, according to the did:key method v0.7".

Or if RSAPublicKey/ASN.1 is good enough, I could mark w3c-ccg/did-method-key#41 as ready for review as it is now.

@OR13
Copy link
Contributor

OR13 commented Nov 16, 2021

Don't invent a new key format for RSA to save a few bytes, it will cause lots of translation bugs, and prevent standards adoption.

Pick a standard key format, JWK, CWK, ASN.1, and map that to the registered prefix.

Pop the prefix off and use off the shelf libraries.

The space saving is not worth the translation bugs, especially for RSA.

did:key is not the place to define new key formats, it's the place to use existing / standard ones with DIDs.

The less innovation we do on key formats, the better IMO (both in this repo and in did:key).

I'm in favor of keeping things as is (ASN.1) or using (CWK).

@expede
Copy link
Contributor

expede commented Nov 18, 2021

Or if RSAPublicKey/ASN.1 is good enough, I could mark w3c-ccg/did-method-key#41 as ready for review as it is now.

I'm in favor of keeping things as is (ASN.1) or using (CWK).

I just want interop

@clehner I think the general consensus is to keep with ASN.1, and we can add other versions (with their own prefixes) if and when there's a strong use case for something different

@clehner
Copy link
Contributor Author

clehner commented Nov 19, 2021

Thanks @OR13 and @expede for the feedback.

I think CWK might make more sense as its own multicodec type, as it is not RSA-specific but rather is a more general key representation (like JWK or SPKI) - and maybe for its own DID method rather than in did:key which uses multicodec for key type.

I am seeing the consensus to keep the current entry as is (ASN.1/DER RSAPublicKey). I think this issue can be closed unless there is reason to discuss it further.

@clehner clehner closed this as completed May 25, 2022
@clehner clehner reopened this May 25, 2022
@clehner clehner closed this as not planned Won't fix, can't repro, duplicate, stale May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants