diff --git a/diagrams/VerifierIssuerCollusion.svg b/diagrams/VerifierIssuerCollusion.svg new file mode 100644 index 0000000..98b05de --- /dev/null +++ b/diagrams/VerifierIssuerCollusion.svg @@ -0,0 +1,3 @@ + + + \ No newline at end of file diff --git a/diagrams/VerifiersCollusion.svg b/diagrams/VerifiersCollusion.svg new file mode 100644 index 0000000..3437acd --- /dev/null +++ b/diagrams/VerifiersCollusion.svg @@ -0,0 +1,4 @@ + + + + \ No newline at end of file diff --git a/index.html b/index.html index bf27a80..b0cfe59 100644 --- a/index.html +++ b/index.html @@ -153,11 +153,59 @@ authors: ["Tobias Looker", "Vasilis Kalos", "Andrew Whitehead", "Mike Lodder"], status: "Draft" }, + "CFRG-PAIRING-FRIENDLY": { + title: "Pairing-Friendly Curves", + href: "https://www.ietf.org/archive/id/draft-irtf-cfrg-pairing-friendly-curves-11.html", + authors: ["Yumi Sakemi", "Tetsutaro Kobayashi", "Tsunekazu Saito", "Riad S. Wahby"], + status: "Draft" + }, "BLS-JOSE-COSE": { title: "Barreto-Lynn-Scott Elliptic Curve Key Representations for JOSE and COSE", href: "https://datatracker.ietf.org/doc/draft-ietf-cose-bls-key-representations/", authors: ["Michael B. Jones", "Tobias Looker"], status: "Draft" + }, + Taming_EdDSAs: { + title: "Taming the many EdDSAs", + href: "https://eprint.iacr.org/2020/1244", + authors: ["Konstantinos Chalkias", "François Garillot", "Valeria Nikolaenko"], + date: "2020", + publisher: "Cryptology ePrint Archive, Paper 2020/1244", + doi: "10.1007/978-3-030-64357-7_4" + }, + CDL2016: { + title: "Anonymous Attestation Using the Strong Diffie Hellman Assumption Revisited", + href: "https://eprint.iacr.org/2016/663", + authors: ["Jan Camenisch", "Manu Drijvers", "Anja Lehmann"], + date: "2016", + publisher: "Cryptology ePrint Archive, Paper 2016/663" + }, + TZ2023: { + title: "Revisiting BBS Signatures", + href: "https://eprint.iacr.org/2023/275", + authors: ["Stefano Tessaro", "Chenzhi Zhu"], + date: "2023", + publisher: "Cryptology ePrint Archive, Paper 2023/275" + }, + "NISTIR8053": { + title: "NISTIR 8053: De-Identification of Personal Information", + href: "https://nvlpubs.nist.gov/nistpubs/ir/2015/NIST.IR.8053.pdf", + authors: ["Simson L. Garfinkel"], + date: "October 2015" + }, + "Powar2023": { + title: "SoK: Managing risks of linkage attacks on data privacy", + authors: ["J. Powar", "A. R. Beresford"], + publisher: "Proceedings on Privacy Enhancing Technologies", + date: "2023", + href: "https://petsymposium.org/popets/2023/popets-2023-0043.php" + }, + "Pugliese2020": { + title: "Long-Term Observation on Browser Fingerprinting: Users' Trackability and Perspective", + authors: ["G. Pugliese", "C. Riess", "F. Gassmann", "Z. Benenson"], + publisher: "Proceedings on Privacy Enhancing Technologies", + date: "2020", + href: "https://petsymposium.org/popets/2020/popets-2020-0041.php" } }, lint: {"no-unused-dfns": false}, @@ -252,7 +300,7 @@
+ Before reading this section, readers are urged to familiarize themselves + with general security advice provided in the + + Security Considerations section of the Data Integrity specification. +
+ ++The security of the base proof is dependent on the security properties of the +associated BBS signature. Digital signatures might exhibit a number of +desirable cryptographic properties [[Taming_EdDSAs]] among these are: +
+EUF-CMA (existential unforgeability under +chosen message attacks) is usually the minimal security property required +of a signature scheme. It guarantees that any efficient adversary who has the +public key + +of the signer and received an arbitrary number of signatures on +messages of its choice (in an adaptive manner): + , +cannot output a valid signature + +for a new message + +(except with negligible probability). In case the attacker outputs a valid +signature on a new message: + , +it is called an existential forgery. +
+SUF-CMA (strong unforgeability under chosen +message attacks) is a stronger notion than EUF-CMA. It guarantees +that for any efficient adversary who has the public key + +of the signer and received an arbitrary number of signatures on messages of its +choice: + , +it cannot output a new valid signature pair + , +such that + +(except with negligible probability). Strong unforgeability implies that an +adversary cannot only sign new messages, but also cannot find a new signature +on an old message. +
+ ++In [[CDL2016]] under some reasonable assumptions BBS signatures were proven to +be EUF-CMA. Furthermore, in [[TZ2023]], under similar assumptions BBS signatures +were proven to be SUF-CMA. In both cases the assumptions are related to the +hardness of the discrete logarithm problem which is not considered post large +scale quantum computing secure. +
++Under non-quantum computing conditions [[CFRG-BBS-SIGNATURE]] provides +additional security guidelines to BBS signature suite implementors. Further +security considerations related to pairing friendly curves are discussed in +[[CFRG-PAIRING-FRIENDLY]]. +
+
+The security of the derived proof is dependent on the security properties of
+the associated BBS proof. Both [[CDL2016]] and [[TZ2023]] prove that a
+BBS proof is a zero knowledge proof of knowledge of a BBS
+ signature
.
+
+As explained in [[CFRG-BBS-SIGNATURE]] this means: +
++a verifying party in receipt of a proof is unable to determine which signature +was used to generate the proof, removing a common source of correlation. In +general, each proof generated is indistinguishable from random even for two +proofs generated from the same signature. ++
+and +
++The proofs generated by the scheme prove to a verifier that the party who +generated the proof (holder/prover or an agent of theirs) was in possession of a +signature without revealing it. ++
+More precisely, verification of a BBS proof requires the original +issuers public key as well as the unaltered, revealed BBS message in +the proper order. +
+TODO: We need to add a complete list of privacy - considerations.
TODO: We need to add a complete list of security - considerations.
++Selective disclosure permits a holder to minimize the information +revealed to a verifier to achieve a particular purpose. In prescribing +an overall system that enables selective disclosure, care has to be taken that +additional information that was not meant to be disclosed to the +verifier is minimized. Such leakage can occur through artifacts of the +system. Such artifacts can come from higher layers of the system, such as in +the structure of data or from the lower level cryptographic primitives. +
++For example the BBS signature scheme is an extremely space efficient scheme for +producing a signature on multiple messages, i.e., the cryptographic +signature sent to the holder is a constant size regardless of the +number of messages. The holder then can selectively disclose +any of these messages to a verifier, however as part of the +encryption scheme, the total number of messages signed by the issuer +has to be revealed to the verifier. If such information leakage needs to +be avoided then it is recommended to pad the number of messages out to a common +length as suggested in the privacy considerations section of +[[CFRG-BBS-SIGNATURE]]. +
++At the higher levels, how data gets mapped into individual statements +suitable for selective disclosure, i.e., BBS messages, is a potential +source of data leakage. This cryptographic suite is able to eliminate many +structural artifacts used to express JSON data that might leak information +(nesting, map, or array position, etc.) by using JSON-LD processing to transform +inputs into RDF. RDF can then be expressed as a canonical, flat format of simple +subject, property, value statements (referred to as claims in the Verifiable +Credentials Data Model [[VC-DATA-MODEL-2]]). In the following, we examine RDF +canonicalization, a general scheme for mapping a verifiable credential in +JSON-LD format into a set of statements (BBS messages), for +selective disclosure. We show that after this process is performed, there +remains a possible source of information leakage, and we show how this leakage +is mitigated via the use of a keyed pseudo random function (PRF). +
+
+RDF canonicalization can be used to flatten
a JSON-LD VC into a set of
+statements. The algorithm is dependent on the content of the VC and
+also employs a cryptographic hash function to help in ordering the
+statements. In essence, how this happens is that each JSON object that
+represents the subject of claims within a JSON-LD document will be assigned an
+id, if it doesn't have an `@id` field defined. Such ids are known as
+blank node ids. These ids are needed to express claims as simple
+subject, property, value statements such that the subject in each claim can be
+differentiated. The id values are deterministically set per
+[[RDF-CANON]] and are based on the data in the document and the
+output of a cryptographic hash function such as SHA-256.
+
+Below we show two slightly different VCs for a set of windsurf sails and their +canonicalization into a set of statements that can be used for +selective disclosure. By changing the year of the 6.1 size sail we see a major +change in statement ordering between these two VCs. If the holder +discloses information about just his larger sails (the 7.0 and 7.8) the +verifier could tell something changed about the set of sails, i.e., +information leakage. +
++{ + "@context": [ + "https://www.w3.org/ns/credentials/v2", + { + "@vocab": "https://windsurf.grotto-networking.com/selective#" + } + ], + "type": [ + "VerifiableCredential" + ], + "credentialSubject": { + "sails": [ + { + "size": 5.5, + "sailName": "Kihei", + "year": 2023 + }, + { + "size": 6.1, + "sailName": "Lahaina", + "year": 2023 // Will change this to see the effect on canonicalization + }, + { + "size": 7.0, + "sailName": "Lahaina", + "year": 2020 + }, + { + "size": 7.8, + "sailName": "Lahaina", + "year": 2023 + } + ] + } +} ++
+Canonical form of the above VC. Assignment of blank node ids, i.e., the
+_:c14nX
labels are dependent upon the content of the VC and this also
+affects the ordering of the statements.
+
+_:c14n0 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" . +_:c14n0 <https://windsurf.grotto-networking.com/selective#size> "7.8E0"^^<http://www.w3.org/2001/XMLSchema#double> . +_:c14n0 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://www.w3.org/2018/credentials#VerifiableCredential> . +_:c14n1 <https://www.w3.org/2018/credentials#credentialSubject> _:c14n4 . +_:c14n2 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" . +_:c14n2 <https://windsurf.grotto-networking.com/selective#size> "7"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n2 <https://windsurf.grotto-networking.com/selective#year> "2020"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n3 <https://windsurf.grotto-networking.com/selective#sailName> "Kihei" . +_:c14n3 <https://windsurf.grotto-networking.com/selective#size> "5.5E0"^^<http://www.w3.org/2001/XMLSchema#double> . +_:c14n3 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n0 . +_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n2 . +_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n3 . +_:c14n4 <https://windsurf.grotto-networking.com/selective#sails> _:c14n5 . +_:c14n5 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" . +_:c14n5 <https://windsurf.grotto-networking.com/selective#size> "6.1E0"^^<http://www.w3.org/2001/XMLSchema#double> . +_:c14n5 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> . ++
+Updated windsurf sail collection, i.e., the 6.1 size sail has been updated to +the 2024 model. This changes the ordering of statements via the assignment of +blank node ids. +
++{ + "@context": [ + "https://www.w3.org/ns/credentials/v2", + { + "@vocab": "https://windsurf.grotto-networking.com/selective#" + } + ], + "type": [ + "VerifiableCredential" + ], + "credentialSubject": { + "sails": [ + { + "size": 5.5, + "sailName": "Kihei", + "year": 2023 + }, + { + "size": 6.1, + "sailName": "Lahaina", + "year": 2024 // New sail to update older model, changes canonicalization + }, + { + "size": 7.0, + "sailName": "Lahaina", + "year": 2020 + }, + { + "size": 7.8, + "sailName": "Lahaina", + "year": 2023 + } + ] + } +} ++
+Canonical form of the previous VC. Note the difference in blank node id +assignment and ordering of statements. +
++_:c14n0 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" . +_:c14n0 <https://windsurf.grotto-networking.com/selective#size> "6.1E0"^^<http://www.w3.org/2001/XMLSchema#double> . +_:c14n0 <https://windsurf.grotto-networking.com/selective#year> "2024"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n1 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" . +_:c14n1 <https://windsurf.grotto-networking.com/selective#size> "7.8E0"^^<http://www.w3.org/2001/XMLSchema#double> . +_:c14n1 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://www.w3.org/2018/credentials#VerifiableCredential> . +_:c14n2 <https://www.w3.org/2018/credentials#credentialSubject> _:c14n5 . +_:c14n3 <https://windsurf.grotto-networking.com/selective#sailName> "Lahaina" . +_:c14n3 <https://windsurf.grotto-networking.com/selective#size> "7"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n3 <https://windsurf.grotto-networking.com/selective#year> "2020"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n4 <https://windsurf.grotto-networking.com/selective#sailName> "Kihei" . +_:c14n4 <https://windsurf.grotto-networking.com/selective#size> "5.5E0"^^<http://www.w3.org/2001/XMLSchema#double> . +_:c14n4 <https://windsurf.grotto-networking.com/selective#year> "2023"^^<http://www.w3.org/2001/XMLSchema#integer> . +_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n0 . +_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n1 . +_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n3 . +_:c14n5 <https://windsurf.grotto-networking.com/selective#sails> _:c14n4 . ++
+To prevent such information leakage from the assignment of these blank node ids
+and the ordering they impose on the statements, an HMAC based PRF is
+run on the blank node ids. The HMAC secret key is only shared between
+the issuer and holder and each Base Proof generated
+by the issuer uses a new HMAC key. An example of this can be seen in the
+
+canonical HMAC test vector of [[DI-ECDSA]].
+
+
+As discussed in the next section, for BBS to preserve unlinkability we do not
+use HMAC based blank node ids but produce a shuffled
version of
+the ordering based on the HMAC as shown in test vector .
+Note that this furnishes less information hiding concerning blank node
+ids than in the ECDSA-SD approach, since information the number of
+blank node ids can leak, but prevents linkage attacks via the
+essentially unique identifiers produced by applying an HMAC to blank node ids.
+
+In some uses of VCs it can be important to the privacy of a holder to +prevent the tracking or linking of multiple different verifier +interactions. In particular we consider two important cases (i) verifier to +issuer collusion, and (ii) verifier to verifier collusion. In the +first case, shown in , a verifier +reports back to the original +issuer of the credential on an interaction with a holder. In +this situation, the issuer could track all the holder +interactions with various verifiers using the issued VC. In the second +situation, shown in , multiple +verifiers collude to share +information about holders with whom they have interacted. +
+ + +
+We use the term unlinkability to describe the property of a VC system
+to prevent such "linkage attacks" on holder privacy. Although the term
+unlinkability is relatively new section 3.3 of [[NISTIR8053]] discusses and
+gives a case study of Re-identification through Linkage Attacks
. A
+systemization of knowledge on linkage attack on data privacy
can be found
+in [[Powar2023]]. The most widespread use of linkage attack on user privacy
+occurs via the practice of web browser fingerprinting, a survey of which can be
+found in [[Pugliese2020]].
+
+To quantify the notion of linkage, [[Powar2023]] introduces the idea of an +anonymity set. In the VC case we are concerned with here, the +anonymity set would contain the holder of a particular VC and other +holders associated with a particular issuer. The smaller the anonymity +set the more likely the holder can be tracked across verifiers. Since a signed +VC contains a reference to a public key of the issuer, the starting size for the +anonymity set for a holder possessing a VC from a particular issuer is the +number of VC issued by that issuer with that particular public/private key pair. +Non-malicious issuers are expected to minimize the number of public/private key +pairs used to issue VCs. Note that the anonymity set idea is similar to the +group privacy concept in [[vc-bitstring-status-list]]. When we use the term +linkage here we generally mean any mechanism that results in a reduction in size +of the anonymity set. +
++Sources of linkage in a VC system supporting selective disclosure: +
++We discuss each of these below. +
++Cryptographic Hashes, HMACs, and digital signatures by their nature generate +highly unique identifiers. The output of a hash function such as SHA-256, by its +collision resistance properties, are guaranteed to be essentially unique given +different inputs and result in a strong linkage, i.e., reduces the anonymity set +size to one. Similarly deterministic signature algorithms such as Ed25519 and +deterministic ECDSA will produce essentially unique outputs for different inputs +and lead to strong linkages. +
++This implies that holders can be easily tracked across +verifiers via digital signature, HMAC, or hash artifacts inside VCs and +hence are vulnerable to verifier-verifier collusion and +verifier-issuer collusion. Randomized signature algorithms such as some +forms of ECDSA can permit the issuer to generate many distinct signatures on the +same inputs and send these to the holder for use with different +verifiers. Such an approach could be used to prevent +verifier-verifier collusion based tracking but cannot help with +verifier-issuer collusion. +
+
+To achieve unlinkability requires specially designed cryptographic signature
+schemes that allow the holder to generate what is called a zero
+knowledge proof of knowledge of a signature
(ZKPKS). What this means is that
+the holder can take a signature from the issuer in such a
+scheme, compute a ZKPKS to send to a verifier. This ZKPKS cannot be
+linked back to the original signature, but has all the desirable properties of a
+signature, i.e., the verifier can use it to verify that the messages
+were signed by the issuers public key and that the messages have not
+been altered. In addition, the holder can generate as many ZKPKSs as
+desired for different verifiers and these are essentially independent
+and unlinkable. BBS is one such signature scheme that supports this capability.
+
+Although the ZKPKS, known as a BBS proof in this document, has +guaranteed unlinkability properties. BBS when used with selective disclosure has +two artifacts that can contribute to linkability. These are the total number of +messages originally signed, and the index values for the revealed statements. +See the privacy considerations in [[CFRG-BBS-SIGNATURE]] for a discussion and +mitigation techniques. +
+
+As mentioned in the section on Issuer's Public Keys
of
+[[CFRG-BBS-SIGNATURE]] there is the potential threat that an issuer might
+use multiple public keys with some of those used to track a specific subset of
+users via verifier-issuer collusion. Since the issuers public
+key has to be visible to the verifier, i.e., it is referenced in the BBS
+proof (derived proof) this can be used as a linkage point if the issuer
+has many different public keys and particularly if it uses a subset of those
+keys with a small subset of users (holders).
+
+We saw in the section on information leakage that RDF canonicalization uses a
+hash function to order statements and that a further shuffle
of the order
+of the statements is performed based on an HMAC. This can leave a fingerprint
+that might allow for some linkage. How strong of a linkage is dependent on the
+number of blank nodes, essentially JSON objects within the VC, and the number of
+indexes revealed. Given n blank nodes and k disclosed indexes
+in the worst case this would be a reduction in the anonymity set size by a
+factor of C(n, k), i.e., the number combinations of size k
+chosen from a set of n elements. One can keep this number quite low by
+reducing the number of blank nodes in the VC, e.g., keep the VC short and
+simple.
+
+In the [[vc-data-integrity]] specification, a number of properties of the +`proof` attribute of a VC are given. Care has to be taken that optional fields +ought not provide strong linkage across verifiers. The optional fields include: +id, created, expires, domain, +challenge, and nonce. For example the optional +created field is a `dateTimeStamp` object which can specify the +creation date for the proof down to an arbitrary sub-second granularity. Such +information, if present, could greatly reduce the size of the anonymity set. If +the issuer wants to include such information they ought to make it as coarse +grained as possible, relative to the number of VCs being issued over time. +
+
+The issuer can also compel a holder to reveal certain
+statements to a verifier via the `mandatoryPointers` input used in the
+creation of the Base Proof. See section
+,
+, and
+. By compel
we mean that
+a generated Derived Proof will not verify unless these statements are revealed
+to the verifier. Care should be taken such that if such information is
+required to be disclosed, that the anonymity set remains sufficiently large.
+
+As discussed in [[Powar2023]] there are many documented cases of +re-identification of individuals from linkage attacks. Hence the holder +is urged to reveal as little information as possible to help keep the anonymity +set large. In addition, it has been shown a number of times that innocuous +seeming information can be highly unique and thus leading to re-identification +or tracking. See [[NISTIR8053]] for a walk through of a particularly famous +case of a former governor of Massachusetts and [[Powar2023]] for further +analysis and categorization of 94 such public cases. +
++It ought to be pointed out that maintaining unlinkability, i.e., anonymity, +requires care in the systems holding and communicating the VCs. Networking +artifacts such as IP address (layer 3) or Ethernet/MAC address (layer 2) are +well known sources of linkage. For example, mobile phone MAC addresses can be +used to track users if they revisited a particular access point, this led to +mobile phone manufacturers providing a MAC address randomization feature. +Public IP addresses generally provide enough information to geolocate an +individual to a city or region within a country potentially greatly reducing +the anonymity set. +
+