There is a whole field of authentication in cryptography that is often under-discussed (at least in my opinion).
See, we often like to talk about how key exchanges can also be authenticated, by mean of public-key infrastructures (e.g. HTTPS) or by pre-exchanging secrets (e.g. my previous post on sPAKE), but we seldom talk about post-handshake authentication.
Post-handshake authentication is the idea that you can connect to something (often a hardware device) insecurely, and then "augment" the connection via some information provided in a special out-of-band channel.
But enough blabla, let me give you a real-world example: you link your phone with your car and are then asked to compare a few digits (this pairing method is called "numeric comparison" in the bluetooth spec). Here:
Out-of-band channel. you are in your car, looking at your screen, this is your out-of-band channel. It provides integrity (you know you can trust the numbers displayed on the screen) but do not necessarily provide confidentiality (someone could look at the screen through your window).
Short authenticated string (SAS). The same digits displayed on the car's screen and on your phone are the SAS! If they match you know that the connection is secure.
This SAS thing is extremely practical and usable, as it works without having to provision devices with long secrets, or having the user compare long strings of unintelligible characters.
How to do this? You're probably thinking "easy!". An indeed it seems like we could just do a key exchange, and then pass the output in some KDF to create this SAS.
This has been discussedlong-and-large on the internet: with key exchange protocols like X25519 it doesn't work.
The reason is that X25519 does not have contributory behavior: I can send you a public key that will lead to a predictable shared secret. In other words: your public key does not contribute (or very little) to the output of the algorithm.
The correct™ solution here is to give more than just the key exchange output to your KDF: give it your protocol transcript. All the messages sent and received. This puts any man-in-the-middle attempts in the protocol to a stop. (And the lesson is that you shouldn't naively customize a key exchange protocol, it can lead to real world failures.)
a sPAKE is first and foremost a PAKE, which stands for Password-Authenticated Key Exchange.
This simply means that authentication in the key exchange is provided via the knowledge of a password.
The s (resp. b) in front means symmetric (resp. balanced). This indicates that both sides know the password.
Other PAKEs where only one side knows the password exist, these are called aPAKE for asymmetric (or augmented) PAKEs.
Yes I know the nomenclature is a bit confusing :)
The most promising sPAKE scheme currently seems to be SPAKE2, which is in the process of being standardized here.
There are other sPAKEs, like Dragonfly which is used in WPA3, but they don't seem to provide as strong properties as SPAKE2.
The trick to a symmetric PAKE is to use the password to blind the key exchange's ephemeral keypairs.
Note that we can't use the password as is, instead we:
Pass the password into a memory-hard hash function like Argon2 to obtain w. Can you guess why we do this? (leave a comment if you do!)
Convert it to a group element. To do this we simply consider w a scalar and do a scalar multiplication with a generator of our subgroup (M or N depending if you're the client or the server, can you guess why we use different generators?)
NOTE: If you know BLS or OPAQUE, you might be wondering why we don't use a "hash-to-curve" algorithm, this is because we don't need to obtain a group element with an unknown discrete logarithm in SPAKE2.
Once the blinded (with the password) public keys have been exchanged, both sides can compute a shared group element:
Alice computes K = h × alice_private_key × (S - w × N)
Bob computes K = h × bob_private_key × (T - w × M)
Spend a bit of your time to understand these equations.
What happens is that both Alice and Bob first unblind the public key they've received, then perform a key exchange with it, then multiply it with the value h. What's this value h? The cofactor, or simply put: the other annoying subgroup.
Finally Alice and Bob hash the whole transcript, which is the concatenation of:
The message Bob sent S.
The message Alice sent T.
The shared group element K.
The hardened password w.
The hash of this transcript gives us two things:
A shared secret !
A key that is further expanded (via a KDF) to obtain two authentication keys.
These authentication keys sole purpose is to provide key confirmation in the last round-trip of messages.
That is to say at this point, if we don't do anything, we don't know if either Alice or Bob truly managed to compute the shared secret.
Key confirmation is pretty simple, both sides just have to compute an authentication tag with one of the authentication key produced over the transcript.
The final protocol looks a bit dense, but you should be able to decipher it if you've read this far.
Authentication is an overloaded word in cryptography.
In the context of cryptographic primitives like message authentication codes (MACs) and authenticated encryption with associated data (AEAD), authentication really refers to authenticity or integrity. And as the Cambridge dictionary says:
Authenticity. the quality of being real or true.
The poems are supposed to be by Sappho, but they are actually of doubtful authenticity.
The authenticity of her story is beyond doubt.
The proof is in the pudding. When talking about the security properties of primitives like MACs, cryptography talks about unforgeability, which does relate to authenticity.
So whenever you hear things like "is this payload authenticated with HMAC?", think authenticity, think integrity.
In the context of protocols though (e.g. TLS) authentication refers to identification: the concept of proving who you are.
So whenever you hear things like "Is the server authenticated?", think "identities are being proven".
This dual sense really annoys me, but in the end this ambiguity is encompassed in the definition of authentication:
the process or action of proving or showing something to be true, genuine, or valid.
origin/entity authentication. You're proving that an entity really is who they say they are.
message authentication. You're proving that a message is genuine.
Note that an argument against this distinction is the following: to authenticate a message, you need a key. This key comes from somewhere (it's your context, or your "who"). So when you authenticate a message, you are really authenticating the context. This falls short in scenarios where for example you trust the root hash of a merkle tree, which authenticates all of its leaves.
The bottom line is, authentication is about proving that something is what it is supposed to be. And that thing can be a person, or a message, or maybe even something else.
This is not all. In the security world people are confused with authorization vs authentication :)
I've been following the Messaging Layer Security (MLS) standardization a bit.
I really appreciate what the people are doing there, and what they are trying to solve.
I think group messaging is currently a huge mess, as every application I have seen/audited seemed to invent a new way to implement group chat.
A common standard and guidelines would greatly help.
MLS' goal is to provide a solution to end-to-end encryption for group chats. A solution that scales.
If you don't know how the MLS protocol works, I advise you to read Michael Rosenberg's blog post or to watch the Real World Crypto talk on the subject (might not be available at the moment).
Thinking about the standard, I have two questions:
Does a group chat loses any notion of privacy/confidentiality after it gets too large? For example, if you are in a Hong Kong group trying to organize a protest and there are more than 1,000 people in the group, what are the odds that one of them is a cop?
Would a group chat protocol targeting groups with small numbers of participant (let's say 50 at most) be able to provide better security insurances efficiently?
For example, here are two security properties (taken from SoK: Secure Messaging) that MLS does not provide:
Speaker Consistency: All participants agree on the sequence of messages sent by each participant.
This means that if Alice (who is part of a group chat with Bob and Eve) colludes with the server, she can send "I like cats" to Bob and "I like dogs" to Eve.
Global Transcript: All participants see all messages in the same order. Note that this implies speaker consistency
This means that if Alice sends the following messages:
you must decide
a server could re-order these messages so that Bob would see them in the same order, but Eve would see:
you must decide
I have the following open questions:
Are these attacks important to protect against?
Is there an efficient protocol to prevent these attacks for groups of reasonable size?
If we cannot prevent them, can we detect them and warm the users?
If we are willing to change the protocol when going from 2 participants to 3 participants, would be willing to change the protocol when going from N to N+1 participants (where N is the number of participants threshold where confidentiality/privacy fades away)?
This is were everything starts, we now have an open peer-to-peer protocol that everyone on the internet can use to communicate.
The US government introduces the 1991 Senate Bill 266, which attempts to allow "the Government to obtain the plain text contents of voice, data, and other communications when appropriately authorized by law" from "providers of electronic communications services and manufacturers of electronic communications service equipment". The bill fails to pass into law.
Pretty Good Privacy (PGP) - released by Phil Zimmermann.
1993 - The US Government launches a criminal investigation against Phil Zimmermann for sharing a cryptographic tool to the world (at the time crypto exporting laws are a thing).
1995 - Zimmermann publishes PGP's source code in a book via MIT Press, dodging the criminal investigation by using the first ammendment's protection of books.
That's it, PGP is out there, people now have a weapon to fight government surveillance. As Zimmermann puts it:
PGP empowers people to take their privacy into their own hands. There's a growing social need for it. That's why I wrote it.
1995 - The RSA Data Security company proposes S/MIME as an alternative to PGP.
criminal investigation against Zimmermann and PGP is dropped.
PGP Inc is founded by Zimmermann, PGP becomes licensed-software.
GNU Privacy Guard (GPG) - version 0.0.0 released by Werner Koch.
PGP 5 is released.
The original agreement between Viacrypt and the Zimmermann team had been that Viacrypt would have even-numbered versions and Zimmermann odd-numbered versions. Viacrypt, thus, created a new version (based on PGP 2) that they called PGP 4. To remove confusion about how it could be that PGP 3 was the successor to PGP 4, PGP 3 was renamed and released as PGP 5 in May 1997
OpenPGP - This is a definition for security software that uses PGP 5.x as a basis.
GPG version 1.0 released
Extensible Messaging and Presence Protocol (XMPP) is developed by the open source community. XMPP is a federated chat protocol (users can run their own servers) that does not have end-to-end encryption and requires communications to be synchronous (both users have to be online).
2002 - PGP Corporation is formed by ex-PGP members and the PGP license/assets are bought back from Network Associates
We argue that [...] the encryption must provide perfect forward secrecy to protect from future compromises [...] the authentication mechanism must offer repudiation, so that the communications remain personal and unverifiable to third parties
We now have an interesting development: messaging (which is seen as a different way of communication for most people) is getting the same security treatment as email.
2013 - The TextSecure (now Signal) application is introduced, built on top of the TextSecure protocol with Axolotl (now the Signal protocol with the double ratchet) as an evolution of OTR and SCIMP. It provides asynchronous communication unlike other messaging protocols, closing the gap between messaging and email.
Matrix is introduced as a modern alternative to XMPP.
All in all, I should be the perfect user for PGP. Competent, enthusiast, embedded in a similar community. But it just didn't work.
WhatsApp now uses the Signal protocol, adding end-to-end encryption for its billions of users.
Another unexpected development: security professionals are now giving up on encrypted emails, and are moving to secure messaging.
Is messaging going to replace email, even though it feels like a different mean of communication?
Moxie's quotes are quite interesting:
In the 1990s, I was excited about the future, and I dreamed of a world where everyone would install GPG. Now I’m still excited about the future, but I dream of a world where I can uninstall it.
In addition to the design philosophy, the technology itself is also a product of that era. As Matthew Green has noted, “poking through an OpenPGP implementation is like visiting a museum of 1990s crypto.” The protocol reflects layers of cruft built up over the 20 years that it took for cryptography (and software engineering) to really come of age, and the fundamental architecture of PGP also leaves no room for now critical concepts like forward secrecy.
In 1997, at the dawn of the internet’s potential, the working hypothesis for privacy enhancing technology was simple: we’d develop really flexible power tools for ourselves, and then teach everyone to be like us. Everyone sending messages to each other would just need to understand the basic principles of cryptography. [...]
The GnuPG man page is over sixteen thousand words long; for comparison, the novel Fahrenheit 451 is only 40k words. [...]
Worse, it turns out that nobody else found all this stuff to be fascinating. Even though GPG has been around for almost 20 years, there are only ~50,000 keys in the “strong set,” and less than 4 million keys have ever been published to the SKS keyserver pool ever. By today’s standards, that’s a shockingly small user base for a month of activity, much less 20 years.
the first draft of Messaging Layer Security (MLS) is published, a standard for end-to-end encrypted group chat protocols.
EFAIL releases damaging vulnerabilities against most popular PGP and S/Mime implementations.
In a nutshell, EFAIL abuses active content of HTML emails, for example externally loaded images or styles, to exfiltrate plaintext through requested URLs. To create these exfiltration channels, the attacker first needs access to the encrypted emails, for example, by eavesdropping on network traffic, compromising email accounts, email servers, backup systems or client computers. The emails could even have been collected years ago.
Why do people keep telling me to use PGP? The answer is that they shouldn’t be telling you that, because PGP is bad and needs to go away.
EFAIL is the straw that broke the camel's back. PGP is officially dead.
Matrix is out of beta and working on making end-to-end encryption the default.
Moxie gives a controversial talk at CCC arguing that advancements in security, privacy, censorship resistance, etc. are incompatible with slow moving decentralized protocols. Today, most serious end-to-end encrypted messaging apps use the Signal protocol (Signal, Facebook Messenger, WhatsApp, Skype, etc.)
Let me introduce the problem: Alice owns a private key which can sign transactions. The problem is that she has a lot of money, and she is scared that someone will target her to steal all of her funds.
Cryptography offers some solutions to avoid this being a key management problem.
The first one is called Shamir Secret Sharing (SSS), which is simply about splitting the signing private key into n shares.
Alice can then split the shares among her friends. When Alice wants to sign a transaction, she would then have to ask her friends to give her back the shares, that she can use to recreate the signing private key. Note that SSS has many many variants, for example VSSS allows participants to verify that malicious shares are not being used, and PSSS allows participants to proactively rotate their shares.
This is not great though, as there is a small timeframe in which Alice is the single point of failure again (the moment she holds all the shares).
A logical next step is to change the system, so that Alice cannot sign a transaction by herself.
A multi-signature system (or multisig) would require n participants to sign the same transaction and send the n signatures to the system.
This is much better, except for the fact that n signatures means that the transaction size increases linearly with the number of signers required.
We can do better: a multi-signature system with aggregated signatures. Signature schemes like BLS allow you to compress the n signatures in a single signature. Note that it is currently much slower than popular signature schemes like ECDSA and EdDSA, so there must be a trade off between speed and size.
We can do even better though!
So far one still has to maintain a set of n public keys so that a signature can be verified. Distributed Key Generation (DKG) allows a set of participant to collaborate on the construction of a key pair, and on signing operations.
This is very similar to SSS, except that there is never a single point of failure. This makes DKG a Multi-Party Computation (MPC) algorithm.
The BLS signature scheme can also aggregate public keys into a single key that will verify their aggregated signatures, which allows the construction of a DKG scheme as well.
Interestingly, you can do this with schnorr signatures too! The following diagram explains a simplified version of the scheme:
Note two things:
All these schemes can be augmented to become threshold schemes: we don't need n signatures from the n signers anymore, but only a threshold m of n. (Having said that, when people talk about threshold signatures, they often mean the threshold version of DKG.) This way if someone loses their keys, or is on holiday, we can still sign.
Most of these schemes assume that all participants are honest and by default don't tolerate malicious participants. More complicated schemes made to tolerate malicious participants exist.
Unfortunately all of this is pretty new, and as an active field of study no standard has been decided on one algorithm so far.
That's the difference!
One last thing: there's been some recent ideas to use zero knowledge proofs (ZKP) to do what aggregated signatures do but for multiple messages (because all the previous solutions all signed the same message). The idea is to release a proof that you have verified all the signatures associated to a set of messages. If the zero knowledge proof is shorter than all the signatures, it did its job!
I am now half-way in the writing of my book (I wrote 8 chapters out of 16) and I am already exhausted.
It doesn't help that I started writing right before accepting a new position for a very challenging (and interesting) project.
But here I am, half-way there, and I think I'm onto something. I can't wait to get there and look at the finished project as a real paper book :)
To give you some insight into this process, let me share some thoughts.
I quickly realized that I didn’t know everything about crypto. The book isn’t just a dump of my own knowledge, but rather the fruit of hours of research—sometimes a single page would take me hours of reading before writing a single word.
So when I don't have a full day ahead of me, I use my limited time to read articles and do research in topics that I don't fully understand. This is useful, and I make more progress during the week end once I have time to write.
Revising is hard. If writing a chapter takes some effort X, revising a chapter takes effort X^3 . After each chapter, several people at Manning, and in my circle, provide feedback. At the same time, I realize that there's much more I want to write about subject Y and I start pilling up articles and papers that I want to read before I revise the chapter. I end up spending a TON of effort revising and re-visiting chapters.
Getting feedback is hard. I am lucky, I know a lot of people with different levels of knowledge in cryptography. This is very useful when I want to test how different audiences read different chapters. Unfortunately people are good at providing good feedback, and bad at providing bad feedback. And only the bad feedback ends up being useful feedback. If you want to help, [the first chapters are free to read](https://www.manning.com/books/real-world-cryptography?a_aid=Realworldcrypto&a_bid=ad500e09
) and I'm ready to buy you a beer for some constructive negative feedback.
Laying out a chapter is hard. Writing a blog is relatively easy. It's short, self-contained, and often something I've been thinking about for weeks, months, before I put it into writing. Writing a chapter for a book is more like writing a paper: you want it to be perfect. Knowing a lot about the subject makes this even more difficult: you know you can make something great and not achieving that would be disappointing. One strategy that I wish I would have more time to spend on is the following one:
create a presentation about the subject of a chapter
give the presentation and observe what diagrams need revisiting and what parts are hard for an audience to understand
after many iterations put the slides into writing
I'm convinced this is the right approach, but I am not sure how I could optimize for this. If you're in SF and wants me to give you a presentation on one of the chapter of the book, leave a comment here :)
There are several cryptocurrencies that are doing really interesting things, Algorand is one of them.
Their breakthrough was to make a leader-based BFT algorithm work in a permissionless setting (and I believe they are the first ones who managed to do this).
At the center of their system lies a cryptography sortition algorithm. It's quite interesting, so I made a video to explain it!
PS: I've been doing these videos for a while, and I still don't have a cool intro, so if you want to make me a cool intro please do :D