david wong

Hey! I'm David, the author of the Real-World Cryptography book. I'm a crypto engineer at O(1) Labs on the Mina cryptocurrency, previously I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

Understanding PLONK posted 2 weeks ago

PLONK is the state of the art when it comes to general-purpose proof system. While it was released in 2019, the paper has recently received some updates, and the scheme is still evolving (with Aztec announcing an UltraPLONK version coming soon). This is the scheme that we use at Mina to compress the size of the blockchain from gigabytes to a fixed size of a dozen kilobytes.

While I don't think the core ideas are the hardest to understand, the scheme compresses a myriad of optimization which makes it hard to parse. In this post I hope to add some clarity to some aspects of the scheme. Note that I assume that you have some knowledge of how PLONK works.

How PLONK works, the short version

Eventually, the idea of PLONK is to prove that some polynomial $f(x)$ vanishes on some domain $H \subset \mathbb{F}$ (and I will ignore the permutation argument, which is just another proof). To prove that, we reduce the problem to some other problem. Incrementaly, it looks like this:

  • Proving the previous statement is equivalent to proving that the polynomial is divisible by $Z_H(x)$, the polynomial that has all the elements of $H$ as roots (also called vanishing polynomial).
  • Which is equivalent to proving the following identity (for some quotient polynomial $t$): $$f(x) = t(x) \cdot Z_H(x) \; \; \; \forall x \in \mathbb{F}$$
  • Which is equivalent to proving the identity on some random point $z$ (thanks to the Schwartz-Zippel lemma): $$f(z) = t(z) \cdot Z_H(z)$$

To prove the last statement, the prover uses of polynomial commitment scheme (specifically, the KZG scheme) to commit to the polynomial $f$ and $t$. The prover then sends the commitments to the verifier. At that point, the verifier has to check that for some random point $z$

$$ f(z) = t(z) \cdot Z_H(z) $$

This is done by sending a random point $z$ to the prover and doing an "opening" of the commitments at this point: the prover sends the values $f(z)$ and $t(z)$ as well as a proof that these are the correct evaluations.

Prover->Verifier: com(f), com(t) Note right of Verifier: generates random z Verifier->Prover: z Prover->Verifier: f(z), t(z) Prover->Verifier: proofs of opening Note right of Verifier: checks that \n sum f(z) = t(z)z_H(z)

This is in essence the PLONK protocol, except that this is not really what happens in the paper...

More reductions

The newer PLONK actually does one more reduction of the last statement:

  • As per the previous section: we want to prove that $$f(z) = t(z) \cdot Z_H(z)$$
  • Which is equivalent to proving that $z$ is a root of the polynomial $$f(x) - t(x) \cdot Z_H(x)$$
  • Since the verifier already knows one of the polynomial ($Z_H$), they can evaluate it in advance. So the previous statement is equivalent to proving that $z$ is a root of $$r(x) = f(x) - t(x) \cdot Z_H(z)$$

The last two steps is an optimization (called Maller's optimization) that removes the need for the prover to send $t(z)$, as the verifier can use the commitment to $t$ to produce a commitment to $r$ (to verify the opening proof).

These additional reductions moved us from a protocol in which the prover sends openings to let the verifier check an identity by themselves, to a protocol where the prover simply sends openings.

Prover->Verifier: com(f), com(t) Note right of Verifier: generates random z Verifier->Prover: z Prover->Verifier: f(z), r(z) = 0 Prover->Verifier: proofs of opening Note right of Verifier: reconstruct r(x) and \n validate opening proofs

To verify the opening of $r$ for $x = z$, the verifier will have to reconstruct a commitment to $r$ first. That's easy, it is:

$$com(r) = com(f) - com(t) \cdot Z_H(z)$$

which will use:

  • the commitment to $f$ received during the protocol
  • the commitment to $t$ received during the protocol
  • the evaluation of $Z_H(x)$ at $x=z$ which they can do by themselves

Not so fast... t is too large

If you've read PLONK, you've noticed that the prover actually doesn't send a commitment to $t$ directly, because $t$ is too large and polynomial commitment schemes have an upperbound fixed during the trusted setup. (By the way, $t$ is too large because the permutation argument makes it three times as large due to the three witness polynomials.) To circumvent that limitation, the polynomial $t$ is split into three smaller polynomials $t_{lo}, t_{mid}, t_{hi}$ such that:

$$ t(x) = t_{lo}(x) + x^n \cdot t_{mid}(x) + x^{2n} \cdot t_{hi}(x) $$

This means that in our previous protocol, we can't prove directly that $z$ is a root of

$$r(x) = f(x) - t(x) \cdot Z_H(z)$$

instead we have to prove the equivalent that $z$ is a root of

$$r(x) = f(x) - [t_{lo}(x) + x^n \cdot t_{mid}(x) + x^{2n} \cdot t_{hi}(x)] \cdot Z_H(z)$$

This is not great, as the prover cannot produce a commitment to $r$ anymore. The reason is that $x^n$ and $x^{2n}$ cannot be committed as they're larger than the upperbound of our polynomial commitment. Instead, notice that since the verifier already knows these values, so they can pre-evaluate them at $z$ and ask instead for a proof that:

$$r(x) = f(x) - [t_{lo}(x) + z^n \cdot t_{mid}(x) + z^{2n} \cdot t_{hi}(x)] \cdot Z_H(z)$$

which is a fine request, as the verifier can produce the commitment of $r$ needed to verify the opening proof:

$$ com(r) = com(f) - [com(t_{lo}) + z^n \cdot com(t_{mid}) + z^{2n} \cdot com(t_{hi})] \cdot Z_H(z) $$

At this point, the protocol looks more like this:

Prover->Verifier: com(f) Prover->Verifier: com(t_lo), com(t_mid), com(to_hi) Note right of Verifier: generates random z Verifier->Prover: z Prover->Verifier: f(z), r(z) = 0 Prover->Verifier: proofs of opening Note right of Verifier: reconstruct r(x) and \n validate opening proofs

Uh-oh, what about f?

The big proof in PLONK really boils down to two things:

  1. The permutation argument, which links the wires in our circuit. I ignore this proof in the post.
  2. the main polynomial $f$, which is our circuit.

Since the polynomial $f$ needs to be constructed such that:

  • it does not leak any non-public information to the verifier
  • it does not allow the prover to change fixed parts of the circuit

the prover and the verifier perform a "polynomial dance" to construct the polynomial together. The end product sorts of looks like this:

$$ f(x) = a(x) q_L(x) + b(x) q_R(x) + q_M(x) a(x) b(x) + q_O(x) c(x) + q_C(x) $$

where $a, b, c$ are private polynomials that the prover constructs, commits, and sends to the verifier; and $q_L, q_R, q_M, q_O, q_C$ are public polynomials (the selector polynomials) that both the verifier and the prover can construct (and commit to if necessary).

So the end protocol looks more like this:

Prover->Verifier: com(a), com(b), com(c) Prover->Verifier: com(t_lo), com(t_mid), com(to_hi) Note right of Verifier: generates random z Verifier->Prover: z Prover->Verifier: a(z), b(z), c(z), r(z) = 0 Prover->Verifier: proofs of opening Note right of Verifier: reconstruct r(x) and \n validate opening proofs

And as in the previous section, the verifier needs to reconstruct a commitment to $r$ before being able to ask for an opening, which is now impossible as we're dealing with multiplication of commitments

$$ \begin{align} r(x) = \; &a(x) q_L(x) + b(x) q_R(x) + a(x) b(x) q_M(x) + c(x) q_O(x) + q_C(x) \\ & - [t_{lo}(x) + x^n \cdot t_{mid}(x) + x^{2n} \cdot t_{hi}(x)] \cdot Z_H(z) \end{align} $$

but since the prover sends the evaluations of $a, b, c$ at $z$ (with proofs), the verifier can use that to simplify the polynomial $r$ to:

$$ \begin{align} r(x) = \; &a(z) q_L(x) + b(z) q_R(x) + a(z) b(z) q_M(x) + c(z) q_O(x) + q_C(x) \\ & - [t_{lo}(x) + x^n \cdot t_{mid}(x) + x^{2n} \cdot t_{hi}(x)] \cdot Z_H(z) \end{align} $$

Finally, the verifier can produce the commitment of $r$ as:

$$ \begin{align} com(r) = \; &a(z) com(q_L) + b(z) com(q_R) + a(z) b(z) com(q_M) + c(z) com(q_O) + com(q_C) \\ & - [com(t_{lo}) + z^n \cdot com(t_{mid}) + z^{2n} \cdot com(t_{hi})] \cdot Z_H(z) \end{align} $$

There's much more to PLONK. I've skipped the circuit part, the permutation argument, I've also ignored the big pairing equation at the end. These will be subjects for another post :)

comment on this story

Maller optimization to reduce proof size posted 2 weeks ago

In the PLONK paper, they make use of an optimization from Mary Maller in order to reduce the proof size. This is a note explaining this optimization. If you have no idea what these words are, you might want to skip reading this post :)

Explanation

Maller's optimization is used in the "polynomial dance" between the prover and the verifier to reduce the number of openings the prover send.

Recall that the polynomial dance is the process where the verifier and the prover form polynomials together so that:

  1. the prover doesn't leak anything important to the verifier
  2. the verifier doesn't give the prover too much freedom

In the dance, the prover can additionally perform some steps that will keep the same properties but with reduced communication.


Let's see the protocol where Prover wants to prove to Verifier that

$$\forall x \in \mathbb{F}, \; h_1(x)h_2(x) - h_3(x) = 0$$

given commitments of $h_1, h_2, h_3$.

Note left of Prover: commits to h1, h2, h3 Prover->Verifier: com(h1), com(h2), com(h3) Note right of Verifier: generates random point s Verifier-->Prover: s Note left of Prover: evaluates at point s Prover->Verifier: h1(s), h2(s), h3(s) Prover->Verifier: 3 proofs of openings Note right of Verifier: verifies that \n h1(s)h2(s) - h3(s) = 0

A shorter proof exists. Essentially, if the verifier already has the opening h1(s), they can reduce the problem to showing that

$$ \forall x \in \mathbb{F}, \; L(x) = h_1(\mathbf{s})h_2(x) - h_3(x) = 0$$

given commitments of $h_1, h_2, h_3$ and evaluation of $h1$ at a point $s$.

Note left of Prover: commits to h1, h2, h3 Prover->Verifier: com(h1), com(h2), com(h3) Note right of Verifier: generates random point s Verifier-->Prover: s Note left of Prover: evaluates at point s Prover->Verifier: h1(s), L(s) Prover->Verifier: 2 proofs of openings Note right of Verifier: forms polynomial com(L) = \n h1(s)com(h2) - com(h3) Note right of Verifier: checks that L(s) = 0

Notes

Why couldn't the prover open the polynomial $L'$ directly?

$$L'(x) = h_1(x)h_2(x) - h_3(x)$$

By doing

Note left of Prover: commits to h1, h2, h3 Prover->Verifier: com(h1), com(h2), com(h3) Note right of Verifier: generates random point s Verifier-->Prover: s Note left of Prover: evaluates at point s Prover->Verifier: L'(s), 1 proof of opening Note right of Verifier: forms polynomial com(L') = \n com(h1)com(h2) - com(h3) Note right of Verifier: verifies that \n h1(s)h2(s) - h3(s) = 0

The problem here is that you can't multiply the commitments together without using a pairing (if you're using a pairing-based polynomial commitment scheme), and you can only use that pairing once in the protocol.

If you're using an inner-product-based commitment, you can't even multiply commitments anyway.

Appendix: Original explanation from the PLONK paper

https://eprint.iacr.org/2019/953.pdf

For completion, the lemma 4.7:

comment on this story

Pairing-based polynomial commitments and Kate polynomial commitments posted last month

There's this thing called a Kate polynomial commitment, which is a polynomial commitment primitive that makes use of pairings. There's an excellent post from Dankrad which I would recommend reading instead of this post. I wrote this as a shorter summary of how you can commit to a polynomial, and then prove any evaluation $f(x) = y$.

Here's how it works:

You have a polynomial $f(x) = x^2 + 3x$

and some public parameters:

$$ SRS = {[1], [s], [s^2], [s^3]} = {G, sG, s^2 G, s^3 G} $$

where $[x] := xG$ for some generator $G$ of an elliptic curve group.

and $s$ is a toxic waste (something that no one should know) hidden behind an elliptic curve point G (some people call that "hidden in the exponent").

to commit to $f$

To commit to this polynomial, evaluate it at the unknown point $s$. You can do that by playing with the $SRS$:

$$ [f(s)] := [s^2] + 3 [s] = s^2 G + 3 sG = (s^2 + 3s)G $$

to prove that $f(\zeta) = a$

One day, the verifier asks "what's the evaluation at $\zeta$?" And the prover responds by sending the answer, $a$, and a proof ($h(s)$, see below).

The idea behind the proof

Notice that because $\zeta$ is a root of $f(x)-f(\zeta)$, then for some polynomial $h(x)$:

$$ f(x) - f(\zeta) = (x-\zeta) \cdot h(x) $$

Due to this, $h(x) = \frac{f(x)-f(\zeta)}{x-\zeta}$ must be a valid polynomial.

At a high-level:

  • the verifier will compute what they think $[h(x)]$ should be at some random point $s$
  • the prover will send the actual value $[h(s)]$
  • the verifier will check if they match

This works because the Schartz-Zippel lemma tells us that two polynomials that are different are different in most points.

The proof

Here's the protocol:

  1. the prover sends the verifier a commitment $[\frac{f(s)-f(\zeta)}{s-\zeta}]=[h(s)]$ evaluated at some random point $s$ (the toxic waste).
  2. the verifier constructs a similar $h(s)$ but with the expected value of $f(\zeta)$ instead: $[\frac{f(s) - a}{s-\zeta}]$. The prover then checks if it's equal to $[h(s)]$.

Note:

  1. The prover can compute $[h(s)]$ easily, because they can just compute the polynomial $h(x)$ first, and then reconstruct it at $s$ with the $SRS$. $$ h(x) = \frac{f(x)-f(\zeta)}{x-\zeta} = a_0 + a_1x + a_2x^2 + \cdots $$ and then $$ [h(s)] := a_0[1] + a_1[s] + a_2[s^2] + \cdots $$

    for example with our previous $f(x)$ and $\zeta = 3$

  2. The verifier cannot compute their own $[h(s)]$ because they cannot divide by $s$ (remember, nobody knows $s$). They need a pairing. Remember, you want to check the following identity hidden in the exponent (using commitments): $$ \frac{[f(s) - a]}{[s-\zeta]} = [h(s)] $$ But since you can't divide with commitments, you can't compute what's on the left-hand side. You can multiply thanks to pairings though. So instead, you could check the following equation: $$ [f(s) - a] = [(s-\zeta)h(s)] $$ and with pairings, you can multiply $[s-\zeta]$ with the proof $[h(s)]$: $$ e([f(s)] - [a], [1]) = e([s-\zeta], [h(s)]) $$
comment on this story

I'm now at O(1) Labs working on Mina protocol!!! posted May 2021

Hey reader! I haven't posted in a while, but as this is my blog I'm contracted to talk about life events such as this one. I've joined O(1) Labs a bit more than a month ago to work on the Mina cryptocurrency. If you don't know about Mina, check it out, it's pretty cool: it uses recursive zero-knowledge proofs to compress a blockchain into a single proof of 11KB. I hope I got you intrigued! I want to say it is one of, if not the most, ambitious project in the space (but I'm biased). As I'm still relatively new there, I don't have much to say besides that, but you can imagine that my posting will switch to more zero-knowledgy type of stuff very soon!

comment on this story

WTF are these security chips? posted April 2021

There seem to be a few interesting trends in “security via hardware” these days.

The first trend is root-of-trust chips. Integrated TPM-like chips that are like crypto Swiss Army knives as they offer many functionalities out of the box. They resemble discrete TPMs but are instead implemented as coprocessor to the main processor. This makes these newer chips more resistant to physical MITM attacks (as discrete TPMs simply use a bus to communicate with other components). If you don’t know what a TPM is, it’s just a device that performs cryptographic operations and generally sits somewhere on your motherboard. Examples of these integrated security chips are Microsoft’s Pluton and Apple’s secure enclave.

The second trend is confidential computing. There are two types of specialized hardware here:

  • Programmable integrated secure processors; these are similar to the root-of-trust chips, except that they are programmable: you can push code there and run it in a separate trusted execution environment (TEE). It’s pretty useful for applications that require a trusted computing base (TCB); a core program whose security is critical and that does not need to trust the rest of the system. It’s also useful in “cloud scenarios” where you want to run some computation on a remote machine but want to make sure it runs it correctly. Think about Intel SGX, or ARM TrustZone.
  • Confidential VMs; imagine a hardware hypervisor that can run VMs as enclaves. This is usually much more practical than the enclave created by SGX, as you don’t need to write custom code and there are no memory limitation. But it is not clear to me how much security you lose against physical attacks by doing this (especially when papers like this one seem alarming). AMD SEV does this, and both Azure and GCP have started offerings to leverage this technology.

It can be hard to understand the difference between all these types of specialized hardware, the attacks they prevent, and the features they unlock. But essentially, here’s how I think about the two kinds: they all do great against software attacks (minus complex cryptographic attacks), they both aren’t the best tool in the box against a motivated physical attacker (HSMs are “better”), and only confidential computing cares about custom user code.

But it’s easier to understand the difference by looking at some examples. As I only touch on protocols, you can simply imagine these chips as creating a blackbox for code and data that others can’t see and touch (even with a debugger).

Protecting keys and data with a secure enclave

The simplest use case for hardware security chips is to protect data. To protect keys, it’s easy: just generate them in the secure chip and disallow extraction. If you need ‘em, just ask the secure enclave to perform crypto operations with them. To protect data? Encrypt it! That concept is called file-based encryption (FBE) if you’re encrypting individual files, and full-disk encryption (FDE) if it’s the whole disk. FDE sounds much better, as it’s all or nothing. If you're under the shower and you wet your hair a little, you know you'll have to wash them. That’s what most laptops and desktops use.

In practice, FDE is not that great though: it doesn't take into account how we, human beings, use our devices. We often leave them locked, as opposed to turned off, so that background functionalities can keep running. Computers deal with this by just keeping the data-encryption key (DEK) around, even if your computer is locked. Think about that the next time you go to the restroom at Starbucks, leaving your locked computer unattended. Phones do it a bit better by encrypting different types of files depending on if your phone is locked or turned off. It sounds like a good solution, but Zinkus et al. showed that it’s not that great either.

If done well, this is how you typically hear about disk encryption in the news:

A couple of months ago the highly-publicised case of Apple vs. FBI brought attention to the topic of privacy - especially in the context of mobile devices. Following the 2015 San Bernardino terrorist attack, the FBI seized a mobile phone belonging to the shooter, Syed Farook, with the intent to search it for any additional evidence or leads related to the ongoing investigation. However, despite being in possession of the device, the FBI were unable to unlock the phone and access its contents.

Of course, the user should be authenticated before data can be decrypted. This is often done by asking the user for a PIN or password. A PIN or password is not enough though, as it would allow simple brute-force attacks (especially on 4 or 6-digit PINs). In general, solutions try to tie the DEK to both a user credential and a symmetric key kept on the enclave.

What’s that symmetric key? We all know that you can’t hardcode the same key in every device you produce. This is dumb. You end up with attacks like DUHK where thousands of devices are found hardcoding the same secret (and pwning one device breaks all of them). The solution is a per-device key that is either burned into the chip during manufacturing, or created by the chip itself (so-called physically unclonable functions). For example, each Apple secure enclave have a UID, each TPM has a unique endorsement key and attestation key, each OpenTitan chip has a creator root key and an owner root key, etc.

A randomly generated UID is fused into the SoC at manufacturing time. Starting with A9 SoCs, the UID is generated by the Secure Enclave TRNG during manufacturing and written to the fuses using a software process that runs entirely in the Secure Enclave. This process protects the UID from being visible outside the device during manufacturing and therefore isn’t available for access or storage by Apple or any of its suppliers. sepOS uses the UID to protect device-specific secrets. The UID allows data to be cryptographically tied to a particular device. For example, the key hierarchy protecting the file system includes the UID, so if the internal SSD storage is physically moved from one device to another, the files are inaccessible.

To prevent brute-force attacks, Apple’s secure enclave mixes both the UID key and the user PIN with a password-based KDF (password-hashing function) to derive the DEK. Except that I lied: to allow user to change their PIN quickly, the DEK is actually not derived directly, but instead encrypted by a key-encryption key (KEK).

Secure boot with a root-of-trust secure chip

When booting your computer, there are different “stages” that will run until you finally get to the screen you want. One problem users face are viruses and malwares, and these can infect the boot process. You then run on an evil operating system… To protect the integrity of boot, our integrated secure chips provide a “root of trust”, something that we trust 100% and that allows us to trust other stuff down the line. This root of trust is generally some read-only memory (ROM) that cannot be overwritten, and it’s also called one-time programmable memory as it was written during manufacturing and can’t be changed anymore. For example, when powering up a recent Apple device, the very first code that gets executed is inside the Apple’s secure enclave ROM (called Boot ROM). That boot rom is tiny, so usually the only thing it does is:

  • Prepare some protected memory and loads the next image there (so-called "boot code").
  • Hash the image and verify its signature against the hardcoded public key in the ROM.
  • Execute that code.

The next boot loader does the same thing, and so on until it gets to the device’s operating system. This is how updates that are not signed by Apple can’t be installed on your phone.

Confidential Computing with a programmable secure processor

There’s been a new paradigm for the last years: the cloud; big companies running servers to host your stuff. Amazon has AWS, Google has GCP, and Microsoft has Azure. Another way to put this is that people are moving from running things themselves, to running things on someone else’s computer. This of course create some issues in some scenarios where privacy is important. To fix that, confidential computing attempts at offering solutions to run client code without being able to see it or modify its behavior. SGX primary use case seems to be exactly that these days: clients running code that the servers can’t see or tamper with.

One interesting problem that arise is: how can I trust that the response I got from my request indeed came from SGX, and not some impersonator. This is what attestation tries to solve. There are two kinds of attestation:

  • local attestation, when two enclaves running on the same platform need to communicate and prove to each other that they are secure enclaves
  • remote attestation, when a client queries a remote enclave and need to make sure that it was a legit enclave that produced the result from the request.

Each SGX chip is provided with unique keypairs at manufacturing time: the Root Sealing Keys. The public key part is then signed by some Intel certificate authority. So the first assumption, if we ignore the assumption that the hardware is secure, is that Intel is correctly signing public keys of secure SGX chips only.

With that in mind, you can now obtained a signed attestation, from Intel's CA, that you're talking to a real SGX enclave, and that it is running some code (at least a proof of its digest), etc.

comment on this story

What is Host Card Emulation (HCE)? posted April 2021

It’s 2020, most people have a computer in their pocket: a smart phone. What is the point of a credit card anymore? Well, not much. Nowadays more and more payment terminals support contactless payment via the near-field communication (NFC) protocol, and more and more smartphones ship with an NFC chip that can act as a credit card. NFC for payment is specified as Card Emulation. Literally: it emulates a bank card. But not so fast, banks will prevent you from doing this unless you have a secure element.

Since Apple has full control over its hardware, it can easily add a secure element to its new iPhones to support payment, and this is what Apple did with an embedded secure element bonded onto the NFC chip since the iPhone 6. The secure element communicates directly with the NFC chip, and in turn to NFC readers; thus a compromise of the phone operating system does not impact the secure element.

Google went a different route, creating the concept of a cloud-based secure element, named Host Card Emulation (HCE), and introduced in 2013 in Android 4.4. How does it work? Google stores your credit card information in a secure element in the cloud (instead of your phone), and only gives your phone access to a short-lived single-use account number. This concept of replacing sensitive long-term information with short-lived tokens is called tokenization. Sending a random card number that can be linked to your real one is great for privacy: merchants can’t track you as it’ll look like you’re always using a new card number. If your phone gets compromised, the attacker only gets access to a short-lived secret that can only be used for a single payment.

Tokenization is a common concept in security: replace the sensitive data with some random stuff, and have a table secured somewhere safe that maps the random stuff to the real data. Although Apple theoretically doesn't have to use tokenization, since iPhones have secure elements that can store the real Primary Account Number (PAN), they do use it in order to gain more privacy (it's after all their new bread and butter).

comment on this story

One password to rule them all, single sign-on (SSO) and password managers posted April 2021

Password reuse is bad, what can we do about it? Naively, users could use different passwords for different websites, but there are two problems with this approach:

  • Users are bad at creating many different passwords.
  • The mental load required to remember multiple passwords is impractical.

To alleviate these concerns, two solutions have been widely adopted:

  • Single-sign on (SSO). The idea of SSO is to allow users to connect to many different services by proving that they own the account of a single service. This way the user only has to remember the password associated with that one service in order to be able to connect to many services. Think "connect with Facebook" type of buttons, as illustrated below.
  • Password Managers. The previous SSO approach is convenient if the different services you use all support it, but this is obviously not scalable for scenarios like the web. A better approach in these extreme cases is to improve the clients as opposed to attempting to fix the issue on the server side. Nowadays, modern browsers have built-in password managers that can suggest complex passwords when you register on new websites, and can remember all of these passwords as long as you remember one master password.

sso

An example of single-sign on (SSO) on the web. By having an account on Facebook or Google, a user can connect to new services (in this example Airbnb) without having to think about a new password.

The concept of SSO is not new in the enterprise world, but its success with normal end-users is relatively recent. Today, two protocols are the main competitors when it comes to setting up SSO:

  • Security Assertion Markup Language 2.0 (SAML). A protocol using the Extensible Markup Language (XML) encoding.
  • OpenID Connect (OIDC). An extension to the OAuth 2.0 (RFC 6749) authorization protocol using the JavaScript Object Notation (JSON) encoding.

SAML is still widely used, mostly in an enterprise setting, but it is at this point a legacy protocol. OpenID Connect, on the other hand, can be seen everywhere on web and mobile applications. You most likely already used it!

While OpenID Connect allows for different types of flows, let's see the most common use case for user authentication on the web via the authorization code flow:

  1. Alice wants to log into some application, let's say example.com, via an account she owns on cryptologie.net (that's just my blog, but let's pretend that you can register an account on it).
  2. example.com redirects her browser to a specific page of cryptologie.net to request an "authorization code." If she is not logged-in in cryptologie.net, the website will first ask her to log in. If she is already logged-in, the website will still confirm with the user that they want to connect to example.com using their identity on cryptologie.net (it is important to confirm user intent).
  3. cryptologie.net redirects Alice back to example.com which then learns the authorization code.
  4. example.com can then query cryptologie.net with this authorization code to confirm Alice's claim that she owns an account on cryptologie.net, and potentially retrieve some additional profile information about that user.

oauth

In OpenID Connect (OIDC), Alice (the end-user in OIDC terms) can authenticate to a service example.com (the relying party) using her already existing account on cryptologie.net (the OpenID provider). For the web, the authorization code flow of OIDC is usually used. It starts by having Alice request an "authorization code" from cryptologie.net (and that can only be used by example.com). example.com can then use it to query cryptologie.net for Alice's profile information (encoded as an ID token), and then associate her cryptologie.net identity with an account on their own platform.

There are many important details that I am omitting here. For example, the authorization token that Alice receives in step 2 must be kept secret, as it can be used to log in as her on example.com. Another example: so that example.com cannot reuse the authorization token to connect as Alice on a different website, the OpenID provider cryptologie.net retains an association between this authorization token and the intended audience (example.com). This can be done by simply storing this association in a database, or by having the authorization code contain this information authenticated (I explained a similar technique with cookies in chapter 3).

By the way, the proof given to example.com in step 4 is called an ID token in OpenID Connect, and is represented as a JSON Web Token (JWT) which is just a list of JSON-encoded data.

jwt

In OpenID Connect, if Alice wants to log in on example.com via her account on another website (an OpenID provider), example.com eventually needs to obtain what is called an "ID token." An ID token is some user information encoded via the JSON Web Token (JWT) standard. a JWT is simply the concatenation of three JSON-encoded objects. In the picture, the website https://jwt.io lets you decode a JWT and learn what every field stands for. In our browser-based example, example.com uses TLS to communicate with (and authenticate) the OpenID provider. Thus, the ID token it receives can be trusted. In other OpenID Connect flows (used, for example, by mobile applications) the ID token can be provided directly by the user, and can thus be tampered with. In this case, an ID token contains a signature which can be verified using the OpenID provider's public key.

Authentication protocols are often considered hard to get right. OAuth2, the protocol OpenID Connect relies on, is notorious for being easy to mis-use (see RFC 6819: OAuth 2.0 Threat Model and Security Considerations). On the other hand, OpenID Connect is very well specified. Make sure that you follow the standards and that you look at best practices, this can save you from a lot of trouble.


Here's another example of a pretty large company deciding not to follow this advice. In May 2020, the "Sign-in with Apple" SSO flow that took a departure from OpenID Connect was found to be vulnerable. Anyone could have obtained a valid ID token for any Apple account, just by querying Apple's servers.


SSO is great for users, as it reduces the number of passwords they have to manage, but it does not remove passwords altogether. The user still has to use passwords to connect to OpenID providers. If you're interested to know how cryptography can help to hide passwords, read the rest of this content in my book real-world cryptography.

comment on this story

Cryptography and assembly code posted March 2021

Thanks to filippo streaming his adventures rewriting Golang assembly code into "cleaner" Golang assembly code, I discovered the Avo assembly generator for Golang.

This post is not necessarily about Golang, but Golang is a good example as its standard library is probably the best cryptographic standard library of any programming language.

At dotGo 2019, Michael McLoughlin presented on his Avo tool. In the talk he mentions that there's 24,962 x86 assembly lines in Golang's standard library, and most of it is in the crypto package. A very "awkward" place where "we need very high performance, and absolute correctness". He then shows several example that he describes as "write-once code".

assembly golang crypto

The talk is really interesting and I recommend you to check it.

I personally spent days trying to understand Golang's SHA-3 assembly implementation. I even created a Go Assembly by Example page to help me in this journey. And I ended up giving up. I just couldn't understand how it worked, the thing didn't make sense. Someone had written it with their own mental model of how they wanted to pass data around. It was horrible.

It's not just a problem of Golang. Look at OpenSSL, for example, which most cryptographic applications and libraries rely on. It contains a huge amount of assembly code to implement cryptography, and that assembly code is sometimes generated by unintelligible perl code.

There are many more good examples out there. the BearSSL TLS implementation by Thomas Pornin, the libsodium cryptographic library by Frank Denis, the extended keccak code package by the Keccak team, all use assembly code to produce fast cryptography.

We're making such a fuss about readable, auditable, simple and clear cryptographic implementations, but most of that has been thrown out of the window in the quest for performance.

The real problem, from a reviewer perspective is that assembly is getting us much further away from the specification. As the role of a reviewer is to match the implementation to the specification, it makes the job hard, perhaps impossible.

Food for thoughts...

1 comment