Hey! I'm David, cofounder of zkSecurity and the author of the Real-World Cryptography book. I was previously a crypto architect at O(1) Labs (working on the Mina cryptocurrency), before that I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.
In this second video, I give some intuition on how to think about zero-knowledge proof systems, with the example of proving the solution of a sudoku, then I give an overview of what I'll explain in this series of video.
I recently got into general-purpose zero-knowledge proof systems (cryptographic primitives that allow you to prove the execution of a program without revealing some of the inputs), specifically the state-of-the-art PLONK proof system. This is a series of video I made to explain what I understood and learned in the past few months. There might be some inaccuracies, so I apologize in advance for that. You can check all the videos via the playlist here: https://www.youtube.com/watch?v=RUZcam_jrz0&list=PLBJMt6zV1c7Gh9Utg-Vng2V6EYVidTFCC
In this first video, I simply explain what general-purpose zero-knowledge proofs are, specifically zk-SNARKs, and what PLONK is.
The inner product argument is the following construction: given the commitments (for now let's say the hash) of two vectors $\vec{a}$ and $\vec{b}$ of size $n$ and with entries in some field $\mathbb{F}$, prove that their inner product $\langle \vec{a}, \vec{b} \rangle$ is equal to $z$.
There exist different variants of this inner product argument. In some versions, none of the values ($\vec{a}$, $\vec{b}$ and $z$) are given, only commitments. In some other version, which is interesting to us and that I will explain here, only $\vec{a}$ is unknown.
How is that useful?
Inner products arguments are useful for several things, but what we're using them for in Mina is polynomial commitments. The rest of this post won't make too much sense if you don't know what a polynomial commitment is, but briefly: it allows you to commit to a polynomial $f$ and then later prove its evaluation at some point $s$. Check my post on Kate polynomial commitments for more on polynomial commitment schemes.
How does that translate to the inner product argument though? First, let's see our polynomial $f$ as a vector of coefficients:
$$
\vec{f} = (f_0, \cdots, f_n) \text{ such that } f(x) = f_0 + f_1 x + f_2 x^2 + \cdots + f_n x^n
$$
The idea behind Bootleproof-type of inner product argument
The inner product argument protocol I'm about to explain was invented by Bootle et al. It was later optimized in the Bulletproof paper (hence why we unofficially call the first paper bootleproof), and then some more in the Halo paper. It's the later optimization that I'll explain here.
A naive approach
So before I get into the weeds, what's the high-level? Well first, what's a naive way to prove that we know the pre-image of a hash $h$, the vector $\vec{a}$, such that $\langle\vec{a}, \vec{b}\rangle = z$? We could just reveal $\vec{a}$ and let anyone verify that indeed, hashing it gives out $h$, and that it also verifies the equation $\langle\vec{a}, \vec{b}\rangle = z$.
$$
\boxed{
\begin{align}
& \langle \vec{a}, \vec{b} \rangle = z\
& \text{given } \vec{b} \text{, } z \text{, and a hash of } \vec{a}
\end{align}
}
\; \overleftarrow{\text{open proof}} \; \boxed{\vec{a}}
$$
Obliviously, we have to reveal $\vec{a}$ itself, which is not great. But we'll deal with that later, trust me. What we want to tackle first here is the proof size, which is the size of the vector $\vec{a}$. Can we do better?
Reducing the problem to a smaller problem to prove
The inner product argument reduces the opening proof by using an intermediate reduction proof:
$$
\boxed{\begin{aligned}
& \langle \vec{a}, \vec{b} \rangle = z\\
& \text{given } \vec{b} \text{, } z \text{, and a hash of } \vec{a}
\end{aligned}}
\; \overleftarrow{\text{reduction proof}} \;
\boxed{\begin{aligned}
& \langle \vec{a'}, \vec{b'} \rangle = z'\\
& \text{ given } \vec{b'} \text{, } z' \text{, and a hash of } \vec{a'}
\end{aligned}}
\; \overleftarrow{\text{open proof}} \; \boxed{\vec{a'}}
$$
Where the size of $\vec{a'}$ is half the size of $\vec{a}$, and as such the final opening proof ($\vec{a'}$) is half the size of our naive approach.
The reduction proof is where most of the magic happens, and this reduction can be applied many times ($log_2(n)$ times to be exact) to get a final opening proof of size 1. Of course the entire proof is not just the final opening proof of size 1, but all the elements involved in the reduction proofs. It can still be much smaller than the original proof of size $n$.
So most of the proof size comes from the multiple reduction subproofs that you'll end up creating on the way. Our proof is really a collection of miniproofs or subproofs.
One last thing before we get started: Pedersen hashing and commitments
To understand the protocol, you need to understand commitments. I've used hashing so far, but hashing with a hash function like SHA-3 is not great as it has no convenient mathematical structure. We need algebraic commitments, which will allow us to prove things on the committed value without revealing the value committed. Usually what we want is some homomorphic property that will allow us to either add commitments together or/and multiply them together.
For now, let's see a simple non-hiding commitment: a Pedersen hash. To commit to a single value $x$ simply compute:
$$ x G $$
where the discrete logarithm of $G$ is unknown. To open the commitment, simply reveal the value $x$.
We can also perform multi-commitments with Pedersen hashing. For a vector of values $(x_1, \cdots, x_k)$, compute:
$$ x_1 G_1 + \cdots + x_k G_k $$
where each $G_i$ is distinct and has an unknown discrete logarithm as well.
I'll often shorten the last formula as the inner product $\langle \vec{x}, \vec{G} \rangle$ for $\vec{x} = (x_1, \cdots, x_k)$ and $\vec{G} = (G_1, \cdots, G_k)$.
To reveal a commitment, simply reveal the values $x_i$.
Pedersen hashing allow commitents that are non-hiding, but binding, as you can't open them to a different value than the originally comitted one.
And as you can see, adding the commitment of $x$ and $y$ gives us the commitment of $x+y$:
$$xG + yG = (x+y)G$$
which will be handy in our inner product argument protocol
The protocol
Set up
Here are the settings of our protocol. Known only to the prover, is the secret vector
$$\vec{a} = (a_1, a_2, a_3, a_4)$$
The rest is known to both:
$\vec{G} = (G_1, G_2, G_3, G_4)$, a basis for Pedersen hashing
$A = \langle \vec{a}, \vec{G} \rangle$, the commitment of $\vec{a}$
$\vec{b} = (b_1, b_2, b_3, b_4)$, the powers of some value $s$ such that $\vec{b} = (1, s, s^2, s^3)$
the result of the inner product $z = \langle \vec{a}, \vec{b} \rangle$
For the sake of simplicity, let's pretend that this is our problem, and we just want to halve the size of our secret vector $\vec{a}$ before revealing it. As such, we will only perform a single round of reduction. But you can also think of this step as being already the reduction of another problem twice as large.
We can picture the protocol as follows:
The prover first sends a commitment to the polynomial $f$.
The verifier sends a point $s$, asking for the value $f(s)$. To help the prover perform a proof of correct evaluation, they also send a random challenge $x$.
The prover sends the result of the evaluation, $z$, as well as a proof.
Prover->Verifier: com(f)
Verifier->Prover: s, random x
Prover->Verifier: z = f(s), proof of opening
Does that make sense? Of course what's interesting to us is the proof, and how the prover uses that random $x$.
Reduced problem
First, the prover cuts everything in half. Then they use $x$ to construct linear combinations of these cuts:
This is how the problem is reduced to $\langle \vec{a'}, \vec{b'} \rangle = z'$.
At this point, the prover can send $\vec{a'}$, $\vec{b'}$, and $z'$ and the verifier can check if indeed $\langle \vec{a'}, \vec{b'} \rangle = z'$. But that wouldn't make much sense would it? Here we also want:
a proof that proving that statement is the same as proving the previous statement ($\langle \vec{a}, \vec{b} \rangle = z$)
a way for the verifier to compute $z'$ and $b'$ and $A'$ (the new commitment) by themselves.
The actual proof
The verifier can compute $\vec{b'}$ as they have everything they need to do so.
What about $A'$, the commitment of $\vec{a'}$ which uses the new $\vec{G'}$ basis. It should be the following value:
Similarly to $A'$, the verifier can recompute $z'$ from the previous value $z$ and two scalar values $L_z$ and $R_z$ which the prover needs to provide.
So in the end, the proof has becomes:
the vector $\vec{a'}$ which is half the size of $\vec{a}$
the $L_a, R_a$ curve points (around two field elements, if compressed)
the $L_z, R_z$ scalar values
We can update our previous diagram:
Prover->Verifier: com(f)
Verifier->Prover: s, random x
Prover->Verifier: z = f(s)
Prover->Verifier: a', L_a, R_a, L_z, R_z
In our example, the naive proof was to reveal $\vec{a}$ which was 4 field elements. We are now revealing instead 2 + 2 + 2 = 6 field elements. This is not great, but if $\vec{a}$ was much larger (let's say 128), the reduction in half would still be of 64 + 2 + 2 = 68 field elements. Not bad no? We can do better though... Stay tuned for the next post.
PLONK is the state of the art when it comes to general-purpose proof system. While it was released in 2019, the paper has recently received some updates, and the scheme is still evolving (with Aztec announcing an UltraPLONK version coming soon). This is the scheme that we use at Mina to compress the size of the blockchain from gigabytes to a fixed size of a dozen kilobytes.
While I don't think the core ideas are the hardest to understand, the scheme compresses a myriad of optimization which makes it hard to parse. In this post I hope to add some clarity to some aspects of the scheme. Note that I assume that you have some knowledge of how PLONK works.
How PLONK works, the short version
Eventually, the idea of PLONK is to prove that some polynomial $f(x)$ vanishes on some domain $H \subset \mathbb{F}$ (and I will ignore the permutation argument, which is just another proof). To prove that, we reduce the problem to some other problem. Incrementaly, it looks like this:
Proving the previous statement is equivalent to proving that the polynomial is divisible by $Z_H(x)$, the polynomial that has all the elements of $H$ as roots (also called vanishing polynomial).
Which is equivalent to proving the following identity (for some quotient polynomial $t$):
$$f(x) = t(x) \cdot Z_H(x) \; \; \; \forall x \in \mathbb{F}$$
Which is equivalent to proving the identity on some random point $z$ (thanks to the Schwartz-Zippel lemma):
$$f(z) = t(z) \cdot Z_H(z)$$
To prove the last statement, the prover uses of polynomial commitment scheme (specifically, the KZG scheme) to commit to the polynomial $f$ and $t$. The prover then sends the commitments to the verifier. At that point, the verifier has to check that for some random point $z$
$$
f(z) = t(z) \cdot Z_H(z)
$$
This is done by sending a random point $z$ to the prover and doing an "opening" of the commitments at this point: the prover sends the values $f(z)$ and $t(z)$ as well as a proof that these are the correct evaluations.
Prover->Verifier: com(f), com(t)
Note right of Verifier: generates random z
Verifier->Prover: z
Prover->Verifier: f(z), t(z)
Prover->Verifier: proofs of opening
Note right of Verifier: checks that \n sum f(z) = t(z)z_H(z)
This is in essence the PLONK protocol, except that this is not really what happens in the paper...
More reductions
The newer PLONK actually does one more reduction of the last statement:
As per the previous section: we want to prove that $$f(z) = t(z) \cdot Z_H(z)$$
Which is equivalent to proving that $z$ is a root of the polynomial
$$f(x) - t(x) \cdot Z_H(x)$$
Since the verifier already knows one of the polynomial ($Z_H$), they can evaluate it in advance. So the previous statement is equivalent to proving that $z$ is a root of
$$r(x) = f(x) - t(x) \cdot Z_H(z)$$
The last two steps is an optimization (called Maller's optimization) that removes the need for the prover to send $t(z)$, as the verifier can use the commitment to $t$ to produce a commitment to $r$ (to verify the opening proof).
These additional reductions moved us from a protocol in which the prover sends openings to let the verifier check an identity by themselves, to a protocol where the prover simply sends openings.
Prover->Verifier: com(f), com(t)
Note right of Verifier: generates random z
Verifier->Prover: z
Prover->Verifier: f(z), r(z) = 0
Prover->Verifier: proofs of opening
Note right of Verifier: reconstruct r(x) and \n validate opening proofs
To verify the opening of $r$ for $x = z$, the verifier will have to reconstruct a commitment to $r$ first. That's easy, it is:
$$com(r) = com(f) - com(t) \cdot Z_H(z)$$
which will use:
the commitment to $f$ received during the protocol
the commitment to $t$ received during the protocol
the evaluation of $Z_H(x)$ at $x=z$ which they can do by themselves
Not so fast... t is too large
If you've read PLONK, you've noticed that the prover actually doesn't send a commitment to $t$ directly, because $t$ is too large and polynomial commitment schemes have an upperbound fixed during the trusted setup. (By the way, $t$ is too large because the permutation argument makes it three times as large due to the three witness polynomials.) To circumvent that limitation, the polynomial $t$ is split into three smaller polynomials $t_{lo}, t_{mid}, t_{hi}$ such that:
This is not great, as the prover cannot produce a commitment to $r$ anymore. The reason is that $x^n$ and $x^{2n}$ cannot be committed as they're larger than the upperbound of our polynomial commitment. Instead, notice that since the verifier already knows these values, so they can pre-evaluate them at $z$ and ask instead for a proof that:
Prover->Verifier: com(f)
Prover->Verifier: com(t_lo), com(t_mid), com(to_hi)
Note right of Verifier: generates random z
Verifier->Prover: z
Prover->Verifier: f(z), r(z) = 0
Prover->Verifier: proofs of opening
Note right of Verifier: reconstruct r(x) and \n validate opening proofs
Uh-oh, what about f?
The big proof in PLONK really boils down to two things:
The permutation argument, which links the wires in our circuit. I ignore this proof in the post.
the main polynomial $f$, which is our circuit.
Since the polynomial $f$ needs to be constructed such that:
it does not leak any non-public information to the verifier
it does not allow the prover to change fixed parts of the circuit
the prover and the verifier perform a "polynomial dance" to construct the polynomial together. The end product sorts of looks like this:
where $a, b, c$ are private polynomials that the prover constructs, commits, and sends to the verifier; and $q_L, q_R, q_M, q_O, q_C$ are public polynomials (the selector polynomials) that both the verifier and the prover can construct (and commit to if necessary).
So the end protocol looks more like this:
Prover->Verifier: com(a), com(b), com(c)
Prover->Verifier: com(t_lo), com(t_mid), com(to_hi)
Note right of Verifier: generates random z
Verifier->Prover: z
Prover->Verifier: a(z), b(z), c(z), r(z) = 0
Prover->Verifier: proofs of opening
Note right of Verifier: reconstruct r(x) and \n validate opening proofs
And as in the previous section, the verifier needs to reconstruct a commitment to $r$ before being able to ask for an opening, which is now impossible as we're dealing with multiplication of commitments
There's much more to PLONK. I've skipped the circuit part, the permutation argument, I've also ignored the big pairing equation at the end. These will be subjects for another post :)
In the PLONK paper, they make use of an optimization from Mary Maller in order to reduce the proof size. This is a note explaining this optimization. If you have no idea what these words are, you might want to skip reading this post :)
Explanation
Maller's optimization is used in the "polynomial dance" between the prover and the verifier to reduce the number of openings the prover send.
Recall that the polynomial dance is the process where the verifier and the prover form polynomials together so that:
the prover doesn't leak anything important to the verifier
the verifier doesn't give the prover too much freedom
In the dance, the prover can additionally perform some steps that will keep the same properties but with reduced communication.
Let's see the protocol where Prover wants to prove to Verifier that
$$\forall x \in \mathbb{F}, \; h_1(x)h_2(x) - h_3(x) = 0$$
given commitments of $h_1, h_2, h_3$.
Note left of Prover: commits to h1, h2, h3
Prover->Verifier: com(h1), com(h2), com(h3)
Note right of Verifier: generates random point s
Verifier-->Prover: s
Note left of Prover: evaluates at point s
Prover->Verifier: h1(s), h2(s), h3(s)
Prover->Verifier: 3 proofs of openings
Note right of Verifier: verifies that \n h1(s)h2(s) - h3(s) = 0
A shorter proof exists. Essentially, if the verifier already has the opening h1(s), they can reduce the problem to showing that
given commitments of $h_1, h_2, h_3$ and evaluation of $h1$ at a point $s$.
Note left of Prover: commits to h1, h2, h3
Prover->Verifier: com(h1), com(h2), com(h3)
Note right of Verifier: generates random point s
Verifier-->Prover: s
Note left of Prover: evaluates at point s
Prover->Verifier: h1(s), L(s)
Prover->Verifier: 2 proofs of openings
Note right of Verifier: forms polynomial com(L) = \n h1(s)com(h2) - com(h3)
Note right of Verifier: checks that L(s) = 0
Notes
Why couldn't the prover open the polynomial $L'$ directly?
$$L'(x) = h_1(x)h_2(x) - h_3(x)$$
By doing
Note left of Prover: commits to h1, h2, h3
Prover->Verifier: com(h1), com(h2), com(h3)
Note right of Verifier: generates random point s
Verifier-->Prover: s
Note left of Prover: evaluates at point s
Prover->Verifier: L'(s), 1 proof of opening
Note right of Verifier: forms polynomial com(L') = \n com(h1)com(h2) - com(h3)
Note right of Verifier: verifies that \n h1(s)h2(s) - h3(s) = 0
The problem here is that you can't multiply the commitments together without using a pairing (if you're using a pairing-based polynomial commitment scheme), and you can only use that pairing once in the protocol.
If you're using an inner-product-based commitment, you can't even multiply commitments anyway.
Appendix: Original explanation from the PLONK paper
There's this thing called a Kate polynomial commitment, which is a polynomial commitment primitive that makes use of pairings.
There's an excellent post from Dankrad which I would recommend reading instead of this post.
I wrote this as a shorter summary of how you can commit to a polynomial, and then prove any evaluation $f(x) = y$.
One day, the verifier asks "what's the evaluation at $\zeta$?"
And the prover responds by sending the answer, $a$, and a proof ($h(s)$, see below).
The idea behind the proof
Notice that because $\zeta$ is a root of $f(x)-f(\zeta)$, then for some polynomial $h(x)$:
$$
f(x) - f(\zeta) = (x-\zeta) \cdot h(x)
$$
Due to this, $h(x) = \frac{f(x)-f(\zeta)}{x-\zeta}$ must be a valid polynomial.
At a high-level:
the verifier will compute what they think $[h(x)]$ should be at some random point $s$
the prover will send the actual value $[h(s)]$
the verifier will check if they match
This works because the Schartz-Zippel lemma tells us that two polynomials that are different are different in most points.
The proof
Here's the protocol:
the prover sends the verifier a commitment $[\frac{f(s)-f(\zeta)}{s-\zeta}]=[h(s)]$ evaluated at some random point $s$ (the toxic waste).
the verifier constructs a similar $h(s)$ but with the expected value of $f(\zeta)$ instead: $[\frac{f(s) - a}{s-\zeta}]$. The prover then checks if it's equal to $[h(s)]$.
Note:
The prover can compute $[h(s)]$ easily, because they can just compute the polynomial $h(x)$ first, and then reconstruct it at $s$ with the $SRS$.
$$
h(x) = \frac{f(x)-f(\zeta)}{x-\zeta} = a_0 + a_1x + a_2x^2 + \cdots
$$
and then
$$
[h(s)] := a_0[1] + a_1[s] + a_2[s^2] + \cdots
$$
for example with our previous $f(x)$ and $\zeta = 3$
The verifier cannot compute their own $[h(s)]$ because they cannot divide by $s$ (remember, nobody knows $s$). They need a pairing.
Remember, you want to check the following identity hidden in the exponent (using commitments):
$$
\frac{[f(s) - a]}{[s-\zeta]} = [h(s)]
$$
But since you can't divide with commitments, you can't compute what's on the left-hand side. You can multiply thanks to pairings though. So instead, you could check the following equation:
$$
[f(s) - a] = [(s-\zeta)h(s)]
$$
and with pairings, you can multiply $[s-\zeta]$ with the proof $[h(s)]$:
$$
e([f(s)] - [a], [1]) = e([s-\zeta], [h(s)])
$$
Hey reader! I haven't posted in a while, but as this is my blog I'm contracted to talk about life events such as this one.
I've joined O(1) Labs a bit more than a month ago to work on the Mina cryptocurrency.
If you don't know about Mina, check it out, it's pretty cool: it uses recursive zero-knowledge proofs to compress a blockchain into a single proof of 11KB.
I hope I got you intrigued!
I want to say it is one of, if not the most, ambitious project in the space (but I'm biased).
As I'm still relatively new there, I don't have much to say besides that, but you can imagine that my posting will switch to more zero-knowledgy type of stuff very soon!
There seem to be a few interesting trends in “security via hardware” these days.
The first trend is root-of-trust chips. Integrated TPM-like chips that are like crypto Swiss Army knives as they offer many functionalities out of the box. They resemble discrete TPMs but are instead implemented as coprocessor to the main processor. This makes these newer chips more resistant to physical MITM attacks (as discrete TPMs simply use a bus to communicate with other components). If you don’t know what a TPM is, it’s just a device that performs cryptographic operations and generally sits somewhere on your motherboard. Examples of these integrated security chips are Microsoft’s Pluton and Apple’s secure enclave.
The second trend is confidential computing. There are two types of specialized hardware here:
Programmable integrated secure processors; these are similar to the root-of-trust chips, except that they are programmable: you can push code there and run it in a separate trusted execution environment (TEE). It’s pretty useful for applications that require a trusted computing base (TCB); a core program whose security is critical and that does not need to trust the rest of the system. It’s also useful in “cloud scenarios” where you want to run some computation on a remote machine but want to make sure it runs it correctly. Think about Intel SGX, or ARM TrustZone.
Confidential VMs; imagine a hardware hypervisor that can run VMs as enclaves. This is usually much more practical than the enclave created by SGX, as you don’t need to write custom code and there are no memory limitation. But it is not clear to me how much security you lose against physical attacks by doing this (especially when papers like this one seem alarming). AMD SEV does this, and both Azure and GCP have started offerings to leverage this technology.
It can be hard to understand the difference between all these types of specialized hardware, the attacks they prevent, and the features they unlock. But essentially, here’s how I think about the two kinds: they all do great against software attacks (minus complex cryptographic attacks), they both aren’t the best tool in the box against a motivated physical attacker (HSMs are “better”), and only confidential computing cares about custom user code.
But it’s easier to understand the difference by looking at some examples. As I only touch on protocols, you can simply imagine these chips as creating a blackbox for code and data that others can’t see and touch (even with a debugger).
Protecting keys and data with a secure enclave
The simplest use case for hardware security chips is to protect data.
To protect keys, it’s easy: just generate them in the secure chip and disallow extraction.
If you need ‘em, just ask the secure enclave to perform crypto operations with them.
To protect data? Encrypt it! That concept is called file-based encryption (FBE) if you’re encrypting individual files, and full-disk encryption (FDE) if it’s the whole disk.
FDE sounds much better, as it’s all or nothing. If you're under the shower and you wet your hair a little, you know you'll have to wash them. That’s what most laptops and desktops use.
In practice, FDE is not that great though: it doesn't take into account how we, human beings, use our devices.
We often leave them locked, as opposed to turned off, so that background functionalities can keep running.
Computers deal with this by just keeping the data-encryption key (DEK) around, even if your computer is locked.
Think about that the next time you go to the restroom at Starbucks, leaving your locked computer unattended.
Phones do it a bit better by encrypting different types of files depending on if your phone is locked or turned off. It sounds like a good solution, but Zinkus et al. showed that it’s not that great either.
If done well, this is how you typically hear about disk encryption in the news:
A couple of months ago the highly-publicised case of Apple vs. FBI brought attention to the topic of privacy - especially in the context of mobile devices. Following the 2015 San Bernardino terrorist attack, the FBI seized a mobile phone belonging to the shooter, Syed Farook, with the intent to search it for any additional evidence or leads related to the ongoing investigation. However, despite being in possession of the device, the FBI were unable to unlock the phone and access its contents.
Of course, the user should be authenticated before data can be decrypted.
This is often done by asking the user for a PIN or password.
A PIN or password is not enough though, as it would allow simple brute-force attacks (especially on 4 or 6-digit PINs).
In general, solutions try to tie the DEK to both a user credential and a symmetric key kept on the enclave.
What’s that symmetric key?
We all know that you can’t hardcode the same key in every device you produce. This is dumb. You end up with attacks like DUHK where thousands of devices are found hardcoding the same secret (and pwning one device breaks all of them).
The solution is a per-device key that is either burned into the chip during manufacturing, or created by the chip itself (so-called physically unclonable functions).
For example, each Apple secure enclave have a UID, each TPM has a unique endorsement key and attestation key, each OpenTitan chip has a creator root key and an owner root key, etc.
A randomly generated UID is fused into the SoC at manufacturing time. Starting with A9 SoCs, the UID is generated by the Secure Enclave TRNG during manufacturing and written to the fuses using a software process that runs entirely in the Secure Enclave. This process protects the UID from being visible outside the device during manufacturing and therefore isn’t available for access or storage by Apple or any of its suppliers.
sepOS uses the UID to protect device-specific secrets. The UID allows data to be cryptographically tied to a particular device. For example, the key hierarchy protecting the file system includes the UID, so if the internal SSD storage is physically moved from one device to another, the files are inaccessible.
To prevent brute-force attacks, Apple’s secure enclave mixes both the UID key and the user PIN with a password-based KDF (password-hashing function) to derive the DEK.
Except that I lied: to allow user to change their PIN quickly, the DEK is actually not derived directly, but instead encrypted by a key-encryption key (KEK).
Secure boot with a root-of-trust secure chip
When booting your computer, there are different “stages” that will run until you finally get to the screen you want.
One problem users face are viruses and malwares, and these can infect the boot process.
You then run on an evil operating system…
To protect the integrity of boot, our integrated secure chips provide a “root of trust”, something that we trust 100% and that allows us to trust other stuff down the line.
This root of trust is generally some read-only memory (ROM) that cannot be overwritten, and it’s also called one-time programmable memory as it was written during manufacturing and can’t be changed anymore.
For example, when powering up a recent Apple device, the very first code that gets executed is inside the Apple’s secure enclave ROM (called Boot ROM).
That boot rom is tiny, so usually the only thing it does is:
Prepare some protected memory and loads the next image there (so-called "boot code").
Hash the image and verify its signature against the hardcoded public key in the ROM.
Execute that code.
The next boot loader does the same thing, and so on until it gets to the device’s operating system. This is how updates that are not signed by Apple can’t be installed on your phone.
Confidential Computing with a programmable secure processor
There’s been a new paradigm for the last years: the cloud; big companies running servers to host your stuff. Amazon has AWS, Google has GCP, and Microsoft has Azure. Another way to put this is that people are moving from running things themselves, to running things on someone else’s computer. This of course create some issues in some scenarios where privacy is important. To fix that, confidential computing attempts at offering solutions to run client code without being able to see it or modify its behavior.
SGX primary use case seems to be exactly that these days: clients running code that the servers can’t see or tamper with.
One interesting problem that arise is: how can I trust that the response I got from my request indeed came from SGX, and not some impersonator. This is what attestation tries to solve. There are two kinds of attestation:
local attestation, when two enclaves running on the same platform need to communicate and prove to each other that they are secure enclaves
remote attestation, when a client queries a remote enclave and need to make sure that it was a legit enclave that produced the result from the request.
Each SGX chip is provided with unique keypairs at manufacturing time: the Root Sealing Keys.
The public key part is then signed by some Intel certificate authority. So the first assumption, if we ignore the assumption that the hardware is secure, is that Intel is correctly signing public keys of secure SGX chips only.
With that in mind, you can now obtained a signed attestation, from Intel's CA, that you're talking to a real SGX enclave, and that it is running some code (at least a proof of its digest), etc.