david wong

Hey! I'm David, a security consultant at Cryptography Services, the crypto team of NCC Group . This is my blog about cryptography and security and other related topics that I find interesting.

SHA-3 vs the world @ OWASP London

posted 20 hours ago

I just gave a talk at OWASP London on SHA-3 and derived functions + derived protocols. The quality is not so great but here you have it.

It was apparently the first crypto talk in 5 years so I'm glad I revived this part of OWASP =)

comment on this story

Speaking at OWASP

posted Yesterday

I'll be speaking at OWASP London tomorrow. It will be the same talk I just gave at Defcamp two weeks ago, and it will be the last time I give this talk.

It's sold out, but there will be a live streaming posted somewhere (maybe on their facebook page?).

After that, I will be talking at Black Hat Europe about Disco and libDisco. Stay tuned.

comment on this story

Ethernaut CTF walk through

posted 3 days ago

This is a walk through of the Ethernaut capture-the-flag competition where each challenge was an ethereum smart contract you had to break.

I did this at 2am in a hotel room in Romania and ended up not finishing the last challenge because I took too long and didn't want to re-record that part. Basically what I was missing in my malicious contract: a function to withdraw tokens from the victim contract (it would have work since I had a huge amount of token via the attack). I figured I should still upload that as it might be useful to someone.

4 comments

Noise Plug-and-play Implementation in Golang

posted 3 weeks ago

I wrote an implementation of Noise in Go. I've already talked about it here but I've made some progress towards a more usable library.

It is now a real protocol built from the Noise protocol framework!

Noise doesn't work right off-the-bat because it does not have a length field in its messages. This means that two problems can arise:

  • if a noise message is fragmented, the receiver will not be able to know and will not be able to parse the received fragments
  • if noise messages are received concatenated to one another, noise will interpret that as one big message and the integrity verification of the encrypted message (via GCM or Poly1305) will fail

In different words, without an indication of a length, noise cannot know where a message stops. Messages can get fragmented by middleboxes, and can get concatenated just because of latency or the way the other peer send its messages. In lab condition this might not be a problem, but in real life without a framing protocol below Noise things will fail quickly.

This is why by default, you can't implement the components of the Noise specification and expect it to work. Having said that, with this minimal addition of a length field things do work!

But that's not the only problem that the specification fails to tackle. The other problem is the authentication of static public keys.

You see, in Noise you have many different ways of doing handshakes (named Noise_XX, Noise_KN, ...), and some of them do not require one of the peer's authentication. Kind of like the typical browser ↔ webserver scenario where only the webserver will authenticate itself via a certificate (and perhaps the browser will later authenticate itself via credentials in a form). Some patterns that you should never use fail to authenticate both side (Noise_NN) and that is why I haven't implemented them. But for patterns that do authenticate one of the side (or both), problems arise: the Noise specification does not have any safeguards in its algorithms to prevent you from failing to authenticate the other side of the connection.

This means that if you implement Noise following the specification, patterns like Noise_XX where both sides require authentication will happily go through without caring about authenticating anything. This leads to trivial active man-in-the-middle attacks.

What I've done is that:

  • if the pattern in use have your peer send a static public key to the other one, I require you to provide a proof when setting up your peer. Think about a signature from an authoritative root signing key.
  • if the pattern in use have your peer receive a static public key from the other one, I require you to provide a callback function to verify that key, optionally using payloads sent during the handshake (for example, a certificate containing a signature).

To make things truly plug-and-play, I've created helper functions that let you generate an authoritative root signing key and create proofs or callbacks for a set of parameters. I've written some documentation here that should get you started in no time. If things are broken (this is a beta) or not clear please let me now.

And again, don't use this in production.

The biggest achievement of this implementation though is the fact that it is implementing the net.Conn interface of the Golang standard library. This mean that if you're already using networking code in your Go application, you can just replace your net.Conn or tls.Conn with noise.Conn and things will continue to work seamlessly.

comment on this story

Speaking at Defcamp

posted 4 weeks ago

I'll be at Defcamp talking about SHA-3 (the standard), as well as its derived functions and protocols. It's happening next week in Bucharest. If you're around there hit me up!

defcamp

comment on this story

Publishing my smart contract on the main ethereum network

posted 4 weeks ago

warning: as this is a proof of concept to see if 4chan could be implemented on the blockchain, some people might post shocking pictures or videos on there. At the time of the writing nothing "bad" has happened, but take precautions if you're planning to take a look at it :)

This is part II of Writing a DAPP for the Ethereum block chain I guess (where I previously said I would not publish this smart contract on the main network).

Pulling the entire Ethereum blockchain took me all afternoon. The thing is taking 29GB of disk space at the moment.

I sent ~20USD worth of ether (0.10 ether) from Coinbase to my real Mist wallet and prepared to see how much it would cost to deploy my smart contract. Surprisingly it was cheap! It cost me 0.007715952 ETHER which is ~2USD (around 1 million unit of gas at 0.008 ether per million gas) to deploy my contract to 0x470fb19D08c3d2eB8923A31d1408c393Dab09ccF an address computed out of my keypair and a nonce. I do not own its associated private key because I simply will never need it.

etherscan source code

To make sure the smart contract's code was properly open sourced on etherscan.io I had to put the source code there and provide the used compiler's version and the ABI encoded arguments to the controller. The ABI encoded arguments are just the 0-padded hexadecimal encoing of the arguments. I had two uint256 as arguments to my FiveMedium() constructor: the fee to post a thread and the fee to reply to a thread.

constructor

I decided to set the creation of a thread around 1$ and replying to a thread around 10 cents. It was then time to push the button and publish it.

deploying

And voila! You can access the DAPP on davidwong.fr/FiveMedium.

Note: the mandatory gas cost to post or reply on a thread seems to be around 0.30$. This has influenced me to lower down the contract fees to approach the gas fees.

comment on this story

Writing a DAPP for the Ethereum block chain

posted 5 weeks ago

warning: as this is a proof of concept to see if 4chan could be implemented on the blockchain, some people might post shocking pictures or videos on there. At the time of the writing nothing "bad" has happened, but take precautions if you're planning to take a look at it :)

The FiveMedium DAPP browsed from the Mist wallet

dapp

I've been dabbing in smart contract security (see my video here) and I found it natural to try and do a DAPP (decentralized application) myself. How hard can it be?

3.5 days later I've got back into Javascript after many years; I've learned Vue.js (kind of, my code is really ugly) and I've created my first DAPP!

First thing I'll have to say: it's hella fun.

Javascript has gone a long way and the Vue.js framework is just great! I've tried using React but I found it less developer-friendly, harder to learn and lacking in terms of template. It just didn't click. I guess coming from HTML first (and jQuery later) the Vue.js framework just makes more sense to me. It's all about having fun and I wasn't having any with React.

I still want to create, modify and query things the jQuery way, but I'm getting used to the javascript modernities (querySelector, arrow functions, ...) and the Vue.js way. I like it. It will take some time to get rid of my old habits and re-think the way I write front end code but I like it.

Writing the smart contract is quite straight forward. It's 128 lines of Solidity code, but most of it are comments (yes I comment my code). At one point I should publish it on etherscan.io because this is best practice. It's not compiled with the last version of Solidity (boooo!) because I deployed it via Mist and Mist doesn't have the last version of Solidity.

solidity code

Writing the dapp with Vue.js and the web3.js API (the javascript library to interact with the blockchain via an ethereum node) is pretty straight forward as well. The learning curve is not bad and there are tons of resources for beginners. That is, until you test your dapp with a real wallet like the Metamask wallet (integrated as a browser plugin) or the official Mist wallet (Electron app). Different wallets offer different functions and versions of web3 (how it works: they inject the web3 object in your document and you can use it directly from your javascript webapp). They also (for the most part) refuse any synchronous calls on the web3 API without really providing ways for you to debug what is not synchronous in your call. A lot of functions have to be replaced for the asynchronous variants, any async/await has to disappear and you enter callback hell. Not fun.
The worst is that the documentation gets sparser and sparser as you enter the world of real DAPPs. You understand that things are changing really fast, that wallets will soon stop supporting web3.js and that a real API will be provided at some point. Everything is way more experimental than I had thought.
On top of that, Metamask doesn't let you watch for events yet, so say goodbye to your DAPP being "live" for their users.

To make the app offline, meaning browsable without a wallet, I use Infura. You basically just have to switch the url of the node to the ones Infura gives you, and web3 will be able to interact with it the same way it interacts with a real node. This is because the standard works via normal Json-RPC routes using the Json way to structure queries and responses. Unfortunately, like Metamask, Infura doesn't let you listen to events so the app is browsable, but not live.

I haven't taken the time to publish the smart contract on the real ethereum network yet. It's sitting on Rinkeby which is a test network where you can get ethers for free. I'm not going to get rich on a test network, and some of my friends are eyeballing me for this decision (I see you jc) but this is fun and I want people to try it for free :)

Is it hard? No but it's annoying. First I need to pull up the whole Ethereum blockchain.

As of April 19th, 2017, my blockchain size is 23.5 GB total.

from this Quora answer

Second, I need some ethers, and buying ethers from the UK is hard. Of course I already have some (I wouldn't be writing anything about ethereum if I wasn't invested), but I had to go through weeks of research to buy them. (If you're looking for an easy way, learn from my wasted time: transfer money on a Revolut account, change it to euro, do a free SEPA transfer to Kraken, buy ethers there.)

Anyone can replicate my work. The DAPP is client-side code (javascript) so anyone can download it and run it on their own page. the contract is also up there on the blockchain, it's public stuff. I don't really like this, but it is how Ethereum works. If I someday drop the page from the internet, anyone can get it and run it. Run it on their own server, but also run it locally from their own machine.

If you want to see how it works without getting a wallet:

But I recommend you to give a try to this new technology!

Download Mist, set the network to Rinkeby, grab some free ethers from the faucet there and browse to my DAPP FiveMedium.

Have fun!

PS: yes this is heavily inspired by 4chan.

1 comment

Posters, figures and graphics

posted last month

I've polished the design of this blog a bit (with flexbox and css-grid!) and it should look a bit cleaner :)

I've also created a page for graphics. I only have 3 at the moment, but I know that PHD students often present posters like these at conferences so if you know any (or if you have one yourself) and you want me to showcase it there send me a message!

crypto graphics

comment on this story

Zero'ing memory, compiler optimizations and memset_s

posted August 2017

tl;dr: use this code

When a program uses a secret key for some cryptographic operation, it will store it somewhere in memory. This is a problem because it is trivial to read what has been previously stored in memory from a different program, just create something like this:

#include <stdio.h>

int main(){
    unsigned char a[5000];
    for(int i = 0; i < 10000; i++) {
        printf("x", a[i]);
    }
    printf("\n");
}

This will print out whatever was previously there in memory, because the buffer a is not initialized to zeros. Actually, C seldom initializes things to zeros, it can if you specifically use something like calloc instead of malloc or static in front of a global variable/struct/...

EDIT: as Fred Akalin pointed to me, it looks like this is fixed in most modern OS. Colin Perceval notes that there are other issues with not zero'ing memory:

if someone is able to exploit an unrelated problem — a vulnerability which yields remote code execution, or a feature which allows uninitialized memory to be read remotely, for example — then ensuring that sensitive data (e.g., cryptographic keys) is no longer accessible will reduce the impact of the attack. In short, zeroing buffers which contained sensitive information is an exploit mitigation technique.

This is a problem.

To remove a key from memory, developers tend to write something like this:

memset(private_key, 0, sizeof(*private_key));

Unfortunately, when the compiler sees something like this, it will remove it. Indeed, this code is useless since the variable is not used anymore after, and the compiler will optimize it out.

How to fix this issue?

A memset_s function was proposed and introduced in C11. It is basically a safe memset (you need to pass in the size of the pointer you're zero'ing as argument) that will not get optimized out. Unfortunately as Martin Sebor notes:

memset_s is an optional feature of the C11 standard and as such isn't really portable. (AFAIK, there also are no conforming C11 implementations that provide the optional Annex K in which the function is defined.)

To use it, a #define at the right place can be used, and another #define is used as a notice that you can now use the memset_s function.

#define __STDC_WANT_LIB_EXT1__ 1
#include <string.h>
#include <stdlib.h>

// ...

#ifdef __STDC_LIB_EXT1__
memset_s(pointer, size_data, 0, size_to_remove);

Unfortunately you cannot rely on this for portability. For example on macOS the two #define are not used and you need to use memset_s directly.

Martin Sebor adds in the same comment:

The GCC -fno-builtin-memset option can be used to prevent compatible compilers from optimizing away calls to memset that aren't strictly speaking necessary.

Unfortunately, it seems like macOS' gcc (which is really clang) ignores this argument.

What else can we do?

I asked Robert Seacord who always have all the answers, here's what he gave me in return:

void *erase_from_memory(void *pointer, size_t size_data, size_t size_to_remove) {
    if(size_to_remove > size_data) size_to_remove = size_data;
    volatile unsigned char *p = pointer;
    while (size_to_remove--){
       *p++ = 0;
    }
    return pointer;
}

Does this volatile keyword works?

Time to open gdb (or lldb) to verify what the compiler has done. (This can be done after compiling with or without -O1, -O2, -O3 (different levels of optimization).)

Let's write a small program that uses this code and debug it:

int main(){
    char a[6] = "hello";
    printf("%s\n", a);
    erase_from_memory(a, 6, 6);
}

gdb

  1. we open gdb with the program we just compiled
  2. we set a break point on main
  3. we run the program which will stop in main

disas

We notice a bunch of movb $0x0 ...

Is this it? Let's put a breakpoint on the first one and see what the stack pointer (rsp) is pointing to.

b

It's pointing to the string "hello" as we guessed.

x

Going to the next instruction via ni, we can then see that the first letter h has been removed. Going over the next instructions, we see that the full string end up being zero'ed.

success

It's a success!

The full code can be seen here as an erase_from_memory.h header file that you can just include in your codebase:

#ifndef __ERASE_FROM_MEMORY_H__
#define __ERASE_FROM_MEMORY_H__ 1

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h> 
#include <string.h>

void *erase_from_memory(void *pointer, size_t size_data, size_t size_to_remove) {
  #ifdef __STDC_LIB_EXT1__
   memset_s(pointer, size_data, 0, size_to_remove);
  #else
   if(size_to_remove > size_data) size_to_remove = size_data;
     volatile unsigned char *p = pointer;
     while (size_to_remove--){
         *p++ = 0;
     }
  #endif
    return pointer;
}

#endif // __ERASE_FROM_MEMORY_H__

Many thanks to Robert Seacord!

PS: here is how libsodium does it

EDIT: As Colin Percival wrote here, this problem is far from being solved. Secrets can get copied around in (special) registers which won't allow you to easily remove them.

3 comments

integer promotion in C

posted August 2017

Loup Vaillant wrote a good blog post about his new crypto library Monocypher.

In spite of the obvious controversy of launching a new crypto library, I really like it. Note that this is not me officially endorsing the library, I just think it's cool and I would only consider using it after it had matured a bit more.

The whole thing is one ~1500LOC file and is pretty clear to read. It only implements a few crypto functions.

The blog post mentions a few bugs that were found in his library (and I appreciate how open he is about it). Here's an interesting one:

Bug 5: signed integer overflow
This one was sneaky. I wouldn't have caught it without UBSan.
I was shifting a uint8_t, 24 bits to the left. I failed to realise that integer promotion means this unsigned byte would be converted to a signed integer, and overflow if the byte exceeded 127. (Also, on crazy platforms where integers are smaller than 32 bits, this would never have worked.) An explicit conversion to uint32_t did the trick.
At this point, I was running the various sanitisers just to increase confidence. Since I used Valgrind already, I didn't expect to actually catch a bug. Good thing I did it anyway.
Lesson learned: Never try anything serious in C or C++ without sanitisers. They're not just for theatrics, they catch real bugs.

This is the problem patched.

patch

Simplified, the bad code really looks like this:

uint32_t = uint8_t << 8 * i;

And all the theory behind the problem can be dismissed, if he had written his code with precautions. When I see something like this, the first thing I think about is that it should probably be written like this:

uint32_t = (uint32_t)uint8_t << 8 * i;

This would avoid any weird C problems as a casting (especially to a bigger type) usually goes fine.

OK but what was the problem with the above code?

Well, in C some operations will usually promote the type to something bigger. See the C standard:

shift-expression << additive-expression
The integer promotions are performed on each of the operands

What is an integer promotion? See the C standard:

If an int can represent all values of the original type, the value is converted to an int;
otherwise, it is converted to an unsigned int.
These are called the integer promotions

So looking back at our bad snippet:

uint32_t = uint8_t << 8 * i;
  1. the maximum value of uint8_t is 255, which can largely be hold in a signed int of 16-bit or 32-bit (depends on the architecture). So 01 is promoted to 00 00 00 01 if a signed int is 32-bit (which it probably is). (In the case were we would have been dealing with a uint32-t, there would have been no problems as "big" values that cannot be represented in a signed int of 32-bit would have been promoted to a unsigned int instead of a signed int.)
  2. the bits are shifted on the left. For example of 8 places 00 00 01 00.
  3. the result gets casted to uint32_t. We still get 00 00 01 00.

This doesn't look like an issue, and it probably isn't most of the time. Now imagine if in 1. our value was 80 (which is 1000 0000 in bits).

Imagine now that in 2. we shift it of 24 bits on the left, that will give us 80 00 00 00 which is an all zero bitstring except for the most significant bit (MSB). In an int type the MSB is the signing bit. I believe at this point, the value will be automatically sign extended to the size of the register, so in your 64-bit machine it will be saved as ff ff ff ff 80 00 00 00.

Now in 3. The result now get casted to a uint32_t. Which doesn't do anything but change the value of the pointer. But we now have a wrong result! What we wanted here was 00 00 00 00 80 00 00 00. If you're not convinced, you can run the following script on your computer:

#include <stdio.h>
#include <stdint.h>

int main(){

uint8_t start = -1;
printf("%x\n", start); // prints 0xff
uint64_t result = start << 24;
printf("%llx\n", result); // should print 00000000ff000000, but will print ffffffffff000000
result = (uint64_t)start << 24;
printf("%llx\n", result); // prints 00000000ff000000
return 0;
}

Looking at the binary in Hopper we can see this:

reverse

And we notice the movsxd instruction which is "move doubleword to quadword with sign-extension".
It moves the result of the shift left (shl) into a register, making sure that its result is the same for an int64_t which is the maximum value your register can hold.

comment on this story

How did length extension attacks made it into SHA-2?

posted August 2017

If you don't know about length extension attacks, it is a very simple and straight forward attack that let you forge a new hash by extending another one, letting you pretend that hashing had previously not been terminated.

The attack targets such hashes: SHA-256(key | message) where the key is secret and where | means concatenation.

This is because a SHA-2 hash (unless we're talking about the truncated versions) is literally a full copy of the state of the hash. It is not the state of hashing key and message, but rather key and message and some padding. Because like everything in the symmetric crypto world you need to pad to the block size. I believe this is 512 bits in the Secure Hash Algorithm 2.

The attack lets you take such a hash, and continue the hashing to obtain the hash of key | message | padding | more where more is whatever you want. And all of this without any knowledge of the secret key!

merkle damgard

Interestingly, this comes from the way the Merkle-Damgard construction is applied (without a good finalization function). And because of this hash functions like MD4, MD5, SHA-1 and SHA-2 have all suffered from the same issues. You'd be glad to hear that this issue is fixed in any of the SHA-3 contestant (read: BLAKE2 and SHAKE and SHA-3 are fine). Keccak (SHA-3's winner) fixes it by using a Sponge construction, not letting you see a big part of the state (the capacity) while BLAKE2 fixes it by using the HAsh Iterative FrAmework (HAIFA), using a "number of bits hashed so far" (not including the padding) inside of the compression function.

haifa

While looking at the exact date length extension attacks were found (which I couldn't find), Samuel Neves came up with an interesting response.

twitter

It looks like the NIST was made aware, during the standardization process of SHA-2, that simple fixes would prevent length extension attacks.

This comment from John Kelsey (who later joined the NIST) is from 28 august 2001 (by the way it doesn't make sense to write dates as month/day/year. Nobody can understand it outside of the US. We have an ISO format that specifies a logical year-month-day). In it he talks about the attack, and proposes a simple fix:

Niels Ferguson suggested the following simple fix to me, some time ago: Choose some nonzero constant C0, of the same size as the hash function chaining variable. Hash messages normally, until we come to the last block in the padded message. XOR C0 into the chaining variable input into that last compression function computation. The resulting compression function output is used as the hash result. For concreteness, I propose C0 = 0xa5a5...a5, with the 0xa5 repeated until every byte is filled in. This should be interpreted in little-endian bit ordering.

Why did the NIST ignore this when it could have modified the draft before publication? I have no idea. Is this one more fuck up from their part?

4 comments

The Strobe Protocol Framework

posted August 2017

strobe

Introduction

The Strobe Protocol Framework is a specification, available here, which you can use to implement a primitive called the Strobe Duplex Construction. The implemented Strobe object should respond to a dozen of calls that can be combined together to allow you to generate random numbers, derive keys, hash, encrypt, authenticate, and even build complex symmetric protocols.

The thing is sexy for several reasons:

  1. you only use a single primitive to do all of your symmetric crypto
  2. it makes the code size of your library extremely small, easy to fit in embedded devices and easy to audit
  3. on top of that it allows you to create TLS-like protocols
  4. every message/operation of your protocol depends on all the previous messages/operations

The last one might remind you of Noise which is a protocol framework as well that mostly focus on the asymmetric part (handshake). More on that later :)

Overview

From a high level point of view, here is a very simple example of using it to hash a string:

myHash = Strobe_init("hash")
myHash.AD("something to be hashed")
hash = myHash.PRF(outputLen=16)

You can see that you first instantiate a Strobe object with a custom name. I chose "hash" here but it could have been anything. The point is to personalize the result to your own protocol/system: initializing Strobe with a different name would give you a different hash function.

Here two functions are used: AD and PRF. The first one to insert the data you're about to hash, the second one to obtain a digest of 16 bytes. Easy right?

Another example to derive keys:

KDF = Strobe_init("deriving keys for something")
KDF.KEY(keyInput)
key1 = KDF.PRF(outputLen=16)
key2 = KDF.PRF(outputLen=16)

Here we use a new call KEY which is similar to AD but provides forward-secrecy as well. It is not needed here but it looks nicer and so I'll use it. We then split the output in two in order to form two new keys out of our first one.

Let me now give you a more complex example. So far we've only used Strobe to create primitives, what if I wanted to create a protocol? For example on the client side I could write:

myProtocol = Strobe_init("my protocol v1.0")
myProtocol.KEY(sharedSecret)
buffer += myProtocol.send_ENC("GET /")
buffer += myProtocol.send_MAC(len=16)
// send the buffer
// receive a ciphertext
message = myProtocol.recv_ENC(ciphertext[:-16])
ok = myProtocol.recv_MAC(ciphertext[-16:])
if !ok {
// reset the connection
}

Since this is a symmetric protocol, something similar should be done on the server side. The code above initializes an instance of Strobe called "my protocol v1.0", and then keys it with a pre-shared secret or some key exchange output. Whatever you like to put in there. Then it encrypts the GET request and sends the ciphertext along with an authentication tag of 16 bytes (should be enough). The client then receives some reply and uses the inverse operations to decrypt and verify the integrity of the message. This is what the server must have done when it received the GET request as well. This is pretty simple right?

There's so much more Strobe can do, it is up to you to build your own protocol using the different calls Strobe provides. Here is the full list:

  • AD: Absorbs data to authenticate.
  • KEY: Absorbs a key.
  • PRF: Generates a random output (forward secure).
  • send_CLR: Sends non-encrypted data.
  • recv_CLR: Receives non-encrypted data.
  • send_ENC: Encrypts data.
  • recv_ENC: Decrypts data.
  • send_MAC: Produces an authentication tag.
  • recv_MAC: Verifies an authentication tag.
  • RATCHET: Introduce forward secrecy.

There are also meta variants of some of these operations which allow you to specify that what you're operating on is some frame data and not the real data itself. But this is just a detail.

How does it work?

Under its surface, Strobe is a duplex construction. Before I can explain that, let me first explain the sponge construction.

permutation

A sponge belongs to a field in cryptography called permutation-based cryptography. This is because at its core, it works on top of a permutation. The whole security of the thing is proven as long as your permutation is secure, meaning that it behaves like a random oracle. What's a permutation? Oh sorry, well, imagine the AES block cipher with a fixed key of 00000000000000000. It takes all the possible inputs of 128-bit, and it will give you all the possible outputs of 128-bit. It's a one-to-one mapping, for one plaintext there is always one ciphertext. That's a permutation.

SHA-3 is based on the sponge construction by the way, and it uses the keccak-f[1600] permutation at its core. Its security was assessed by long years of cryptanalysis (read: people trying to break it) and it works very similarly as AES: it has a series of steps that modify an input, and these steps are repeated many many times in what we call rounds. AES-128 has 10 rounds, Keccak-f[1600] has 24 rounds. The 1600 part of the name means that it has an input/ouput size of 1600 bits.

public/secret

So here our permutation is Keccak-f[1600], and we imagine that our input/output is divided into two parts: the public part (rate) and the secret part (capacity). Intuitively we'll say that the bigger the secret part is, the more secure the construction is. And indeed, SHA-3 has several flavors that will use different sizes according to the security advertised.

absorb

The message is padded and split into multiple blocks of the same size as the public part. To absorb them into our sponge, we just XOR each blocks with the public part of the state, then we permute the state.

squeeze

To obtain an output from this construction, we just retrieve the public part of our state. If it's not enough, we permute to modify the state of the sponge, then we collect the new public part so that it can be appended to the previous one. And we continue to do that until we have enough. If it's too much we truncate :)

And that's it! It's a sponge, we absorb and we squeeze. Makes sense right?

This is exactly how SHA-3 works, and the output is your hash.

What if we're not done though? What if we want to continue absorbing, then squeeze again, then absorb again, etc... This would give us a nice property: everything that we squeeze will depend on everything that has been absorbed and squeezed so far. This provides us transcript consistency.

duplex

The Keccak team said we can, and they created the Duplex construction. It's just something that allows us to absorb, to squeeze, to absorb, to squeeze, and on and on...

Building Strobe

"How is Strobe constructed on top of the Duplex construction?" you may ask. And I will give you an intuition of an answer.

Strobe has fundamentally 3 types of internal operations, that are used to build the operations we've previously saw (KEY, AD, Send_ENC, ...). They are the following:

  • default: state = input ⊕ state
  • cbefore: state = input
  • cafter: output, state = input ⊕ state

The default one simply absorbs the input with the state. This is useful for any kind of operation since we want them to affect the outcome of the next ones.

The cbefore internal operation allows you to replace the bits of the state with your input. This is useful when we want to provide forward-secrecy: if the state is later leaked, the attacker will not be able to recover a previous state since bits of the rate have been erased. This is used to construct the KEY, RATCHET and PRF operations. While KEY replaces the state with bits from a key, RATCHET and PRF replaces the state with zeros.

cafter is pretty much the same as the default operation, except that it also retrieves the output of the XOR. If you've seen how stream ciphers or one-time pads work, you might have recognized that this is how we can encrypt our plaintext. And if it wasn't more obvious to you, this is what will be used to construct the Send_ENC operations.

There is also one last thing: an internal flag called forceF that allows you to run the permutation before using any one of these internal operations. This is useful when you need to produce something from the Duplex construction: a ciphertext, a random number, a key, etc... Why? Because we want the result to depend on what happened previously, and since we can have many operations per block size we need to do this. You can imagine problems if we were not to do that: an encryption operation that would not depend on the previously inserted key for example.

Let's see some examples!

KEY

We'll start by keying our protocol. We first absorb the name of the operation (Strobe is verbose). We then permute (via the forceF flag) to start on a fresh block. Since the KEY operation also provides forward-secrecy, the cbefore internal operation is used to replace the bits of the state with the bits of the input (the key).

Send_ENC

After that we want to encrypt some data. We'll absorb the name of the operation (send_ENC), we'll permute (forceF) and we'll XOR our plaintext with the state to encrypt it. We can then send that ciphertext, which is coincidentally also part of the new state of our duplex construction.

I'll give you two more examples. We can't just send encrypted data like that, we need to protect its integrity. And why not including some additional data that we want to authenticate:

AD, Send_MAC

You'll notice, AD does not need to permute the Strobe state, this is because we're not sending anything (or obtaining an output from the construction) so we do not need to depend on what has happened previously yet. For the send_MAC operation we do need that though, and we'll use the cafter internal operation with an input of 16 zeros to obtain the first 16 bytes of the state.

In these description, I've simplified Strobe and omitted the padding. There is also a flag that is differently set depending on who sent the first message. All these details can be learned through the specification.

Now what?

Go play with it! Here is a list of things:

Note that this is still a beta, and it's still experimental.

comment on this story