Communication Theory of Secrecy Systems is a paper published in 1949 by Claude Shannon discussing cryptography from the viewpoint of information theory. It is **one of the foundational treatments (arguably the foundational treatment) of modern cryptography**. It is also a proof that all theoretically unbreakable ciphers must have the same requirements as the one-time pad.

source: wikipedia

I found an old Matthew Green's post where he wrote a really useful list of cryptography blogs and resources

I'll get back here after reading everything.

Studying about smartcard there seem to be a lot about whitboxes to learn, since it is indeed a whitebox: the encryption/decryption that are done inside the cards can be analyzed since you own the card. Analysis are separated in different categories like non-intrusive and intrusive. Intrusive because for efficient analysis you would have to remove some part of the plastic covering the interesting parts and directly plug yourself on the chip. This is what **Differential Power Analysis** (DPA) do, it's a stronger kind of Simple Power Analaysis (SPA).

**Kocher & al** found out about this in 1998 and released a paper that is still very useful today: http://www.cryptography.com/public/pdf/DPA.pdf

The idea is to record the power consumption of the chip along multiple encryptions. You then obtain curves with pics that you can correlate to XORs operations being performed. You can guess what cipher is used, and where are the known rounds/operations of the cipher from the intensities of some peaks, and the periodicity of some patterns. In the paper they study DES which is still the state of the art for block ciphers then.

Looking at a big number of such curves, along with the messages (or ciphertexts) they encrypted, you can focus on one operation and **one bit** of the internal state to find out one bit of one of the subkey. One bit should affect the number of XORs being performed thus you should find a **correlation** between the bit you're looking for and the power consumption at one point. Repeat and find all the other ones. It's powerful because you only need to find one bit of the subkey, one after the other.

It's pretty hard to explain it without pictures (and a video would be even better, that's always something I have been wanting to do, if I dig deeper into it maybe I'll try that). But the basic idea is here, if you want more info check the original paper

I was wondering why Randomized Algorithm were often more efficient than non-randomized algorithm.

Then I looked at a list of random number generators (or **RNG**).

Of course we usually talk about **PRNG** (Pseudo Random Number Generator) since "truly random" is impossible/hard to achieve.

An interesting thing I stumbled into is that you can create a **PRNG** using a block cipher in counter mode, by iterating the counter and always encrypting the same thing, if the block cipher used is good, it should look random.

This sounds solid since ciphers sometimes need to have Ciphertext Indistinguishability from random noise.

To support such deniable encryption systems, a few cryptographic algorithms are specifically designed to make ciphertext messages indistinguishable from random bit strings

Also under the **Ciphertext indistinguishability property** that a cipher should respect, you shouldn't be able to find any relations between the ciphertexts coming from the same input but encrypted with an increasing counter.

MicroCorruption is a "game" made by **Matasano** in which you will have to debug some programs in **assembly**. There is a total of 19 levels and it gets harder and harder by the number. The first levels are made for beginners though! So it seems like a great tool to learn, and that's what our prof Emmanuel Fleury must have thought when he gave us this as homework. One rule: every level is worth one point.

I started writing the solutions here but as people told me "it was unethical to post solutions of a challenge online", I promptly removed them. If someday the challenge will shut down I will post the write ups online so that people can still learn from it :)

I feel like I don't write much about my formation, and that it could be useful to people who are wondering about **studying Cryptography at Bordeaux University**.

There is a good article from a M1 student here: http://journaldumaster.stats.yt/master-csi-presentation/

And as it says there, the master 1 is do-able both for maths and CS people as long as you're willing to catch up in the other subject. There's a lot of theory that will allow you to study more interesting subjects in the second year of Master.

I've talked about some of the subjects but one subject I forgot to talk about was a M1 class: Elliptic Curves, taught by Fabien Pazuki and if you have the chance of taking a class from the guy just do it. He's one of the best math teacher I have had in my life, along with Vincent Borrelli (Surfaces & Curves at Lyon 1) and **some dude I can't remember the name of**. Each one of them were both really passionate and making true efforts to be pedagogical.

Someone wrote about this awesome explanation of what is AES from scratch on the **#crypto** channel on freenode. It's pretty nice!

One of my professor organized a Hacking Week this semester but I didn't have time to do it. Since I'm in holidays I thought I would take a look at it and write a bit about how I solved them.

Here's the **Crypto Challenge number 2** (out of 5) from this **CTF** (Capture The Flag):

user0:$apr1$oTsx8NNn$bAjDZHpM7tCvHermlXKfZ0
user1:$apr1$UxOdpNtW$funTxZxL/8y3m8STvonWj0

user2:$apr1$w7YNTrjQ$0/71H7ze5o9/jCnKLt0mj0
user3:$apr1$AIw2h09/$Ti0TRlU9mDpCGm5zg.ZDP.
user4:$apr1$048HynE6$io7TkN7FwrBk6PmMzMuyC.
user5:$apr1$T2QG6cUw$eIPlGIXG6KZsn4ht/Kpff0
user6:$apr1$2aLkQ0oD$YRb6aFYMkzPoUCj70lsdX0

You have 7 different users with their respective password hashed and you have to find them. It's just the 2nd out of 5 crypto problems, it's pretty basic, but I never brute forced passwords for real before (I remember using **John The Ripper** when I was in middle school but that's for script kiddies).

What's Apr1 ? It's a hash function that uses md5. And md5 is pretty weak, lots of **rainbow tables** on google.

This is how Apr1 looks in PHP according to Wikipedia, also the passwords are supposed to be alpha (a to z) in lowercase.

```
function apr1($mdp, $salt) {
$max = strlen($mdp);
$context = $mdp.'$apr1$'.$salt;
$binary = pack('H32', md5($mdp.$salt.$mdp));
for($i=$max; $i>0; $i-=16)
$context .= substr($binary, 0, min(16, $i));
for($i=$max; $i>0; $i>>=1)
$context .= ($i & 1) ? chr(0) : $mdp{0};
$binary = pack('H32', md5($context));
for($i=0; $i<1000; $i++) {
$new = ($i & 1) ? $mdp : $binary;
if($i % 3) $new .= $salt;
if($i % 7) $new .= $mdp;
$new .= ($i & 1) ? $binary : $mdp;
$binary = pack('H32', md5($new));
}
$hash = '';
for ($i = 0; $i < 5; $i++) {
$k = $i+6;
$j = $i+12;
if($j == 16) $j = 5;
$hash = $binary{$i}.$binary{$k}.$binary{$j}.$hash;
}
$hash = chr(0).chr(0).$binary{11}.$hash;
$hash = strtr(
strrev(substr(base64_encode($hash), 2)),
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/',
'./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
);
return '$apr1$'.$salt.'$'.$hash;
}
```

It seems pretty difficult to reverse. Let's not forget that hashes are one-way functions and that they also lose information. I don't know if they do lose information on a 7-letters-password though, but it seemed quite stupid to go down this road when I could just brute force it.

What language offers a good library to hash with Apr1? Well I didn't know, and I felt like maybe Unix could do it well for me.

Turns out that **OpenSSL** has a command line for it:

`openssl passwd -apr1 -salt SALT PASSWD`

A quick bash script later:

```
#!/bin/bash
test[1]='$apr1$oTsx8NNn$bAjDZHpM7tCvHermlXKfZ0'
salt[1]='oTsx8NNn'
test[2]='$apr1$UxOdpNtW$funTxZxL/8y3m8STvonWj0'
salt[2]='UxOdpNtW'
test[3]='$apr1$w7YNTrjQ$0/71H7ze5o9/jCnKLt0mj0'
salt[3]='w7YNTrjQ'
test[4]='$apr1$AIw2h09/$Ti0TRlU9mDpCGm5zg.ZDP.'
salt[4]='AIw2h09/'
test[5]='$apr1$048HynE6$io7TkN7FwrBk6PmMzMuyC.'
salt[5]='048HynE6'
test[6]='$apr1$T2QG6cUw$eIPlGIXG6KZsn4ht/Kpff0'
salt[6]='T2QG6cUw'
test[7]='$apr1$2aLkQ0oD$YRb6aFYMkzPoUCj70lsdX0'
salt[7]='2aLkQ0oD'
while read line
do
if [ "${#line}" == 7 ]
then
for num in {1..7}
do
noob=$(openssl passwd -apr1 -salt $salt[$num] $line)
if [ "$noob" == "$test[$num]" ];
then
echo $line;
fi
done
fi
done < /usr/share/dict/words
```

I read the `/user/share/dict/words`

that contains a simple dictionary of words on Unix, I try only the 7-letters-words.

The test ran in a few minutes and gave me nothing.

Well, I guess with a 7 letters password they must have used gibberish words. Let's try a real bruteforce:

```
for a in {a..z}
do
for b in {a..z}
do
for c in {a..z}
do
for d in {a..z}
do
for e in {a..z}
do
for f in {a..z}
do
for g in {a..z}
do
truc=$a$b$c$d$e$f$g;
for num in {1..7}
do
noob=$(openssl passwd -apr1 -salt $salt[$num] $truc)
if [ "$noob" == "$test[$num]" ];
then
echo $truc;
fi
done
done
done
done
done
done
done
done
```

It ran and ran and... nothing.

Well. Let's not spend too much on this. There is John The Ripper that does this well and even oclHashcat that does this with the GPU.

Let's create a `john.conf`

with the following to limit the password to 7 letters:

```
[Incremental:Alpha7]
File = $JOHN/alpha.chr
MinLen = 7
MaxLen = 7
CharCount = 26
```

Let's launch John:

`john -i=Alpha7 hackingweek.txt`

(don't forget to put the hashed password in hackingweek.txt).

Wait and wait and wait.. and get the passwords =)

I got asked this question in an interview. And I knew this question beforehands, and that it had to deal with hashtables, but never got to dig into it since I thought nobody would asked me that for a simple internship.

I didn't know how to answer, in my mind I just had a simple php script that would have looked like this:

```
$arr = array(-5, 5, 3, 1, 7, 8);
$target = 8;
for($i = 0; $i < sizeof($arr) - 1; $i++)
{
for($j = $i + 1; $j < sizeof($arr); $j++)
{
if($arr[$i] + $arr[$j] == $target)
echo "pair found: ${arr[i]}, ${arr[j]}";
}
}
```

But it's pretty slow, it's mathematically correct, but it's more of a CS-oriented question. How to implement that quickly for machines? The answer is **hash tables**. Which are implemented as **arrays** in PHP (well, arrays are like super hash tables) and as **dictionaries** in Python.

I came up with this simple example in python:

```
arr = (-5, 5, 3, 1, 7, 8)
target = 8
dic = {}
for i, item in enumerate(arr):
dic[item] = i
if dic.has_key(target - item) and dic[target - item] != i:
print item, (target - item)
```

- iterate the list
- assign the hash of the value to the index of the value in the array
- to avoid finding a pair twice, we do this in the same for loop:

we do the difference of the target sum and the number we're on, we hash it, if we find that in the hash table that's good!
- but it could also be the number itself, so we check for its index, and it has to be different than its own index.

Voilà!
We avoid the n-1! additions and comparisons of the first idea with hash tables (I actually have no idea how fast they are but since most things use hash tables in IT, I guess that it is pretty fast).

One last exam, ECC, and then I'm free to do whatever I want (no I still haven't found an internship, but I talked with **TrueVault**, **Cloudflare**, **MatterMark**, **Spotify** and maybe **Matasano** so this has been a good experience nonetheless).

I stumbled upon the notes of Ben Lynn an ex Stanford's student that took an ECC class there. They're pretty awesome and I kinda want to do something like that on this blog. Maybe next year it's a bit late for that :)

The notes are here

We're learning a lot of algorithm in my **algebre et calcul formel** class. One of them is the **Toom-Cook algorithm** used for multiplication of large integers.

I found a super simple explanation of it on a forum, it helps:

Say, we want to multiply 23 times 35.

We write,

p(x) = 2x + 3,

q(x) = 3x + 5.

We are using our realization that any integer can be written as a polynomial.

Here, p(x), represents 23, and q(x), represents 35, when x equals 10.

We write,

p(x)q(x) = r(x).

That is, p(x) times q(x), equals r(x).

So,

(2x + 3)(3x + 5) = ax^2 + bx + c = r(x).

Now,

p(0)q(0) = r(0).

So,

(2*0 + 3)(3*0 + 5) = a*0 + b*0 + c.

Therefore,

c = 15.

Now,

p(1)q(1) = r(1).

Therefore, when we do the substitutions (for x and c),

a + b = 25.

Now,

p(-1)q(-1) = r(-1).

Therefore, when we do the substitutions (for x and c),

a - b = -13.

Now, we already know c, and we just need to find a and b.

We have two linear equations and two unknowns,

a + b = *25,

a - b = -13.

We just add the two equations and we get,

2a = 12.

Therefore,

a = 6.

Now, we can substitute 6 for a in,

a + b = 25,

and we get,

b = 19.

So,

r(x) = 6x^2 + 19x + 15.

Now, we substitute 10 for x in r(x), and we are done,

r(10) = 600 + 190 + 15 = 805.

Believe it or not!

I knew that my principal cryptography professor Gilles Zémor was a GO player.

Which is pretty amazing in itself :)

But this keeps going on.

I have an algebra class this semester, and I'm trying to understand **Berlekamp's algorithm**. Trying to find videos on youtube about him I discover that he is as well a go player! And doing researches about the game at that! So cool :D

I wanted a recall on how masquerade worked in NAT, and I wanted a fast recall.

What's better than a picture? Nothing of course :D

source

If you read this blog, you know that recently I gave a talk on bitcoins.

I also gave a talk on whitebox cryptography last week.

One part of giving a talk that a lot of people tend to overlook is **making good slides**.
I've always used Powerpoint for that, but for my last talk on whitebox cryptography I had two other persons on my team. Powerpoint was not an option if we were all working on the same file. LaTeX is the solution.

It's a real text file so you can use a revision control system like git, it's constant in its layout. You configure it at the beginning of the file and then you don't have to worry about it later.

We also had a fight (we were tired) on what theme to used. I went for no theme at all. Because everything else is **visual noise**.

Here's a great article from **Zach Holman** on the subject. If you ask me, and I'm not saying my slides are perfect, there are way too many crappy slides out there!