david wong

Hey! I'm David, a security consultant at Cryptography Services, the crypto team of NCC Group . This is my blog about cryptography and security and other related topics that I find interesting.

How to parse scans.io public keys in python

posted December 2015

I wanted to check for weak private exponents in RSA public keys of big website's certificates. I went on scans.io and downloaded the Alex Top 1 Million domains handshake of the day. The file is called zgrab-results and weighs 6.38GB uncompressed (you need google's lz4 to uncompress it, get it with brew install lz4).

Then the code to parse it in python:

with open('rro2asqbnwy45jrm-443-https-tls-alexa_top1mil-20151223T095854-zgrab-results.json') as ff:
    for line in ff:
        lined = json.loads(line)
        if 'tls' not in lined["data"] or 'server_certificates' not in lined["data"]["tls"].keys() or 'parsed' not in lined["data"]["tls"]["server_certificates"]["certificate"]:
            continue
        server_certificate = lined["data"]["tls"]["server_certificates"]["certificate"]["parsed"]
        public_key = server_certificate["subject_key_info"]
        signature_algorithm = public_key["key_algorithm"]["name"]
        if signature_algorithm == "RSA":
            modulus = base64.b64decode(public_key["rsa_public_key"]["modulus"])
            e = public_key["rsa_public_key"]["exponent"]
            N = int(modulus.encode('hex'), 16)
            print "modulus:", N
            print "exponent:", e

I figured if the public exponent was too small (e.g. smaller than 1000000, an arbitrary lower bound), it would not work. Unfortunately it seemed like every single one of these RSA public keys were using the public exponent 65537.

PS: to parse other .csv files, just open sqlite and write .import the_file.csv tab, then .schema tab or any SQL query on tab will work ;)

Well done! You've reached the end of my post. Now you can leave me a comment :)