Errata: the number 1024 used in formulas on slides 50 and 51 (slide 15 in the summary version) and on pages 4 to 6 in the paper should actually be 1042 (as (512+9)*2). Correcting it would very slightly affect our reported theoretical and derived speeds.