Score:5

Resistance against timing attacks of AES candidates

It's difficult to implement AES securely and efficiently if the adversary can observe the timing and (approximate) location of memory accesses, unless you have dedicated hardware. The naive implementation uses lookup tables, which are vulnerable to attacks based on caches or on memory bus contention. Timing-invariant implementations exist (using bitslicing) but they're slower.

Are the other candidates to the AES competition equally vulnerable to timing attacks? Especially the other finalists (Serpent, Twofish, RC6, MARS). Does Rijndael maintain its performance advantage if one only considers timing-invariant implementations that don't require special-purpose processor instructions? Were timing attacks at all considered when choosing the winner?

(Note that this is a question about history, not about algorithms to use today. Algorithms invented after 1998 are squarely off-topic.)

Score:8
ru flag

Although I've never implemented any of the AES runner-ups, I thought it would be a fun exercise to come back and look at them in this light. Note I haven't studied any optimized implementations in depth, just the textbook description. So here we go:

RC6

Its side-channel attack resistance boils down to the CPU having a constant-time implementation of rotations (or, at least, shift left and right, which can be composed to efficiently implement a rotation) by a variable amount, given in a register, in constant-time; this operation is required for the key schedule, encryption and decryption.

Hardware to implement this is called a barrel shifter, and interestingly, the main Intel architecture around the time of the AES contest (NetBurst, as used in the Pentium 4), didn't have one. Agner Fog, on the other hand, claims a fixed 4 cycles in the original NetBurst architecture (but perhaps he didn't investigate this with different shift widths) and 1 cycle for the updated P4E architecture as used on later Pentium 4 chips. I think this is the only high-profile case of lacking a barrel in AES- and post-AES-era desktop and server CPUs

Also, it's possible -- and I'd say likely, even -- that some microcontroller CPUs may lack either a barrel shifter or variable-shift instructions. Without that, there's no recurse but to emulate a barrel shifter in software using constant-time techniques (compute shifts, then do a conditional select between the shifted and non-shifted data). If there's support for constant-time shifts by any amount, this can be implemented in $O(\log n)$, using a ladder of shifts by 16, 8, 4, 2 and 1. This would already slow down the algorithm considerably, as data-dependent rotation is essentially the cornerstone of RC6. But it could be worse: some old architectures only did shift by 1, and in that case you'll need a 32-iteration loop, which would absolutely kill performance.

Aside from the issue of rotations, RC6 calls for multiplications. Again, these are mostly implemented in constant-time, but there are exceptions, especially in ARM land: the ARM7T and ARM9T cores, which I believe were available around the time of the AES contest, have variable-time multiplication, and even the latter Cortex-M3 (although the $32 \times 32$-bit multiplication instruction, used by RC6, is constant time).

I believe the remaining operations should be reasonably expected to be implemented in constant time: 32-bit addition, subtraction, XOR, and sequential accesses to the round keys.

Serpent

Like RC6, it employs rotation, but unlike RC6, and crucially, these are by fixed amounts. So even with a hypothetical, very inefficient shifter implementation in hardware (in which execution time is proportional to shift amounts), or in an architecture with only shift-by-1 instructions, no leak is possible since the shift amounts are fixed and thus public. Note that, even lacking a barrel shifter, "[t]he shifted or rotated data, though, does not leak." It also employs S-boxes, which by themselves might lead to similar issues as Rijndael.

But the whole point is moot, since Serpent is basically designed to be implemented in bitslicing mode if you care at all about performance, including the S-boxes. By nature of the (bitwise) operations employed, is the closest to a constant-time guarantee one can expect -- I simply cannot fathom any real-world hardware implementations of bitwise operations that could possibly leak timing data. But of course, you have to implement it using bitslicing to get these guarantees.

Twofish

Like Serpent, it employs rotations by fixed amounts, so as discussed there, this operation doesn't leak timing. However, it employs S-boxes, and from a cursory read of the specification, there's no trick to implement it differently from Rijndael (e.g. the bitslicing mode of Serpent), so it appears to be susceptible to the same timing attacks as Rijndael.

MARS

Employs every vulnerable operation possible: data-dependent rotations, multiplications and S-boxes. Apparently hopeless to implement in constant-time without extensive hardware support.

Ranking AES finalists by resistance to side-channel attacks

Given all of the above, this is how I would rank the AES finalists in terms of susceptibility to timing side-channel attacks, given the prevalence of issues with different primitive operations. The ranking is from best to worst:

  1. Serpent
  2. RC6
  3. Twofish/Rijndael
  4. MARS
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.