Score:0

Does a secret-dependent order of access of a lookup table create a possibility for a timing attack?

st flag

There are attacks that can extract some information from lookup table access that depends on a secret value, e.g. if we access a[0] or a[255] depending on some property of the secret, and a[255] causes a cache miss. But what if only the order of access changes depending on the secret value? E.g. we could have the a[0], a[1], a[2], a[3] or a[3], a[0], a[1], a[2] pattern. The order is different, but for every possible secret value the same set of indices is accessed (and, in my case, the access pattern is always cyclic, not entirely random). Would that still open up a possibility for an attack?

The practical application here is a constant-time implementation of bit shift on multi-limb integers (and if anyone has a reference to an existing implementation of such, I'd be curious to look at it). So the table can be several hundred 32- or 64-bit words.

Score:2
ng flag

Does a secret-dependent order of access of a lookup table create a possibility for a timing attack?

At least it creates a the possibility of a secret-dependent timing variation (this may allow attack, or not, depending on context). Avoiding secret-dependent timing in a portable way requires that memory access patterns (including array indexes) are not secret-dependent.

On many architectures, timing will broadly tend to be minimal when the array is accessed in increasing or decreasing index order, which will maximize cache hits; and larger when the array is accessed in some haphazard order, tending to cause cache misses. The actual timing will depend on the state of the memory subsystem including cache, which on multithread CPUs (including most modern desktop and mobile CPUs) often depends on what other processes do. Having a cyclic access pattern will somewhat mitigate this effect, but how much remains dependent on the target CPU's architecture and activity of other processes.

Application is constant-time implementation of bit shift on multi-limb integers

Except in RC5 and RC6, I have never seldom met the need for implementation of bit shift with time independent of the shift count (this does not happen in RSA, Schnorr signature, DSA, ECDSA, EdDSA for large integers shifted by a variable shift count, except for shift count restricted to less than the limb width).

However it can be done, with much less performance hit than the obvious (multiplication by a power of two obtained in constant time). Assuming only byte access to memory, there's an easy algorithm with constant memory access pattern that shifts by $0$ or $2^k$ bits, and that can be used for each bit of the shift count. Assuming byte access to memory, we only need to special-case $k\in\{0,1,2\}$. Assuming a 64-bit barrel shifter, we can do the lower 6 bit in one pass.

fjarri avatar
st flag
> Except in RC5 and RC6, I have never met the need for implementation of bit shift with time independent of the shift count. This is used in some modular inversion algorithms (decomposing the modulus into `s * 2^k` where `s` is odd) and division/modular reduction algorithms (where one needs to renormalize the modulus to have the MSB set)
fjarri avatar
st flag
Also thanks for the ideas for a fast implementation, this may be better than what I ended up with (which works in `O(N^2)` time with `N` being the number of limbs)
fgrieu avatar
ng flag
@fjarri: For modular inversion algorithms without secret-dependent timing variation, we can use binary (extended) GCD ([this](https://eprint.iacr.org/2020/972.pdf) gives the basic method and improvements). In "division/modular reduction algorithms (where one needs to renormalize the modulus to have the MSB set)", the shift count is limited to less than one limb worth (typically <32 or <64 bit) and a barrel shifter does in a single pass over the data. I updated the answer to make my statement sharper.
fjarri avatar
st flag
I plan to implement eGCD as well, but frankly I would not bet that it will end up faster. The RNS inversion used already existing methods and was quicker to add, to have the functionality in place. "the shift count is limited to less than one limb worth" - it is not for big integers, you have to rescale all the way to the full bitsize, not just to one limb.
fjarri avatar
st flag
Btw, I did actually end up with a much simpler and faster constant-time shift algorithm (O(N * log(N)) thanks to your answer.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.