Score:5

How to prevent power analysis on software level?

nu flag

When attacking RSA with Square&Multiply, one can figure out the secret key by looking at the exponentiation algorithm itself. To prevent this in software, we could use dummy multiplications after each square.

Yet, there are attacks like correlation power analysis on AES, which is vulnerable to it by definition. How can such attacks be prevented on a software level, without using any noise masking hardware components?

Imagine you want to update a device for side-channel resistance whose hardware cannot be changed, how can one proceed for algorithms like DSA, AES?

Score:6
mx flag

Software can do a number of things to thwart power analysis. Fundamentally, power analysis looks at power consumption to extract information about operations carried out inside the device. To make sense of the data it needs to know the correlation between device power consumption and the algorithm steps carried out by the device.

  • make time matching of operations harder
    • randomise order of S-box application in each round
    • shuffle in dummy computations on additional unused values
  • reduce leaked information from each operation by masking values during computations

Here's a good intro paper to look at for algorithm modifications

Less well known are changes to the mode of operation of cryptographic components. Power analysis on a given implementation may require hundreds to thousands of power traces on unique inputs. An attacker feeds plaintexts or ciphertexts of their choice into a keyed cryptographic function. Limiting the number of times a key can be used, or the range of inputs it will accept can limit the attacker's ability to extract information about a key.

As an example, when performing a key derivation using an attacker controlled Nonce from a master key. Instead of doing k=AES_enc(Nonce,K_master) which allows the attacker 2^128 queries of the key, the nonce can be broken into n bit parts and each part mixed with the master key in a single AES operation. Similarly, high level constructions that rotate keys frequently cost little but avoid giving the attacker time to extract information from any one key.

There's an app note from Microchip about that somewhere.

Finally, consider rate limiting. Make acquiring each power trace take longer. As an example, if a bank card is being used to perform a transaction requiring PIN entry, a software timer to delay cryptographic operations until 5-10 seconds after power on of the card might be acceptable.

Another option for rate limiting for online protocols is to use asymmetric cryptography to do an initial challenge response protocol before the device allows a sensitive key to be used. As an example, the device could store the root of a Merkle tree of 2^N hashed answer values. It randomly generates a challenge 0 ≤ challenge < 2^N and expects to receive the corresponding answer value and intermediate leaf hashes that can be used to reach the stored root hash. That's a simple bit of public key cryptography that only requires symmetric primitives. An attacker who has captured k challenge response pairs has their attack slowed by 2^N/k unless they're performing an online attack.

Score:1
ca flag

If one makes the software "look" like it could be in hardware, you can make something is fixed time and relative fixed power. This involves making decisions on how the hardware will behave through software, which involves removing lookup tables and cache hits.

I have a document on how the AES S-BOX is implemented in hardware. I have used this same architecture in software and verified that it is fixed time.

Sadly, you would have to do this for each cryptographic algorithm. You could automate this in a non-optimal way. If you had a logic tree for the algorithm, you could do a Karnaugh reduction to NAND gates (which one can us to implement any circuit), and then just have the CPU do a series of logic operations.

Maarten Bodewes avatar
in flag
Wouldn't some calculations still be constant time in HW and not in SW? For instance when it comes to data you could have a cache hit or miss, while in HW it would just reside in the circuit.
poncho avatar
my flag
@MaartenBodewes: if the memory locations accessed is not correlated to any secret data, that is, perhaps fixed or computable from data we're giving the attacker anyways (e.g. ciphertext), then cache misses are irrelevant (from a security perspective). The issue with cache misses is that the leaker may get some information about when they occur and what locations were being accessed - if the attacker can't leverage that into something we don't want him to know, well, why do we care?
Maarten Bodewes avatar
in flag
Well, this is about ciphers right, so secret information is involved for sure. What I mean to say is that you can emulate hardware or a hardware implementation, but you still have to consider that software is not hardware, and behaves differently e.g. when it comes to CPU & memory timing. For instance, I can imagine AES using tables being secure in HW, which is not secure in SW.
b degnan avatar
ca flag
@MaartenBodewes All base logic operations are constant time in the ALU (well, all the ones that I know). Also, if you math through it all, there's nothing to cache. You'll have your instructions in the cache, but there no data dependent lookups. It makes it relatively slow compared to a classical implementation, but it's constant time.
poncho avatar
my flag
@bdegnan: "All base logic operations are constant time"; common exceptions (which depend on the CPU) include multiply, divide/mod, variable rotate (and for the Falcon implementors, FP operations). Actually, I recall an 8 bit microprocessor where the 16 bit increment wasn't constant time - for people who must work in such an extremely constrained environment, they need to investigate their CPUs
poncho avatar
my flag
"For instance, I can imagine AES using tables being secure in HW, which is not secure in SW."; actually, you can do constant time/memory access table lookups in software - it involves scanning through the entire table on every lookup and using boolean masking logic to pick out the entry you're interested in. This is too expensive to be considered for use in AES - I have seen other scenarios where it made sense...
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.