Score:0

Symmetric encryption algorithm based on multiplication

tf flag
Tom

I've been wondering about this paragraph for some time:

Multiplication is a great mixing function. If you work out what multiplication looks like in terms of ANDs and XORs it becomes apparent how elaborate a 64bit multiply is. The amount of transistors required to implement it in hardware prohibits multiplication from being used in most cryptographic algorithms. But for non-cryptographic PRNGs which only need to run on a general purpose CPU, multiplication is very useful because there is already a hardware implementation.

https://tom-kaitchuck.medium.com/designing-a-new-prng-1c4ffd27124d

Usually in encryption algorithms we use modular addition, rotation, and exclusive OR operations. But is there anything that could stand in the way of using modular multiplication, rotation, and exclusive OR operations?

Modular multuiplication is slower than addition, but it's probably not that much slower, and for sure it is stronger mixing function. Multiplication is in fact a great mixing function, so why it is so rarely used in symmetric cryptography? I think even smarphones can do 64-bit multiplication very quickly and have some hardware implementation for multiplication (but I'm not sure).

Is the slowness of multiplication really such a big problem that multiplication can't find widespread use in fast lightweight encryption algorithms? Probably on IOT devices or RFID chips it can be a problem, but when it comes to computers and smartphones, an encryption algorithm based on multiplication couldn't be a problem, isn't it?

phantomcraft avatar
pf flag
Unless the CPU has a integer hardware multiplier it shouldn't be a problem. The problem starts when using in a very small environment like smartcards when the tiny processor only do very basic operations as the ones you cited,
Score:5
my flag

Is the slowness of multiplication really such a big problem that multiplication can't find widespread use in fast lightweight encryption algorithms? Probably on IOT devices or RFID chips it can be a problem, but when it comes to computers and smartphones, an encryption algorithm based on multiplication couldn't be a problem, isn't it?

Part of the issue appears to be the definition of 'lightweight', and the intended platforms it is targeted. The CPUs on smartphones are actually quite capable; I would not characterize those platforms (or laptop computers) as necessarily 'lightweight'. Lightweight crypto is generally designed with microcontrollers in mind; typically, those microcontrollers don't have built-in 64x64 bit multiplication instructions.

Now, modular multiplication (for modulus a power of 2) can be implemented by a series of shifts and conditional additions; certainly doable, but considerably more expensive than an addition operation.

The other issue would appear to be that modular multiplication isn't as wonderful as you would have hoped. For this discussion, I'll limit my discussion to multiplication modulo a power of 2 (multiple modulo a prime doesn't have these issues; they do have have issues around the range not being a power of 2).

  • Modular multication does not have any 'right-word' propagation; for example, flipping the high bit of one of the inputs would only affect the high bit of the output; the other bits are unaffected. Of course, modular addition has the same issue; however it's also cheaper.

  • Modular multiplication does have strong differentials; the strongest is based around the identity $(-x)*y = -(x*y)$ (and the modulus operation does not break this up).

Both of these issues can be designed around in a proper design; however the fact that you have to do so makes it less attractive. In addition, it begs the question: why not use multiplication in $GF(2^k)$ instead? If we're doing a shift/add implementation, a double/xor implementation of Galois multiplication isn't much more expensive, and it avoids the above two issues...

Tom avatar
tf flag
Tom
Of course multiplication mod 2^n has big issues with mixing the bits (especially low), that's why we combined it with xor and rotate. But so far as I know addition has even bigger issues of this type. I'm not sure but probably even stronger differentials.
Tom avatar
tf flag
Tom
Can multiplication GF(2^k) be similarly fast as multiplication 64-bit numbers mod 2^64? I think it has to be GF(2^64) to get the same amount of bits.
poncho avatar
my flag
@Tom: obviously, any such answer would be quite hardware-dependent. On large CPUs, I believe that they have 'carryless multiply' operations; the necessary fixup isn't free, so it'll be slower than a straight 64x64 multiply, but it wouldn't be bad. On small CPUs (without a multiply), a constant-time double/conditional xor $GF(2^{64})$ should be close to the analogous shift/conditional add 64x64 multiplication...
Tom avatar
tf flag
Tom
I made 32-bit LCG generator and LCG generator in GF(2^32). And multiplication in GF(2^32) perfmorms slightly worse in PractRand. It looks like there are also some problems with low bits. Won't there be some other differentials? I'm not sure. It seems that the multiplication in GF(2) does not mix the bits any better. Or maybe the main advantage is that it's harder to find multiplicative inverse?
poncho avatar
my flag
@Tom: the advantage of GF multiplication is this: if $a \ne 0$ and $b$ is unknown and uniformly distributed, then $a \times b$ will also be uniformly distributed. This is not true for multiplication modulo a power of 2, if $a = 2$, and $a \times b$ will always be even.
Score:4
ye flag

The block cipher IDEA from 1991 used modular multiplication mod $2^{16}+1$ for diffusion (where 0 is mapped to $2^{16}$).

As zero-divisors are not good from a cryptological standpoint, the modulus should be prime, and of the form $2^b+1$ as one prefers to work with bits (and not use the 0), so $b=2, 4, 8, 16$ ($b=1$ would be linear).

If you design a cipher using these modular multiplications, you will run into (at least) two problems:

  • the cryptographic properties of the modular multiplications are not well understood, making it hard for you to show that your cipher is good
  • for smaller devices side-channel attacks have to be considered, but it's hard to protect these modular multiplications against those (especially againt DPA; but already timing attacks might be a problem, if multiplication is not constant time)
poncho avatar
my flag
I would argue that the cryptographical properties of modular multiplication are understood. In addition, as far as DPA, modular multiplication is threshold-friendly, based on the identity $(A+B)*(C+D) = (A*C + B*D) + (A*D + B*C)$ (which extends in the obvious way for 3+-way thresholding)
Tom avatar
tf flag
Tom
Side-channel attacks and timing attacks are really issue that has to be considered, I forgot about it. Hovewer, isn't addition similarly vulnerable to such a problems? Or maybe it isn't because we can easily transform it into shift and xors, which are constant time?
poncho avatar
my flag
@Tom: the main problem with timing and multiply is on lower-end platforms, on those platforms, the multiply instruction might not be constant time (e.g. stop when one of the operands runs out of '1' bits). Now, you could implement a constant-time multiply on those platforms; however a crypto-naïve implementor might not. In contrast, CPUs almost always implement addition in constant time...
Tom avatar
tf flag
Tom
One more question. To get analog of 64-bit multiplication (mod 2^64), what GF should be used? I want to multiply 64-bit numbers and get 64-bit results. Am I right I have to use GF(2^64)? There are examples of GF(2^8) everywhere, AES is also operating on GF(2^8), but it works on bytes. If someone would like to replace 64-bit multiplication, he has to use GF(2^64), am I right? Or maybe it could be something different, like in AES GF(2^8), but on bytes? Then rules of such arithmetic are slighty different, than in GF(2^64), if I understand it right.
ye flag
@poncho: The formulas you gave for threshold implementations hide two implementation issues: The addition is mod $2^{16}+1$, so the additive shares don't fit into 16 bits, and you need a modular reduction that is easily implemented, but you have to consider that the un-reduced values are visible to the attacker (via DPA). But the real problems for a DPA-secure implementation will show up, when you mix the modular multiplication with other operations like xor or addition mod $2^{16}$, as you have to switch between different masking operations, which is non-trivial (and time-consuming).
ye flag
@Tom: You know that the multiplication in the field with $2^{64}$ elements is quite different from multiplication mod $2^{64}$?
Tom avatar
tf flag
Tom
@garfunkel yes, this is different, I know. I studied the subject and wrote my own multiplication programs and even a GF(2^128) multiplication function based on GCM mode and pclmulqdq instructions.
ye flag
@Tom: (From your writing I thought you know, and asked just to be sure.) Two additional differences between using multiplication mod $2^{16}+1$ (using all field elements but the 0) and multiplication in the field with $2^{16}$ (using all field elements including the 0) is that the latter has always the 0 as fixed point, and - much more important - that the latter is linear over F2 ("xor"), so misses one essential property of cryptographic S-Box. So like in AES you might have to use inversion over the field with $2^{64}$ elements instead.
Score:0
ca flag

Encryption, of any type, is costly in hardware area and power. Multipliers are indeed large, and deep and thereby slow compared to the base ALU full adder. I'm going to do a bunch of hand waving, but if you grab a CPU architecture book, you'll find the following: The design of the multipliers are always different, but the MUL unit pipeline for a 64-bit result is 4-deep, and for a 128-bit result is 8-deep on most CPUs. This is because you use chained full adders blocks, and the maximum depth in the 4-deep design is 16. The full-adder depth for an XOR, ROL, or ADD is equivalent to a single full-adder. Super-scaler processors hide much of the delays of multipliers; however, if you are really grinding away on a problem, will find an empty pipeline with a lot of delays.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.