Score:6

Crypto

Why is mod calculation necessary in a one time pad encryption?

GunJack

12/15/22, 9:42 AM

Considering the English alphabets to be encrypted why is the computation of mod 26 necessary after adding the pad to plain text. Is it just that it adds to another level of encryption or is it used so that two letters may end up with the same symbol in the cypher? 3 mod 26 is 3 and 29 mod 26 is 3.

2105

3 + 4

one-time-pad

fgrieu

12/15/22, 9:52 AM

Hints: (1) with mod used, and then not used, ask yourself the following: what is the highest possible ciphertext value? When would it occur? Would an adversary noticing that value in the ciphertext learn something about the plaintext? (2) Dig deeper: assume plaintext consisting of random independent characters, and a random pad: how much information (in bit or fraction thereof per plaintext character) is learned from the plaintext with mod used, and not used?

8

Reply

jjj

12/16/22, 12:34 PM

No two letters end up with the same symbol. Think of it like the alphabet repeats after 25. So 26 would be "a" again. 3 and 29 correspont to the same letter. Mod just makes the number smaller, so you only need 0 to 25

1

Reply

Joshua

12/16/22, 4:58 PM

Am I the only one whose first use of One Time Pad used five digit numbers with the numbers being nowhere close to either end? If it was random Gaussian rather than random uniform it leaks very little information.

0

Reply

Paul Uszak

12/16/22, 8:48 PM

@Joshua No, you can't have a Gaussian OTP key. Unless you're missing the randomness extractor from your TRNG

0

Reply

Score:21

Crypto

bk2204

12/15/22, 8:30 PM

There are two main reasons.

First, when we encrypt data with a symmetric algorithm, we generally want each unit to encrypt or decrypt to a unit of the same size (ignoring padding and MACs). In your case, when we're using English letters, we'd want to also get English letters out, and not a set of random numbers. Similarly, when we're encrypting a byte, we also want to get a byte out, since computers usually work with bytes and it's most convenient to process them that way.

Secondly, and more important, not using modular arithmetic here leaks information, sometimes a lot of information, about the data. For example, if we're using the range 0-25 to represent our letters, if we see a 0 as the encrypted output, we know that both the pad and the input were 0, and if we see 50, we know that both the pad and the input were 25. Similarly, 49 tells us that the two numbers involved were 24 and 25 in some order. With that type of information and statistical analysis, we can probably decrypt the ciphertext.

However, if we used modular arithmetic, then the output value doesn't teach us anything about the pad or the input, since every output value is equally likely. If the pad is truly random and used only once, then it provides perfect confidentiality.

0 + 14

fgrieu

12/16/22, 7:45 AM

"we can probably decrypt the ciphertext": assuming English without space, I think that's going to be challenging. Studying when that can be done is actually an interesting problem!

1

Reply

user253751

12/16/22, 8:50 AM

@fgrieu Challenging, but not necessarily impossible.

1

Reply

John Coleman

12/16/22, 11:46 AM

@user253751 With a random key stream I would say that a ciphertext only attack which succeeds in recovering the plaintext is typically impossible. What you can do is rule out some candidate plaintexts, so it definitely leaks information, but not generally enough for a complete break. For some shorter messages there might be enough information for a lucky guess, especially if there is some prior information about the likely content of the plaintext.

1

Reply

fgrieu

12/16/22, 1:18 PM

If the plaintext was random, I get we'd learn ≈0,7191 bit/symbol, which I think is not enough to reconstruct English even semi-reliably. How much we learn for a particular language plaintext depends on letter frequencies and the value assigned to each letter, but I think it remains significantly less than 1 bit/symbol, and that's still not much.

0

Reply

Jason Goemaat

12/16/22, 3:19 PM

Given that the question asks specifically about mod 26, I assume the message would be in plain english with punctuation and spaces not encrypted. This would be silly if you're trying for real security. You could create tables based on the likelihood of each letter given the value and take into account how often the letter occurs and what range is possible. You could do the same for words and constrain it with the possible and likely letters to come up with a likelihood for each word. Then you could constrain that with parsing the language, and any information you know about the message...

1

Reply

Jason Goemaat

12/16/22, 3:30 PM

For example say the original message is 'a-z' and without mod the values could be added to be 'A-Z' (i.e. 'e' is 4, adding 21 gives 'z', adding 22 gives 'A', etc.). If the ciphertext has a 'd', you know the letter has to be a, b, c, or d. If the ciphertext for a word is 'aB', you know that it is 'a' and some other letter from c to z. That could be 'at', 'an', 'as', but not 'aj', 'ao', 'ap' because those aren't words. 'as' is an advertb or conjunction and the 17th most popular english word, 'at' is the 20th and a preposition, 'an' is 32nd and an article. You could not decrypt the...

1

Reply

Jason Goemaat

12/16/22, 3:32 PM

message with 100% certainty in most cases, but you could write software to come up with the most *likely* decryptions without too much work.

1

Reply

John Coleman

12/16/22, 5:30 PM

@JasonGoemaat In classical cryptography, spaces and punctuation are *suppressed* rather than left unencrypted. Of course this is silly from the point of view of true security, but the unicity distance with a random key would still be close to the whole key length. You could come up with a notion of the most likely decryption, but given the astronomically large number of possible decryptions the probability in question could be well below 1%.

0

Reply

dan04

12/16/22, 9:34 PM

Well, to test this idea, here's a message where each letter is encoded A=0...Z=25 plus a pseudorandom “key” number between 0 and 25: 13-28-30-10-38-22-38-21-36-8-21-15-15-30-39-14-28-16-19-20-39-19-25-18-14-21-26-36-41-31-23-24-21-22-14-14-17-35-11-7-31-28-23-11-36-26-42-20-40-29-12-11-31-28-37.

1

Reply

John Coleman

12/17/22, 11:29 AM

@dan04 interesting idea. There are `37979562564556140050029561655702064479432532295680000000000000000000` possible decryptions, though the vast majority wouldn't correspond to sensible English. Looking at just the first three letters it is easy to find multiple 3-letter words which could encrypt that way (and of course it need not start with a 3-letter word). To make more serious progress I would construct a [DAWG](https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton) for a large word list and write a script for turning cipher-text initial sequences into paths through it.

1

Reply

user253751

12/17/22, 11:31 AM

@dan04 I'm going to assume it's not "congratulations, dramatic illustration, milfhunter privileges" (yes "milfhunter" seriously appears in google's 10k word list). But "congratulations" is a very likely starting word given the context, and "dramatic-" is also reasonably likely. Long words are much more likely to be guessed correctly because there are more constraints on them. Of course, it could just be a coincidence that "congratulations" fits. I did not try smaller words such as "congratulations, you have" or "congratulations, this is"

2

Reply

dan04

12/17/22, 11:51 PM

@user253751: “Congratulations” is indeed the intended first word. Now you just need to find a combination of other English words that (after removing punctuation and spaces) matches the regex `[A-O][D-Z][A-Q][A-T][A-U][O-Z][A-T][A-Z][A-S][A-O][A-V][B-Z][L-Z][Q-Z][G-Z][A-X][A-Y][A-V][A-W][A-O][A-O][A-R][K-Z][A-L][A-H][G-Z][D-Z][A-X][A-L][L-Z][B-Z][R-Z][A-U][P-Z][E-Z][A-M][A-L][G-Z][D-Z][M-Z]`

0

Reply

user253751

12/18/22, 8:15 AM

@dan04 "dramatic illustration, milfhunter privileges" is such a combination :P

2

Reply

John Coleman

12/18/22, 8:39 PM

@dan04 'congratulations: dictionaries provide many choices: hunt secret' is a possible decryption for which I have zero confidence. I wrote a script to find matches in segments of the cipher text using the 10K word list and was surprised at how many there were. Unless my script is buggy, there are nearly 2000 possibilities for the next word after 'congratulations'. Lucky guesses aside, this encryption method seems resistant to complete recovery of the plaintext (which of course doesn't mean it is any good).

1

Reply

Score:19

Crypto

Fax

12/16/22, 3:57 PM

Not using modulus leaks information.

For the English language, it's not so obvious. For an image, on the other hand...

Left half uses addition and modulus, right half uses addition and division by two (i.e. half brightness).

0 + 2

Paul Uszak

12/17/22, 11:12 PM

Not sure that young TUX is an appropriate example (re. OTPs), as uneven character values can't be divided into integers. So of course you've rounded your RBG values (integers). You really need a completely different algorithm/character/key mapping. Which then wouldn't be uniformly random, but rather a group of some sort.

0

Reply

dan04

12/17/22, 11:48 PM

@PaulUszak: To be technically correct, the output format would need to support RGB values in the range [0, 510], and use lossless compression. The image is just being approximated here because of technical restrictions. Point is, if you “encrypt” a raster image and get something that's easily recognizable to the human eye (albeit with random “noise”), then you have a really bad encryption algorithm.

4

Reply

Fax

1/30/23, 3:28 PM

@PaulUszak It's certainly possible to produce a 9-bit lossless image without division and rounding, but even if you have a monitor capable of displaying it, it won't change the conclusion.

0

Reply

Score:3

Crypto

Paul Uszak

12/15/22, 1:03 PM

It's not.

Encryption/decryption can be done in anyway you can think of. Maths isn't needed at all. All it represents is a 1:1 mapping (bijection) between the message and cipher text. The only important thing is that the key material is generated through a truly random (physical) process. It's just that either a modulo operation or XOR makes it simpler for a computer implementation.

This is encryption/decryption without maths using a reciprocal DIANA table:-

Also realise that there are more than 26 characters in cryptography. There are numbers, punctuation (very important as it can change the meaning of a sentence when declaring war on someone). Then there are enumerated codewords, like 73 (meaning something to someone). From the wiki page: " The JN-25 code used in World War II used a code book of 30,000 code groups superencrypted with 30,000 random additives. ". You might heretofore perform the mapping modulo 10, on a digit by digit basis.

0 + 3

Maarten Bodewes

12/15/22, 1:45 PM

The question reads: "why is the computation of mod 26 necessary *after adding the pad to plain text*". Your post is certainly interesting, but it doesn't answer the question. The second part of your answer is also at odds with the assumption in the first line of the question: "*Considering the English alphabets* to be encrypted"

4

Reply

ilkkachu

12/16/22, 10:30 AM

and if you look into how the table is generated, you see e.g. under column `N` the two entries `Ma` and `Nz` next to each other. That tells of modular arithmetic.

1

Reply

Paul Uszak

12/19/22, 10:45 PM

@MaartenBodewes My answer is correct because the question is moot. We can do "A/9" $\to$ which only involves friendship, and no maths.

0

Reply