Cracking an XOR crypt with a know key length

MyselfAndOnlyMe

2/3/24, 3:02 AM

I'm trying to crack a crypt with a known key length. I deduced that the operation made was a hex XOR. Here is the crypt:

330a1448010816101c1e470b0248104711050903040a0844511317130d030817024812150d014817150d1c48050f0751181415071f0618060e510618000a051b190606144822080e1006040a42051d1302101e1b040a423d4651330a14480608101548010816101c1e470f1011511507170d0347161e48050f0751181d060c0548181311140417470b1f48100306181c18080c511c1e4716190d510206180a1d0242051d1302105f4838094205001447231f0c14144e511f1902101448050f07511b010201180d02470b0248180906180f14090d041b5d4716190d030242101a1447111e0514470d050014154212041e14071d115115071d09050206510b040b16181e1013071548010816101c1e4711010d120e07024651370d0509050807024806021014481809160307151201140c510817051b180307511c1902423006150211511a14000b1e0651010d041a5104071f1c04150b141b5106051e4451060c154819061414481302011e051447031f48180916140f03060e5118101516510717470f040b19470d1748050f07511f1e150e154f0247041e071547110418010b1b5f48381342181b51130a14480608101d0c561442170704151619451d0610160d02134217071e0342121a1e174e510e1e0b0e1e1f1809055105100e18144451100a14090547031f0c51150b120d5f

I have tried to use this tool to decrypt it. The tool outputted multiple possible keys and for each key, an attempt do decipher the crypt. I know the result should be plain text English. The closest I got was this:

T-e po1ato ,s a 6tarc-y, t0bero0s cr*p fr*m th  per nnia) nig-tsha!e So)anumetube7osumeL. T-e wo7d po1ato (ay r fer 1o th  pla+t it6elf ,n ad!itio+ to 1he e!ibleetube7. Inethe ndesi whe7e th  spe&ies ,s in!igen*us, 1hereeare 6ome *thereclos ly r late! cul1ivat d po1ato 6peci s.P*tato s we7e in1rodu&ed o0tsid  theeAnde6 reg,on f*ur c ntur,es a"o,a+d ha3e be&ome $n in1egra) par1 of (uch *f th  wor)d's #ood 6uppl<. Iteis t-e wo7ld'sefour1h-la7gestefoodecropi fol)owin" mai?e, w-eat $nd r,ce.

After some digging and manually tweaking the text, I got this:

The potato is a starchy, tuberous crop from the perennial nightshade Solanum tuberosum L. The word "potato" may refer to the plant, itself, in addition to the edible tuber. In the Andes, where the species is indigenous, there are some other closely related cultivated potato species. Potatoes were introduced outside the Andes region four centuries ago, and have become an integral part of much of the world's food supply. It is the world's fourth-largest food crop, following maize, wheat and rice.

It unfortunately did not work. I am now trying to find a clue as to what I should be doing to find the answer.

307

1 + 5

xor

key-size

Amit

2/3/24, 9:41 AM

Hi and welcome to the site. I don't quite understand the issue at hand, you are asking how to find the answer but it seems like you have recovered the plaintext, so you have the answer apparently. If it's the precise XOR key you're looking for, note that for a simple XOR cipher since we have $p \oplus k=c$ where $p$,$k$,$c$ are the plaintext, key and ciphertext, XOR'ing both sides with $p$ you obtain $p \oplus c=k$ , so given the plaintext and the ciphertext you can easily recover the key.

MyselfAndOnlyMe

2/3/24, 1:51 PM

Hi, the plain text I recovered is not the answer to the cypher. I am still trying to find the appropriate text. What I found was the closest I got. Thanks

Amit

2/3/24, 1:58 PM

I don't completely understand, but it sounds to me like you're asking for a technique to obtain the key that will decrypt the ciphertext correctly, and not the correct plaintext (looks like you know what is the resulting plaintext already). If the key used for the XOR encryption is completely random this amounts to a One time pad and it is practically unbreakable save for an exhaustive search attack. I don't think that's the case here. Therefore, I suggest you to take a look at this: https://cryptopals.com/sets/1/challenges/6 As a nice tutorial on how to break a "repeating-key XOR".

Amit

2/3/24, 2:17 PM

Note that if you are uncertain about the encryption method, perhaps XOR is not the only one that was used here. It may be that you found the correct XOR key but there is another layer of encryption applied

Amit

2/3/24, 2:39 PM

.. and also, whenever applying the standard statistical techniques to break such a cipher there is an assumption that the character distribution of the underlying plaintext is very close to the character distribution of the relevant language in general. This is correct only approximately, and becomes less accurate the shorter the plaintext (/ciphertext) is. This is why even if you implement the cracking algorithm 100% correctly, it may still give you a wrong key as the "most likely" key.

Score:1

Crypto

Amit

2/4/24, 11:37 AM

Your problem is a very common one when starting to study techniques of cryptanalysis. Cryptanalysis relies in many cases heavily on statistical properties, and therefore rather than provide you with an accurate answer, a good cryptanalytic technique will instead reduce the solution space to a space that is easy to perform exhaustive search on.

To see how that applies to your issue, note that after correctly deducing that the cipher is a repeated-XOR cipher, and that the key size is 5 bytes long (probably via methods such as Hamming distance estimation) the next step is to fit the letter distribution of the underlying plaintext to the generalized distribution of letters in the (english) language.

But the problem is that this, in general, will not be a perfect fit because while the generalized letter frequency was calculated statistically over enormous amount of text, the plaintext at hand is quite short, and in general will be necessarily shorter than the text based on which the generalized frequency was established.

This is why both at this stage (and actually also at the stage where you estimate the key length) you should design the cryptanalytic code to take into account several of the most likely options, rather than assuming that the best fit is the correct one for the case at hand. For example, as you can see in the Wikipedia page I have linked, there are actually two generalized letter distributions to choose from: one based on texts, another one based on dictionaries. Those two can be your top two attempts for example, upon which you base the first two calculations of the possible key. It is also easy to construct other possible letter distributions by slightly tweaking the original one or consulting other sources. Note that even if you have to consult 1000 possible letter distributions before finding which one yields the correct key, this is still a big improvement over a brute force search over a 5 byte key: $2^{8\cdot5} = 2^{40}$ options is a lot more than $1000 \approx 2^{10}$. I hope that demonstrates the essential point here.

+ 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Cracking an XOR crypt with a know key length

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.