Score:1

Compute OTP key if set of plain texts and its set of cipher texts are known

cn flag

Given a set of plain texts $P \subseteq \{0, 1\}^n$. Assume we know the corresponding set of cipher texts $C \subseteq \{0, 1\}^n$ produced by applying one-time pad with an unknown key $k \in \{0, 1\}^n$.

Question: How to compute $k$, based on $P$ and $C$?

My approach: For every pair $(p, c) \in P \times C$, compute the key $k' = p \oplus c$. Output the key most frequent key $k'$.

My question: What is the probability of success of the proposed algorithm?

The problem with this approach is that the most frequent key $k'$ is unique in some cases. In some other cases, $k'$ is not unique. For example, when $P = C = \{0, 1\}^n$.

Score:0
de flag

Given we have the sets $P$ and $C$, to find $k$, we need only find (one of) the pairs $(p,c)\in P \times C$ s.t. $p \oplus k = c$, as with this pair, $k$ is trivial to compute.

First, we can note the property that given two corresponding pairs $(p_1, c_1)$ and $(p_2, c_2) \in P \times C$, $c_1 \oplus c_2 = p_1 \oplus p_2$ - in other words, the difference between two ciphertexts is the same as the difference between their corresponding ciphertexts.

We can use this to which of our plaintexts correspond to the ciphertexts, as assuming $c_i \ne c_j$, the set of differences, $d_i^c$, for a given $c_i$ with all other words in $C$ will be unique. The same holds for a given $p_i$, with $d_i^p \ne d_j^p$ for any other set of differences for a different plaintext in $P$, however for corresponding plaintext ciphertext pairs, these differences will be the same.

For example, if $p_1$ encrypts to $c_1$, then $d_1^p = d_1^c$, and we are able to pair these two as we know these differences are unique to the given plaintext/ciphertext.

This then allows us to calculate $k = p_1 \oplus c_1$.

Score:0
ng flag

Nitpick: As soon as $|P|>1$ (several plaintexts), it's not applied the One Time Pad, but "XOR with a key". I'll assume that.


For many instances of the problem, it can be ruled some keys $k=p\oplus c$ by finding a plaintext $p'$ such $p'\oplus k$ is not in $C$. The question's method does not do this, and considers keys that could be eliminated.

For relatively small $P$ and $C$ (or same size), this algorithm will work nicely:

  • make a set of already tested keys, initially empty.
  • for each pair $(p,c)\in P\times C$
    • $k\gets p\oplus c$
    • if $k$ is not in the set
      • enter $k$ in the set
      • for each $p'$ in $P$
        • if $p'\oplus k$ is not in $C$
          • exit the loop and skip next step, moving to the next pair $(p,c)$
      • print $k$ which is a possible key.

When $|P|=2$, the algorithm let to terminate always outputs 2 keys, and if we stop at the first it's success rate is $1/2$.

For larger $|P|$ forming a small haphazard subset of possible plaintexts, chances are good that there's a single key, and success rate is near $1$. On the other hand it's easy to make pathological $P$ with $|P|=2^n$ allowing $2^n$ keys.


This does not answer the part of the question asking the success rate of the initial algorithm.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.