How to build a periodic PRF from a PRF?

Tristan Nemoz

10/7/22, 1:44 PM

This question may be related to this one, though the construction differs.

Let us consider a PRF $f$. We define $g_k$ as $g_k(x)=f(x)\oplus f(x\oplus k)$. Is $g_k$ a PRF, assuming $k$ is chosen at random?

I tried to prove this as follows. Let us consider an adversary $\mathcal{A}$ that is able to distinguish between $g_k$ and a PRF with non-negligible advantage. Let $\mathcal{R}$ be a reduction that has access to $\mathcal{A}$ and wants to break the PRF security of $f$. In both games, $b=0$ denotes the real world and $b=1$ the random world, where a truly random function is applied instead of $f$ or $g_k$.

At the beginning of the game, $\mathcal{R}$ choses $k$ at random. When $\mathcal{A}$ queries for $x$, $\mathcal{R}$ queries for $x$ and $x\oplus k$, XORs the result and send it back to $\mathcal{A}$. When $\mathcal{A}$ returns its guess $b'$, $\mathcal{R}$ returns the same bit.

In order to prove that $\mathcal{R}$ has a non-negligible advantage, we just have to show that it perfectly simulates an oracle implementing $g_k$. In the case $b=0$, it is the case, nothing differentiates $\mathcal{R}$ from an oracle implementing $g_k$. If $b=1$ however, $\mathcal{A}$ expects to get $\pi(x)$ for a random function $\pi$, while it receives $\pi(x)\oplus\pi(x\oplus k)$. $\pi(x)$ is uniformly random and, by definition of a random function, not related to $\pi(x\oplus k)$, so what $\mathcal{R}$ returns to $\mathcal{A}$ is uniformly random too. It is to be denoted that now that $\pi$ has been defined on $x$ and on $x\oplus k$, $\mathcal{A}$ can predict the value of the encryption of $x\oplus k$ since this would the same as $x$'s. Since $\mathcal{A}$ does not know $k$, this is not a possible strategy. Hence, $\mathcal{A}$ cannot distinguish between these situations.

Is this proof correct? The thing that bothers me is that this new PRF has a lot of collisions, which is quite surprising for a PRF, but I think that the adversary can't find them unless they get to know $k$.

204

1 + 1

provable-security

pseudo-random-function

Samuel Neves

10/7/22, 4:52 PM

Well, there are two bad events in the ideal world: $k = 0$, with probability $2^{-n}$, and $x_i = x_j \oplus k, 0 \le i < j < q$. Summing over all queries, $P[x_i = x_j \oplus k] = q(q-1)/2^n$. So the PRF advantage of $g_k$ is going to be something like $\mathbf{Adv}_f^{\mathrm{PRF}}(2q) + 2^{-n} + \frac{q(q-1)}{2^n}$. You can mount a distinguisher on $g_k$ by querying $\approx 2^{n/2}$ distinct random inputs $x_i$, obtaining a collision. From this collision you can obtain with high probability $k = x_i \oplus x_j$, and confirm with another query that you're working with $g_k$.

Score:4

Crypto

Mikero

10/7/22, 4:47 PM

What you have written is correct and accurate intuition. The adversary indeed cannot distinguish without "guessing" $k$. This can be formalized, and it is this formalization that seems to be missing from your post.

Here is a sketch of a traditional game-based proof of security. Without loss of generality, assume the adversary never repeats a query.

Real hybrid:

Choose random $k$ and a random function $\pi$
For every adversary query $x$, respond with $\pi(x) \oplus \pi(k \oplus x)$

Hybrid 1:

Choose random $k$ and a random function $g$
For every adversary query $x$, if there was a previous query on $x \oplus k$ then return $g(x \oplus k)$; otherwise return $g(x)$

Ideal hybrid:

Choose random $k$ and a random function $g$
For every adversary query $x$, return $g(x)$

Here the observations that complete the proof:

The real hybrid and hybrid #1 are identically distributed. Following the logic of the real hybrid: if there was no prior query on $x \oplus k$, then both $\pi(x)$ and $\pi(x \oplus k)$ are independent of the adversary's view so far, so the result is exactly uniform. Otherwise, the result is exactly the response from the query on $x\oplus k$. This exactly matches the logic of Hybrid #1.
In Hybrid #1 and the ideal hybrid, define a "bad event" as the case where the adversary queries $x$ after previously querying $x \oplus k$. Note that the two hybrids perform exactly the same operations (giving uniform/independent responses) until this bad event happens. Thus these two hybrids are "identical-until-bad" in the terminology of Bellare & Rogaway. By the Bellare-Rogaway "Fundamental Lemma", the adversary's distinguishing probability is bounded by the probability that the bad event happens in the ideal hybrid.
What is the probability of the bad event in the ideal hybrid? A nice thing about this hybrid is that the adversary's view is clearly independent of $k$ --- it would change nothing to imagine $k$ being chosen after the adversary has finished making all its queries. If $k$ is chosen after the adversary's queries, then we see that the bad event happens if there exist queries $x_i, x_j$ such that $k = x_i \oplus x_j$. With $q$ queries there at most $q^2$ possible values of $x_i \oplus x_j$, so $\Pr[bad] \le q^2/2^n$, where $n$ is the length of the $x$'s.

Overall, the adversary's distinguishing advantage is bounded by $q^2/2^n$.