Score:2

Barak et al. proof that black-box obfuscation is impossible

br flag

I have been attempting to analyse the classic proof presented by Barak et al. that claims Black-Box Obfuscation is not possible for (what appears to be) most classes of programs.

The proof is presented in such a manner where it is said that if there exists an encrypted program C'(a, b, x) which returns b if and only if a = x, and another encrypted program D'(a, b, f) which returns 1 if an only if f(a, b, x) = b, then D'(a, b, C(a, b, x)) = 1 with a probability of 1. This will also mean that an attacker will be able to differentiate C'(a, b, x) from another function Z() which returns 0 at all points, as D'(a, b, Z()) = 1 with a probability less than 1.

The proof does not really make sense to me though, as assuming an attacker is unable to test every single value of a and b there would appear to be no way to conclude there is any difference between C'(a, b, x) and Z(). Black-Box Obfuscation would hold to be true however if the only way to differentiate two programs was to test every single input and inspect the output.

Is there anyone that could help explain to me how this proof is truly conclusive to say that Black-Box Obfuscation (for the most part) is not possible?

kodlu avatar
sa flag
please provide a link to the paper if available online
James avatar
br flag
@kodlu I have now included the link to the proof mentioned (the paper describes it in a much more mathematical manner than I did, but as it is quite a famous proof I have acquired a simplified form to present here).
Score:3
us flag

The paper defines two function classes:

\begin{align*} C_{\alpha,\beta}(x) &= \begin{cases} \beta & \mbox{ if } x=\alpha \\ 0^k & \mbox{ otherwise} \end{cases} \\ D_{\alpha,\beta}(F) &= \begin{cases} 1 & \mbox{ if } F(\alpha)=\beta \\ 0 & \mbox{ otherwsie} \end{cases} \end{align*}

The point is that if you are given any circuit $C^*$ (even an obfuscated one) computing the same function as $C_{\alpha,\beta}$ then $D_{\alpha,\beta}(C^*)=1$.

On the other hand, if you only have black-box access to $C_{\alpha,\beta}$, and $\alpha,\beta$ are chosen uniformly, then it will be hard to come up with an input that causes $D_{\alpha,\beta}$ to output 1.

Intuitively, having access to an obfuscation of $C_{a,b}$ gives you strictly more power than having black-box access to $C_{a,b}$.

The proof does not really make sense to me though, as assuming an attacker is unable to test every single value of $\alpha$ and $\beta$ there would appear to be no way to conclude there is any difference between $C_{\alpha,\beta}$ and $Z$ (a function that outputs zero on all inputs).

The attacker doesn't distinguish obfuscations of $C_{\alpha,\beta}$ from obfuscations of $Z$ by trying every input. The attacker distinguishes by passing the obfuscation as input to $D_{\alpha,\beta}$. $D_{\alpha,\beta}$ has the "correct" $\alpha,\beta$ baked into it -- it knows where to look so it can easily distinguish $C_{\alpha,\beta}$ from $Z$.

James avatar
br flag
But how can can a hacker know that `a` and `b` are baked into both obfuscated functions (and hence that `D'(C')` will always return `1`) unless it just happens to be that `a` has been baked in to be equal to `x`? Except for that case, `C'` seems it would still appear identical to `Z`? Also wouldn't this proof then simply suggest that Black-Box Obfuscation is impossible in the very specific scenario that a hacker finds, for instance, two functions that have the exact right baked-in variables?
us flag
When we define security for encryption, we say (for example) that ciphertexts look indistinguishable from random, even if the adversary got to choose the plaintext. When we define VBB security for obfuscation, we say that the obfuscated program leaks no more than black-box access to the function, even if the adversary got to choose the function. Also, later on in the Barak et al. proof they combine both C and D into a single program, which ensures that C and D both have the same $\alpha$ and $\beta$ in mind.
James avatar
br flag
Even if an adversary chose the `a` and `b` for both `C` and `D` though, if we passed them back both `Z'` and `C'` at a later date they wouldn't be able to tell which one is which? That is unless they baked-in `x` as well. If they have baked-in every single variable though then the proof seems a bit redundant as then they are just deliberately creating a function `C` that always returns `1`, and a function `Z` that always returns `0`. If `x` is not baked-in though, then the adversary has to test every single `x` on both `C'` and `Z'` till they produce different results?
Manish Adhikari avatar
us flag
If you want more intuitive answer involving obvious information leakage, consider for a moment that the program $C^*_{\alpha,\beta} (.)$ which calculates the same function as $C_{\alpha,\beta} (.)$ with $\alpha$ and $\beta$ chosen randomly from a large space. Now change $D_{\alpha,\beta} (F)$ slightly such that it returns $\beta$ if $F(\alpha)=\beta$ and $0^k$ otherwise. Given access to $D$ can you extract $\beta$ from a black box running $C$. What about from $C^*$?
James avatar
br flag
I am probably going to have to apologise as I might still be missing something here. I would say though @ManishAdhikari that I would imagine you would extract exactly as much information about $\beta$ from a black-box running $C$ as from $C*$, unless you already knew what $\alpha$ is baked in, or you are able to bake in $\alpha$, $\beta$, and $x$ such that $C_{\alpha,\beta}(x)$ will always return $\beta$ and never $0^k$. In all other cases you would have to test every single $x$ till you found the value that the function returns $\beta$?
Manish Adhikari avatar
us flag
Yes it seems you are still missing the point. I said that $D$ is available to you. With $C^*$, no matter how obfuscated $\alpha$ and $\beta$ might be in the program, you can extract $\beta$ by simply feeding it to $D$. But a black box running $C$ cannot be given to anything and you are stuck with only giving it input and seeing the output. Extracting $\beta$ in this case will require trying a lot of $x$ as you rightly said.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.