Prove that what's running inside a black box is code that has been released publicly

in flag

Suppose someone has a black box, which is running some program. This box has some internal states, and we can observe the outputs it produces. Now someone comes along and claims the release the source code of the program running on that black box. Is there a way to verify that claim, without also requiring the disclosure of its internal state? (A practical example of this could be, say, Twitter's recommendation algorithm).

An quick, impractical solution I can come up with is to use FHE. With a version of the internal state encrypted with FHE, third-party can independently verify the result of running the source code.

fgrieu avatar
ng flag
There's no way to determine what a black box is doing. That's why it's called black. If we add what's required to see what the black box is doing (including what software it is running), then it's no longer black and is hinging on white (update: [this answer]( has good examples). I don't see that FHE can change that.
nialv7 avatar
in flag
@fgrieu You are technically right. But if I can independently verify the black box's output matches the output you would expect from running the released source code, then the black box is _effectively_ running that code.
fgrieu avatar
ng flag
Observing what a box has output so far and matching that with output of reference code proves neither that the box has been effectively running this reference code, nor that in the future it will keep on having the same output as the reference code. So I don't get what the question is actually asking, or how FHE could help.
eps avatar
ca flag
@nialv7 except if the blackbox is programmed to give you the outputs you want while it is being tested. there have been phone manufactures caught doing this with benchmark tests (phone runs faster when it knows it's being benchmarked). a rather extreme example was the volkswagen emission scandal, where the car ran differently when it knew it was being tested at the emissions testing facility.
Mooing Duck avatar
us flag "Not only did his compiler know it was compiling the login function and inject a backdoor, but it also knew when it was compiling itself and injected the backdoor generator into the compiler it was creating. The source code for the compiler thereafter contains no evidence of either virus."
usul avatar
tj flag
By the way, the impossibility of this task is what motivates "trusted" platform modules, where the "black box" is equipped with a trusted piece of hardware which does the verification. See
JeffC avatar
gp flag
@nialv7 One way you might be able to confirm it's the same is if you knew of a bug in the (alleged) copied code and could reproduce that bug in the black box running the new code. I can't say this would prove it 100% but it would be a pretty strong indicator that it was copied. If you could confirm additional bugs, each one would make the case stronger.
Bakuriu avatar
cn flag
What's the purpose of this question? Keep in mind that a *technical* answer may well be very different from a **legal** answer. A judge/jury may decide that it is more likely than not that the blackbox is running some specific software and use that fact to resolve some legal questions/ramifications. In law it's not always necessary to prove something, it depends a lot on the context.
jp flag
@Bakuriu: Given the last sentence of the first paragraph of the question, I would assume an example problem would be: "Is it possible to determine, purely by observing Twitter through its public API, whether the Twitter recommendation system is actually running the code Twitter posted on GitHub."
stevec avatar
cn flag
Not the same question, but related: [*Are reproducible builds practically possible on major app stores?*](
Nat avatar
de flag
Are you asking about a case where the black-box is running a program that _includes_ the original source-code somewhere in it?
Vi. avatar
cn flag
Do you mean something like [risc0](
Christophe Le Besnerais avatar
sh flag
you could publish a fingerprint hash of the code source inside the black box to compare with the public code
sa flag

Another way to think of this: You cannot distinguish whether the black box is running something which happened to match what you expect so far, vs running the actual code. This is a fundamental difference between some input/output representation and actual code. There is no way to get around this.

One very basic problem in computer science is the halting problem. We cannot determine if a given code will halt or keep running independently, since we have an uncountable number of possible inputs and state sequences [by Cantor diagonal argument]. And the representation of what's inside the box by a finite state machine is one very simple representation, it doesn't even have to match this representation, it can have hardware/environmental data dependent interrupts, run processes in parallel, switch between processes, etc. etc.

benrg avatar
cn flag
I understand the first paragraph of this answer, but I can't make sense of any part of the second. There is no black box in the halting problem, so I don't see how it's relevant, and that goes double for "hardware/environmental data dependent interrupts" etc.
TLW avatar
cn flag
@benrg - proving if an arbitrary Turing machine T recognizes a given decidable language L is an undecidable problem, even when given the source of T not just oracle access. You can show this via reducing to solving the halting problem. Construct a TM U_V that first runs V, then checks if the input is in L. Does U_V recognize L? Answer: only if V halts in a finite number of steps. Oops.
user2357112 avatar
in flag
@TLW: That would be relevant if we were trying to prove properties about the code's output given its source code, but we're trying to prove what the *source code* is given its *output*. That's a completely different problem.
TLW avatar
cn flag
@user2357112 - we are trying to prove if an arbitrary Turing machine T recognizes a given decidable language L, given only oracle access to T. This is no less difficult than proving if an arbitrary Turing machine T recognizes a given decidable language L, given the source of T... which can be reduced to solving the halting problem. Sorry if my "not just oracle access" wasn't clear enough last time - I had to cut down my first attempt because it didn't fit in a comment, and obviously over-pruned.
user2357112 avatar
in flag
@TLW: "we are trying to prove if an arbitrary Turing machine T recognizes a given decidable language L" - but we're not trying to prove anything like that. We're trying to prove what the black box's source code is, not what language it recognizes.
cn flag

TL;DR - Not possible, in several ways

I had exactly this problem some decades ago in my DSP patent career, when faced with two competitors' implementations of our idea in encrypted FPGAs.

The code, such as it was, was bordering on the trivial, perhaps 10 lines of C equivalent. The problem was that there was an internal state of perhaps 200 bits, and the observable output on each clock cycle was only 3 bits. Fortunately it was possible to control the 40 bits input to the FPGA.

I attacked it as a Hidden Markov model. I can place the year that this occurred, as I recall our company had 486 PCs on many desks, but a few of the managers and directors had Pentiums. I recorded the output of the FPGA in a file, and then on Friday afternoon, toured the company recruiting every PC I could muster to run my code over the weekend, searching the FPGA output for matching sequences of those 3 bits.

On Monday morning, I harvested the matches, and stitched them together, rather as 'shotgun sequencing' would for DNA.

Faced with competitor A's black box, a reasonable match was visible as I got to a 100 length sequence, then improved as I got to 1000. There was no improvement at 10k. By the time I got to 30k, the match was deteriorating. I had to tell my boss that I didn't know what was inside, but that it was not our code.

Black box B was more forthcoming. As I increased the match length, it got better with each decade. I could almost hear the lock tumblers clicking into place in my excitement as I got to 100, then 10k, then 100k. This had our code in it. Or at least, something that behaved identically.

Years later, the truth emerged. Both competitors used an IIR implementation of the basic idea, whereas we had used an FIR implementation. However, A used floating point arithmetic with an LSB or two of added noise. B used integer arithmetic, where poles and zeros can cancel exactly. So both were using our basic idea. Neither was running our precise code.

The moral is clear. Even with a problem as small as this one, it's only possible to say that the black box is not running an identical copy of your code. Whether the version it is running is in any way 'equivalent' to yours is a potentially long and expensive argument for patent lawyers.

va flag

The other answers cover the general case well. For the related/implied broader question of "is there any way I can know some other server out there is actually running some specific code", there are actually some ways something that would otherwise be a blackbox, can "make itself transparent" and you can reasonably believe certain code is running.

Code running in e.g. an AWS nitro enclave, an Intel SGX enclave [PDF], Apple's secure enclave, or Android equivalent, can have that system produce an "attestation", which includes a cryptographically signed hash of the system image/running code/etc. If you trust AWS/Intel/etc, and you're given such an attestation, you can in turn trust that certain code is running, without needing to see the rest of its private state.

This isn't without flaws: SGX has had bugs in the past, and in general you've still just moved your trust from one third party to another. But, it definitely has its uses, it's the underpinning of Google's SafetyNet (now Play Integrity API) for app developers to know e.g. a phone is running the genuine unmodified application, the phone isn't rooted, etc.

hr flag

This is related to the domain of verifiable computation as laid down in

The setting of VC is:

  • A client (think a thin client, like a small laptop) has an expensive function $F$ it needs computed, on an input $x$ that may need to be kept secret, but only a small budget for computation, not enough to compute $F$ on its own.
  • A worker (think a cluster of large computers available as a utility) can do lots of computation, but may be flaky and may leak its inputs.

Using VC, the the client can encode the function $F$ and the input $x$ so that the worker can do the computation and return an encoded answer and proof $\pi$ so that the client can decode the answer $y$ and use the proof $\pi$ to verify $y = F(x)$ more cheaply than by simply recomputing $F$.

The two properties a VC scheme must provide are loosely:

  • Security: The worker can't fool the client into accepting an answer $y' \ne F(x)$.
  • (optional) Privacy: The worker can't learn anything about $x$ or $F(x)$.

It's not quite enough to encrypt the inputs with fully homomorphic encryption, because just like public-key encryption doesn't automatically give way to public-key signature (no, it's not just “encryption with the private key”), fully homomorphic encryption doesn't automatically give way to verification either. But FHE is involved in the Gennaro–Gentry–Parno scheme, along with Yao's garbled circuit.

Of course, the worker might do other work than computing the (encoded) function $F$. In fact it would be rather surprising if it didn't; after all, it has to do network communication, run an operating system, and generally exist in a physical world of imperfections. But with VC the client can be confident that the worker did compute $y = F(x)$.

Unfortunately, the setting you describe is slightly different.

  • In the VC setting, the same client provides the secret inputs and verifies the secret outputs—that is, the power to encode inputs is the same as the power to decode and verify outputs. And the client is assumed to keep the inputs and outputs secret from the worker.
  • In your setting, you want one power to encode input—what you call the “internal state”, which is kept secret—and a separate power to decode and verify output—which is revealed to the world.

In your setting, there's not much to verify—if you see an output, the only question to verify is whether there exists any input under which the function could possibly return the output you saw. For a sufficiently general function (a surjection onto the space of outputs), this is vacuously true, so there's nothing to verify; this is likely the case of a Twitter recommendation algorithm.

Perhaps if the black box provided a commitment $c := C_r(x)$ to the secret inputs it has, and revealed an output $y = F(x)$, then it could also provide a non-interactive zero-knowledge proof $\pi$ that it is the same $x$ used to make the commitment $c = C_r(x)$ and the output $y = F(x)$.

This doesn't enable anyone to verify now that the black box is computing $F$, but it would enable everyone to verify later, if $x$ and $r$ are ever revealed, that the black box was computing $F$.

Of course, you could also just recompute $F$ at that point, but perhaps the proof $\pi$ could be made short enough that verifying it is substantially cheaper than recomputing $F$ as in VC.

And, naturally, any interesting verification here assumes the black box cooperates by providing commitments and proofs, which in the case of Twitter is unlikely to happen.

nialv7 avatar
in flag
Thank you! I feel most of the other answers here, while enlightening in their own ways, are answering the wrong question.
mw flag

Let's model the black box program as something which takes a sequence of inputs and produces an output in response to each input, while also maintaining some internal state.

The answer is very trivially "no" if we consider programs with the same observable behaviour but different "source code" to be different. For example, "always output X" and "if input = X then output X, else output X" are "different" programs but they always have the same observable behaviour. So let's change the problem slightly to say that we don't care about verifying the "source code" of the program, we only want to verify the program's behaviour. Or more formally, we define an equivalence relation on programs which says two programs are "practically the same" if their outputs always match, for any input sequence, and the goal is to verify that the program in the box belongs to the same equivalence class as the one we have in mind.

This is still impossible.

A simple counterexample shows that the program running in the box cannot be recognised in general by observing its outputs. Let's say that the inputs and outputs are either 0 or 1, and the state is a single flag. Consider the following four programs:

  • Program A: On any input, output 0.
  • Program B: Set flag = 0. On input, if flag = 1 then output 0, else set flag = 1 and output the same as the input.
  • Program C: Set flag = 0. On input, if flag = 1 then output 0, else set flag = 1 and output (1 - the input).
  • Program D: Set flag = 0. On any input, if flag = 1 then output 0, else set flag = 1 and output 1.

These four programs are all distinct according to the equivalence relation we defined above. But:

  • If the first input is 0 and the response is 0, the program could be either A or B.
  • If the first input is 0 and the response is 1, the program could be either C or D.
  • If the first input is 1 and the response is 0, the program could be either A or C.
  • If the first input is 1 and the response is 1, the program could be either B or D.

In any case, the response to all subsequent inputs will be 0, so we can't narrow it down further. You would need a "reset" button to be able to tell which of these four programs the box is running.

The problem here is straightforward: after some inputs, the programs can reach states where their behaviour will stay the same, even though their behaviour would be different were they not stuck in those states. So suppose we forbid this by demanding that the program running in the box has no "sink" states, i.e. from every state the box might be in, there is some sequence of inputs that can take it to any other state.

Unfortunately the answer is still no:

  • Program A: Set flag1 = 0, flag2 = 0. On input of 0, output flag1 and set flag2 = flag1. On input of 1, output flag2, and set flag1 = 1 - flag2.
  • Program B: Set flag1 = 0, flag2 = 1. Then as above.
  • Program C: Set flag1 = 1, flag2 = 0. Then as above.
  • Program D: Set flag1 = 1, flag2 = 1. Then as above.

These programs have no "sink" states. There are four possible states (0,0), (0,1), (1,1) and (1,0), and the input sequence "1010" cycles between them in that order. But these programs cannot be fully distinguished from outside the black box:

  • If the first input is 0 and the response is 0, the program could be A or B, and either way the state is now (0,0) so all future outputs will match.
  • If the first input is 0 and the response is 1, the program could be C or D, and either way the state is now (1,1) so all future outputs will match.
  • If the first input is 1 and the response is 0, the program could be A or C, and either way the state is now (1,0) so all future outputs will match.
  • If the first input is 1 and the response is 1, the program could be B or D, and either way the state is now (0,1) so all future outputs will match.

In this counterexample, the problem is that the black box has two bits of state, both of which affect the program's behaviour, but if you query one bit of that state then you lose your opportunity to query the other bit. The programs are observably different because they have different initial states, but we can never observe both bits of the initial state to distinguish all four programs from each other.

al flag

If the black box remains sealed, then it is impossible to prove what code is running inside of it.

Any test inputs you give can be handled correctly by whatever-is-in-the-box (WIITB), while different inputs - known only to the attacker who loaded WIITB - might give wildly different output to what you expect.

This extends to any methods you might expose to let inspectors "directly verify" WIITB. A compromised box could produce innocent code when asked.

There is no way around this.

You could (and people often do) fudge a hand-wavy pretense at doing what you ask. It will not actually work, but some people will believe that it does.

gb flag

In general it is not possible to have such certainty. That is quite literally the definition of a black box. However, in some circumstances one can indeed develop statistical methods of gaining confidence that the black box does indeed do what it says.

Zero Knowledge Proofs (ZKP) are an example. In these proofs, we typically do assume we know something about the black box. In particular, we assume that we know all of its inputs (no hidden radios taking instructions from the bad guys), and that it has no state (we don't know what the black box does, but we know it has at least one function which returns the entire system to a known state). We also assume that the system can issue "commitments" to its state, which is not to say that it provides its internal state to a user, but that we can prove some properties of its state later.

The classic example of this is a cave with two entrances. I discover a path between the two entrances, something I think is valuable and want to sell to you. Of course, I don't want to show you the path before I get my money. And you don't want to fork over your money unless you at least know that I've got a path. One solution is that you turn your back and I go in one side or the other. Then you turn to face the entrances and call out whether you want me to appear out of the left or right entrance. If I'm telling the truth about the path, I can always do so. If I'm lying, then I only have a 50% chance of being able to do so. We then reset the scenario and try again. Each time you grow exponentially more sure that I do indeed know the path.

Elements of this approach are visible in several practical situations. The voting system Punchscan includes a very easy to follow example of constructing these ZKP to prove (or, more pedantically, build confidence) that several black boxes were indeed constructed according to the rules. And as for the validity of the underlying assumptions about the inputs and resetting, Google's Native Client (NaCl) is a great example. It is a way to run untrusted native binary code on your computer safely. NaCl applications are formatted in a particular way which lets us formally prove that an application has no back-channels that we don't know about and that it is perfectly resettable. It does this by proving that the untrusted code is not capable of executing any instructions out of a set of "unsafe" instructions. The only way to run those instructions is to ask the NaCl kernel to do so, and it carefully sandboxes those instructions to prevent malicious actions.

The difficulty of acquiring such certainty is why reproducable builds are a highly valued concept. If one can prove with great certainty that two people's builds are identical, then we can trust that one person's audit of the software can be used as a surrogate for the other person doing the audit to their software.

Jonathan Hartley avatar
al flag
This is a very interesting and not entirely unrelated answer. But the OP does not imply that any of the prerequisites for a ZKP hold. (eg. The box may have state, which we know nothing about). So does any of this apply to the question as asked?
Cort Ammon avatar
gb flag
@JonathanHartley My logic when writing this answer was that it's easy to see that, in the general case, its impossible to peer into a black box to verify it, but there are specific situations where it is possible to do so. The accepted answer actually goes down the same path I did with ZKP. I brought in the NaCl approach to point out an example of how a black box might be confined to reduce the larger "are these arbitrary black boxes the same?" to a restricted problem where ZKP might apply.
Jonathan Hartley avatar
al flag
Ok then. I don't like it, but if I'm in a minority then so be it.
cn flag

Assumption: "If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck." I suggest that if the I/O characteristics of a black box and inspect-able source code achieve the same effect, then semantically the two are identical. The presence of individual pieces of code (e.g. GOTOs or not) becomes irrelevant.

Case 1: If the box is a simple finite state machine as you suggest, then there are techniques such as Monte Carlo simulation. You input various random values whilst observing the outputs. This has been done by NIST for hardwired cryptographic primitives operating in potted boxes. See What's the reason for Monte Carlo tests for block ciphers? for it's use. This is probably the more applicable use case.

Case 2: If the box is complex (Rolls Royce turbofan ECU + 10 miles of wiring loom) then verification will be impossible due to the number of correlated/interdependent states. You'll have to verify through trials (as they do). Remember that Twitter's wiring harness has global reach, is susceptible to random Muskrat tampering and fragmented continuous delivery techniques. Even the US Federal Trade Commission can't verify how it works.

Mooing Duck avatar
us flag
Counterexample: Computer viruses
kr flag
Re: "If the box is a simple finite state machine as you suggest": I don't take the OP as suggesting that; and the term "black box", if interpreted strictly, suggests that we have *no idea* whether it's a simple finite state machine!
br flag
"Semantically identical" is not the same as running the same code. This is what "Chinese wall" development is about.
Paul Uszak avatar
cn flag
@Barmar The point I'm making is that if the data crossing the black box boundary is identical, does it matter what occurs within it? Does it have any bearing on anything? If the duck's a robot that behaves exactly like a yummy duck, it's a duck. The concept is good enough for NIST (follow the link).
br flag
The question says "Now someone comes along and claims the release the source code of the program running on that black box". So the question is about the literal source code, not its abstract behavior. In the real world this comes up in copyright infringement claims.
JRE avatar
ma flag
@PaulUszak It matters greatly whether the code in the black box is the same as the released code. You can read the released code and see if it is free of bad stuff - it isn't a virus, it isn't a keylogger, etc. A black box that emulates that known good code could do other things. It always gives the outputs you expect, but at times when you aren't looking it does something else - something bad. Have a look at [reproducibile builds.]( It looks like a duck, quacks like a duck, but is a wolf that is a very good actor.
in flag
@JRE: This answer doesn't claim that the black box is incapable of pretending to be something else. Either there is an input for which it does not so pretend (in which case you can give it that input and observe it drop the facade), or there is no such input (in which case it's semantically equivalent), by the law of the excluded middle. However, it may be infeasible to *find* said input, which IMHO is the real problem here.
bd flag

There is an easy counter-example: I use exactly the published source-code, but in the beginning I add a single instruction: if input is exactly *secret 200 character password* explode

You will not find any difference in behavior for any input, as long as you don't know (or by random luck find) the secret 200 character string. Since the program uses the published source-code, it will behave the same for any other input.

Checksums and sophisticated analysis

For complex scenarios where the program will output it's own source code/checksum or other information about itself, the blackbox could be built like this:

The blackbox contains a device running the actual code inside. The blackbox passes the input into the internal device and then changes the output under certain conditions before passing it on. The blackbox is carefully constructed so the answer is always generated with a proportional timing and power-consumption to the original device.

In reality it's not easy the more you know about the blackbox

If you know the hardware/chipsets which are used in the blackbox, or you can reasonably narrow them down, by available technology, size and power-consumption or by x-raying the box, it will become harder for the creator to hide something in the box, but not impossible.

fgrieu avatar
ng flag
There are some complications, including 1) insuring that a timing analysis does not reveal the addition (which may require optimizing the startup code); 2) determining if the code includes a feature to list or checksum itself, which could reveal the change; and in the affirmative circumvent that (that would need to be modified to output the original checksum/code, and again might require some optimization to evade a timing analysis).
Falco avatar
bd flag
@fgrieu I also thought about a timing-analysis. But this also requires information about the used hardware inside the box and the used algorithms. The creator of the box can e.g. use a faster computer and run the original code in a virtual machine and always answer in a timing window proportional to this runtime (e.g. always a factor of 1,3 slower to account for his code) This would also solve the checksum problem.
Falco avatar
bd flag
@fgrieu I daresay if you don't know the exact hardware-layout I used, it will be near impossible to recognize a single added if-statement from timing-attacks.
br flag

A somewhat contrived counterexample, but I believe it proves that a general solution to this cannot exist.

Let's suppose that you have a random number generator in the box, but you think what you might have is a pi digit stream.

Now it's highly unlikely, but theoretically possible, for your random number stream to match the pi digit stream exactly. As such, there would be no way to discern the difference from the outside.

The same could be applied to any system with outputs, it could just be an RNG getting exceedingly lucky and generating what looks like an intelligent output.

bi flag

I've seen an image analysis software with a special input image that resulted a special output image. If your software has such a feature, you can easily test if your software is running in the black box.

Jonathan Hartley avatar
al flag
No. The first step a attacker, loading different code onto the black box, will do, is include all the original code, or a functional equivalent of it, to duplicate outputs such as the one you describe, except in particular circumstances which you have no way of guessing. There is no situation when verifying outputs comes anywhere close to verifying what is running in the box.
Olek avatar
bi flag
it depends on what you want to prove. if you want to show that someone is using your software, this is a very easy to implement check. not sure what the OP is asking though, so will leave it at this.

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.