How to generate a circuit for SHA-256?

user77340

1/20/23, 10:59 AM

In "A Boolean Circuit for SHA-256" by Steven Goldfeder, the author gives a Boolean circuit for SHA-256. I find this method very complicated.

May I ask how to construct a Boolean circuit for a hash function? I mean, given an algorithm of a hash function, how to transform it into a circuit as in the article?

564

0 + 0

hash

sha-256

circuit

fgrieu

1/20/23, 11:35 AM

Is the question focused on SHA-256 (as in title), or asking in general (as in last sentence of body)? Is it wanted a strictly boolean expression (as needed in e.g. proof of knowledge of value with a given hash), or something for a silicon design (which in all likeliness will be sequential to a degree)? It might also be useful to tell the targeted input length, and why it is thought the expression linked is "very complicated", given the [specification of SHA-256](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf#page=6).

user77340

1/21/23, 4:32 AM

@fgrieu yes, actually I am asking in general.

Score:2

Crypto

fgrieu

1/21/23, 6:44 AM

given an algorithm of a hash function, how to transform it into a circuit?

Most importantly we first want to better state the problem

What hash function?
Do we want a circuit for the full hash function and a fixed-length input [which seem best matches the question], or for some part of the hash [like a round of the hash function with input one padded message block in the linked material, if I read it correctly]?
Why do we want "a circuit"? This will impact the nature of what we produce.
- A zero-knowledge proof involving the hash as in the linked example? That will direct towards an expression as a long chain of gates restricted to some varieties (here XOR, AND, and INV)
- Testing how a pure SAT solver handles solving some problem related to the hash (like preimage)? The expression will typically be even more restricted (no XOR), on the other hand negation is usually free.
- Optimized realization in silicon or FPGA? Typically an overly deep purely Boolean expression will be useless, we'll need intermediary latches, and unless the whole thing is deeply pipelined or the hash terribly irregular we'll have some logic reused across rounds. I won't cover that.
In what form do we want the output? For a purely Boolean circuit, most formats will number variables. I guess the example has 512 inputs numbered from 0 to 511, 116246 gates (one per line) each producing one new variable for a total of 116758 variables, and 256 outputs (it could be variables 116502 to 116757, I'm not sure), with this described in the first two lines per some simple convention. Below are one gate per line, with I guess for each
- the number of inputs [1 for NOT and 2 for others in the example]
- the number of outputs [1 in the example]
- the list of input(s)
- the list of [one] output(s)
- the name of the gate [XOR, AND, INV in the example]

The rest broadly follows the caveman's technique to eat a mammoth (one chunk at a time) and the progress from there (tools).

We take the algorithm step by step, unrolling each loop, and express each step as Boolean equations. For example if all variables are 32-bit (like in SHA-256):

a statement like C = A ^ B can be translated into 32 new variables for C, output by 32 XOR gates, requiring 32 lines. We need to keep track of the numbers designating the new variables assigned to C.
E = C + D requires intermediate variables, thus more lines. We need 30 full adders, and then two simplified ones (no carry out for the high-oder bit which thus reduces to one XOR; no carry in for the first which thus reduces to one XOR and one AND).
F = (E<<3)|(E>>29) would require no line, only a reassignment of variables for E.

There are a few tricks that can sometime get simpler expressions, but for hashes of cryptographic interest the expression will remain long. If it was not, the hash would be weak.

It's reasonably easy to make a program that does this from scratch, and in my experience that's easier than finding and mastering something adequate. Existing tools can be useful to automatically simplify expressions, but for most cryptographic hashes a little analysis of the hash's equations will get most simplifications it's possible to get, and might give better results than automated tools.

0 + 0

user77340

1/22/23, 1:42 AM

Thanks for your answer! Yes, I want to use it for MPC implementation, or concretely, I am designing a two-party secure computation protocol in which the prover is to prove the pre-image of SHA256. So the file in "A Boolean Circuit for SHA-256" by Steven Goldfeder" is what I want. However, the circuit they design is for fixed-length input, while I want a circuit for arbitrary length input. I am still struggling on it. Thank you so much!

fgrieu

1/22/23, 6:07 AM

@user77340: for SHA-256 and input of $b$ bits up to $512-64-1=447$-bit ($55$-byte), there is a single block and it's obtained by appending to the input some $512-b$ bits depending only on $b$, and then apply the round equations (it's instructive to derive them, and indispensable to verify them. Also, some $55$ of the $512-b$ bits are always $0$, growing to $65$ for byte-aligned input, allowing some slight simplifications. For arbitrary length you need to append $(-65-b\bmod512)+65$ bits, and apply $\lceil (b+65)/512\rceil$ times the round equations.