Score:1

How does an attacker decrypt a hash function by looking for linearity?

tn flag

Reading the selected answer to Designing a hash function from first principles rather than depending on heuristics is very insightful.

The section on "nonlinearity" suggests that making every equation involved in the hash function into a linear one means the attacker can easily figure out the implementation of the hash function.

If you try to make a cryptographic hash function that only uses XOR and shifting, the equations will be linear even after infinitely many rounds, now the analyst is just a single Gauss elimination away from solving it and now they have the ability to create arbitrary preimages.

In order to avoid that, you need to make your equations non-linear by using the AND operator. Hashes do all sorts of shifting and XOR-ing but this non-linear step is what makes them secure.

But having non-linear terms are not enough. You must make sure the attacker cannot effectively cancel out the non-linear terms out of your equations by fixing some inputs to 1 or 0. Also you need to make sure the terms will not cancel if the attackers takes the difference of two equations. If you have the formula ++ for one and another formula ++ for another bit. Adding these two will give + which is now linear and allows the attacker to solve it to get the relationship between two out bits. Every independent linear equation an attacker can create will reduce the security of your hash with 1 bit.

Can you explain in more detail how this would work? Given some hash function which has been broken, how could you show this linearity situation on that hash function? Basically, say md5 is an example (insecure for some reason). Is it insecure for this linearity reason? If so, what does that mean in terms of the md5 hash function? What are the equations that were found to be linear? If md5 is a bad example, what is a good example? Basically, how do you look for linearity in a hash function bit by bit, what are the techniques to use?

I would like to know the techniques an attacker might use to "solve" the hash function based on input/output alone (without seeing the implementation). But since that is probably too general or too involved a question, this question focuses only on this linear equation aspect. What are the "linear equation" techniques an attacker might use to solve a hash function?

kelalaka avatar
in flag
I don't think that you can find such a simple attack. For example, MD5 attacked with [differential paths](https://www.win.tue.nl/hashclash/On%20Collisions%20for%20MD5%20-%20M.M.J.%20Stevens.pdf), You have to form some algebraic equation to see this. It takes time...
Morrolan avatar
ng flag
The field of linear algebra has a [rich toolset](https://en.wikipedia.org/wiki/System_of_linear_equations#Solving_a_linear_system) to solve systems of linear equations efficiently, e.g. Gaussian elimination. So if one can express the relation between input and output of a hash function as a linear system, breaking it is straight-forward.
kelalaka avatar
in flag
@Morrolan the problem is not solving the systems of linear equations efficiently, rather how to construct them given a hash function. Solving is the easy part, finding a hash function that is solvable is hard. Also, instead of linear relations, if one gets low-level equations then an algebraic attack might help. [Linearization](https://crypto.stackexchange.com/q/75357/18298), XLS etc...
Morrolan avatar
ng flag
@kelalaka My comment was touching upon the aspect of OP's question why a purely linear hash function is not viable - namely because there's tools to trivially solve it.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.