Score:0

Does matrix multiplication of hash digests admit manipulation of the result?

in flag

Take a sequence of byte buffers, hash each of them, interpret the hash digests as square matrices with 8-bit unsigned int elements, and (matrix) multiply them in order. Define the final matrix to be the "hash" of the list of elements.

This definition has some useful properties. In particular, the associative property of matrix multiplication enables calculating the list-hash of the concatenation of two lists by calculating each list's hash independently and then reducing them by multiplication to get the final list-hash. This works with any arbitrary partitioning. Non-commutativity provides that different element orders create a different hash for the list, as one would expect for a list.

(I explore this definition in more detail including working code samples in a python jupyter notebook that I published entitled Merklist. You can also play with it yourself on Google Colab, and add hypothes.is annotations on the post for general feedback. I can lift detail from there to this question if needed.)

Question

  1. Is this definition resistant to preimage attacks? Stated another way, is it possible to select a sequence of elements that results in an arbitrary target list hash?

Note that the elements must exist, so the element digests that go into the list-hash have preimage resistance based on the underlying hash function (which we can assume holds in the scope of this question). So really the question becomes: Can the order or presence of these hash digests be used to arbitrarily alter the contents of the final matrix? For example, can you generate a sequence of elements that produces a list-hash that is the zero matrix? (Hitting a zero matrix means game over, afaict.)

I did some searching and didn't find answers to any of these questions, though I suspect that could be due as much to my ignorance of the correct terminology as to the non-existence of the answers.

poncho avatar
my flag
BTW: I just tried it; when I hashed a sequence of circa 3000 different (not maliciously chosen) preimages together, the result was the all-zero matrix.
in flag
Whelp it looks like my "didn't do his homework" is showing. How embarrassing Guess I'll have to try again with $GF(256)$, and maybe test it a bit more carefully.
poncho avatar
my flag
I suspect that, while $GF(256)$ would be considerably better, even then if you need to exclude noninvertable matrices, a large enough series will still end up as all 0's - it just might take perhaps a million elements, rather than just 3,000.
in flag
You think? I feel like if candidate entries are restricted to invertible matrices only that should fix it. Take this example: $X A B C C^{-1} B^{-1} A^{-1}$; this should always result in $X$ again, no matter how long the initial $ABC$ sequence. Like, if it didn't then one or more of the matrices would not be invertible by definition. No? I feel like the biggest problem is going to be finding an invertible matrix for each possible entry in reasonable time.
in flag
@poncho I just tried it with $GF(256)$ matrix elements and it still seems fine after 10M entries. Soon I'll post another question like this one but with the $GF(256)$ formulation instead.
poncho avatar
my flag
If you use $GF(256)$, then a random matrix has an approximately $1/256$ of being singular; my suspicion that, with a long chain, the number of singular matrices that appear randomly will gradually reduce the rank, eventually leading to a rank of 0 (the all 0's matrix). However, I don't have a good model of how soon this would happen...
in flag
Yes, you're right. To ensure that no singular matrices are admitted, I'm using rejection sampling in a loop when calculating the 'matrix hash' of an element as you suggested before.
poncho avatar
my flag
Obviously, if you restrict yourself to invertible elements, well, those form a group, and so you'll never run into a restricted state (such as the all-0's matrix). There might be subtle tricks you could use from group theory to find a collision; however that's not at all obvious...
in flag
@poncho I published a draft of this using $GF(256)$ in a new post if you're interested: https://blog.infogulch.com/2021/07/15/Merklist-GF.html It uses Julia instead of python because I wanted to try Julia. You can open it to live edit online. I mean to add an addendum on my original post outlining the problem that you found.
Score:1
my flag

For example, can you generate a sequence of elements that produces a list-hash that is the zero matrix?

Yes; the most obvious way is to find a buffer that hashes to a matrix with all $n^2$ matrix elements even; replicate that buffer 8 times, and you'll get a product that's all zeros.

This takes an expected $2^{n^2}$ work to find such a buffer; for $n=8$, this is plausible.

It could likely be improved; by looking at pairs of noninvertible matrices, it would appear plausible that, with considerably less work than $2^{64}$, you could find two whose product has all elements even (and in which case the above observation would apply).

in flag
That makes a lot of sense. To clarify, is this caused by my choice of ring (256) which has factors of 2? Do you think this could be improved by choosing a prime ring (say, 257)?
poncho avatar
my flag
@infogulch: or $GF(2^8)$? Well, it would help, but not all that significantly; one still has a good probability (probability circa 1/256 per attempt) of finding noninvertible matrices; it looks plausible that you could glue them together to find products of decreasing rank, resulting in the all 0 matrix. Actually, my original answer was going to be this, until I realized that the 'all lsbits 0' answer was considerably easier to justify. Still, if you did use a field such as $GF(256)$ or $GF(257)$, and deliberately skipped (using rejection sampling) noninvertible matrices, that might work
in flag
Thanks for the pointer to Galois fields and invertible matrices. I had considered that invertible matrices might be a nice-to-have, but they're probably a requirement for this to work and not result in degenerate cases like a zero matrix.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.