Score:0

Is there any way to claculate the hash of (a + b +c)) if you only know hash(a) and Hash(b) and Hash(c)?

jo flag

For example, if you have say, 3 distinct paragraphs of clear text a, b, c and you only know hash(a), hash(b) and hash(c), and then you have a clear text d, which claims to be the concatenation of a, b, and c, is there any way to use hash(a, b, c) to demonstrate that d either is or isn't a+b+c?

Not beng a math person, I'm guessing there is a brute force way, depending on the length of a, b, and c to try every possible division of d into 3 parts, and see if you can match the individual hashes, but seems computationally intensive unless the a b and c are pretty short. Is there a way that isn't dependent on the length of a, b and c?

fgrieu avatar
ng flag
The title of the question asks if given $H(a)$, $H(b)$ and $H(c)$ we can compute $H(a\mathbin\|b\mathbin\|c)$. The body of the question asks if additionally given $d$, we can tell if $a\mathbin\|b\mathbin\|c=d$. These are different problems. [Mark's answer](https://crypto.stackexchange.com/a/102479/555) addresses the first using an appropriate $H$ (there's no solution with $H$ a standard hash like SHA-256). [poncho's answer](https://crypto.stackexchange.com/a/102477/555) addresses the second problem with $H$ a standard hash, giving a solution of cost linear with the size of $d$.
Score:1
my flag

Not beng a math person, I'm guessing there is a brute force way, depending on the length of a, b, and c to try every possible division of d into 3 parts, and see if you can match the individual hashes, but seems computationally intensive unless the a b and c are pretty short.

Actually, it doesn't sound that bad. Suppose $d$ is a Megabyte in length. There are a million (plus one) ways that $a$ could be a prefix of $d$ (assuming $a$ is known to be an integer number of bytes; multiply by 8 if it might be an arbitrary bit string); hash all possible prefixes, and see if any of them match the known value $hash(a)$. Once you have $a$ (and hence the length of $a$), you can then do the same for $b$; with at most a million hashes, you get recover $b$ (and then also verify $c$)

That should be less than a second on a decent PC, assuming your hash function is, say, SHA-2 or SHA-3 or something similar (if you reuse intermediate hash values during the scan of possible matches for $hash(a), hash(b)$)

Score:0
ng flag

This is similar to a well-known construction, namely what is known as an incremental hash. See this paper for pointers.

Roughly, incremental hashes are homomorphic with respect to set operations. Or in other words, if you have a database $\mathcal{D}$ that you

  1. need to maintain the hash $H_{\mathcal{D}} := H(\mathcal{D})$ of, while
  2. updating by adding ($\mathcal{D} + e := \mathcal{D}\cup \{e\}$) and removing ($\mathcal{D}-e := \mathcal{D}\setminus\{e\}$) elements,

one can use an incremental hash to efficiently update $H_{\mathcal{D}}$ without recomputing $H(\mathcal{D})$ after each operation.

kelalaka avatar
in flag
Mark, the user only knows the hashes and doesn't know how $d$ is divided into $a,b,c$
Mark avatar
ng flag
@kelalaka yes, I don't think this answers the question exactly, but it seems close, and if they could modify their application such that this is an accepted solution, they can rely on standard (if not somewhat niche, hence why I thought it worthwhile to mention) constructions.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.