Score:0

Crypto

Verifying Merkle root correctness without completely reconstructing the tree

Adam Ierymenko

10/17/22, 6:30 PM

Lets say Alice has a list of values, and Bob sends Alice a Merkle root that he claims is for this list of values. The Merkle tree construction algorithm is mutually known of course.

Alice can then pick arbitrary values from the list and ask Bob for their Merkle proof.

Lets say Alice wants to avoid constructing the whole tree to verify Bob's Merkle root. How sure can she be that Bob's Merkle root is correct after Bob has supplied N Merkle proofs? In other words what is the probability that Bob has lied about the Merkle root after N queries?

EDIT:

I have an answer in mind but I don't think it's right. Lets say we have a tree with M nodes in total (counting all layers), and each proof in that tree contains N hashes as data points in the proof. If Alice asks for K distinct proofs, then she can be sure the root is correct with a certainty of (N*K)/M.

Seems naive and like this would only be a lower bound though, since there's more to it than that. The tree is made of hashes that have preimage resistance, so it's not a mere question of counting known data points vs. total data points... or is it?

EDIT #2:

Another formulation:

Given a known vector of arbitrary values V and a Merkle root R constructed by building a Merkle tree for V, if I have N Merkle proofs how sure can I be that R was constructed honestly and correctly from V?

Obviously I could rebuild the tree since I have (or can compute) all of V, but lets say I don't want to because it's time consuming etc.

1 + 8

hash-tree

poncho

10/17/22, 6:45 PM

Is this homework?

Adam Ierymenko

10/17/22, 6:48 PM

No, just a question that came up in a discussion with a friend of mine about proof of space systems.

kelalaka

10/17/22, 6:50 PM

This answer is the [canonical Merkle-Tree answer](https://crypto.stackexchange.com/q/71310/18298)

kelalaka

10/17/22, 6:56 PM

In a second read, this question is missing details. Is Bob allowed add additional values? What if Bob deletes one value and you never asked that? Use counterargument to arrive the result! I wonder what you and your friend thought?

Adam Ierymenko

10/17/22, 6:59 PM

No, Bob can't add values because that would change the Merkle root. Lets say I know or *am able to compute* all the values in the list. How can I tell that Bob did likewise and constructed his tree honestly? If I ask him for proofs, how sure am I after N proofs? Seemed interesting to us because it could be a form of proof-of-space-time if the list is expensive to construct.

kelalaka

10/17/22, 7:13 PM

But Bob calculates the root, not Alice, so during the Merkle-Tree setup, Bob can add/delete/modify values.

Adam Ierymenko

10/17/22, 7:19 PM

Yes. Alice wants to get Bob to prove that all values were included without having to calculate all values or construct the whole tree, so she picks values at random and asks for proofs. Any proof that contains a node that conflicts with the same value in another proof proves that Bob is being dishonest. How many queries to reach, say, a 99% certainty that Bob didn't cheat?

Adam Ierymenko

10/17/22, 10:33 PM

One more thing that may not have been clear: when Bob sends the root to Alice, he commits to it. He can't change the root so he can't actually add or remove values. What he could have done though is cheat by making an incomplete tree or using a few incorrect values.

Score:1

Crypto

poncho

10/18/22, 3:29 AM

In other words what is the probability that Bob has lied about the Merkle root after N queries?

Well, here is one way Bob could cheat; he could take the list of $M$ values $$V_1, V_2, ..., V_M$$ and instead of forming a Merkle tree based on that, select a value $c$ index and a value $V'_c \ne V_c$, and instead form the list $$V_1, V_2, ..., V_{c-1}, V'_c, V_{c+1}, ..., V_M$$ and form the Merkle tree based on that list (and give Alice the computed root, which is with quite high probability different from the root from the original list).

Then, when Alice gives him the $K$ indices to base the $K$ proofs on, he forms them based on the modified list and the tree he computed. He will be detected as cheating only if one of the indices she asked for happens to be the index $c$; if none of the indices she asked for happened to be for index $c$, then the proofs she would receive are all valid and consistent with each other (as they correspond to a self-consistent Merkle tree, with the revealed leafs being the values she expects)

The probability of Bob being caught cheating, that is, the probability that one of the indices Alice picked is $c$, is $K/M$. Hence the detection probability against any strategy is no more than that (it might be less if there is an even better strategy Bob could employ).

So, to detect Bob cheating with probability 0.99, we have $K > 0.99M$; at that point, it'll actually be less computation for Alice if she were just to recompute the tree herself.

Seemed interesting to us because it could be a form of proof-of-space-time if the list is expensive to construct

Now, looking at the problem as a proof-of-work is a bit more interesting (proof-of-space doesn't work, as the tree and authentication paths can be computed without increasing the amount of computation required significantly); obviously, the cheating strategy I gave above doesn't break that problem, as Bob is doing the full amount of work. If I had more time, I'd try to investigate that more...

0 + 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Verifying Merkle root correctness without completely reconstructing the tree

TH: ตรวจสอบความถูกต้องของราก Merkle โดยไม่ต้องสร้างต้นไม้ใหม่ทั้งหมด

RO: Verificarea corectitudinii rădăcinii Merkle fără a reconstrui complet arborele

RU: Проверка правильности корня Меркла без полной реконструкции дерева

VI: Xác minh tính chính xác của gốc Merkle mà không cần xây dựng lại hoàn toàn cây

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.