Score:1

Are such verification wormholes known, or even possible?

in flag

1. Scenario

Suppose that we have a source that is generating one random value per, say, minute. So we have random value $x_1$ in $1$st minute, $x_2$ in $2$nd minute, $\ldots$, $x_n$ in the $n$th minute, and so forth.

The distribution of values $x_1, x_2, \ldots$ is not entirely uniform random, but follows the following rule: for any $i \ge 1$, $x_i = (y_i, y_{i+1})$, where, $y_i$ is the unique identifier of $x_i$, and $y_{i+1}$ is the unique identifier of upcoming $x_{i+1}$ that going to arrive at the next minute.

In other words, at any given $i$th minute, the current $x_i$, and the next $x_{i+}$ are known, but the one after $x_{i+2}$ is absolutely unknown, or random. Below is an example:

Minute 1:  x_1 = (334234   , 234829129   )
Minute 2:  x_2 = (234829129, 983220      )
Minute 3:  x_3 = (983220   , 831231236347)
...
Minute n:  x_n = (643532754, 3534611     )

One way to hash those $n$ values, in order, is to hash each value as it arrives, e.g. $h_1 = f(x_1)$, then chain it with the next newly arriving one, e.g. $h_2 = f(x_2 \parallel h_1)$.

In other words, the hash of the ordered list of input, in the $n$th minute is defined recursively as $h_n = f(x_n \parallel h_{n-1})$, with the base case being $h_1 = f(x_1)$.

The advantage of this approach is that, for somehow who is running this process from the beginning, at every minute, both the run-time and space-time are in $O(1)$, as he can cache the hash from the previous minute.

The disadvantage of this approach is that, for someone that was not following the process, and wishes to verify if $h_n$ is indeed the hash of the entire sequence, he will have to start over from the beginning with $h_1$, and repeat the entire chain all the way until $h_n$. Effectively, this verification process will take $O(n)$ space and $O(n)$ time.

2. Wormhole

It would be nice if it is possible that, at every $n$ many hashed chains, we can discover a wormhole $w_n$ that has the following properties:

  • Once the $n$th hash, $h_n$, is legitimately calculated off $h_1$ by following the full recursion earlier, only then the wormhole $w_n$ becomes discovered. Otherwise, finding $w_n$ is practically impossible.
  • For a given $h'_n$ hash that is claimed to be $h_n$, the wormhole can shortcut the verification as follows:

$$ w_n(h_1, h'_n) = \begin{cases} 1 & \text{if $h'_n = h_n$}\\ 0 & \text{else}\\ \end{cases} $$

  • The asymptotic worst run-time as well as the asymptotic worst space for computing $w_n(h_1, h'_n)$ is not worse than $O(\log n)$. If it's possible to make it in $O(1)$, that's be even better of course.

Note. This is different than the pre-image attack of hashing functions. The difference being as follows:

  • Pre-image attacks allow the attacker to forge an arbitrary input that gives some desired target hash.

  • This wormhole $w_n$ does not allow forging any arbitrary input, but rather reveals a hidden shortcut path that works only for linking a specific input that allows to link $h_n$ back to $h_1$, and that the only way to discover such wormhole is by legitimately calculating $h_n$ first.

  • Maybe we can call such a wormhole to be a conditional pre-image attack that does not benefit the adversary.

3. Question

Are such verification wormholes known, or even possible?

fgrieu avatar
ng flag
I think there must be an extra input beside $n$, $h_1$ and $h'_n$ to the algorithm computing $w_n(h_1, h'_n)$. In particular, what it does should depend on $x_1$ to $x_n$, right? Therefore, why single out $h_1$, and what _in the problem statement_ prevents from making $h_n$ that extra input, which allows a trivial implementation of $w_n(h_1, h'_n)$? Is it assumed $x_1$ to $x_n$ are implicit inputs to said algorithm?
knaccc avatar
es flag
Do the random values have to be genuinely random and outside of the control of the source, or do the values only need to be indistinguishable from random to an observer?
caveman avatar
in flag
@fgrieu - Right, it depends on $x_1, x_2, \ldots$. However, I wonder, can we pass information about such dependence in the output hashes $h_1, h_2, \ldots$? In other words, can it be that $h_2 = f(x_2, h_1)$ is effectively passing related information in $x_1$ into $h_2$? Subsequently, as the chain goes on, can $h_n$ effectively have related information from $x_n, x_{n-1}, \ldots, x_1$, that's sufficient to create a verification wormhole $w_n(h_1, h'_n)$?
caveman avatar
in flag
@fgrieu - As for your question about the problem statement preventing trivial solutions, if I understand you correctly, it's the requirement that the space complexity for the "wormhole user" must be constrained in $O(\log n)$. But, the "wormhole discoverer" must do the $O(n)$ process.
caveman avatar
in flag
@knaccc - Right. The values are random. E.g. the person who computes $h_i = f(x_i, h_{i-1})$ has absolutely no idea about the next value $x_{i+1}$.
knaccc avatar
es flag
Yes, the person that calculates the hash I assume is only doing so in order to be able to then discard the prior $x_i$ values, yet still be able to verify if someone then supplies them with the correct copy of those values in the future. But my question is: do the random values (i.e. $x_i$) have to be genuinely random and outside of the control of the source, or do the values only need to be indistinguishable from random to an observer?
caveman avatar
in flag
@knaccc - Yes, truly random, not even the source knows what the next value is going to be. Information about previous random values pass to the next ones in the hashes. E.g. $h_{i}$ gets information about previous value $x_{i-1}$ as it gets its hash $h_{i-1}$.
knaccc avatar
es flag
I assume the source can't be trusted to assert anything about the values? Because if the source is trusted, the source can just release a "checkpoint" every $n$th time, where a signed message containing the latest hash is announced. If the source can't be trusted to assert anything, how can the source be trusted to introduce a wormhole that attests to the correct value of the hash?
fgrieu avatar
ng flag
In other words: the question as stated is answered by using any standard hash for $f$; making the "wormhole discoverer" compute $h_n$ (requiring time $O(n)$, which is optimal); then making the wormhole itself output $1$ iff it's second argument matches $h_n$. I conclude the question needs refinement to prevent similarly dull solutions. Like: we want the "wormhole discoverer" to be as fast as a standard hash. Also, we need a purpose for the first parameter of the wormhole, and I don't see that in the question.
caveman avatar
in flag
@fgrieu - Right. I gave 1 wrong answer about the distribution of $x_1, x_2, \ldots$. It's not entirely random. It works as follows: $x_i = (y_i, y_{i+1})$, where $y_i$ is the unique identifier of $x_i$, and $y_{i+1}$ is the unique identity of the next upcoming value $x_{i+1}$. So, when $x_1$ is generated at minute 1, the source knows what the next one is $y_2$, but doesn't know $y_3$. So, it has randomness in it, but not entirely. As for the trust: we only trust $h_1$ to be right. - Updated the post with that clarification.
Hhan avatar
jp flag
I think the notion of wormhole you are concerned is closely related to the recent object called verifiable delay function, see e.g. https://eprint.iacr.org/2018/712
Score:0
pl flag

Edit: Clarifying answer by using a Merkle tree as example.

For a list of values $x_1,\ldots,x_n$, you can compute a Merkle tree with the root $h_n$. For a given $h_n'$ claimed to be $h_n$, you can shorten the verification by revealing the authentication path of length $log_2(n)-1$ between any known leaf hash $f(x_i)$ and the claimed $h_n'$.

By defining the wormhole $w_{i,n}$ as the authentication path between $f(x_i)$ and $h_n$, you can verify that $h_n' = h_n$ in $log_2(n)$ calls to $f$. As pseudo-code:

def verify(f_x_i, w, h_n):
    h_i = f_x_i
    for h_j in w:
        h_i = f(h_i + h_j)
    return h_i == h_n

You can then call verify(f(x_i), w_i_n, h_n'). Specifically, to verify $h_n'$ does not require all previous hashing, only a subset of logarithmic size called the authentication path.


The drawback of using a Merkle tree here is that we have to assume that it's perfectly balanced, i.e. that $n = 2^k$ for some $k$, and appending requires rebuilding the tree, so no gain. Appending your $h_i$ to a Merkle Mountain Range instead, there is something similar to an authentication path to show that it lives under a set of peaks rather than under one root.

caveman avatar
in flag
How would those peak hash digests link to $h_1$?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.