Hardening a polynomial checksum scheme

vc flag

I have a checksum scheme that uses a simple polynomial summation as described here.

Basically I'll take a random value $R$ and a set of inputs $[v_0, v_1, v_2]$ and checksum it like $v_0*R + v_1*R^2 + v_2*R^3$. This was designed to get simple collision resistance with incremental checksumming.

I'm trying to change this algorithm to be resistant to chosen inputs. e.g. an adversary should be allowed to choose $V = [v_0, v_1, ...]$ without being able to break the checksum.

My current approach is to discard the incremental component and sample an $R$ value from the inputs. I define a constant $R_c$ and a set of inputs $V = [v_0, v_1, ...]$. I also define a hash function $H$ that is a random oracle. All of this is happening inside $\mathbb{F}_{p}$ where $p \approx 2^{256}$.

First I sample an $R$ value from the $V$ and $R_c$ by doing the following:

$R = H( \sum_{i=0}^{|V|} v_i * R_c^i)$

Then I use the originally strategy above with this $R$ value sampled from $V$.

$C = \sum_{i=0}^{|V|} v_i * R^i$

Now $C$ is the final checksum of the data.

Does this add security against chosen input attacks, and if so how much? I'm imagining the simple collision case for 1 element being (roughly)

$H(v) * v = H(v') * v'$ where $v \neq v'$

This seems awfully close to the security of $H$, is this accurate? Does this change significantly as $|V|$ gets larger?

Daniel S avatar
ru flag
I'm not sure what your scheme achieves that is better than just publishing your $R$ value (or just the hash of $V$.
vimwitch avatar
vc flag
It's mostly valuable for rank 1 constraint systems in ZK proofs. Executing $H$ many times is expensive whereas combinations of multiplication and addition are cheap. It's also impractical to publish all $R$ values if many hashes are calculated (violates succinctness).
vimwitch avatar
vc flag
I should note, $H$ takes a single input, so hashing $V$ takes $|V|$ $H$ invocations. That's what I'm trying to avoid.
my flag

Does this add security against chosen input attacks, and if so how much?

No, if $R_c$ is public, it is easy to find collisions. The idea is to find a simultaneous collision on both $R$ and $C$.

Here is one approach:

  • Start with a provisional input $v_i$; set the lower three inputs $v_3, v_2, v_1$ to 0, and everything above that arbitrarily (we won't change those values)

  • Compute the value $R = H( \sum_{i=0}^{|V|} v_i * R_c^i)$

  • Now, for any arbitrary constant $c$, we can reset $v_2 = c$, $v_1 = -c(R + R_c)$, $v_0 = cRR_c$

For any such value of $c$, evaluating the function gives us the same value of $R$ (because $cR_c^2 - c(R + R_c)R_c + cRR_c = 0$, consistent with our initial test evaluation), and thus gives the same $H$ value (because $cR^2 - c(R + R_c)R + cRR_c = 0$ and all higher order terms are the same)

By choosing two different values of $c$, this gives us a collision. This approach can be adapted to compute preimages, should the attacker find that useful.

vimwitch avatar
vc flag
Thank you for the answer. It seems like the algorithm is vulnerable to birthday type attacks (2 randomish pre-images), and pre-image recovery. Is it vulnerable to second pre-image attacks? If I have an output $C$ is it trivial to find a second input that outputs $C$?
poncho avatar
my flag
@vimwitch: yes, if you can do preimages attacks, then second preimage attacks are easy...
I sit in a Tesla and translated this thread with Ai:


Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.