Score:1

Crypto

Checking equivalence among distributed sets

zetaprime

5/29/23, 6:03 PM

I have elements from $\{0, 1\}^{n}$ (range of a hash function)
The master $A$ can have any subset of this range.
There are clients that each have a subset from the space, too.
I want to make sure that the union of the clients' sets is equal to the master set
The communication should be as least as possible.
The elements are secret. (this requirement can be relaxed with a solution that could potentially leak)

thanks to @kelalaka for refining the question.

0 + 0

hash

collision-resistance

homomorphic-signatures

knaccc

5/29/23, 6:37 PM

Isn't it trivial to filter out duplicates?

zetaprime

5/29/23, 6:41 PM

No it's not trivial, because they are distributed. Filtering out duplicates would mean copying the data to multiple locations or to a single location.

zetaprime

5/29/23, 8:41 PM

Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/133625/discussion-between-zetaprime-and-kelalaka).

kelalaka

5/29/23, 8:49 PM

If you can relax the secret, this is better for CS, IMHO.

Maarten Bodewes

5/30/23, 4:38 PM

@knacc & all I've removed a lot of comments of zetaprime as they didn't make sense to third parties (anymore). If anything is missing, please add them again.

Score:1

Crypto

meshcollider

5/29/23, 7:52 PM

A potential option would be to use Bloom filters to identify potential duplicates probabilistically, and then check if they really are duplicates by sending the few potentials. If you use large enough bloom filters this would be sufficient for any size.

Alternatively, which may not be a great fit after all, you could use miniscketch.

From the readme,

Sketches, as produced by this library, can be seen as "set checksums" with two peculiar properties:

Sketches have a predetermined capacity, and when the number of elements in the set is not higher than the capacity, libminisketch will always recover the entire set from the sketch. A sketch of b-bit elements with capacity c can be stored in bc bits.

The sketches of two sets can be combined by adding them (XOR) to obtain a sketch of the symmetric difference between the two sets (i.e., all elements that occur in one but not both input sets).

So assuming you use a large enough capacity, you can just XOR the sketches to identify the unique elements and remove the rest.

0 + 0

kelalaka

5/29/23, 8:00 PM

We don't know the possible size of the universe and the subsets $A$ and the clients. yet

meshcollider

5/29/23, 8:11 PM

Hmm, updated with another potential solution

kelalaka

5/29/23, 8:19 PM

I think miniscketch will fail on the security requirement and Bloom, too

zetaprime

5/29/23, 8:30 PM

for the bloom filters: I can see if all clients would have the blooom filters of others then I guess they could agree on a protocol to discard the duplicates. How do I then check the equality of the union?

zetaprime

5/29/23, 8:33 PM

for the minisketch it writes, "A sketch of b-bit elements with capacity c can be stored in bc bits." unless I misunderstood it, it's somewhat storing all the elements in the sketch. I will read further.

meshcollider

5/29/23, 8:36 PM

@zetaprime yeah the elements are recoverable from the sketch so it's probably too large for your purposes. I was mainly focused on the additive part initially.

meshcollider

5/31/23, 7:12 AM

In terms of testing equality, this may help: [A probabilistic set with no false positives?](https://cstheory.stackexchange.com/questions/6596/a-probabilistic-set-with-no-false-positives). I can imagine one scenario where, after you have removed all of the duplicates between the clients, you could do some sort of RSA accumulator deterministically and then ensure equality of those. Essentially hash each element to a prime (hopefully with negligible chance of collision), and raise a fixed generator to all those prime powers in a finite field. It would be probabilistic though.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Checking equivalence among distributed sets

TH: การตรวจสอบความเท่าเทียมกันระหว่างชุดที่แจกแจง

RO: Verificarea echivalenței între mulțimile distribuite

RU: Проверка эквивалентности среди распределенных наборов

VI: Kiểm tra sự tương đương giữa các tập phân tán

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.