Why the set membership symbol (∈) is used in formal differential privacy definition?

umityigitbsrn

11/15/23, 9:01 PM

In The Algorithmic Foundations of Differential Privacy (Dwork, C; Roth, A), the formal definition of differential privacy is given as:

The randomized algorithm $\mathcal{M}$ with domain $\mathbb{N}^{|\mathcal{X}|}$ is $(\epsilon, \delta)-$differentially private if for all $\mathcal{S} \subseteq Range(\mathcal{M})$ and for all $x, y \in \mathbb{N}^{|\mathcal{X}|}$ such that $\|x - y\|_1 \leq 1$: $$Pr[\mathcal{M}(x) \in \mathcal{S}] \leq \exp(\epsilon) Pr[\mathcal{M}(y) \in \mathcal{S}] + \delta$$ where $\mathcal{X}$ is collection of records from a universe and $x, y$ are the databases

I can not understand why the '$\in$' operator is used for this definition. I mean, if they used the '=' sign instead of the '$\in$' operator, wouldn't it be the same because the definition should hold for every $\mathcal{S} \subseteq Range(\mathcal{M})$. Can you help me to understand why they use the '$\in$' symbol for formal differential privacy definition?

2 + 0

differential-privacy

Score:1

Crypto

Daniel S

11/16/23, 9:52 AM

The subset statement is stronger. It is true that $$\mathrm{Pr}[\mathcal M(x)\in\mathcal S]=\sum_{\xi\in \mathcal S}\mathrm{Pr}[\mathcal M(x)=\xi].$$

However, if we have the statement $$\mathrm{Pr}[\mathcal M(x)=\xi]\le\exp(\epsilon)\mathrm{Pr}[\mathcal M(y)=\xi]+\delta$$ for every $\xi\in \mathcal S$ and then try to develop a bound for $\mathrm{Pr}[\mathcal M(x)\in\mathcal S]$, the best that we can manage is $$\mathrm{Pr}[\mathcal M(x)\in\mathcal S]=\sum_{\xi\in \mathcal S}\mathrm{Pr}[\mathcal M(x)=\xi]\le\exp(\epsilon)\sum_{\xi\in\mathcal S}\mathrm{Pr}[\mathcal M(y)=\xi]+\sum_{\xi\in\mathcal S}\delta$$ where the RHS is $\exp(\epsilon)\mathrm{Pr}[\mathcal M(y)\in\mathcal S]+|\mathcal S|\delta.$

This is weaker than the definition given due to the extra factor of subset size applied to $\delta$. Trivially the subset bound implies the individual output bound as subsets can have a single element.

+ 0

Score:0

Crypto

kodlu

11/15/23, 10:36 PM

Edit: Your question was unclear (at least to me) until the comment at the bottom of my answer. In that case, as pointed out in the other answer, the set subset statement is stronger: We have a bound that holds uniformly instead of in a pointwise manner.

The set membership $\in$ is different than $\epsilon$ which is a small positive constant used as a parameter in these definitions. So you take $\epsilon>0,$ as small as you want and figure out what the corresponding $\delta$ is for the mechanism you design to be $(\epsilon,\delta)-$differentially private.

So if $x,y$ are close to each other then we have that the algorithm outputs are close, and how close is determined by $(\epsilon,\delta)$ where in $$Pr[\mathcal{M}(x) \in \mathcal{S}] \leq \exp(\epsilon) Pr[\mathcal{M}(y) \in \mathcal{S}] + \delta$$ $\exp(\epsilon)$ is a scalar factor and $\delta$ is an additive factor controlling how near the probability that the algorithm's output for $x$ falls in $\cal S$ compared to the probability that the algorithm's output for $y$ falls in $\cal S$ for all subsets $\cal S$ in the range of the algorithm.

+ 1

umityigitbsrn

11/16/23, 7:17 AM

That is actually not what I ask. I already knew the difference between $\in$ and $\epsilon$. However, I do not understand why the set membership ($\in$) operator is used in the definition instead of the equal sign (=). Specifically, I am trying to ask the reason why they use $Pr[\mathcal{M}(x) \in \mathcal{S}]$ statement rather than using $Pr[\mathcal{M}(x) = \xi]$ where $\xi$ is a specific output.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Why the set membership symbol (∈) is used in formal differential privacy definition?

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.