Score:1

Advanced Composition in DP is worse than Basic Composition

cn flag

I have problems with understanding the advanced composition theorem in DP.

Let I have two approximate-DP mechanisms ($k = 2)$ where each satisfies $(\epsilon = 0.5, \delta = 0.1)$-DP. By basic composition, I know that using two queries sequentially will guarantee $(2 \cdot 0.5, 2 \cdot 0.1) = (1, 0.2)$-DP.

Advanced composition, however, says that, instead of the composition having $\delta' = k\cdot \delta$, if we are willing to take $\delta ' = k\cdot \delta + \tilde{\delta}$ for some $\tilde{\delta}>0$, then our $\epsilon'$ improves from $2\epsilon$ to $\epsilon' = k\cdot \epsilon(\exp(\epsilon) - 1) + \epsilon\sqrt{2 \cdot k \cdot \log (1/\tilde{\delta})}$.

Now, assume I am happy with $\delta' = 0.3$ instead of $\delta' = 0.2$. This means $\tilde{\delta}= 0.1$. So, $$\epsilon' = 2\cdot 0.5(\exp(0.5) - 1) + 0.5\sqrt{2 \cdot 2 \cdot \log (1/0.1)} = 2.16 \gg 1.$$

I don't understand how this is improving upon the basic composition, as obviously here this is not the case! Am I doing something wrongly?

Edit:

The numbers I have fixed play no role. In general, we know that we can compose $k$ mechanisms (assume each is $(\epsilon, \delta)$-DP) and get $(k\epsilon, k\delta)$-DP just by basic composition. But, by increasing $k\delta$ a bit, we get an $\epsilon'$ which is equal to:

$$k \epsilon \underbrace{(e^\epsilon - 1)}_{\geq 0} + \underbrace{\epsilon \sqrt{2 k \log(1 / \tilde{\delta})}}_{\geq 0} $$ which is not always less than $k\epsilon$.

Specifically, let my extra allowance be $\tilde{\delta} = 0.1$. I want to see when the advanced composition improves upon the basic composition. So, in summary I want to see when the following holds:

\begin{align} & k\epsilon > k \epsilon (e^\epsilon - 1) + \epsilon \sqrt{2 k \log(1 / \tilde{\delta})} \\ \iff & k > k (e^\epsilon - 1) + \sqrt{2 k \log(1 / \tilde{\delta})} \\ \iff & \sqrt{k}(2 - e^\epsilon) > \sqrt{2 \log(1 / \tilde{\delta})} \\ \iff & k > \frac{2 \log(1 / \tilde{\delta})}{(2 - e^\epsilon)^2} \\ \iff & k > \frac{2 \log(10)}{(2 - e^\epsilon)^2}. \end{align} Now assume I want to use $2$ mechanisms. Then, I need to have:

\begin{align} & \log(10) <(2 - e^\epsilon)^2 \\ \iff & \epsilon < \log(2 - \sqrt{\log(10)}) = -0.7286 \end{align} which is never possible. So, when $k = 2$, and if I am willing to only add $0.1$ to $\delta'$, then I can never improve basic composition with advanced composition?

Edit 2: We can, in general, say that advanced composition only improves upon the basic composition if the following is satisfied:

$$ \epsilon < \log\left[2 - \sqrt{\frac{2 \log ( 1/\tilde{\delta})}{k}} \right] $$

which requires $k > 4$ when, e.g., $\tilde{\delta} = 0.1$ and this number increases when $\tilde{\delta}$ decreases.

Overall, I feel like advanced composition is really useless when $k$ is not large. Is this true?

travelingbones avatar
tv flag
I've used this a bit in practice (have not attempted to prove it in general), but my limited experience is that for small `k`, basic composition was better. For large `k`, the advanced composition, and others should be considered. I agree with your intuition and I think the question is a really important one! In my case `k` was fixed. I had my choice of mechanism (e.g., Laplace, Gaussian). I computed episilon/delta for basic, advanced, renyi composition then conversion to epsilon/delta and optimized alpha, zero-concentrated composition, and used the best one.
independentvariable avatar
cn flag
Thanks for the very insightful answer @travelingbones !
Score:2
ng flag

First, there are other composition results, for example I believe this one improves on advanced composition. I'll answer a more general question though (which I think you are getting at).

Given mechanisms $M_1,\dots, M_n$, how does one get the best parameters for their composition?

Ideally we could say "use basic composition for small $k$", and "use advanced composition for large $k$". Unfortunately, one can formally show that this is not straightforward. The Complexity of Computing the Optimal Composition of Differential Privacy studies, for "input" mechanisms $M_1,\dots, M_n$ of parameters $(\epsilon_1,\delta_1),\dots, (\epsilon_n, \delta_n)$, the problem of finding the "minimal" parameters $(\epsilon^*, \delta^*)$ of the composition mechanism.

Prior results were known, for example

If $M_1,\dots, M_n$ are all $(\epsilon,\delta)$-private for a fixed pair of $(\epsilon, \delta)$, and a target $\delta^*$ is given, then the optimal value of $\epsilon^*$ is the minimal $\epsilon^*\geq 0$ such that $$\frac{1}{(1+e^\epsilon)^n}\sum_{\ell = \lceil (\epsilon^*+n\epsilon)/2\epsilon\rceil}^n\binom{n}{\ell}(e^{\ell \epsilon}-e^{\epsilon^*}e^{(n-\ell)\epsilon}) \leq 1 - \frac{1-\delta^*}{(1-\delta)^n}.$$

The paper I linked extends this result to the case of $M_1,\dots, M_n$ being private of parameters $(\epsilon_1,\delta_1),\dots,(\epsilon_n,\delta_n)$ not all the same. They find a similar (but more complicated) expression characterizing the optimal value of $\epsilon^*$ (when given a "target" $\delta^*$), and find that computing this optimum solution is $\#P$-complete, e.g. is unlikely to be able to be efficiently done. This is true even in the case of composition for pure mechanisms, meaning $\delta_i = 0$ for all $i$.

Perhaps the most interesting to you is that there are approximation algorithms for this problem (also in that paper). I do not know if anyone has implemented them, but if they have then it seems like a good option for concrete selection of parameters.

It is worth mentioning as well that there are closely-related notions of privacy (namely concentrated differential privacy) where the composition story is more straightforward, while one still (in a certain sense) achieves $O(\sqrt{k})$ scaling with $k$-fold composition, rather than $O(k)$ scaling "basic composition" gives you.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.