Score:1

Differential Privacy with Outliers

us flag

To use the Laplace mechanism, we have to get the global sensitivity of a query function. What do we do in the case where there is one huge outlier(or multiple outliers) in the dataset such that the global sensitivity gets too large and noise added results in query outputs that do not make sense anymore. In which case can we use the local sensitivity. Any ideas, papers, references will be appreciated.

Score:0
ru flag

There's really very little that can be done in this sort of situation. The noise is only parameterised by $\Delta f/\epsilon$ where $\Delta f$ precisely represents how much the function changes when such an outlier is added or removed from a queried set and $\epsilon$ precisely quantifies how hard it is for an adversary to detect a change. Your options are essentially one or more of the following

  • make it easier for outliers to be identified (by increasing $\epsilon$);
  • purge your data of outliers (thereby reducing $\Delta f$ by changing the permissible subsets, but putting data integrity into question);
  • change your query function to something smoother so that the effect of outliers is less extreme (reducing $\Delta f$ by changing the function, but potentially losing some of the information that is captured).

All of these have drawbacks.

Score:0
pl flag

The following procedure will work:

  1. Take a transformation of the original data that
  2. Apply noise to the transformed data. The simplest transformation is the rank-normal transformation: Take $x$, map it to the empirical CDF at that point, and then take the probit to map the uniformly-distributed ranks to a normally-distributed dataset.
  3. Apply Laplacian noise as usual to the outlier-free dataset.
  4. Apply the inverse of the transforms above to arrive back in the original space.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.