When Existing Techniques Preserve Differential Privacy
It is no secret that online companies, hospitals, credit-card companies and governments hold massive datasets composed of our sensitive personal details. Information from such datasets is often released using some privacy preserving heuristics, which have been repeatedly shown to fail. That is why in recent years the notion of differential privacy has been gaining much attention, as an approach for conducting data-analysis that adheres to a strong and mathematically rigorous notion of privacy. Indeed, many differentially private analogs of existing data-analysis techniques have already been devised. These are, however, new algorithms, that require the use of additional random noise on top of existing techniques.
In this talk we will demonstrate how existing techniques, that were developed independently of any privacy consideration, preserve differential privacy by themselves --- when parameters are properly set. The main focus of the talk will be the Johnson-Lindenstrauss Transform, which preserves differential privacy provided the input satisfies some ``well spread'' properties. We will discuss applications of this algorithm in approximating multiple linear regressions and in statistical inference. Moreover, focusing on linear regression, we will exhibit additional techniques that preserve privacy: regularization, addition of random datapoints and Bayesian sampling.
(Time permitting, we will survey a very different technique, de-noising neural networks, that also aligns with the definition of differential privacy in the local model.)
The talk is self-contained and no prior knowledge is assumed.