[Busec] MSR talk on differential privacy

Sharon Goldberg goldbe at cs.bu.edu
Fri Oct 28 10:29:51 EDT 2011

WHO:                    Kamalika Chaudhuri
TITLE:                  Sample Complexity Bounds for Differentially
Private Classification
WHEN:                   Monday, October 31th, 2011
WHERE:                  Paul conference room (first floor), 1 Memorial
Drive, Cambridge
TIME:                   4:00 - 5:00 PM
We study the problem of differentially private classification --
namely, learning a classifier from sensitive training data, while
still preserving the privacy of individuals in the data set. A natural
question to ask is: what is the sample requirement of a learning
algorithm that guarantees a certain level of privacy and accuracy?
Previous work studied this question in the context of discrete data
distributions, and provided an upper bound, and lower bounds for some
specific classes. In the first part of this talk, we show a lower
bound that holds for general classification problems which obey
certain fairly-loose conditions.
In the second part of this talk, we consider this question in the
context of learning infinite hypothesis classes on continuous data
distributions. We show that in this case, even for very simple
hypothesis classes, any algorithm that uses a finite number of
examples and guarantees differential privacy must fail to return an
accurate classifier for at least some unlabeled data distributions.
This result is unlike the case with either finite hypothesis classes
or discrete data domains, in which distribution-free private learning
is possible (as was shown by previous work).
Joint work with Daniel Hsu.

More information about the Busec mailing list