[Busec] [BUSec] Alexander Pretschner tomorrow (Wed. 10am)

Foteini Baldimtsi foteini at baldimtsi.com
Tue Sep 29 19:57:45 EDT 2015


Just a reminder for tomorrow's seminar. 10am at Hariri Alexander
Pretschner  from Technische Universität München (currently visiting MIT)
will talk about his work on distributed data usage control.


BUsec Calendar:  http://www.bu.edu/cs/busec/
BUsec Mailing list: http://cs-mailman.bu.edu/mailman/listinfo/busec

The busec seminar gratefully acknowledges the support of BU's Center for
Reliable Information Systems and Cyber Security (RISCS).


***************
How I know you printed my email
Speaker: Alexander Pretschner, Technische Universität München
Wednesday Sept 30, 2015  10-11am
Hariri Seminar Room, MCS180

*Abstract:* This talk tackles the problem of specifying, monitoring and
enforcing data usage requirements of the kind, “print my email at most
twice,” “notify me upon dissemination of my address,” “no more than three
copies of a confidential document in the company,” “delete all copies of a
movie within thirty days,” “keep financial record for five years,” and the
like.

We first sketch how to formally express such requirements, and how to
semi-automatically transform user-level policies into technical policies
that can be observed or enforced by our infrastructure. As a second step,
we present this infrastructure that can act both post factum, for
accountability purposes, and preventively. It builds on two main ideas.
One, requirements come at various levels of abstraction: prohibiting
screenshots, writing files, playing songs, copying database rows can most
conveniently observed and controlled by monitors at different layers of
abstraction. Two, when data is to be protected, usually all of its
representations are meant to be protected: a picture comes as network
packets, pix map, cache file, DOM object. This requires information flow
tracking technology across the layers of a system and across systems.

Practical information flow tracking across layers and across systems with
multiple monitors faces significant challenges. These include performance,
over-approximations as well as completeness of the data flow analyses, and
security of the infrastructure. Time permitting, we sketch mitigations
based on quantity and structure of data as well as hybrid static-dynamic
information flow trackers.


*Bio:* Alexander Pretschner currently is a visiting researcher in MIT’s
distributed information group. Since 2012, he has held the chair of
software engineering at Technische Universität München in Germany. Prior
appointments include positions at the Karlsruhe Institute of Technology,
the Fraunhofer Institute for Experimental Software Engineering, and ETH
Zurich. Research interests include systems testing, security, and data
usage control. More information athttps://www22.in.tum.de/en/pretschner/.


***************************


Practical Differential Privacy
Speaker: George Kellaris
Wednesday Oct. 14th, 2015  10-11am\
Hariri Seminar Room, MCS180

Abstract: We focus on publishing statistics on a private database with
epsilon-differential privacy. We target at three scenarios; (i) when the
statistics are computed over a static database with high sensitivity, (ii)
when the statistics are published continuously over data that are updated
by a stream, and (iii) when we wish to publish 1-dimensional histograms.
For the first scenario, we address one-time publishing of non-overlapping
counts computed over highly sensitive data. These statistics are useful in
a wide and important range of applications, including transactional,
traffic and medical data analysis. Prior work on epsilon-differential
privacy publishes such statistics with prohibitively low utility in several
practical scenarios. Towards this end, we present GS, a method that
pre-processes the counts by elaborately grouping and smoothing them via
averaging. This step acts as a form of preliminary perturbation that
diminishes sensitivity, and enables GS to achieve epsilon-differential
privacy through low Laplace noise injection. The grouping strategy is
dictated by a sampling mechanism, which minimizes the smoothing
perturbation. We demonstrate the superiority of GS over its competitors,
and confirm its practicality, via extensive experiments on real datasets.
For the second scenario, we address continuously publishing of
non-overlapping counts. Numerous applications require continuous
publication of statistics for monitoring purposes, such as real-time
traffic analysis, timely disease outbreak discovery, and social trends
observation. These statistics may be derived from sensitive user data and,
hence, necessitate privacy preservation. Although epsilon-differential
privacy is a notable paradigm for offering strong privacy guarantees in
statistics publishing, there is limited literature that adapts this concept
to settings where the statistics are computed over an infinite stream of
“events" (i.e., data items generated by the users), and published
periodically. These works aim at hiding a single event over the entire
stream. We argue that, in most practical scenarios, sensitive information
is revealed from multiple events occurring at contiguous time instances.
Towards this end, we put forth the novel notion of w-event privacy over
infinite streams, which protects an event sequence occurring in w
successive time instants. We first formulate our privacy concept, motivate
its importance, and introduce a methodology for achieving it. We next
design two instantiations, whose utility is independent of the stream
length. Finally, we confirm the practicality of our solutions experimenting
with real data.
For the third scenario, we focus on the problem of differentially private
histogram publication, for range-sum query answering. Specifically, we
derive an 1-dimensional histogram from a given dataset, such that (i) it
satisfies epsilon-differential privacy, and (ii) it achieves high utility
for queries that request the sum of contiguous histogram bins. Existing
schemes are distinguished into two categories: "data-aware" and
"data-oblivious". Data-aware methods exploit data characteristic to
increase utility. However, they have superquadratic running time, and their
error increases with the query range. On the other hand, data-oblivious
methods are fast but may yield lower utility for datasets commonly found in
practice, especially for short ranges. We propose schemes that combine and
improve characteristics of both approaches, with emphasis on both
efficiency and utility. Towards this goal, we formulate a principled
approach, which defines a small set of simple modules, based on which we
can devise a variety of more complex schemes. We first express the
state-of-the-art methods in terms of these modules, which allows us to
identify the performance bottlenecks. Next, we design novel efficient and
effective schemes based on non-trivial module combinations. We
experimentally evaluate all mechanisms on three real datasets with diverse
characteristics, and demonstrate the benefits of our proposals over
previous work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs-mailman.bu.edu/pipermail/busec/attachments/20150929/8841537a/attachment.html>


More information about the Busec mailing list