[Dmbu-l] Fwd: [cs-talks] John Liagouris, Friday March 31st, 11:00am @ MCS 148

Charalampos Mavroforakis cmav at bu.edu
Tue Mar 28 19:11:05 EDT 2017

Update: The talk will be at MCS 148

- Harry

*Understanding Distributed Dataflow Systems*

*John Liagouris,* *Post-Doctoral Researcher, ETH Zurich*

*Friday March 31st, 11am – 12:30pm*


In this talk I will present our recent work on understanding distributed
dataflow systems like Apache Spark, Apache Flink, and Google’s TensorFlow.
The first part of the talk will focus on understanding the semantics of
distributed dataflows: Why does a dataflow return certain results and how
should output explanations look like? To answer such questions, we leverage
existing work in data provenance, and we advance the state-of-the-art to
provide output explanations that are both sufficient and concise. The
second part of the talk will focus on understanding the performance of
distributed dataflows: Why is a dataflow execution slow and which are the
bottlenecks in the pipeline? To answer such questions, we leverage existing
work on critical path methods, and we advance the state-of-the-art to
analyse the performance of dynamic and continuous computations in near-real
time. We have implemented our ideas in a prototype system, Strymon, that
builds on top of the novel Timely Dataflow framework written in Rust.
Strymon’s ultimate goal is to provide fast and meaningful insights into
complex enterprise datacenters by processing logs of events collected at
all levels of the software and hardware stack in real time.


John Liagouris is a post-doctoral researcher at ETH Zurich, and a member of
the Systems Group. Before joining ETHZ, he was a visiting research fellow
at the University of Hong Kong (2013-2014), and a research assistant at the
Institute for the Management of Information Systems (IMIS) of the Research
and Innovation Center 'Athena', Greece (2009-2015). Dr. Liagouris obtained
a diploma in Electrical and Computer Engineering in 2008, and a PhD in
2015, both from NTU Athens, Greece. His research interests lie in the areas
of datacenter monitoring, modelling and simulation, real-time analytics,
graph data management, distributed system profiling, and software defined

cs-talks mailing list
cs-talks at cs.bu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs-mailman.bu.edu/pipermail/dmbu-l/attachments/20170328/b5f04a97/attachment.html>

More information about the Dmbu-l mailing list