[NRG] Notification: The Dataflow Model: A Practical Approach to Balancing Cor... @ Mon Nov 16, 2015 11am - 12pm (NRG at BU)

Google Calendar calendar-notification at google.com
Sun Nov 15 11:00:04 EST 2015


This is a notification for:

Title: The Dataflow Model: A Practical Approach to Balancing Correctness,  
Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing  
(Ugur Kaynar)
The Dataflow Model: A Practical Approach to Balancing Correctness, Latency,  
and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

Authors: Tyler Akidau, Robert Bradshaw, Craig Chambers, Slava Chernyak,  
Rafael J. Fernandez-Moctezuma, Reuven Lax, Sam McVeety, Daniel Mills, ´  
Frances Perry, Eric Schmidt, Sam Whittle Google

Abstract: Unbounded, unordered, global-scale datasets are increasingly  
common in day-to-day business (e.g. Web logs, mobile usage statistics, and  
sensor networks). At the same time, consumers of these datasets have  
evolved sophisticated requirements, such as event-time ordering and  
windowing by features of the data themselves, in addition to an insatiable  
hunger for faster answers. Meanwhile, practicality dictates that one can  
never fully optimize along all dimensions of correctness, latency, and cost  
for these types of input. As a result, data processing practitioners are  
left with the quandary of how to reconcile the tensions between these  
seemingly competing propositions, often resulting in disparate  
implementations and systems. We propose that a fundamental shift of  
approach is necessary to deal with these evolved requirements in modern  
data processing. We as a field must stop trying to groom unbounded datasets  
into finite pools of information that eventually become complete, and  
instead live and breathe under the assumption that we will never know if or  
when we have seen all of our data, only that new data will arrive, old data  
may be retracted, and the only way to make this problem tractable is via  
principled abstractions that allow the practitioner the choice of  
appropriate tradeoffs along the axes of interest: correctness, latency, and  
cost. In this paper, we present one such approach, the Dataflow Model1 ,  
along with a detailed examination of the semantics it enables, an overview  
of the core principles that guided its design, and a validation of the  
model itself via the real-world experiences that led to its development.
When: Mon Nov 16, 2015 11am - 12pm Eastern Time
Where: MCS 148
Video call: https://plus.google.com/hangouts/_/bu.edu/nrg  
<https://plus.google.com/hangouts/_/bu.edu/nrg?hceid=NTYwam42bnQ1aGo0b2YzcnNyaWNoZnB0aW9AZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ.p99jsbkk8u39gm9u64takc7v1k>
Calendar: NRG at BU
Who:
     * doucette at bu.edu - creator

Event details:  
https://www.google.com/calendar/event?action=VIEW&eid=cDk5anNia2s4dTM5Z205dTY0dGFrYzd2MWsgNTYwam42bnQ1aGo0b2YzcnNyaWNoZnB0aW9AZw

Invitation from Google Calendar: https://www.google.com/calendar/

You are receiving this email at the account nrg-l at cs.bu.edu because you are  
subscribed for notifications on calendar NRG at BU.

To stop receiving these emails, please log in to  
https://www.google.com/calendar/ and change your notification settings for  
this calendar.

Forwarding this invitation could allow any recipient to modify your RSVP  
response. Learn more at  
https://support.google.com/calendar/answer/37135#forwarding
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs-mailman.bu.edu/pipermail/nrg-l/attachments/20151115/722bdf59/attachment.html>


More information about the NRG-L mailing list