Technische Universität Berlin
DIMA – Databases and Information Management Group
The Apache Flink Platform
for Parallel Batch and Stream Analysis
Jonas Traub | Tilmann Rabl | Fabian Hueske | Till Rohrmann | Volker Markl
In this talk
 Apache Flink Primer
• Architecture
• Execution Engine
• API Examples
 Stream Processing with Apache Flink
• Micro Batching vs. Native Streaming
• Flexible Windows/Stream Discretization
• Fault Tolerance with distributed snapshotting
 Conclusion
2Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Apache Flink Primer
3
What is Flink?
4
A platform for distributed
batch and streaming analytics
Streaming dataflow runtime
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Flink in the Analytics Ecosystem
55
MapReduce
Hive
Flink
Spark Storm
Yarn Mesos
HDFS
Mahout
Cascading
Tez
Pig
Data processing
engines
App and resource
management
Applications
Storage, streams KafkaHBase
Crunch
…
Giraph
5
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
What can I do with it?
6
An engine that can natively support all these workloads.
Flink
Stream
processing
Batch
processing
Machine Learning at scale
Graph Analysis
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Sneak peak: Two of Flink’s APIs
7
case class Word (word: String, frequency: Int)
val lines: DataStream[String] = env.fromSocketStream(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.keyBy("word")
.window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS))
.sum("frequency”)
.print()
val lines: DataSet[String] = env.readTextFile(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.groupBy("word").sum("frequency")
.print()
DataSet API (batch):
DataStream API (streaming):
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Execution Model
 Flink program = DAG* of operators and intermediate results
 Operator = computation + state
 Intermediate result = logical stream of records
8
map
join sum
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Architecture
 Pipelined/Streaming engine
• Complete DAG deployed
Worker 1
Worker 3 Worker 4
Worker 2
Job Manager
9Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Flink Stream Processing
10
Ingredients of a Streaming System
 Pipelined Execution Engine
 Streaming Windows/Discretization
 Fault Tolerance
 High Level Programming API (or language)
11Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Micro Batching vs Native Streaming
12
Stream
discretizer
Job Job Job Jobwhile (true) {
// get next few records
// issue batch computation
}
Discretized Streams (D-Streams)
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Micro Batching vs Native Streaming
13
Stream
discretizer
Job Job Job Jobwhile (true) {
// get next few records
// issue batch computation
}
while (true) {
// process next record
}
Long-standing
operators
Discretized Streams (D-Streams)
Native streaming
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Stream Discretization
 Data is unbounded
• Interested in a (recent) part of it e.g. last 10 days
 Most common windows around: time, and count
• Mostly in sliding, fixed, and tumbling form
 Need for data-driven window definitions
• e.g., user sessions (periods of user activity followed by inactivity), price changes, etc.
14
The world beyond batch: Streaming 101, Tyler Akidau
https://beta.oreilly.com/ideas/the-world-beyond-batch-
streaming-101
Great read!
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Flink’s Discretization
 Allows very flexible windowing
 Borrows ideas, and extends IBM’s SPL
• SLIDE = Trigger = When to emit a window
• RANGE = Eviction = What the window contains
 Allows for lots of optimization
• Not part of this talk
15Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
16
Streams are represented as
FIFO-Queue of data-items
The window
operator keeps a
FIFO-Buffer
After some time,
data-items expire
(they are deleted)
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
17
The window operator is
event driven by
data-item arrivals
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
18
The window operator is
event driven by
data-item arrivals
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
19
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
Query Example (window of size 3):
dataStream.window(Count.of(3))
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
20
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
21
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
22
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
The Discretizer Operator
23
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Flexible Windowing
 Windows can be any combination of (multiple) triggers & evictions
• Arbitrary tumbling, sliding, session, etc. windows can be constructed.
 Common triggers/evictions part of the API
• Time, Count & Delta.
 Even more flexibility: define your own UDF trigger/eviction
24Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Fault Tolerance and
Operator State
25
Comparing Fault Tolerance Solutions
• Based on consistent global snapshots
• Algorithm inspired by Chandy-Lamport
• Low runtime overhead
• Stateful exactly-once semantics
Message tracking/acks
(at least once guarantee)
RDD re-computation
26Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Example: A Stateful Map (counter)
27
public class Counter implements MapFunction<Long>, Checkpointed<Long> {
//persistent counter
private long counter = 0;
public Long map(Long value){
return ++counter;
}
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Example: A Stateful Map (counter)
28
public class Counter implements MapFunction<Long>, Checkpointed<Long> {
//persistent counter
private long counter = 0;
public Long map(Long value){
return ++counter;
}
// regularly persists state during normal operation
public Serializable snapshotState(long checkpointId, long checkpointTimestamp){
return new Long(counter);
}
// restores state on recovery from failure
public void restoreState(Serializable state){
counter = (Long) state;
}
}
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Distributed Snapshots
reset from snap t2
t3t2t1
snap - t1 snap - t2
Assumptions
• repeatable sources
• reliable FIFO channels
29Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Taking Snapshots
reset from snap t2
t3t2t1
snap - t1 snap - t2
Initial approach (e.g.,Naiad)
• Pause execution on t1,t2,..
• Collect state
• Restore execution
30Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Asynchronous Snapshots in Flink
[Carbone et. al. 2015] “Lightweight Asynchronous Snapshots
for Distributed Dataflows”, Tech. Report.
http://arxiv.org/abs/1506.08603
Push checkpoint barriers through the data flow
Data Stream
barrier
Before barrier
 part of the snapshot
After barrier
 Not in snapshot
(backup till next snapshot)
31Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Asynchronous Snapshots in Flink
Push checkpoint barriers through the data flow
Data Stream
barrier
Before barrier
 part of the snapshot
After barrier
 Not in snapshot
(backup till next snapshot)
Operator checkpoint
starting
Checkpoint done
Checkpoint done
checkpoint in progress
32Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
[Carbone et. al. 2015] “Lightweight Asynchronous Snapshots
for Distributed Dataflows”, Tech. Report.
http://arxiv.org/abs/1506.08603
Closing
33
Community
34
Flink started as the Stratosphere
project in in 2009, led by TU Berlin.
Entered incubation April 2014
graduated on December 2014.
Now one of the most active big data
projects after over a year in the
Apache Software Foundation.
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
tl;dr: what was this about?
• The Berlin Big Data Center
• Native Streaming with Apache Flink
• Flexible Windowing
• Fault Tolerance with exactly once guarantees
• Large (and growing!) community
35Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Outlook: Introducing the BBDC
36
http://bbdc.berlin
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
BBDC Technology (10.000 feet view)
37Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
38
http://flink-forward.org
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Thank you
39
If you find this exciting,
get involved on Flink‘s mailing list
or stay tuned by
subscribing to news@flink.apache.org,
following flink.apache.org/blog, and
@ApacheFlink on Twitter
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
Technische Universität Berlin
DIMA – Databases and Information Management Group
The Apache Flink Platform
for Parallel Batch and Stream Analysis
Jonas Traub | Tilmann Rabl | Fabian Hueske | Till Rohrmann | Volker Markl

LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis

  • 1.
    Technische Universität Berlin DIMA– Databases and Information Management Group The Apache Flink Platform for Parallel Batch and Stream Analysis Jonas Traub | Tilmann Rabl | Fabian Hueske | Till Rohrmann | Volker Markl
  • 2.
    In this talk Apache Flink Primer • Architecture • Execution Engine • API Examples  Stream Processing with Apache Flink • Micro Batching vs. Native Streaming • Flexible Windows/Stream Discretization • Fault Tolerance with distributed snapshotting  Conclusion 2Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 3.
  • 4.
    What is Flink? 4 Aplatform for distributed batch and streaming analytics Streaming dataflow runtime Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 5.
    Flink in theAnalytics Ecosystem 55 MapReduce Hive Flink Spark Storm Yarn Mesos HDFS Mahout Cascading Tez Pig Data processing engines App and resource management Applications Storage, streams KafkaHBase Crunch … Giraph 5 Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 6.
    What can Ido with it? 6 An engine that can natively support all these workloads. Flink Stream processing Batch processing Machine Learning at scale Graph Analysis Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 7.
    Sneak peak: Twoof Flink’s APIs 7 case class Word (word: String, frequency: Int) val lines: DataStream[String] = env.fromSocketStream(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .keyBy("word") .window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS)) .sum("frequency”) .print() val lines: DataSet[String] = env.readTextFile(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print() DataSet API (batch): DataStream API (streaming): Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 8.
    Execution Model  Flinkprogram = DAG* of operators and intermediate results  Operator = computation + state  Intermediate result = logical stream of records 8 map join sum Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 9.
    Architecture  Pipelined/Streaming engine •Complete DAG deployed Worker 1 Worker 3 Worker 4 Worker 2 Job Manager 9Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 10.
  • 11.
    Ingredients of aStreaming System  Pipelined Execution Engine  Streaming Windows/Discretization  Fault Tolerance  High Level Programming API (or language) 11Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 12.
    Micro Batching vsNative Streaming 12 Stream discretizer Job Job Job Jobwhile (true) { // get next few records // issue batch computation } Discretized Streams (D-Streams) Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 13.
    Micro Batching vsNative Streaming 13 Stream discretizer Job Job Job Jobwhile (true) { // get next few records // issue batch computation } while (true) { // process next record } Long-standing operators Discretized Streams (D-Streams) Native streaming Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 14.
    Stream Discretization  Datais unbounded • Interested in a (recent) part of it e.g. last 10 days  Most common windows around: time, and count • Mostly in sliding, fixed, and tumbling form  Need for data-driven window definitions • e.g., user sessions (periods of user activity followed by inactivity), price changes, etc. 14 The world beyond batch: Streaming 101, Tyler Akidau https://beta.oreilly.com/ideas/the-world-beyond-batch- streaming-101 Great read! Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 15.
    Flink’s Discretization  Allowsvery flexible windowing  Borrows ideas, and extends IBM’s SPL • SLIDE = Trigger = When to emit a window • RANGE = Eviction = What the window contains  Allows for lots of optimization • Not part of this talk 15Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 16.
    The Discretizer Operator 16 Streamsare represented as FIFO-Queue of data-items The window operator keeps a FIFO-Buffer After some time, data-items expire (they are deleted) Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 17.
    The Discretizer Operator 17 Thewindow operator is event driven by data-item arrivals Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 18.
    The Discretizer Operator 18 Thewindow operator is event driven by data-item arrivals 1.) Trigger Policies (TPs) Specify when to emit the current buffer content as a window. 2.) Eviction Policies (EPs) Specify when data-items are removed from the buffer. Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 19.
    The Discretizer Operator 19 1.)Trigger Policies (TPs) Specify when to emit the current buffer content as a window. 2.) Eviction Policies (EPs) Specify when data-items are removed from the buffer. Query Example (window of size 3): dataStream.window(Count.of(3)) Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 20.
    The Discretizer Operator 20 2.)Eviction Policies (EPs) Specify when data-items are removed from the buffer. 1.) Trigger Policies (TPs) Specify when to emit the current buffer content as a window. Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 21.
    The Discretizer Operator 21 1.)Trigger Policies (TPs) Specify when to emit the current buffer content as a window. 2.) Eviction Policies (EPs) Specify when data-items are removed from the buffer. Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 22.
    The Discretizer Operator 22 1.)Trigger Policies (TPs) Specify when to emit the current buffer content as a window. 2.) Eviction Policies (EPs) Specify when data-items are removed from the buffer. 1.) Trigger Policies (TPs) Specify when to emit the current buffer content as a window. Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 23.
    The Discretizer Operator 23 1.)Trigger Policies (TPs) Specify when to emit the current buffer content as a window. 2.) Eviction Policies (EPs) Specify when data-items are removed from the buffer. 2.) Eviction Policies (EPs) Specify when data-items are removed from the buffer. Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 24.
    Flexible Windowing  Windowscan be any combination of (multiple) triggers & evictions • Arbitrary tumbling, sliding, session, etc. windows can be constructed.  Common triggers/evictions part of the API • Time, Count & Delta.  Even more flexibility: define your own UDF trigger/eviction 24Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 25.
  • 26.
    Comparing Fault ToleranceSolutions • Based on consistent global snapshots • Algorithm inspired by Chandy-Lamport • Low runtime overhead • Stateful exactly-once semantics Message tracking/acks (at least once guarantee) RDD re-computation 26Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 27.
    Example: A StatefulMap (counter) 27 public class Counter implements MapFunction<Long>, Checkpointed<Long> { //persistent counter private long counter = 0; public Long map(Long value){ return ++counter; } Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 28.
    Example: A StatefulMap (counter) 28 public class Counter implements MapFunction<Long>, Checkpointed<Long> { //persistent counter private long counter = 0; public Long map(Long value){ return ++counter; } // regularly persists state during normal operation public Serializable snapshotState(long checkpointId, long checkpointTimestamp){ return new Long(counter); } // restores state on recovery from failure public void restoreState(Serializable state){ counter = (Long) state; } } Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 29.
    Distributed Snapshots reset fromsnap t2 t3t2t1 snap - t1 snap - t2 Assumptions • repeatable sources • reliable FIFO channels 29Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 30.
    Taking Snapshots reset fromsnap t2 t3t2t1 snap - t1 snap - t2 Initial approach (e.g.,Naiad) • Pause execution on t1,t2,.. • Collect state • Restore execution 30Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 31.
    Asynchronous Snapshots inFlink [Carbone et. al. 2015] “Lightweight Asynchronous Snapshots for Distributed Dataflows”, Tech. Report. http://arxiv.org/abs/1506.08603 Push checkpoint barriers through the data flow Data Stream barrier Before barrier  part of the snapshot After barrier  Not in snapshot (backup till next snapshot) 31Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 32.
    Asynchronous Snapshots inFlink Push checkpoint barriers through the data flow Data Stream barrier Before barrier  part of the snapshot After barrier  Not in snapshot (backup till next snapshot) Operator checkpoint starting Checkpoint done Checkpoint done checkpoint in progress 32Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015 [Carbone et. al. 2015] “Lightweight Asynchronous Snapshots for Distributed Dataflows”, Tech. Report. http://arxiv.org/abs/1506.08603
  • 33.
  • 34.
    Community 34 Flink started asthe Stratosphere project in in 2009, led by TU Berlin. Entered incubation April 2014 graduated on December 2014. Now one of the most active big data projects after over a year in the Apache Software Foundation. Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 35.
    tl;dr: what wasthis about? • The Berlin Big Data Center • Native Streaming with Apache Flink • Flexible Windowing • Fault Tolerance with exactly once guarantees • Large (and growing!) community 35Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 36.
    Outlook: Introducing theBBDC 36 http://bbdc.berlin Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 37.
    BBDC Technology (10.000feet view) 37Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 38.
    38 http://flink-forward.org Technische Universität Berlin- The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 39.
    Thank you 39 If youfind this exciting, get involved on Flink‘s mailing list or stay tuned by subscribing to [email protected], following flink.apache.org/blog, and @ApacheFlink on Twitter Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015
  • 40.
    Technische Universität Berlin DIMA– Databases and Information Management Group The Apache Flink Platform for Parallel Batch and Stream Analysis Jonas Traub | Tilmann Rabl | Fabian Hueske | Till Rohrmann | Volker Markl