LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis

Technische Universität Berlin
DIMA – Databases and Information Management Group
The Apache Flink Platform
for Parallel Batch and Stream Analysis
Jonas Traub | Tilmann Rabl | Fabian Hueske | Till Rohrmann | Volker Markl

In this talk
 Apache Flink Primer
• Architecture
• Execution Engine
• API Examples
 Stream Processing with Apache Flink
• Micro Batching vs. Native Streaming
• Flexible Windows/Stream Discretization
• Fault Tolerance with distributed snapshotting
 Conclusion
2Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015

What is Flink?
4
A platform for distributed
batch and streaming analytics
Streaming dataflow runtime
Technische Universität Berlin - The Apache Flink Platform for Parallel Batch and Stream Analysis - FGDB 2015

Flink in the Analytics Ecosystem
55
MapReduce
Hive
Flink
Spark Storm
Yarn Mesos
HDFS
Mahout
Cascading
Tez
Pig
Data processing
engines
App and resource
management
Applications
Storage, streams KafkaHBase
Crunch
…
Giraph
5

What can I do with it?
6
An engine that can natively support all these workloads.
Flink
Stream
processing
Batch
processing
Machine Learning at scale
Graph Analysis

Sneak peak: Two of Flink’s APIs
7
case class Word (word: String, frequency: Int)
val lines: DataStream[String] = env.fromSocketStream(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.keyBy("word")
.window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS))
.sum("frequency”)
.print()
val lines: DataSet[String] = env.readTextFile(...)
lines.flatMap {line => line.split(" ")
.map(word => Word(word,1))}
.groupBy("word").sum("frequency")
.print()
DataSet API (batch):
DataStream API (streaming):

Execution Model
 Flink program = DAG* of operators and intermediate results
 Operator = computation + state
 Intermediate result = logical stream of records
8
map
join sum

Architecture
 Pipelined/Streaming engine
• Complete DAG deployed
Worker 1
Worker 3 Worker 4
Worker 2
Job Manager

Ingredients of a Streaming System
 Pipelined Execution Engine
 Streaming Windows/Discretization
 Fault Tolerance
 High Level Programming API (or language)

Micro Batching vs Native Streaming
12
Stream
discretizer
Job Job Job Jobwhile (true) {
// get next few records
// issue batch computation
}
Discretized Streams (D-Streams)

Micro Batching vs Native Streaming
13
Stream
discretizer
Job Job Job Jobwhile (true) {
// get next few records
// issue batch computation
}
while (true) {
// process next record
}
Long-standing
operators
Discretized Streams (D-Streams)
Native streaming

Stream Discretization
 Data is unbounded
• Interested in a (recent) part of it e.g. last 10 days
 Most common windows around: time, and count
• Mostly in sliding, fixed, and tumbling form
 Need for data-driven window definitions
• e.g., user sessions (periods of user activity followed by inactivity), price changes, etc.
14
The world beyond batch: Streaming 101, Tyler Akidau
https://beta.oreilly.com/ideas/the-world-beyond-batch-
streaming-101
Great read!

Flink’s Discretization
 Allows very flexible windowing
 Borrows ideas, and extends IBM’s SPL
• SLIDE = Trigger = When to emit a window
• RANGE = Eviction = What the window contains
 Allows for lots of optimization
• Not part of this talk

The Discretizer Operator
16
Streams are represented as
FIFO-Queue of data-items
The window
operator keeps a
FIFO-Buffer
After some time,
data-items expire
(they are deleted)

17
The window operator is
event driven by
data-item arrivals

18
The window operator is
event driven by
data-item arrivals
1.) Trigger Policies (TPs)
Specify when to emit the current
buffer content as a window.
2.) Eviction Policies (EPs)
Specify when data-items are
removed from the buffer.

19
Query Example (window of size 3):
dataStream.window(Count.of(3))

20

21

22

23

Flexible Windowing
 Windows can be any combination of (multiple) triggers & evictions
• Arbitrary tumbling, sliding, session, etc. windows can be constructed.
 Common triggers/evictions part of the API
• Time, Count & Delta.
 Even more flexibility: define your own UDF trigger/eviction

Fault Tolerance and
Operator State
25

Comparing Fault Tolerance Solutions
• Based on consistent global snapshots
• Algorithm inspired by Chandy-Lamport
• Low runtime overhead
• Stateful exactly-once semantics
Message tracking/acks
(at least once guarantee)
RDD re-computation

Example: A Stateful Map (counter)
27
public class Counter implements MapFunction<Long>, Checkpointed<Long> {
//persistent counter
private long counter = 0;
public Long map(Long value){
return ++counter;
}

Example: A Stateful Map (counter)
28
public class Counter implements MapFunction<Long>, Checkpointed<Long> {
//persistent counter
private long counter = 0;
public Long map(Long value){
return ++counter;
}
// regularly persists state during normal operation
public Serializable snapshotState(long checkpointId, long checkpointTimestamp){
return new Long(counter);
}
// restores state on recovery from failure
public void restoreState(Serializable state){
counter = (Long) state;
}
}

Distributed Snapshots
reset from snap t2
t3t2t1
snap - t1 snap - t2
Assumptions
• repeatable sources
• reliable FIFO channels

Taking Snapshots
reset from snap t2
t3t2t1
snap - t1 snap - t2
Initial approach (e.g.,Naiad)
• Pause execution on t1,t2,..
• Collect state
• Restore execution

Asynchronous Snapshots in Flink
[Carbone et. al. 2015] “Lightweight Asynchronous Snapshots
for Distributed Dataflows”, Tech. Report.
http://arxiv.org/abs/1506.08603
Push checkpoint barriers through the data flow
Data Stream
barrier
Before barrier
 part of the snapshot
After barrier
 Not in snapshot
(backup till next snapshot)

Asynchronous Snapshots in Flink
Push checkpoint barriers through the data flow
Data Stream
barrier
Before barrier
 part of the snapshot
After barrier
 Not in snapshot
(backup till next snapshot)
Operator checkpoint
starting
Checkpoint done
Checkpoint done
checkpoint in progress
[Carbone et. al. 2015] “Lightweight Asynchronous Snapshots
for Distributed Dataflows”, Tech. Report.
http://arxiv.org/abs/1506.08603

Community
34
Flink started as the Stratosphere
project in in 2009, led by TU Berlin.
Entered incubation April 2014
graduated on December 2014.
Now one of the most active big data
projects after over a year in the
Apache Software Foundation.

tl;dr: what was this about?
• The Berlin Big Data Center
• Native Streaming with Apache Flink
• Flexible Windowing
• Fault Tolerance with exactly once guarantees
• Large (and growing!) community

Outlook: Introducing the BBDC
36
http://bbdc.berlin

BBDC Technology (10.000 feet view)

38
http://flink-forward.org

Thank you
39
If you find this exciting,
get involved on Flink‘s mailing list
or stay tuned by
subscribing to news@flink.apache.org,
following flink.apache.org/blog, and
@ApacheFlink on Twitter

LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis

More Related Content

What's hot

Similar to LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis

More from Jonas Traub

Recently uploaded

LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis