Adaptive Flow Monitoring
& Selective DPI for ONOS
ONS2016
2016. 3. 17
Taesang Choi
ETRI
Topics
OPEN-TAM ONOS Subproject Overview
Development Phase 1
Adaptive(Effective) Flow Monitoring
Development Phase 2
Selective-DPI (Deep Packet Inspection)
Demo & Use-Cases
Open-TAM Subproject Overview
ONOS Architecture Review and OPEN-TAM Architecture Design
ONOS wiki page : beginner, user, developer, architecture guide
ONOS Sub Component : analysis of the ONOS architecture and
associated source code
OPEN-TAM Architecture Design
OPEN-TAM Subproject Setup
Kick-off Conference Call with ONOS TSR
Proposal of a Subproject: OPEN-TAM
Creation of OPEN-TAM Future Project Wiki Page
Development and Release Plans
Started with Blackbird 1.1.0 release
Phase 1 result (AFM) was incorporated in EMU release
Phase 2 will be in GoldenEye release
3
Phase 1: Adaptive Flow
Monitoring
Adaptive Flow Monitoring Motivation
Default ONOS Flow Monitoring Issues
Default FlowRule service collects all flow information from all devices at every
time interval (default 10 seconds)
This mechanism may cause performance degradation issue at each collection
time in a large-scale real carrier network due to the number of switches and its
associated flows (for example; WAN: ~500 Routers, ~10K ports, ~1-10M flows
per port)
To overcome performance problem in a simple way, we can maintain collection
time interval value with a large number. It then causes another critical issue:
lack of accuracy
Our proposal to this problem is an effective flow monitoring scheme called,
Adaptive Flow Monitoring Service that can minimize collection computing
overhead and provide more accurate flow statistics
Adaptive Flow Monitoring Algorithm
1. Initialize Variables and Setup Tasks() {
CAL_AND_SHORT_POLL_INTERVAL= 5, MID_POLL_INTERVAL=10,
LONG_POLL_INTERVAL= 15, ENTIRE_POLL_INTERVAL=30
Set each tasks being executed at every corresponding time interval }
2. CAL_AND_SHORT_FLOWS_TASK() {
IF at first time call or ENTIRE_POLL_INTERVAL, send FlowStatsRequest message
getting all flow entries.
Else at every call, calculates FlowLiveType and save it appropriate tables
Sends FlowStatsRequest message only for SHORT_FLOWs entries
3. MID_FLOWS_TASK() {
If at every time call and not ENTIRE_POLL_INTERVAL, sends FlowStatsRequest
message only for MID_FLOWs entries }
4. LONG_FLOWS_TASK() {
If at every time call and not ENTIRE_POLL_INTERVAL, sends FlowStatsRequest
message only for LONG_FLOWs entries }
6
AFM Flow Polling Example
CAL_AND_SHORT_POLL,
0 (E)
5 (C)
MID_POLL,
10 (C, M)
LONG_POLL,
15 (C, L)
ENTIRE_POLL
20 (C, M)
Immediate Flow
Short Flow
25 (C)
CAL_AND_SHORT_POLL_INTERVAL= 5
MID_POLL_INTERVAL=10(5*2)
LONG_POLL_INTERVAL= 15(5*3)
ENTIRE_POLL_INTERVAL=30(5*3*2)
30 (E)
35 (C)
(C, M)
40
Simple Polling Count: 44(29+15)
Adaptive Polling count: 29
Mid Flow
Long Flow
About
51.7%
Reduction
FlowStatisticManager Subsystem Design
GetFlowStatistics CLI
Component
FlowStatisticService.loadSummary
(device)
query &
command
command
FlowStatistic
Service
AdminService
FlowStatisticManager
Component
DistributedFlowStatisticStore
ProviderService
NewDistributedFlowRuleStore
FlowRuleManager
Component
ProviderRegistry
command
sensing
OpenFlowRuleProvider
Component
FlowRuleListener
Notify FlowRule
Add/Update/Remove
register &
unregister
Provider
Protocols
8
Adaptive Flow Monitoring Design Strategy
Re-designing of FlowStatCollector into NewAdaptiveFlowStatCollector used
in OpenFlowRuleProvider for each corresponding switch
Extended StoredFlowEntryWithType from StoredFlowEntry
Divides FlowEntries into four groups, i.e., IMMEDIATE_FLOW, SHORT_FLOW,
MID_FLOW, and LONG_FLOW groups
And uses four time intervals, i.e., SHORT_POLL_INTERAVAL,
MID_POLL_INTERVAL, LONG_POLL_INTERVAL, ENTIRE_POLL_INTERVAL
At every SHORT_POLL_INTERAL, calculates all flows into appropriate flow
groups and send FlowStatsRequest message about all SHORT_FLOWs
And every MID_POLL_INTERVAL, send FlowStatsRequest message about all
MID_FLOWs
And every LONG_POLL_INTERVAL, send FlowStatsRequest message about
all LONG_FLOWs
And finally every ENTIRE_POLL_INTERVAL, send FlowStatsRequest message
all flow entries in the switch
9
NewAdaptiveFlowStatsCollector
I
StoredFlowEntry
TypedStoredFlowEntry
Member
public static enum FlowLiveType { IMMEDIATE_FLOW, SHORT_FLOW,
MID_FLOW, LONG_FLOW, UNKNOWN_FLOW }
Method
public int flowLiveType();
public void setFlowLiveType(FlowLiveType liveType);
DefaultFlowEntry
Impl
TypedStoredFlowEntry
DefaultTypedFlowEntry
Member
private FlowLiveType liveTyep;
Method
public DefaultTypedFlowEntry(FlowRule rule, FlowLiveType liveType);
public DefaultTypedFlowEntry(FlowEntry fe, FlowLiveType liveType);
@Override
public int flowLiveType();
@Override
public void setFlowLiveType(FlowLiveType liveType);
10
NewAdaptiveFlowStatsCollector
C
NewAdaptiveFlowStatsCollector
Member
pivate final Logger log = getLogger(getClass());
private final OpenFlowSwitch sw;
Member
Method
private ScheduledExecutorService adaptiveFlowStatsScheduler =
newScheduledThreadPool(4, groupedThreads("onos/flow", "device-statscollector-%d"));
private CalAndShortFlowsTask calAndShortFlowsTask;
private MidFlowsTask midFlowsTask;
private LongFlowsTask longFlowsTask;
private int calAndShortPollInterval = 5;
private int midPollInterval = 10;
Private int longPollInterval = 15;
Private int entirePollInterval = 30;
private InteralDeviceFlowTable deviceFlowTable = new
InteralDeviceFlowTable();
Method
NewAdaptiveFlowStatsCollector(OpenFlowSwitch, int);
synchronized void adjustCalPollInterval(int pollInterval);
private class CalAndShortFlowsTask implements Runnable { }
private class MidFlowsTask implements Runnable { }
private class LongFlowsTask implements Runnable { }
public synchronized void start();
public synchronized void stop();
public
public
public
public
boolean
boolean
boolean
boolean
addWithFlowRule(FlowRule flowRules);
addOrUpdateFlows(FlowEntry flowRules);
removeFlows(FlowEntry flowRules);
pushFlowMetrics(List<FlowEntry> flowEntries);
OpenFlowRuleProvider
Added NewAdaptiveFlowStatsCollector method call statements
at every flow entry ADD, REMOVED, and UPDATED
Internal
C
InteralDeviceFlowTable
Member
private final Map<FlowId, Set<StoredFlowEntryWithType>>
deviceFlowEntries = Maps.newConcurrentMap();
private final Set<StoredFlowEntry> shortFlows = new HashSet<>();
private final Set<StoredFlowEntry> midFlows = new HashSet<>();
private final Set<StoredFlowEntry> longFlows = new HashSet<>();
Method
public getFlowCount();
public TypedStoredFlowEntry getFlowEntry(FlowRule rule);
public Set<StoredFlowEntry> getFlowEntries();
public void add(TypedStoredFlowEntry rule) ;
public void addWithCalAndSetFlowLiveType(TypedStoredFlowEntry rule);
{ getFlowEntriesInternal(rule.id()).add((StoredFlowEntry) rule); }
public boolean remove(FlowEntry rule);
public void checkAndMoveLiveFlowAll();
11
FlowStatisticManager Class Hierarchy
C(lass)
AbstractShellCommand
I(nterface)
GetFlowStatistics
Member
Method
execute()
{ flowStatsService.load(device,
inLiveType, inInstructionType, topn); }
Member
Method
FlowRuleManager
FlowStatisticManager
Member
Method
Post FlowRuleEvent
Add/Update/Remove
C
FlowStatisticService
C
Member
Method
Member
I FlowRuleListener
InternalFlowRuleStatsListener
TypedStatistics
Method
Add/Update/Remove
flowRule
getCurrentFlowStatistic
(cp);
getPreviousFlowStatisti
c(cp);
C
Member
DistributedFlowStatisticStore
I
FlowStatisticStore
Method
12
FlowStatisticManager Class Design
C
C(lass) AbstractShellCommand
GetFlowStatistics
Member
Method
execute() {
statisticFlowService.loadSummary(device, port);
statisticFlowService.loadAllByType(device, port,
flow_type, inst_type);
statisticFlowService.loadTopnByType(device, port,
flow_type, inst_type, int topn); }
TypedFlowEntryWithLoad
Member
ConnectionPoint cp;
TypedFlowEntry typedFlowEntry;
Load load;
I(nterface)
FlowStatisticManager
FlowStatisticService
Member
Method
@Overrride
Map<ConnectionPoint, SummaryFlowEtnryWithLoad> loadSummary(Device device) { }
SummaryFlowEtnryWithLoad loadSummary(Device device, PortNumer pNumber) { }
Map<ConnectionPoint, List<TypedFlowEntryWithLoad>> loadAllByType(Device device,
FlowLiveType liveType, InstructionType instType ) { }
List<TypedFlowEntryWithLoad> loadAllByType(Device device, PortNumer pNumber,
FlowLiveType liveType, InstructionType instType ) { }
Map<ConnectionPoint, List<TypedFlowEntryWithLoad>> loadTopnByType(Device device,
FlowLiveType liveType, InstructionType instType, int topn ) { }
List<TypedFlowEntryWithLoad> loadTopnByType(Device device, PortNumer pNumber,
FlowLiveType liveType, InstructionType instType, int topn ) { }
Method
SummaryFlowEntryWithLoad
Member
ConnectionPoint cp;
Load totalLoad;
Load immediateLoad;
Load shortLoad;
Load midLoad;
Load longLoad;
Method
I(nterface)
FlowStatisticStore
DistributedFlowStatisticStore
Member
private Map<ConnectPoint, Set<FlowEntry>> previous = new
ConcurrentHashMap<>();
private Map<ConnectPoint, Set<FlowEntry>> current= new
ConcurrentHashMap<>();
Method
void removeFlowStatistic(FlowRule rule) { }
void addFlowStatistic(FlowEntry rule) { }
void updateFlowStatistic(FlowEntry rule) { }
Set<FlowEntry> getCurrentFlowStatistic(ConnectPoint connectPoint) { }
Set<FlowEntry> getPreviousFlowStatistic(ConnectPoint connectPoint) { }
13
Advanced Adaptive Flow Sampling
To reduce performance overhead further,
At every polling interval, instead of getting all flow stats of
corresponding flow entries (SHORT, MID, and LONG_FLOWs)
Apply one of the following sampling methods for each flow table
groups, especially for SHORT or MID groups with large number of
flow entries
No sampling (get all, same as current method)
Random sampling (at every count (default=100) flow entry number)
Top-n sampling (only top-n (default=1000) flows based on Bytes per Second
rates)
Probabilistic sampling (only probabilistic (default=1/2) success flow entry
number)
14
Phase 2: Selective Deep
Packet Inspection
Selective-DPI Motivation
Current Problem
Current ONOS flow can be classified and selected by lower-level FlowSelection
criteria based on FlowRule entry (eg., ports, ether_type, vlan_id, 5-tuple, etc.)
There is no application classification service for ONOS data plane user-data
We propose to add a Selective DPI service that can filter data plane user-data
from controller traffic and classify them with application level granularity by
using a open source DPI s/w
16
Selective DPI: OpenDPIManager Architecture
On-ONOS Platform
OpenDPIApp
Component
DPIService
Manger
RawPackDumpA
dminService
RawPacketDumpManager
Component
DPI
Engine
DPI
Engine
DPI
Engine
RawPacketDumpService
(Flow)
RawPacketRawPacketRawPacket
Dumper Dumper Dumper
Neighbor
OpenDPIApps
RawPackDumpA
dminService
RawPacketDumpService
RuleStore
DPI
Engine
DPI
Engine
DPI
Engine
RawPacketDumpService
(Flow)
RawPacketRawPacketRawPacket
Dumper Dumper Dumper
RawPacketDumpProvider
Component
Capture
Interface
Capture
Interface
17
Selective DPI: OpenDPIManager Architecture
ntop NDPI-1.7
- off-ONOS Platform
OpenDPIApp
Component
DPIService
Manger
RawPacketDumpManager
Component
DPIService
Agent
DPI
DPI
DPI
Engine Engine Engine
External
OpenDPIEngine
Stores
Packet Packet Packet
Dumper Dumper Dumper
RawPacketDumpProvider
Component
Intel DPDK 2.2.0
Capture
Interface
18
Selective-DPI: Scalability and Performance
On-ONOS Platform solution can be used for limited number flows that
dont influence performance and reliability of ONOS
The scalability limitation of on-ONOS platform can be determined with field tests
It is a native ONOS DPI solution which can be used for certain premium service
applications or ONOS control & management traffic trouble shooting purposes
Off-ONOS Platform solution is a full-blown scalable DPI solution
It is a non native solution
Require a dedicated stand-alone DPI engine which can be a hardware-based
solution for the performance assurance or a container-based virtual software
function for service agility
Issue seems to be
Not choosing one of them
But finding an appropriate ratio of deployment of both solutions depending on
the service requirements
19
Demo and Use Cases
GetFlowStatistics Command
CLI:onos> get-flow-stats Device[/Port] [--summary | --all | -top number] [--flow type | --instruction type] [--help]
Default: onos> get-flow-stats Device summary
-s, --summary: show summary flow rule stats based on flow live type
-a, --all: show all flow rule stats
-t, --top N : show only top N flow rule stats, 0 < N <= 1000, default
= 100
-f, --flow type: show only flow rule live type = [IMMEDIATE| SHORT |
MID | LONG]
-i, --instruction type: show flow rule instruction type = [DROP |
OUTPUT | GROUP | L0MODIFICATION | L2MODIFICATION | TABLE |
L3MODIFICATION | METADATA]
21
CLI:onso> get-flow-stats
get-flow-stats --help
get-flow-stats
--summary
get-flow-stats
--all
get-flow-stats
--topn 10
22
CLI:onso> dpis & -j
Lists the DPI results
Received from external DPI engine with
table or json type.
dpis --help
23
CLI:onso> dpis n 2
Lists the latest n (MAX=100)
DPI result entries
24
Selective DPI Use Case for Function Service
Chaining Classification Application (off-Platform)
8: DPI Classification Results for Service Chaining Classification
ONOS
Apps
(ex; SFC)
1: DPI Start Request: Identification of a specific flows for
Service Chaining Classification
3: Return the result and notify
DPI External PacketDumper
Access Information
OpenDPIApp
Component
DPIService
Manger
RawPacketDumpManager
Component
5: Setup Flow Rule for Redirection of
Requested flow(s)
2.3
DPIService
Agent
DPI
DPI
Engine Engine
2.1
and
2.2
4: Request Redirection of flow for DPI
with Access Info
ONOS
Controller
2.1: Create and RawPacket Dumper and Start
to Capture
2.2 Then enforce an instance of DPI Engine
2.3 Returns Classification Results
RawPacketDumpProvider
Component
6: Redirection of Requested flow(s) to DPI Subsystem
External
OpenDPIEngine
Stores
Packet
Dumper
Packet
Dumper
7: Packet dumps into
a destined DPI engine for analytics
Capture
Interface
25
Thanks
for your Attention
[email protected]