Serverless Compute Platforms
on Kubernetes:
Beyond Web Applications
Alex Glikson
Senior Research Architect, Cloud Platforms
Carnegie Mellon University, Pittsburgh, USA
(IBM Research, Israel)
KubeCon, May 2019
with Ping-Min Lin (Pinterest), Shengjie Luo (VMware), Ke Chang (Facebook), Shichao Nie (Alibaba)
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Interactive Computing
■ Demo
○ Deep Learning
● Conclusions
2
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Serverless
● Many definitions. In a nutshell:
● Avoid management of servers, as a representative example of tasks that:
○ Keep you distracted from developing your *core* business capabilities, and
○ Can be outsourced to someone you trust, for whom this would be *their* core business
● Serverless = Distraction-Free
● Separation of concerns
● Developer experience??
3
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Serverless = Distraction-Free (Examples)
● Object Storage:
○ Core: data organization
○ Distraction: servers, storage, network, high availability, fault tolerance, replication, consistency
● Micro-services:
○ Core: services logic, interfaces
○ Distraction: infra, scaling, LB, HA/FT, API management, routing, service discovery, databases
● Async/Event-driven:
○ Core: event-processing logic
○ Distraction: eventing, messaging, queuing, notifications, etc (+infra/scaling/LB/HA/FT/auth/etc)
● …
4
Example:
Amazon S3
Example:
Kubernetes+Istio+…
Example:
Lambda, SNS, etc
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Serverless Compute Platform (SCP)
● Platform that executes user-provided code (BYOC)
● Often optimized for specific application patterns
○ Often associated fine-grained elasticity, scaling to zero, etc
● Distraction-free
○ Simplified management
■ Deployment, scaling, metering, monitoring, logging, updates, etc
○ Seamless integration with services that the ‘compute’ interacts with (or depends on)
■ Event sources, data, communication middleware, etc.
5
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
6
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
7
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
8
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
(Not too) short-lived, ephemeral functions, triggered by events or requests;
High load variability (including periods of idleness), (relatively) low sensitivity to latency
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
9
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
(Not too) short-lived, ephemeral functions, triggered by events or requests;
High load variability (including periods of idleness), (relatively) low sensitivity to latency
Management Fully managed runtime containers; functions & function invocations as first class citizens
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
10
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
(Not too) short-lived, ephemeral functions, triggered by events or requests;
High load variability (including periods of idleness), (relatively) low sensitivity to latency
Management Fully managed runtime containers; functions & function invocations as first class citizens
Integration Seamless integration with multiple event sources
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
11
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
12
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
13
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
High throughput, low latency packet processing
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
14
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
High throughput, low latency packet processing
Management Fully managed isolated runtime
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
15
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
High throughput, low latency packet processing
Management Fully managed isolated runtime
Integration The hosting platform
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
16
Platform
Property
Serverless ETL
Examples
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
17
Platform
Property
Serverless ETL
Examples AWS Glue
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
18
Platform
Property
Serverless ETL
Examples AWS Glue
Code PySpark, PyShell jobs
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
19
Platform
Property
Serverless ETL
Examples AWS Glue
Code PySpark, PyShell jobs
Application
Pattern
Data-parallel Spark jobs (periodic or ad-hoc)
Non-parallel pre/post-processing jobs
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
20
Platform
Property
Serverless ETL
Examples AWS Glue
Code PySpark, PyShell jobs
Application
Pattern
Data-parallel Spark jobs (periodic or ad-hoc)
Non-parallel pre/post-processing jobs
Management Fully managed Spark cluster; Python runtime
Integration Data catalogue
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
21
Platform
Property
Cloud-Native Web Applications
Examples
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
22
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
23
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
24
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Long-running, scale-out services; Linear resource demand per request
Often high-throughput, low-latency
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
25
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Long-running, scale-out services; Linear resource demand per request
Often high-throughput, low-latency
Management K8s features + code-to-deploy, revisions, canary deployment, etc
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
26
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Long-running, scale-out services; Linear resource demand per request
Often high-throughput, low-latency
Management K8s features + code-to-deploy, revisions, canary deployment, etc
Integration Service mash, build, eventing
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
What Other Application Patterns Could Justify a Specialized SCP?
27
Platform
Property
?
Examples ?
Code ?
Application
Pattern
?
Management ?
Integration ?
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Interactive Computing
○ Deep Learning
● Conclusions
28
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Interactive Computing
● Example: Data Science using Jupyter Notebook
● Architecture 1: Python + Spark
○ Scale-out Spark jobs
○ Requires Spark programming model
● Architecture 2: “pure” Python
○ Local execution, using non-parallel
Python libraries
○ Not designed for scale-out,
but can take advantage of scale-up
● Other example: Linux Shell
29
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
30
Property
Interactive Computing (Jupyter, Shell)
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
31
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
32
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Iterative invocation of stateful, non-parallel, computation-intensive,
ad-hoc tasks, triggered by explicit user interaction
Management
Integration
Efficient persistence of state across invocations
Scale-up rather than scale-out Easily re-programmable (code as payload)
Scale to zero when idle
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
33
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Iterative invocation of stateful, non-parallel, computation-intensive,
ad-hoc tasks, triggered by explicit user interaction
Management Provisioning, management, scaling of underlying resources
Integration
Efficient persistence of state across invocations
Scale-up rather than scale-out
Scale to zero when idle
Easily re-programmable (code as payload)
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
34
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Iterative invocation of stateful, non-parallel, computation-intensive,
ad-hoc tasks, triggered by explicit user interaction
Management Provisioning, management, scaling of underlying resources
Integration Data sources, auth, etc
Efficient persistence of state across invocations
Scale-up rather than scale-out
Scale to zero when idle
Easily re-programmable (code as payload)
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Runbox: Elastic Persistent Execution Environment on K8s
https://github.com/slsvm/runbox
35
Notebook Filesystem Data Volume
Pod/RS
Container
Dev Machine
Runbox
Runbox
Controller*
Kubernetes Cluster
UI
(e.g.,
Jupyter,
Bash)
Runbox
Proxy
Create
Exec
Recycle
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
DEMO – Bash
● https://github.com/slsvm/runbox
36
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
37
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
Filesystem synchronization
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
38
Filesystem synchronization
Persistent over recycling
of idle resource (e.g., by
Runbox controller)
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
39
Filesystem synchronization
Persistent over recycling
of idle resource (e.g., by
Runbox controller)
Per-command vertical scaling
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
40
Filesystem synchronization
Persistent over recycling
of idle resource (e.g., by
Runbox controller)
Per-command vertical scaling
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
DEMO – Jupyter
● https://github.com/slsvm/runbox-jupyter (COMING SOON)
41
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
42
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
46
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
47
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Architecture - Jupyter
48
Jupyter-Browser
Jupyter Server
Runbox
Extension
Notebook Filesystem Data Volume
Pod/RS
Container
Dev Machine
Runbox
Runbox
Controller*
sync
cold
save
GC
1 start kernel
4 resize
3 sync
2 create
6 resize
up
5 run cell
7 exec
9 exec
11
12
10 save
8 restore
Kubernetes Cluster
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Design Details
● Special Jupyter Kernels, delegating execution to a K8s Pod using `kubectl exec`
○ E.g., scp-python, scp-bash
● State is persisted in a K8s volume attached to the Pod
○ Snapshot/restore in-memory state using `dill` in Python and `set/source` in Bash
○ Also, state is synchronized from/to the local machine via a side-car running unison
● Pod is scaled down (optionally, to zero) when nothing is executed
○ E.g., by scaling the containing ReplicaSet, or using in-place Pod vertical scaling (WIP)
○ Tradeoff between capacity for ‘warm’ containers and latency managed by dedicated controller
● When image changes (e.g., after `apt install`), a new image is committed
○ Using tags for versioning; docker-squash to remove redundant layers
● Magics to control the non-functional properties
○ E.g., resource allocation, whether or not image snapshot is needed, etc
49
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Lessons Learned
● Kubernetes originally focused on scale-out workloads, but can also support
scale-up
○ New kind of controller?
● Generic support for application-assisted snapshots could be useful
● For use-cases involving ephemeral compute, API for direct access to volumes
could be useful
50
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Interactive Computing
○ Deep Learning
● Conclusions
51
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Deep Learning
● Resource-intensive
○ (1) model training, (2) inference
● Frameworks: Tensorflow, Keras, PyTorch, etc.
● ‘Hot’ research area – new algorithms, frameworks, etc
● Example application: Image Classification
○ Given a model + unlabeled example(s), predict label(s)
○ Compute-intensive, scale-out, can leverage GPUs
52
transportation medicine smart cities, security consumer games e-commerce
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
53
Property
Deep Learning Inference
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
54
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
55
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Long-running, scale-out services; Linear resource demand per request; Load variance
Can benefit from running on GPUs; potentially large “cold-start” latencies
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
56
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Long-running, scale-out services; Linear resource demand per request; Load variance
Can benefit from running on GPUs; potentially large “cold-start” latencies
Management
Same as Knative: build, serving, eventing
Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
57
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Long-running, scale-out services; Linear resource demand per request; Load variance
Can benefit from running on GPUs; potentially large “cold-start” latencies
Management
Same as Knative: build, serving, eventing
Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency
Integration K8s, Istio, model storage, etc
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Our Architecture
58
Pod
scaling
GPU Nodes
Pod Pod
scaling
Knative
Service 2
PodPodPodPod
Knative
Service 1
Pod
scaling
CPU Nodes
Pod Pod
scaling
Knative
Service 4
PodPodPodPod
Knative
Service 3
Pod
Standby
Pool
GPU-aware
Load Balancer
LB
GPU
Scheduler
Pool
Manager
User
Hybrid Service
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Design Details
● Build: Automatically add HTTP interface
○ Augment the provided inference logic with a Django ‘wrapper’, then use Knative build to deploy it
● Load-balancing across GPU-enabled and CPU-only nodes
○ Patch Knative to support GPU resources
○ Based on model properties, indicate in the Knative service template whether a GPU is preferable
○ Two-level scheduling: 1 GPU service and 1 CPU service for each app; fair time-sharing of GPUs
● Maintain a pool of ‘warm’ Pods
○ “Pool” is a ReplicaSet with ‘warm’ (running) Pods
■ Size is adjusted dynamically by the Pool Controller (cluster utilization, estimated demand)
○ Knative scaling logic consumes a warm Pod from the Pool instead of provisioning a new one
■ Pod “migration” is implemented by label manipulation + update of the Istio side-car via API
59
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Lessons Learned
● Standardized HTTP wrappers can be used to deliver FaaS-like experience
○ Can leverage existing open source FaaS solutions (e.g., OpenWhisk)
● More fine-grained management of GPU resources would be beneficial
○ The overhead of 2-level scheduling is substantial
● For reuse of ‘warm’ Pods, stronger notion of ‘similarity’ between Pods is needed
○ E.g., same model version?
● Even pool of size 1 significantly reduces the chances of cold starts
○ Instead of pools, can we reuse priority classes and make Knative scaling logic adjust priorities?
60
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Deep Learning
○ Interactive Computing
● Conclusions
61
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Conclusions
● “Serverless” = BYOC + distraction-free
● “Serverless” derives different requirements for different workloads
● No one-size-fits-all!
● Lots of opportunities to deliver ‘serverless’ experience for new workloads!
○ Knative can be enhanced to achieve “serverless” goals for DL inference (KFserving?)
○ SCP for Interactive Computing requires new capabilities on top of Kubernetes
■ https://github.com/slsvm/runbox
62
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Questions? Ideas? Suggestions? Collaboration?
● alex dot glikson at gmail dot com
63

Serverless Compute Platforms on Kubernetes

  • 1.
    Serverless Compute Platforms onKubernetes: Beyond Web Applications Alex Glikson Senior Research Architect, Cloud Platforms Carnegie Mellon University, Pittsburgh, USA (IBM Research, Israel) KubeCon, May 2019 with Ping-Min Lin (Pinterest), Shengjie Luo (VMware), Ke Chang (Facebook), Shichao Nie (Alibaba)
  • 2.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Interactive Computing ■ Demo ○ Deep Learning ● Conclusions 2
  • 3.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Serverless ● Many definitions. In a nutshell: ● Avoid management of servers, as a representative example of tasks that: ○ Keep you distracted from developing your *core* business capabilities, and ○ Can be outsourced to someone you trust, for whom this would be *their* core business ● Serverless = Distraction-Free ● Separation of concerns ● Developer experience?? 3
  • 4.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Serverless = Distraction-Free (Examples) ● Object Storage: ○ Core: data organization ○ Distraction: servers, storage, network, high availability, fault tolerance, replication, consistency ● Micro-services: ○ Core: services logic, interfaces ○ Distraction: infra, scaling, LB, HA/FT, API management, routing, service discovery, databases ● Async/Event-driven: ○ Core: event-processing logic ○ Distraction: eventing, messaging, queuing, notifications, etc (+infra/scaling/LB/HA/FT/auth/etc) ● … 4 Example: Amazon S3 Example: Kubernetes+Istio+… Example: Lambda, SNS, etc
  • 5.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Serverless Compute Platform (SCP) ● Platform that executes user-provided code (BYOC) ● Often optimized for specific application patterns ○ Often associated fine-grained elasticity, scaling to zero, etc ● Distraction-free ○ Simplified management ■ Deployment, scaling, metering, monitoring, logging, updates, etc ○ Seamless integration with services that the ‘compute’ interacts with (or depends on) ■ Event sources, data, communication middleware, etc. 5
  • 6.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 6 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Application Pattern Management Integration
  • 7.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 7 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern Management Integration
  • 8.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 8 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern (Not too) short-lived, ephemeral functions, triggered by events or requests; High load variability (including periods of idleness), (relatively) low sensitivity to latency Management Integration
  • 9.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 9 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern (Not too) short-lived, ephemeral functions, triggered by events or requests; High load variability (including periods of idleness), (relatively) low sensitivity to latency Management Fully managed runtime containers; functions & function invocations as first class citizens Integration
  • 10.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 10 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern (Not too) short-lived, ephemeral functions, triggered by events or requests; High load variability (including periods of idleness), (relatively) low sensitivity to latency Management Fully managed runtime containers; functions & function invocations as first class citizens Integration Seamless integration with multiple event sources
  • 11.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 11 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Application Pattern Management Integration
  • 12.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 12 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern Management Integration
  • 13.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 13 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern High throughput, low latency packet processing Management Integration
  • 14.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 14 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern High throughput, low latency packet processing Management Fully managed isolated runtime Integration
  • 15.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 15 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern High throughput, low latency packet processing Management Fully managed isolated runtime Integration The hosting platform
  • 16.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 16 Platform Property Serverless ETL Examples Code Application Pattern Management Integration
  • 17.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 17 Platform Property Serverless ETL Examples AWS Glue Code Application Pattern Management Integration
  • 18.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 18 Platform Property Serverless ETL Examples AWS Glue Code PySpark, PyShell jobs Application Pattern Management Integration
  • 19.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 19 Platform Property Serverless ETL Examples AWS Glue Code PySpark, PyShell jobs Application Pattern Data-parallel Spark jobs (periodic or ad-hoc) Non-parallel pre/post-processing jobs Management Integration
  • 20.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 20 Platform Property Serverless ETL Examples AWS Glue Code PySpark, PyShell jobs Application Pattern Data-parallel Spark jobs (periodic or ad-hoc) Non-parallel pre/post-processing jobs Management Fully managed Spark cluster; Python runtime Integration Data catalogue
  • 21.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 21 Platform Property Cloud-Native Web Applications Examples Code Application Pattern Management Integration
  • 22.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 22 Platform Property Cloud-Native Web Applications Examples Knative Code Application Pattern Management Integration
  • 23.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 23 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Management Integration
  • 24.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 24 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Long-running, scale-out services; Linear resource demand per request Often high-throughput, low-latency Management Integration
  • 25.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 25 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Long-running, scale-out services; Linear resource demand per request Often high-throughput, low-latency Management K8s features + code-to-deploy, revisions, canary deployment, etc Integration
  • 26.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 26 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Long-running, scale-out services; Linear resource demand per request Often high-throughput, low-latency Management K8s features + code-to-deploy, revisions, canary deployment, etc Integration Service mash, build, eventing
  • 27.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 What Other Application Patterns Could Justify a Specialized SCP? 27 Platform Property ? Examples ? Code ? Application Pattern ? Management ? Integration ?
  • 28.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Interactive Computing ○ Deep Learning ● Conclusions 28
  • 29.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Interactive Computing ● Example: Data Science using Jupyter Notebook ● Architecture 1: Python + Spark ○ Scale-out Spark jobs ○ Requires Spark programming model ● Architecture 2: “pure” Python ○ Local execution, using non-parallel Python libraries ○ Not designed for scale-out, but can take advantage of scale-up ● Other example: Linux Shell 29
  • 30.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Interactive Computing 30 Property Interactive Computing (Jupyter, Shell) Code Application Pattern Management Integration
  • 31.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Interactive Computing 31 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Management Integration
  • 32.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Interactive Computing 32 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Iterative invocation of stateful, non-parallel, computation-intensive, ad-hoc tasks, triggered by explicit user interaction Management Integration Efficient persistence of state across invocations Scale-up rather than scale-out Easily re-programmable (code as payload) Scale to zero when idle
  • 33.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Interactive Computing 33 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Iterative invocation of stateful, non-parallel, computation-intensive, ad-hoc tasks, triggered by explicit user interaction Management Provisioning, management, scaling of underlying resources Integration Efficient persistence of state across invocations Scale-up rather than scale-out Scale to zero when idle Easily re-programmable (code as payload)
  • 34.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Interactive Computing 34 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Iterative invocation of stateful, non-parallel, computation-intensive, ad-hoc tasks, triggered by explicit user interaction Management Provisioning, management, scaling of underlying resources Integration Data sources, auth, etc Efficient persistence of state across invocations Scale-up rather than scale-out Scale to zero when idle Easily re-programmable (code as payload)
  • 35.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Runbox: Elastic Persistent Execution Environment on K8s https://github.com/slsvm/runbox 35 Notebook Filesystem Data Volume Pod/RS Container Dev Machine Runbox Runbox Controller* Kubernetes Cluster UI (e.g., Jupyter, Bash) Runbox Proxy Create Exec Recycle
  • 36.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 DEMO – Bash ● https://github.com/slsvm/runbox 36
  • 37.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 37 Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution Filesystem synchronization
  • 38.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 38 Filesystem synchronization Persistent over recycling of idle resource (e.g., by Runbox controller) Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution
  • 39.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 39 Filesystem synchronization Persistent over recycling of idle resource (e.g., by Runbox controller) Per-command vertical scaling Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution
  • 40.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 40 Filesystem synchronization Persistent over recycling of idle resource (e.g., by Runbox controller) Per-command vertical scaling Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution
  • 41.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 DEMO – Jupyter ● https://github.com/slsvm/runbox-jupyter (COMING SOON) 41
  • 42.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 42
  • 43.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019
  • 44.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019
  • 45.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019
  • 46.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 46
  • 47.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 47
  • 48.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Architecture - Jupyter 48 Jupyter-Browser Jupyter Server Runbox Extension Notebook Filesystem Data Volume Pod/RS Container Dev Machine Runbox Runbox Controller* sync cold save GC 1 start kernel 4 resize 3 sync 2 create 6 resize up 5 run cell 7 exec 9 exec 11 12 10 save 8 restore Kubernetes Cluster
  • 49.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Design Details ● Special Jupyter Kernels, delegating execution to a K8s Pod using `kubectl exec` ○ E.g., scp-python, scp-bash ● State is persisted in a K8s volume attached to the Pod ○ Snapshot/restore in-memory state using `dill` in Python and `set/source` in Bash ○ Also, state is synchronized from/to the local machine via a side-car running unison ● Pod is scaled down (optionally, to zero) when nothing is executed ○ E.g., by scaling the containing ReplicaSet, or using in-place Pod vertical scaling (WIP) ○ Tradeoff between capacity for ‘warm’ containers and latency managed by dedicated controller ● When image changes (e.g., after `apt install`), a new image is committed ○ Using tags for versioning; docker-squash to remove redundant layers ● Magics to control the non-functional properties ○ E.g., resource allocation, whether or not image snapshot is needed, etc 49
  • 50.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Lessons Learned ● Kubernetes originally focused on scale-out workloads, but can also support scale-up ○ New kind of controller? ● Generic support for application-assisted snapshots could be useful ● For use-cases involving ephemeral compute, API for direct access to volumes could be useful 50
  • 51.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Interactive Computing ○ Deep Learning ● Conclusions 51
  • 52.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Deep Learning ● Resource-intensive ○ (1) model training, (2) inference ● Frameworks: Tensorflow, Keras, PyTorch, etc. ● ‘Hot’ research area – new algorithms, frameworks, etc ● Example application: Image Classification ○ Given a model + unlabeled example(s), predict label(s) ○ Compute-intensive, scale-out, can leverage GPUs 52 transportation medicine smart cities, security consumer games e-commerce
  • 53.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 53 Property Deep Learning Inference Code Application Pattern Management Integration
  • 54.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 54 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Management Integration
  • 55.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 55 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Long-running, scale-out services; Linear resource demand per request; Load variance Can benefit from running on GPUs; potentially large “cold-start” latencies Management Integration
  • 56.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 56 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Long-running, scale-out services; Linear resource demand per request; Load variance Can benefit from running on GPUs; potentially large “cold-start” latencies Management Same as Knative: build, serving, eventing Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency Integration
  • 57.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 57 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Long-running, scale-out services; Linear resource demand per request; Load variance Can benefit from running on GPUs; potentially large “cold-start” latencies Management Same as Knative: build, serving, eventing Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency Integration K8s, Istio, model storage, etc
  • 58.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Our Architecture 58 Pod scaling GPU Nodes Pod Pod scaling Knative Service 2 PodPodPodPod Knative Service 1 Pod scaling CPU Nodes Pod Pod scaling Knative Service 4 PodPodPodPod Knative Service 3 Pod Standby Pool GPU-aware Load Balancer LB GPU Scheduler Pool Manager User Hybrid Service
  • 59.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Design Details ● Build: Automatically add HTTP interface ○ Augment the provided inference logic with a Django ‘wrapper’, then use Knative build to deploy it ● Load-balancing across GPU-enabled and CPU-only nodes ○ Patch Knative to support GPU resources ○ Based on model properties, indicate in the Knative service template whether a GPU is preferable ○ Two-level scheduling: 1 GPU service and 1 CPU service for each app; fair time-sharing of GPUs ● Maintain a pool of ‘warm’ Pods ○ “Pool” is a ReplicaSet with ‘warm’ (running) Pods ■ Size is adjusted dynamically by the Pool Controller (cluster utilization, estimated demand) ○ Knative scaling logic consumes a warm Pod from the Pool instead of provisioning a new one ■ Pod “migration” is implemented by label manipulation + update of the Istio side-car via API 59
  • 60.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Lessons Learned ● Standardized HTTP wrappers can be used to deliver FaaS-like experience ○ Can leverage existing open source FaaS solutions (e.g., OpenWhisk) ● More fine-grained management of GPU resources would be beneficial ○ The overhead of 2-level scheduling is substantial ● For reuse of ‘warm’ Pods, stronger notion of ‘similarity’ between Pods is needed ○ E.g., same model version? ● Even pool of size 1 significantly reduces the chances of cold starts ○ Instead of pools, can we reuse priority classes and make Knative scaling logic adjust priorities? 60
  • 61.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Deep Learning ○ Interactive Computing ● Conclusions 61
  • 62.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Conclusions ● “Serverless” = BYOC + distraction-free ● “Serverless” derives different requirements for different workloads ● No one-size-fits-all! ● Lots of opportunities to deliver ‘serverless’ experience for new workloads! ○ Knative can be enhanced to achieve “serverless” goals for DL inference (KFserving?) ○ SCP for Interactive Computing requires new capabilities on top of Kubernetes ■ https://github.com/slsvm/runbox 62
  • 63.
    KubeCon / CloudNativeCon,Barcelona, May 20-23, 2019 Questions? Ideas? Suggestions? Collaboration? ● alex dot glikson at gmail dot com 63