SlideShare a Scribd company logo
A Tale of Three Deep
Learning Frameworks:
TensorFlow, Keras, and Deep
Learning Pipelines
Brooke Wenig
Jules S. Damji
Spark + AI Summit, SF 6/5/2018
About Us . . .
Databricks Machine LearningInstructor
Data ScienceSolution Consultant@ Databricks
Software Engineering @Splunk & MyFitnessPal
MS Machine Learning(UCLA)
Fluentin Chinese
https://www.linkedin.com/in/brookewenig/
Brooke WenigJules S. Damji
Apache Spark Developer& Community
Advocate @Databricks
Program Chair Spark + AI Summit
Software engineering @Sun Microsystems,
Netscape, @Home, VeriSign, Scalix, Centrify,
LoudCloud/Opsware, ProQuest
https://www.linkedin.com/in/dmatrix
@2twitme
Agenda for Today’s Talk
• Impact of Big Data
• Why Apache Spark?
• Short Survey of 3 DL Frameworks
• TensorFlow
• Keras
• Deep Learning Pipelines
• Demo
• Q&A
What has Big Data Done to Us?
Permeated our livesSource : MIT
Hardest Part of AI isn’t AI, it’s Data
ML
Code
Configuration
Data	Collection
Data	
Verification
Feature	
Extraction
Machine	
Resource	
Management
Analysis	Tools
Process
Management	Tools
Serving
Infrastructure
Monitoring
“Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015
Figure 1: Onlya small fraction of real-world ML systems is composed of the ML
code. The required surrounding infrastructure is vast and complex.
What’s Apache Spark
& Why
Apache Spark: The First Unified Analytics Engine
Runtime	
Delta
Spark	Core	Engine
Big Data Processing
ETL + SQL + Streaming
Machine Learning
MLlib + SparkR
Uniquelycombines Data & AI technologies
Survey of Three Deep
Learning Frameworks
What’s TensorFlow?
• Open source from Google, 2015
• Current v1.8 API
• Fast: Backend C/C++
• Data flow graphs
• Nodes are functions/operators
• Edges are input or data (tensors)
• Lazy execution
• Eager execution (1.7)
TensorFlow Programming Stack
CPU GPU Android iOS …TPU
Use canned estimators
Build models
Keras	
Models
Why TensorFlow: Community
AF
AF
• 100K+ stars!
• 11M downloads
• Popular open-source code
• TensorFlow Hub & Blog
○ Code Examples &
Tutorials!
○ Learn + share from
others
Why TensorFlow: Tools
AF
AF
• Deploy + Serve Models• TensorBoard
• Visualize Tensors flow
TensorFlow: We Get it … So What?
• Steep learning curve, but powerful!!
• Low-level APIs, butoffers control!!
• Expert in Machine Learning, justlearn!!
• Yet, high-level Estimators help, you bet!!
• Better, Keras integration helps, indeed!!
What’s Keras?
• Open source Python Library APIs for Deep Learning
• Current v2.1.6 APIs François Chollet (Google)
• API spec: TensorFlow, CNTKand Theano
• Easy to UseHigh-Level DeclarativeAPIs!
• Build layers
– Great for Neural Network Applications
• Fast Experimentation,Modular & Extensible!
Keras Programming Stack
CPU GPU Android iOS …TPU
Use canned estimators
Specific Impl
models
Keras	API	Specification	
TF-Keras Theano-Keras CNTK
TensorFlow	
Workflow
.....
Why Keras?
• Focuses on Developer Experience
• Popular & Broader Community
• Supports multiple backends
• Modularity
• Sequential Layers
• Multi-layer input networks
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
model.add(Dense, 32, activation=’softmax’)
...
Transfer Learning &
Deep Learning Pipelines
What’s Transfer Learning?
• Training from scratch requires
• Enormousamounts of data
• A lot of compute resources & time
Intermediate
representations learned
for one task may be
useful for other related
tasks
IDEA
Trained Model
SoftMax
GIANT PANDA 0.9
RACCOON 0.05
RED PANDA 0.01
…
Transfer Learning as a Pipeline
Classifier
Dog/Cat?
When to use Transfer Learning?
• Dataset is small & similar
• Dataset is large & similar
• Dataset is small but different
• Dataset is large and different
Source: Andrej Karpathy’s Transfer Learning
What & Why Deep Learning Pipelines (DLP)?
• Open source from Databricks, 2017
• Current v1.0 APIs w/ Apache Spark 2.3
• Primarily in Python
• Ease of Use & Integration
• Spark MLlibPipelines & DataFrames
• TensorFlow & Keras
• SQL
– Deploying & Evaluating
• Distributed Hyperparameter Tuning
• Easy for Transfer Learning
DEMO
https://dbricks.co/dlf_sai_2018
Takeaways: Which One & What Language?
TensorFlow Keras
Takeaways: When to Use TF, Keras or DLP
Deep Learning Pipelines
• Low-level APIs & Control
• Visualize with
TensorBoard
• Train Models or Transfer
Learning
• Model Serving
• High-level APIs
• TensorFlowBackend
• LovePython
• Train models or
transfer learning
• Integration with Spark
MLlib Pipelines &
DataFrames
• Integrated with TF &
Keras
• Transfer Learning
Resources
Blogposts Talk, & webinars (http://databricks.com/blog)
• Deep Learning Pipelines
• GPU acceleration in Databricks
• Deep Learning and ApacheSpark
• Build Scalable Deep LearningPipelines
• Deep Learning course:fast.ai
• TensorFlowTutorials
• TensorFlowDev Summit
• Keras/TensorFlowTutorials
• MLFlow.org
Docs for Deep Learning on Databricks (http://docs.databricks.com)
• Deep Learning Pipelines Example
• ApacheSpark integration
Thank You!
Questions?
brooke@databricks.com
jules@databricks.com (@2twitme)

More Related Content

PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
Databricks: What We Have Learned by Eating Our Dog Food
PDF
Using Databricks as an Analysis Platform
PDF
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
PDF
A Collaborative Data Science Development Workflow
PDF
Productionizing Machine Learning Pipelines with Databricks and Azure ML
PDF
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
PDF
Spark summit 2019 infrastructure for deep learning in apache spark 0425
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Databricks: What We Have Learned by Eating Our Dog Food
Using Databricks as an Analysis Platform
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
A Collaborative Data Science Development Workflow
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Spark summit 2019 infrastructure for deep learning in apache spark 0425

What's hot (20)

PDF
Tuning ML Models: Scaling, Workflows, and Architecture
PDF
Apache Spark MLlib's Past Trajectory and New Directions with Joseph Bradley
PDF
Just-in-Time Analytics and the Need for Autonomous Database Administration wi...
PDF
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
PPTX
Large Scale Graph Analytics with JanusGraph
PDF
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
PDF
Semi-Supervised Learning In An Adversarial Environment
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PDF
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
PDF
An Introduction to Sparkling Water by Michal Malohlava
PDF
A Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
PDF
Bring Satellite and Drone Imagery into your Data Science Workflows
PPTX
Building a Virtual Data Lake with Apache Arrow
PDF
SparkApplicationDevMadeEasy_Spark_Summit_2015
PDF
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
PDF
Lessons Learned from Modernizing USCIS Data Analytics Platform
PDF
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
PDF
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
PDF
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
PDF
Building the Foundations of an Intelligent, Event-Driven Data Platform at EFSA
Tuning ML Models: Scaling, Workflows, and Architecture
Apache Spark MLlib's Past Trajectory and New Directions with Joseph Bradley
Just-in-Time Analytics and the Need for Autonomous Database Administration wi...
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Large Scale Graph Analytics with JanusGraph
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
Semi-Supervised Learning In An Adversarial Environment
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
An Introduction to Sparkling Water by Michal Malohlava
A Tale of Three Tools: Kubernetes, Jsonnet, and Bazel
Bring Satellite and Drone Imagery into your Data Science Workflows
Building a Virtual Data Lake with Apache Arrow
SparkApplicationDevMadeEasy_Spark_Summit_2015
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
Lessons Learned from Modernizing USCIS Data Analytics Platform
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Building the Foundations of an Intelligent, Event-Driven Data Platform at EFSA
Ad

Similar to Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learning by Jules Damji (20)

PDF
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
PPTX
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
PDF
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
PDF
Build, Scale, and Deploy Deep Learning Pipelines with Ease Using Apache Spark
PDF
Tensorflow 2.0 and Coral Edge TPU
PDF
Build, Scale, and Deploy Deep Learning Pipelines with Ease
PDF
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
PPTX
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
PDF
Neural Networks from Scratch - TensorFlow 101
PDF
Integrating Deep Learning Libraries with Apache Spark
PDF
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
PPTX
Combining Machine Learning frameworks with Apache Spark
PDF
TensorFlow and Keras: An Overview
PDF
The Flow of TensorFlow
PDF
dl-unit-3 materialdl-unit-3 material.pdf
PDF
Austin,TX Meetup presentation tensorflow final oct 26 2017
PPTX
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
PDF
Inteligencia artificial para android como empezar
PDF
Lecture 4: Deep Learning Frameworks
PDF
The Deep Learning Frameworks You Should Know | 2025
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Deep-Dive into Deep Learning Pipelines with Sue Ann Hong and Tim Hunter
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines with Ease Using Apache Spark
Tensorflow 2.0 and Coral Edge TPU
Build, Scale, and Deploy Deep Learning Pipelines with Ease
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Neural Networks from Scratch - TensorFlow 101
Integrating Deep Learning Libraries with Apache Spark
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Combining Machine Learning frameworks with Apache Spark
TensorFlow and Keras: An Overview
The Flow of TensorFlow
dl-unit-3 materialdl-unit-3 material.pdf
Austin,TX Meetup presentation tensorflow final oct 26 2017
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
Inteligencia artificial para android como empezar
Lecture 4: Deep Learning Frameworks
The Deep Learning Frameworks You Should Know | 2025
Ad

More from Data Con LA (20)

PPTX
Data Con LA 2022 Keynotes
PPTX
Data Con LA 2022 Keynotes
PDF
Data Con LA 2022 Keynote
PPTX
Data Con LA 2022 - Startup Showcase
PPTX
Data Con LA 2022 Keynote
PDF
Data Con LA 2022 - Using Google trends data to build product recommendations
PPTX
Data Con LA 2022 - AI Ethics
PDF
Data Con LA 2022 - Improving disaster response with machine learning
PDF
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
PDF
Data Con LA 2022 - Real world consumer segmentation
PPTX
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
PPTX
Data Con LA 2022 - Moving Data at Scale to AWS
PDF
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
PDF
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
PDF
Data Con LA 2022 - Intro to Data Science
PDF
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
PPTX
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
PPTX
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
PPTX
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
PPTX
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA 2022 Keynote
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 Keynote
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022 - Data Streaming with Kafka

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Reimagining Insurance: Connected Data for Confident Decisions.pdf
PDF
creating-agentic-ai-solutions-leveraging-aws.pdf
PDF
Google’s NotebookLM Unveils Video Overviews
PDF
Smarter Business Operations Powered by IoT Remote Monitoring
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
CroxyProxy Instagram Access id login.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
CIFDAQ's Market Wrap: Ethereum Leads, Bitcoin Lags, Institutions Shift
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Top Generative AI Tools for Patent Drafting in 2025.pdf
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
PDF
DevOps & Developer Experience Summer BBQ
PDF
KodekX | Application Modernization Development
PPTX
ABU RAUP TUGAS TIK kelas 8 hjhgjhgg.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Reimagining Insurance: Connected Data for Confident Decisions.pdf
creating-agentic-ai-solutions-leveraging-aws.pdf
Google’s NotebookLM Unveils Video Overviews
Smarter Business Operations Powered by IoT Remote Monitoring
NewMind AI Monthly Chronicles - July 2025
CroxyProxy Instagram Access id login.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Understanding_Digital_Forensics_Presentation.pptx
CIFDAQ's Market Wrap: Ethereum Leads, Bitcoin Lags, Institutions Shift
A Day in the Life of Location Data - Turning Where into How.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Top Generative AI Tools for Patent Drafting in 2025.pdf
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
DevOps & Developer Experience Summer BBQ
KodekX | Application Modernization Development
ABU RAUP TUGAS TIK kelas 8 hjhgjhgg.pptx

Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learning by Jules Damji

  • 1. A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learning Pipelines Brooke Wenig Jules S. Damji Spark + AI Summit, SF 6/5/2018
  • 2. About Us . . . Databricks Machine LearningInstructor Data ScienceSolution Consultant@ Databricks Software Engineering @Splunk & MyFitnessPal MS Machine Learning(UCLA) Fluentin Chinese https://www.linkedin.com/in/brookewenig/ Brooke WenigJules S. Damji Apache Spark Developer& Community Advocate @Databricks Program Chair Spark + AI Summit Software engineering @Sun Microsystems, Netscape, @Home, VeriSign, Scalix, Centrify, LoudCloud/Opsware, ProQuest https://www.linkedin.com/in/dmatrix @2twitme
  • 3. Agenda for Today’s Talk • Impact of Big Data • Why Apache Spark? • Short Survey of 3 DL Frameworks • TensorFlow • Keras • Deep Learning Pipelines • Demo • Q&A
  • 4. What has Big Data Done to Us? Permeated our livesSource : MIT
  • 5. Hardest Part of AI isn’t AI, it’s Data ML Code Configuration Data Collection Data Verification Feature Extraction Machine Resource Management Analysis Tools Process Management Tools Serving Infrastructure Monitoring “Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015 Figure 1: Onlya small fraction of real-world ML systems is composed of the ML code. The required surrounding infrastructure is vast and complex.
  • 7. Apache Spark: The First Unified Analytics Engine Runtime Delta Spark Core Engine Big Data Processing ETL + SQL + Streaming Machine Learning MLlib + SparkR Uniquelycombines Data & AI technologies
  • 8. Survey of Three Deep Learning Frameworks
  • 9. What’s TensorFlow? • Open source from Google, 2015 • Current v1.8 API • Fast: Backend C/C++ • Data flow graphs • Nodes are functions/operators • Edges are input or data (tensors) • Lazy execution • Eager execution (1.7)
  • 10. TensorFlow Programming Stack CPU GPU Android iOS …TPU Use canned estimators Build models Keras Models
  • 11. Why TensorFlow: Community AF AF • 100K+ stars! • 11M downloads • Popular open-source code • TensorFlow Hub & Blog ○ Code Examples & Tutorials! ○ Learn + share from others
  • 12. Why TensorFlow: Tools AF AF • Deploy + Serve Models• TensorBoard • Visualize Tensors flow
  • 13. TensorFlow: We Get it … So What? • Steep learning curve, but powerful!! • Low-level APIs, butoffers control!! • Expert in Machine Learning, justlearn!! • Yet, high-level Estimators help, you bet!! • Better, Keras integration helps, indeed!!
  • 14. What’s Keras? • Open source Python Library APIs for Deep Learning • Current v2.1.6 APIs François Chollet (Google) • API spec: TensorFlow, CNTKand Theano • Easy to UseHigh-Level DeclarativeAPIs! • Build layers – Great for Neural Network Applications • Fast Experimentation,Modular & Extensible!
  • 15. Keras Programming Stack CPU GPU Android iOS …TPU Use canned estimators Specific Impl models Keras API Specification TF-Keras Theano-Keras CNTK TensorFlow Workflow .....
  • 16. Why Keras? • Focuses on Developer Experience • Popular & Broader Community • Supports multiple backends • Modularity • Sequential Layers • Multi-layer input networks model = Sequential() model.add(Dense(32, input_dim=784)) model.add(Activation('relu')) model.add(Dense, 32, activation=’softmax’) ...
  • 17. Transfer Learning & Deep Learning Pipelines
  • 18. What’s Transfer Learning? • Training from scratch requires • Enormousamounts of data • A lot of compute resources & time Intermediate representations learned for one task may be useful for other related tasks IDEA
  • 19. Trained Model SoftMax GIANT PANDA 0.9 RACCOON 0.05 RED PANDA 0.01 …
  • 20. Transfer Learning as a Pipeline Classifier Dog/Cat?
  • 21. When to use Transfer Learning? • Dataset is small & similar • Dataset is large & similar • Dataset is small but different • Dataset is large and different Source: Andrej Karpathy’s Transfer Learning
  • 22. What & Why Deep Learning Pipelines (DLP)? • Open source from Databricks, 2017 • Current v1.0 APIs w/ Apache Spark 2.3 • Primarily in Python • Ease of Use & Integration • Spark MLlibPipelines & DataFrames • TensorFlow & Keras • SQL – Deploying & Evaluating • Distributed Hyperparameter Tuning • Easy for Transfer Learning
  • 24. Takeaways: Which One & What Language?
  • 25. TensorFlow Keras Takeaways: When to Use TF, Keras or DLP Deep Learning Pipelines • Low-level APIs & Control • Visualize with TensorBoard • Train Models or Transfer Learning • Model Serving • High-level APIs • TensorFlowBackend • LovePython • Train models or transfer learning • Integration with Spark MLlib Pipelines & DataFrames • Integrated with TF & Keras • Transfer Learning
  • 26. Resources Blogposts Talk, & webinars (http://databricks.com/blog) • Deep Learning Pipelines • GPU acceleration in Databricks • Deep Learning and ApacheSpark • Build Scalable Deep LearningPipelines • Deep Learning course:fast.ai • TensorFlowTutorials • TensorFlowDev Summit • Keras/TensorFlowTutorials • MLFlow.org Docs for Deep Learning on Databricks (http://docs.databricks.com) • Deep Learning Pipelines Example • ApacheSpark integration