0% found this document useful (0 votes)
70 views

Spark Summit: June 2014

This document discusses Spark Summit, an event held in June 2014 about Apache Spark and Databricks. It summarizes Databricks' offerings, including its cloud platform, Apache Spark integration, notebooks, dashboards, jobs, and vision to make big data easy to use. The platform aims to simplify typical big data challenges around setting up clusters, integrating tools, and data analysis/product development.

Uploaded by

AkikoYuuki
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Spark Summit: June 2014

This document discusses Spark Summit, an event held in June 2014 about Apache Spark and Databricks. It summarizes Databricks' offerings, including its cloud platform, Apache Spark integration, notebooks, dashboards, jobs, and vision to make big data easy to use. The platform aims to simplify typical big data challenges around setting up clusters, integrating tools, and data analysis/product development.

Uploaded by

AkikoYuuki
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Spark Summit

June 2014
Apache Spark and Databricks
Adoption
All major Hadoop distributions include Spark





Beyond Hadoop
Partnerships

Partner with Spark distributors to provide great
experience to every Spark user

Partners




Certification
Build a strong application ecosystem

Spark Apps
App Cert

Spark API
Distros Cert
Spark Distros

Certification

Free certification process

Scripts for certifying Spark distributions


•  Developed by community
•  Open-source

Anyone will be able to certify any Spark distribution



Training

We’ve been teaching Spark since 2012
•  400+ people this year through Databricks

Just launched a new training program


•  Already hold workshops in 5 cities

300+ people signed up for training on Wednesday


Solve Big Data Challenges
Big Promise
Great successes using Big Data









Big Promise
Great successes using Big Data







Every organization collects data Your company here!


Big Challenge
Great successes using Big Data



Google, Facebook spend billions $ to develop,
implement, and run data analysis tools and products



Every organization collects data Your company here!


Typical Story
Your company starts a Big Data initiative

You are tasked to…

1) Build a Hadoop cluster Clusters hard to set up
(IT) and manage

2) Build a data pipeline
(engineers, data scientists)
Need to integrate a zoo
of tools
3) Get insights &
build data products Tools are hard to use
(engineers, data scientists, analysts)
Typical Data Pipeline

Exploration Dashboards Advanced Data


ETL
& Reports Analytics Products

Data

Integrate disparate, clunky tools


Hard to navigate data, develop and deploy apps
Vision

Make big data easy


From Challenges to Solutions

Challenges Solutions
Clusters hard to set up
Hosted platform
and manage
Need to integrate a zoo
Apache Spark
of tools

Tools are hard to use Interactive Workspace


Databricks Cloud

Databricks Workspace

Databricks Cloud
Databricks Platform
Databricks Platform

Databricks Workspace

Databricks Platform


Databricks Platform

Start clusters in seconds


Zero-cost management
Dynamically scale up & down
Apache Spark

Databricks Workspace Unifies


•  Streaming
•  SQL
•  Machine learning
•  Graphs
Single system,
Databricks Platform single API
Databricks Workspace

Databricks
Notebooks Workspace
Dashboards Jobs
Apps

Databricks Platform
Notebooks

Support Python, SQL, Scala



Interactive commands & plots

On-line collaboration
Dashboards

WYSIWYG builder

Interactive plots

One-click publishing
Job Launcher

Run arbitrary Spark jobs, programmatically


Dramatically Simplify Data Pipeline
ETL
Exploration Cloud
Advanced Analytics
Dashboards & Reports
Data Products

Data
Dramatically Simplify Data Pipeline
ETL
Exploration Cloud
Advanced Analytics
Dashboards & Reports
Data Products

Data

Free users to focus on


finding answers & building products
Demo
Availability
Started closed beta program earlier this year

Limited availability soon


•  Gradually ramping up
•  Sign up on databricks.com!

3rd Party Apps

Databricks
Workspace

Databricks Platform
3rd Party Apps

Databricks
Workspace …
Apps

Databricks Platform
Databricks Cloud and Spark

Databricks Cloud runs 100% Apache Spark


•  No lock in: any Databricks Cloud app runs on any
certified Spark distribution

Databricks Cloud accelerates Spark adoption


•  Provide easiest way to learn and use Apache Spark
Databricks Cloud

Dramatically simplify Databricks Workspace
•  analyzing big data
•  building data products

Databricks Platform

Fuel
growth of Spark ecosystem


Make big data easy

Thank You!

You might also like