Big Data Analysis
Big Data Analysis
Ian brings over 20 years of experience in the business analytics software market
with roles spanning consulting services, pre-sales engineering, product
management and product marketing. Ian started his career by co-founding a
business intelligence startup and has worked at Business Objects, Informix,
Epiphany, PeopleSoft and Jaspersoft.
4
4 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
The Value of Big Data for our Customers
Big opportunities
Transactional Non-Transactional
•Fraud detection •Web pages, blogs etc
•Financial services / stock •Documents
markets •Physical events
•Application events
Sub-Transactional •Machine events
•Weblogs
•Social/online media
•Telecoms events
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. US and Worldwide: +1 (866) 660-7555 | Slide
6 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Click Stream Analytics
From buying patterns to revenue
Business Challenge
• Monetize buying patterns hidden in billions of
data points
• Quickly analyze multi-channel click stream data
Pentaho Benefits
• Reduced ETL time to analyze blended data
from Hadoop, Hbase & data warehouse
• Use of big data analytics to grow revenue from
targeted campaigns
Business Challenge
• Affordably scale machine data from storage
devices for customer support app
• Predict device failure
• Enhance product performance
Pentaho Benefits
• Easy to use ETL & analysis for Hadoop, Hbase,
& Oracle data sources
• 15x cost improvement
• Stronger performance against customer SLA’s
Business Challenge
• Gain new revenue source from add-on
module with reporting, analysis & dashboards
• Get to market fast to differentiate
Pentaho Benefits
• Easy to embed & brand
• Broad capabilities result in new revenue stream
• Increased functionality & compelling
visualizations
Continued Leadership:
• Cloud & multi-tenancy ease-of-use
• Simplified REST services for ISVs
• BI Platform SDK enhancements – deep
solution examples, tutorials and training
• Continued focus on standards and
extensibility
12
12 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
© 2012, Pentaho. All Rights Reserved.
The Current Solutions
10,000
structured data.
0 10%
2005 2010 2015
Apache Pig
High-level language
for expressing data
Apache Hive analysis programs
Apache HBase
SQL-like language and
metadata repository The Hadoop database.
Random, real -time
read/write access
Hue
Apache Zookeeper
Browser-based
desktop interface for Highly reliable
interacting with distributed
Hadoop coordination service
Oozie
Flume
Server-based
workflow engine for Distributed service for
Hadoop activities collecting and
aggregating log and
event data
Sqoop
Apache Whirr
Integrating Hadoop
with RDBMS Library for running
Hadoop in the cloud
ETL Developer
1. Somewhat immature
2. Lack of tooling
3. Steep technical learning curve
4. Hiring qualified people
5. Availability of enterprise-ready products and tools
6. High latency (Hadoop)
7. Running inside the cluster
Scheduling
Modeling
21
21 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Questions to Ask
Business Drivers
1. Mandate to reduce EDW costs?
2. Clear use case that you need to solve?
3. Do you have access to technical skill set?
Technical
1. Do you have more than one kind of big data store, for example Hadoop as well as HBase,
MongoDB or Cassandra?
2. Would you prefer to use the same tool for big data stores in addition to your traditional relational
data stores?
3. Are you ok waiting minutes or even hours to access your big data?
4. Are you ok using a spreadsheet-like interface to access and analyze your data?
5. Do you need complete BI capabilities, including reporting, interactive visualization, and predictive
analytics?
6. Do you need to enrich your big data with data from outside of the big data platform?
7. Is the big data you want to analyze bigger than the amount of memory you have available?
http://blog.pentaho.com/tag/ian-fyfe/
23
23 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
© 2012, Pentaho. All Rights Reserved.
Complete Big Data Analytics &
Visual Data Management
Analytic
Hadoop NoSQL Relational
Databases
Discussion
blog.pentaho.com Facebook.com/Pentaho