SOLUTION BRIEF
CONNECT
Informatica
Industry
Data Integration
Website
www.informatica.com
Company Overview
Informatica provides data integration software
and services that enable organizations to gain
a competitive advantage in todays global
information economy by empowering them
with timely, relevant and trustworthy data for
their top business imperatives
Product Overview
The Informatica platform on Clouderas
enterprise data hub increases productivity
up to 5 times with the ability to access
all types of data from modern and
legacy systems at any latency, process
and integrate data at scale, and deliver
actionable information to business
stakeholders, customers, and partners.
Cloudera and Informatica Unleash the
Power of Hadoop
Big Data and Hadoop for Everyone
Organizations increasingly recognize the potential of big data to transform their
businessimproving customer retention and acquisition, increasing operational
efficiencies, enabling better products and service delivery, and generating new business
insights. An analytics-ready Hadoop-based enterprise data hub platform and advanced
data integration are critical technologies to take full advantage of big data.
With Cloudera and Informatica, enterprises have a proven solution and services to
maximize their big data returns by successfully leveraging Hadoop as a key component
of their overall data management infrastructure. One of the biggest challenges associated
with big data projects is a shortage of resource skills. Informatica and Cloudera address
these challenges to increase productivity up to 5 times with readily available trained
developers. The combined solution provides a visual development environment and
administration tools, reference architectures, services and training, and over 100,000
readily available trained developers and consultants around the world.
Most Hadoop workloads involve ETL, data integration, and data quality, therefore,
choosing the right platform and tools is important to the success of your data
management initiatives. Informatica has a comprehensive and unified data integration
platform that runs natively on Clouderas enterprise data hub so you can easily access all
types of data, ingest data at any latency, profile and discover data domains to understand
data on Hadoop, parse, integrate, and cleanse data on Hadoop using a visual development
environment, and deliver data as actionable information to business stakeholders,
customers, and partners.
Solution Highlights
Profile
Parse
ETL
Cleanse
Match
Increase developer productivity on
Hadoop up to 5 times
Transform and cleanse data on Hadoop
with a visual development environment
Access all types of data (e.g. RDBMS,
mainframe, ERP, social data, and machine
data)
Profile and discover data domains and
their relationships
Load
Load
Documents and Emails Replicate
powered by
Stream
Archive
Machine Device, Cloud
Events
CLOUDERAS
Model
DiscoverDATA HUB
Process ENTERPRISE
Social Media, Web Logs
Data Warehouse
Services
SEARCH
MACHINE Machine
STREAM
Analytic
Ingest ANALYTIC
SQL
ENGINE
LEARNING
PROCESSING
Database
Learning
Sqoop, Flume,
Impala
SAS, R,
Kafka
WORKLOAD MANAGEMENT
Spark, Mahout
Search
Transform
FOR ANY TYPE OF DATA
MapReduce, STORAGESolr
Hive, Pig, Spark UNIFIED, ELASTIC, RESILIENT, SECURE
BATCH
PROCESSING
Security
and Administration
Filesystem
Serve
NoSQL
Database
HBase
3RD PARTY
APPS
Streaming
Spark Streaming
YARN, Cloudera Manager,
Cloudera
Navigator
Online NoSQL
Unlimited Storage HDFS, HBase
SYSTEM
MANAGEMENT
Parse all types of data to extract
features from unstructured and complex
structured format
Relational, Mainframe
DATA
MANAGEMENT
Ingest data into a Hadoop-based EDH
through batch, replication, streaming, and
archiving
Analytics Teams
Topics
Analytics & Op
Dashboards
Mobile Apps
Alerts
SOLUTION BRIEF
Cloudera Enterprise Benefits
Stores and Analyzes Any Type of Data
Leverage the full power of your data to
achieve pervasive analytics, increase
business visibility, and reduce costs
Bring diverse users and application
workloads to a single, unified pool of
data on common infrastructure; no data
movement required
Enterprise Approach
Compliance-ready perimeter security,
authentication, granular authorization,
and data protection through encryption
and key management
Enterprise-grade data auditing, data
lineage, and data discovery
Industry-Leading Management and Support
Best-in-class holistic interface that
provides end-to-end system management
and zero-downtime rolling upgrades
Open platform ensures easy integration
with existing systems
Open source to achieve stability,
continuous innovation, and portability
Benefits of Informatica
Lower Cost, Minimize Risk
Staff projects with readily available and
trained resource skills
Develop and Innovate Faster
Develop on Hadoop up to 5 times faster
with a visual development environment
and pre-built transforms
Onboard all types of data at any latency
and discover insights through profiling
and rapid prototyping
Comprehensive
Provides data management infrastructure
for data integration, data governance,
information lifecycle management, and
more
Turn Big Data into Actionable Information
Informatica and Cloudera together deliver a proven solution that simplifies data
integration and data quality on a Hadoop-based enterprise data hub, eliminating the need
to write mountains of code or the need for expert knowledge of Hadoop and the data
source and target systems. The joint solution enables organizations to:
Access all types of data including relational databases, legacy mainframes, enterprise
applications such as ERP and CRM, cloud applications, social data, machine data, and
industry standards data
Automatically ingest data into an enterprise data hub through batch, high-speed data
replication, real-time streaming, and highly-compressed archiving
Parse complex formatted data sets to extract features using a visual development
environment and pre-built parsers
Design and execute data pipelines on Hadoop for ETL, data integration, and data
quality using a visual development environment with pre-built transforms and rules so
you can spend more time analyzing and visualizing data to uncover insights, patterns,
and trends
Profile and discover data domains and their relationships on in a Hadoop-based
enterprise data hub so so you can better understand datasets and identify sensitive
data which you can secure and mask for regulatory compliance
Deliver big data insights as actionable information to business users, customers, and
partners with batch loads to the data warehouse, data services for applications, eventbased processing for real-time notifications and alerts, and subscription to topics of
interest
About Informatica
Informatica Corporation (Nasdaq:INFA) is the worlds number one independent provider
of data integration software. Organizations around the world rely on Informatica to
realize their information potential and drive top business imperatives. Informatica Vibe,
the industrys first and only embeddable virtual data machine (VDM), powers the unique
Map Once. Deploy Anywhere. capabilities of the Informatica Platform. Worldwide, over
5,000 enterprises depend on Informatica to fully leverage their information assets from
devices to mobile to social to big data residing on-premise, in the Cloud and across social
networks.
About Cloudera
Cloudera is revolutionizing enterprise data management by offering the first unified
Platform for Big Data, an enterprise data hub built on Apache Hadoop. Cloudera offers
enterprises one place to store, process and analyze all their data, empowering them
to extend the value of existing investments while enabling fundamental new ways to
derive value from their data. Only Cloudera offers everything needed on a journey to
an enterprise data hub, including software for business critical data challenges such as
storage, access, management, analysis, security and search. As the leading educator of
Hadoop professionals, Cloudera has trained over 40,000 individuals worldwide. Over 1400
partners and a seasoned professional services team help deliver greater time to value.
Finally, only Cloudera provides proactive and predictive support to run an enterprise
data hub with confidence. Leading organizations in every industry plus top public sector
organizations globally run Cloudera in production. www.cloudera.com.organizations
globally. www.cloudera.com.
cloudera.com
1-888-789-1488 or 1-650-362-0488
Cloudera, Inc. 1001 Page Mill Road, Palo Alto, CA 94304, USA
2015 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA
and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.
cloudera-solutionbrief-informatica-102