1
Enabling Data as a Service
                          with

        JBoss Data Services
     Prajod Vettiyattil          Gnanaguru Sattanathan
    Twitter: @prajods             Twitter:@gnanagurus
                                 Website: bushorn.com



2
What this session is about


The why and what of data services
How data services work
Use cases
JBoss Data Services Platform



3
Why



4
Proliferation of data
                                         Data Consumers
      Custom              Employee
                                             ERP           CRM                Accounting            Billing
      er portal            portal



                                                                   Partner              Vendor
              Finance            Marketing         Sales         Management            Management




                                                                                       Content
                                 Mainfra                                               Manage
SQL               File                             NoSQL           Email                                      ERP
                                  me                                                    ment
                                                                                       System

                         Data Sources and Data Managers
  5
Proliferation: so what ?
• Multiplicity of connections
    – High development cost
    – Huge operational overhead
    – Difficult and risky to change Data Sources/Managers
• Dispersed data connectors
• Data duplication
    – Too much ETL
    – Lines of Business copies data
•   Duplicated data aggregation
•   Impossible to create “Single source of truth”
•   Data ownership issues
•   No comprehensive view
    – No data movement dashboards
    – Location of data and its status

6
What



7
Data Services and DSP
                              The basic view
•   DSP = Data Services Platform                  •   Presents the data as a service to the
•   Abstracts the data                                consumer
    managers/sources                              •   ETL++

    C1               C2       C3      Data Consumers           C4            C5         C6




         Data
         Service 1
                          Data
                          Service 2   Data Services Platform     Data
                                                                 Service 3
                                                                                  Data
                                                                                  Service 4




    D1               D2       D3       Data Managers           D4            D5         D6
8
Dashboard in a DSP
                      Data
    Data
                      movement     Errors
    Connections
                      status


                  Data Dashboard
    Error
                       Failures    Alerts
    Corrections




9
How it works



10
Features of a DSP
• Enables architecture principles
     – Separation of concerns
     – Protected variations
• Data adapters
• Data mapping tools and standards
• Data caching
     – Local and distributed
•    Service search and reuse
•    Data security and data usage audit
•    Data access control
•    Central channel for all data requirements
•    Data dashboard
•    Configurable performance and reliability
11
Use cases



12
Auto manufacturing supply chain:
                         Requirements
•    Vehicle ownership experience
•    Business Process Automation
•    Disparate data sources
•    Multiple data feeds
     – Parts catalog
     – Prices
• Dealer updates
     – Parts consumed
     – Parts replaced
     – Part failure statistics
• Customer feedback
     – Post purchase
     – Breakdown support
     – Service Quality Dashboards
• Integration solutions based on batch transfers
     – Unreliable
     – Not traceable
13
Auto manufacturing supply chain:
                        Layer Diagram
 Customer                      Business Activity Monitoring
Experience
Dashboards                             Business Processes


                                         Enterprise Service Bus


                        Data Services Platform


              Parts
Breakdown                 Customer    Parts    Dealer    Dealer
             supplier                                             Customer
  reports                 feedback   Catalog   feeds
              feeds                                       Info     Master


14
Enterprise Data Access Layer:
               Requirements
• Golden copy / System of Record / Single source of
  truth
• Shared services team for Enterprise Data
  Management
• Data usage audit
• Data access control
• Reduce request load on Data Management team
• Reduce data maintenance costs




15
Enterprise Data Access Layer:
                      Layer Diagram
                        Enterprise Data Consumers




                               Data Services

     Virtual DB      Data Services Platform                   Metadata

     Data base                                              Data Access
                    Auditing       Data Aggregation
      drivers                                                 Control



                                 Content
                  Partner
Mainframe                      Management      Partner   Customer   Employee
                   Data
                                 System         Info      Master      Info

16
Reporting risk for securities:
                   Requirements
• Internal and external reporting
     – Risk and margin
• Centralized risk capture and management
• Calculate risk from different customer activities
• Report consolidated data to comply with regulation
     – Dodd Frank
     – Sarbanes Oxley Act (SOX)
• Dashboards for higher management




17
Reporting risk for securities:
             Architecture without DSP
 COTS Trading         Customer                      Government    Reporting
                                    Partner Apps
   Systems           facing Apps                     Systems     Applications


                      Execution       Liquidity       Position
 Order Mgmt                                                      Order Book
                        Mgmt            Mgmt           Mgmt



     Price feeds
                       Enterprise Middleware Systems             Trade feeds
                            (MQ, ESB, FTP, File shares)

                       Trade                         Ref Data     Payment
 Margin Mgmt                          Clearing
                      Matching                        Feeds       Systems


 Custom built           Risk                          Ref Data
                                     Settlement                  Accounting
    Apps             Management                        Mgmt


18
Reporting risk for securities:
      Patterns in this requirement
• Regulatory requirement for transparency
     – Cannot be met by opaque internal systems
• Data Sources
     – Large number of them
     – Internal and external
• Reports are read heavy
• No real time data requirements
     – once a quarter or once a year
• No excuses for incorrect data in reports
• Non-discretionary spending



19
JBoss Data Services
          Platform



20
Architecture
                                                 Data consumers
                    (Custom Applications, COTS products, Business Processes, Business Services )

                                                 Data interfaces
•   The EDS platform              (JCR API, Web service, JDBC, ODBC, OData,..)      •     Parts of the architecture
     –    v5 Runs on SOA-P                                                                 –     Data interfaces
•   Teiid                                                        Metadata                  –     Data adapters
                                   Data virtualization                                     –
•   ModeShape                                                   repository                       Data virtualization
                                                                                           –     Metadata repository
                                               Data Adapters
                                          SOA Platform
                                      Data Services Platform


              SAP        Sybase      Flat file     XML       SalesFo      Oracle        Cassan      Mongo
                                                               rce                       dra         DB
                                                    Data Sources
         21
Data sources
     Oracle     IBM         MS       MySQL     PostgreS   Sybase
      DB        DB2         SQL                  QL
                           Server



     Greenpl   Teradata   Netezza    Ingres    Mondria    MetaMa
       um                                        n         trix




     LDAP      Salesfor   Delimite    XML        Web      Apache
                  ce       d file      file    services    Hive




       MS       MS         JBoss      JBoss    TIBCO       IBM
      Excel    Access     Messagi    HornetQ               MQ
                            ng

22
Data Mapping
• Teiid Designer
   – Map actual data tables using transforms to virtual
     tables
   – MDD; use Data Models, not SQL
   – Semantic mapping
   – Virtual procedures
      • A set of SQL statements, similar to DB stored procedures




23
Data Standards
• JCR
     – Java Content Repository(JSR-283)
• OData
     – Open Data Protocol
• JDBC
• ODBC
• Others
     – S-RAMP
        – An SOA repository spec, OASIS
     – Web Services
     – REST
     – JMS


24
Access control and Audit
• Teiid
     – passwords
     – MembeshipDomains for authentication
     – Data roles
        • Fine grained access and visibility control of tables
     – CRUD level permissions for VDB
     – LDAP integration
• ModeShape
     – LoginContext
     – AuthenticationProvider
     – Role to Action mapping

25
Teiid and ModeShape
 Data type                   Teiid          ModeShape
 Approach                    Relational     Hierarchical
 Metadata repository         Not suitable   Yes
 Content repository          Not suitable   Yes
 ACID transactions           Yes            Yes
 SQL queries                 Yes            Yes(JCR-SQL)
 Flat file data source       Yes            Not suitable
 Relational DB data source   Yes            Not suitable

 Schema                      Fixed          Optional
 NoSQL data sources          Not suitable   Yes
 Stores data                 No             Yes

26
Summary
• Data Services
     – Why
     – What
     – How
• Use cases
     – Auto Manufacturer
     – Enterprise Data Access Layer
     – Regulatory Reporting
• JBoss DSP
     – Data virtualization
     – Teiid
     – ModeShape
27
Questions

      Prajod Vettiyattil              Gnanaguru Sattanathan
     Twitter: @prajods                 Twitter: @gnanagurus
                                      Website: bushorn.com




         Our Open Source Middleware Group on LinkedIn
                   http://tinyurl.com/be6e93q




28

Enabling Data as a Service with the JBoss Enterprise Data Services Platform

  • 1.
  • 2.
    Enabling Data asa Service with JBoss Data Services Prajod Vettiyattil Gnanaguru Sattanathan Twitter: @prajods Twitter:@gnanagurus Website: bushorn.com 2
  • 3.
    What this sessionis about The why and what of data services How data services work Use cases JBoss Data Services Platform 3
  • 4.
  • 5.
    Proliferation of data Data Consumers Custom Employee ERP CRM Accounting Billing er portal portal Partner Vendor Finance Marketing Sales Management Management Content Mainfra Manage SQL File NoSQL Email ERP me ment System Data Sources and Data Managers 5
  • 6.
    Proliferation: so what? • Multiplicity of connections – High development cost – Huge operational overhead – Difficult and risky to change Data Sources/Managers • Dispersed data connectors • Data duplication – Too much ETL – Lines of Business copies data • Duplicated data aggregation • Impossible to create “Single source of truth” • Data ownership issues • No comprehensive view – No data movement dashboards – Location of data and its status 6
  • 7.
  • 8.
    Data Services andDSP The basic view • DSP = Data Services Platform • Presents the data as a service to the • Abstracts the data consumer managers/sources • ETL++ C1 C2 C3 Data Consumers C4 C5 C6 Data Service 1 Data Service 2 Data Services Platform Data Service 3 Data Service 4 D1 D2 D3 Data Managers D4 D5 D6 8
  • 9.
    Dashboard in aDSP Data Data movement Errors Connections status Data Dashboard Error Failures Alerts Corrections 9
  • 10.
  • 11.
    Features of aDSP • Enables architecture principles – Separation of concerns – Protected variations • Data adapters • Data mapping tools and standards • Data caching – Local and distributed • Service search and reuse • Data security and data usage audit • Data access control • Central channel for all data requirements • Data dashboard • Configurable performance and reliability 11
  • 12.
  • 13.
    Auto manufacturing supplychain: Requirements • Vehicle ownership experience • Business Process Automation • Disparate data sources • Multiple data feeds – Parts catalog – Prices • Dealer updates – Parts consumed – Parts replaced – Part failure statistics • Customer feedback – Post purchase – Breakdown support – Service Quality Dashboards • Integration solutions based on batch transfers – Unreliable – Not traceable 13
  • 14.
    Auto manufacturing supplychain: Layer Diagram Customer Business Activity Monitoring Experience Dashboards Business Processes Enterprise Service Bus Data Services Platform Parts Breakdown Customer Parts Dealer Dealer supplier Customer reports feedback Catalog feeds feeds Info Master 14
  • 15.
    Enterprise Data AccessLayer: Requirements • Golden copy / System of Record / Single source of truth • Shared services team for Enterprise Data Management • Data usage audit • Data access control • Reduce request load on Data Management team • Reduce data maintenance costs 15
  • 16.
    Enterprise Data AccessLayer: Layer Diagram Enterprise Data Consumers Data Services Virtual DB Data Services Platform Metadata Data base Data Access Auditing Data Aggregation drivers Control Content Partner Mainframe Management Partner Customer Employee Data System Info Master Info 16
  • 17.
    Reporting risk forsecurities: Requirements • Internal and external reporting – Risk and margin • Centralized risk capture and management • Calculate risk from different customer activities • Report consolidated data to comply with regulation – Dodd Frank – Sarbanes Oxley Act (SOX) • Dashboards for higher management 17
  • 18.
    Reporting risk forsecurities: Architecture without DSP COTS Trading Customer Government Reporting Partner Apps Systems facing Apps Systems Applications Execution Liquidity Position Order Mgmt Order Book Mgmt Mgmt Mgmt Price feeds Enterprise Middleware Systems Trade feeds (MQ, ESB, FTP, File shares) Trade Ref Data Payment Margin Mgmt Clearing Matching Feeds Systems Custom built Risk Ref Data Settlement Accounting Apps Management Mgmt 18
  • 19.
    Reporting risk forsecurities: Patterns in this requirement • Regulatory requirement for transparency – Cannot be met by opaque internal systems • Data Sources – Large number of them – Internal and external • Reports are read heavy • No real time data requirements – once a quarter or once a year • No excuses for incorrect data in reports • Non-discretionary spending 19
  • 20.
  • 21.
    Architecture Data consumers (Custom Applications, COTS products, Business Processes, Business Services ) Data interfaces • The EDS platform (JCR API, Web service, JDBC, ODBC, OData,..) • Parts of the architecture – v5 Runs on SOA-P – Data interfaces • Teiid Metadata – Data adapters Data virtualization – • ModeShape repository Data virtualization – Metadata repository Data Adapters SOA Platform Data Services Platform SAP Sybase Flat file XML SalesFo Oracle Cassan Mongo rce dra DB Data Sources 21
  • 22.
    Data sources Oracle IBM MS MySQL PostgreS Sybase DB DB2 SQL QL Server Greenpl Teradata Netezza Ingres Mondria MetaMa um n trix LDAP Salesfor Delimite XML Web Apache ce d file file services Hive MS MS JBoss JBoss TIBCO IBM Excel Access Messagi HornetQ MQ ng 22
  • 23.
    Data Mapping • TeiidDesigner – Map actual data tables using transforms to virtual tables – MDD; use Data Models, not SQL – Semantic mapping – Virtual procedures • A set of SQL statements, similar to DB stored procedures 23
  • 24.
    Data Standards • JCR – Java Content Repository(JSR-283) • OData – Open Data Protocol • JDBC • ODBC • Others – S-RAMP – An SOA repository spec, OASIS – Web Services – REST – JMS 24
  • 25.
    Access control andAudit • Teiid – passwords – MembeshipDomains for authentication – Data roles • Fine grained access and visibility control of tables – CRUD level permissions for VDB – LDAP integration • ModeShape – LoginContext – AuthenticationProvider – Role to Action mapping 25
  • 26.
    Teiid and ModeShape Data type Teiid ModeShape Approach Relational Hierarchical Metadata repository Not suitable Yes Content repository Not suitable Yes ACID transactions Yes Yes SQL queries Yes Yes(JCR-SQL) Flat file data source Yes Not suitable Relational DB data source Yes Not suitable Schema Fixed Optional NoSQL data sources Not suitable Yes Stores data No Yes 26
  • 27.
    Summary • Data Services – Why – What – How • Use cases – Auto Manufacturer – Enterprise Data Access Layer – Regulatory Reporting • JBoss DSP – Data virtualization – Teiid – ModeShape 27
  • 28.
    Questions Prajod Vettiyattil Gnanaguru Sattanathan Twitter: @prajods Twitter: @gnanagurus Website: bushorn.com Our Open Source Middleware Group on LinkedIn http://tinyurl.com/be6e93q 28