Managing Performance and Availability for
25,000 Siebel Contact Center Users with
Oracle Real Application Clusters
Roland Höhl, Perot Systems / Deutsche
Telekom AG
30rd
of July, 2009
RAC SIG Webseminar
roland.hoehl@ps.net
roland.hoehl@telekom.de
2
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
3
Who am I / Biography Roland Höhl
 46 years old
 Studied at the university in Darmstadt / Germany
 Perot Systems employee
Manager Application Development
 Working for Deutsche Telekom as
external consultant in the CRM-T
project since 2006
 In the Deutsche Telekom CRM project responsible for
 technical architecture
 infrastructure sizing and planning
 release management
4
Company overview – Deutsche Telekom
 Internationalization:
Deutsche Telekom is represented in ~ 50 countries
worldwide..
 Employees:
Approx. 242,000 employees worldwide (December 31,
2007)
 Revenue:
62.5 billion euros (2007 financial year)
 Figures T-Home
 37 Million Fixed Line User (PSTN)
 No.1 in German DSL-Market
(total amount12, 5 Mio Customer)
 150.000 new Customer for IPTV (Entertain) in 2007
 1.800 Sale points (Biggest integrated sales- and
Service
organization for Fixed and Mobile products)
5
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
6
Facts and figures
 Project scope
 Siebel upgrade from 7.0 to 7.8.2.5
 Change from DB2 Host (OS390) to Oracle 10g RAC Unix
 Reimplementation of customized Siebel business processes
 Data migration from DB2 to Oracle 10g (based on EIM load)
 Building a ESB around Siebel CRM (based on MQSeries)
 Project staffing:
 Oracle – Software vendor (Siebel / Database)
 IBM Global Services – Development Siebel / EAI
 T-Systems - Operations and hosting
 Duration:
 Project start: 11/2005
 First release: 04/2008
 Involved employees: ~800
7
Project Goals
Strategic Goals
Improve customer
satisfaction and raise
call center efficiency
Time reduction to
develop new
products
and services
Cost reduction
for operations,
maintenance and
enhancements
Standardization,
risk mitigation and
protection of
investment
Functional Scale
identical with CRM-C V2.130
(KM, KKM, OM)
Upgrade to version:
Siebel Standard SW 7.8
Align reduction of the application to
Siebel Standard
Reduction of interface complexity
(Service oriented integration platform)
Hardware change –
Change of database platform if applicable
Master function of CRM-T for customer and
contract data (End of parallel operation, KoSi)
Improved application operation,
incident and problem management
Operative Goals
8
Project Business Drivers
 Advantages for the customers:
 Improved customer service
 Increased customer satisfaction
 Advantages for the employees:
 Enhanced graphical user interface  Simplification of IT systems
 Single and completed view of customer information
 Transparency of processes
 Decreased administrative complexity
 Advantages for the Deutsche Telekom:
 Simplification of IT processes
 Improvement of IT stability
 Increased flexibility through open standards
 Retirement of legacy systems
 Achieved saving potentials in operations, support and development
 1. Milestone in redesign of the IT architecture
9
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
10
Deutsche Telekom – T-Home - CRM Landscape
2 Data Centers Data Warehouse
Legacy Systems
T-ShopT-Shop
Call CentersCall Centers
PartnerPartner
T-ESB
11
Scope and usage
 89 call center within Germany
 20000 named call center users
 7000 named T-Shop and partner users
 80 million accounts
 up to 40.000 customer contacts per day
(Inbound items, i.e. Calls, Mails, Docs)
 ~ up to 10.000 orders / hour
 ~ 19 TB Data in the Database
 ~ 40.000.000 user requests / hour
 ~ 50.000.000 disk reads / hour
 Up to 7.000.000.000 buffer gets / hour
 42 Legacy systems connected
 several million messages per day transferred
12
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
13
Infrastructure Architecture
14
Web Server Infrastructure
Software Configuration
• AIX 5.3 ML8
• IBM IHS 2.0
• Siebel Web Server Extension 7.8.2.5
Hardware Configuration
• 12 x IBM P5 9133-55A
•4 x 1.65 GHz P5 CPUs
•24 GB Memory
• CISCO CSS 1150
•Frontend and Backend Load balancer
15
Software Configuration
• AIX 5.3 ML8
• Siebel 7.8.2.5
• WebSphere MQ V6 (on EAI Servers only)
• IBM HACMP (Cluster) on EAI Servers
• Genesys CTI Driver
Siebel Application Infrastructure
Hardware Configuration
Users related Servers
• 6 x IBM P595 (eCommunications)
•48 x 5 GHz P5 CPUs
•336 GB Memory
• 3 x IBM P595 (eCommunications)
•32 x 2.1 GHz P5 CPUs
•128 GB Memory
• 2 x IBM P590 (Siebel eConfigurator)
EAI Server
• 8 x IBM P590 (Active)
•8 x 2.1 GHz CPUs
•24 GB Memory
• 8 x IBM P590 (passive – Virtual Servers)
•0.1 x 2.1 GHz CPUs
•24 GB Memory
Batch/Read Audit Servers
• 4 x IBM P590 (active + passive)
•8 x 2.1 GHz CPUs
•32 GB Memory
16
Oracle RAC Infrastructure
Software Configuration
• AIX 5.3 ML8
• ORACLE 10.2.0.3
• ASM, Streams
Hardware Configuration
Oracle RAC Nodes
• 4 x IBM P595
•36 x 2.1 GHz P5 CPUs
•312 GB Memory
SAN Storage 160 TB
• 10 x IBM DS8300
•RAID 10
•16 x 146 GB Disks
•2 x 2 GB/sec FC
Solid State Disks
• 2 x Texas Memory Systems RamSan-500
•2 TB Flash RAID
•2 GB/sec Bandwidth
17
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
18
Session Management for Performance, Availability,
and Scalability Sessions Handling
• Up to 20.000 user session + several hundred sessions
from EAI Components will be handled on the Siebel
System
• 2 Options to handle them on the Database
•Oracle MTS
•Siebel Connection Pooling
19
Session Management for Performance, Availability,
and Scalability Sessions Management
Siebel Connection Pooling of 5:1 for User Sessions
• Session are connected via DB Services
•Up to now each session will connect via the same
service
•Multiply Services allow to
•Separate – Distribute – Prioritize - …
load over the different RAC Nodes.
Siebel
Connection
Pooling
20
Session Management for Performance, Availability,
and Scalability
Siebel
Connection
Pooling
Resource Management
• Siebel implements Transparent Application Failover TAF
internal
• Bad SQL statements have been hard to stop
• Solution was Resource Profiles
•CPU limitation
21
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
22
RAC Performance & Availability Aspects
 General memory considerations
 Session size up to 15 MB
 Keep enough memory for the operating system
 10-20% more memory on RAC environments
 Amount of cluster nodes
 Expected load and hardware limitations defines
the minimum # of nodes
 More nodes provide better availability / less
failure impact (MTBF, upgrades, …)
 More nodes allows dedicated resource
assignment
 Less nodes reduce interconnect traffic,
management overhead
•Interconnect
•Dedicated infrastructure
•Failover Interfaces / Channel bundling
23
RAC Performance & Availability Aspects II
 Disaster Scenarios and Stretch Cluster Limitations
 20 – 50 km Max
 Disaster Recovery
 Consider Rolling Upgrades
 Increase the availability – reduce planned down time
 Define database services as soon as possible
 Group Servers
 Separate load
 Control performance and availability
 Watch out for Architectural bottlenecks
 Centralized Tables, Sequences
 Prevent, Tune, or Invest
24
RAC Performance Configuration
 Consider to use Partitioning
 Try to spread out load but keep the data local
 Consider that i.e. DB Triggers will be fired on each node
 Watch out for High frequently changing DB objects
 Consider less records per block
 Less records decreases block concurrency
 Less nodes request the same block at a single point in time
 Use caching
 Increase cache size for sequences
 Pin small tables into the cache
 Oracle features
 ASSM, ASM, Reverse Indexes, Function Based Indexes, Partitioning
 Follow best practices
25
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 System Interfaces
 Lessons Learned
 Future plans
Agenda
26
System Monitoring
 Joint Team of Admins
 IBM Tivoli
 Oracle GRID Control
 Quest Foglight Experience Monitor
 Interface monitoring
27
Joint Team of Admins
 Joint team of Siebel , EAI and Database Admins
 Total overview of the system
 Database monitoring
 Siebel view (functional SQL process monitoring)
 Database view (technical monitoring)
 Queue Monitoring (Siebel inbound and outbound)
 Technical Monitoring
 Logs, CPU, Memory, Storage etc.
 Propose technical improvements
28
System Monitoring
 Joint Team of Admins
 IBM Tivoli
 Oracle GRID Control
 Quest Foglight Experience Monitor
 Interface monitoring
29
IBM Tivoli
 Operational Management
 System Automation
 Monitoring (CPU, Memory…)
 Workload Scheduling
 Service Level Advisor
 Storage Management:
 Tivoli Storage Manager
 SAN Volume Controller
 Alerting
30
System Monitoring
 Joint Team of Admins
 IBM Tivoli
 Oracle GRID Control
 Quest Foglight Experience Monitor
 Interface monitoring
31
Oracle GRID Control
 RAC monitoring
 Long running transactions
 Wait States
32
System Monitoring
 Joint Team of Admins
 IBM Tivoli
 Oracle GRID Control
 Quest Foglight Experience Monitoring
 Interface monitoring
33
Quest Foglight Experience Monitor
Coverage:
 Round about 60 business
processes are being monitored
(90% coverage)
 Various report sets are available
to users
Applicable Fields:
 Monitoring of software quality
after deployment
 Alarm based control of the
production environment
 Information base for sizing
recommendations and server
consolidation
Acceptance:
 FxM is accepted as an objective
measuring instrument which
reflects “end user reality” and is
used to support SLA monitoring
and reporting
34
System Monitoring
 Joint Team of Admins
 IBM Tivoli
 Oracle GRID Control
 Quest Foglight Experience Monitor
 Interface Monitoring
35
Interface Monitoring
 The Telekom Enterprise
Service Bus
36
Interface Monitoring
 Evaluation of auditing
messages
 Monitor traffic and response
times on interfaces
37
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
38
Reporting
 OA Mirror (Operative
analysis)
 RO Mirror (Read only)
Logshipment
(arch and
redo)
CRM RAC
Database
Stage DB
Capture process
Propagation process
RO Mirror DB
Physical Standby
read only
Logshipment
(arch and redo)
OA Mirror DB
Apply processes (16
CDC-capture
CDC-apply (8)
Data warehouse
CRM OA
Reports
39
CRM technical achievements
 Reduced Response times
 Increased Automation
 Faster time to market and increased flexibility for new products
 Less development more configuration
 Service Oriented Architecture is implemented
 Host platform retired
 OLAP database with realtime mirroring based on Streams technology
40
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
41
System Interfaces - ESB
42
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
43
Lessons Learned
 Get a 360° view on the system
 Test with full blown database as early as possible
 Test the infrastructure with worst case scenarios
 Test the system infrastructure carefully before roll out
 Test your migration with real data
 Always have a look at system and application performance
 Avoid load mixing of batch and OLTP
 Keep commit sizes low (Batch Jobs)
 Keep work on a single node when possible
 Be prepared for long running queries
 Slim line tables with highly changing content
44
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
Agenda
45
Future Plans
 Additional customer groups will be integrated and served through the new
platform
 Small business customers
 Medium business customers
 Additional product types will be integrated to process the complete portfolio
 Internet access and value added services
 Next generation network services
 Service Architecture will be enhanced
 Additional services will be externalized
 Further automation improvements
 Continuous systems consolidation and renewal of IT architecture
 Upgrades to Oracle 11g and Siebel 8.x
46
 Introductions
 Project Overview & Business Drivers
 Scope and Usage of the system
 Technical Footprint - Handling 25,000 Users
 Architecture & Infrastructure
 Session Management
 RAC Performance
 System Monitoring
 Reporting
 System Interfaces
 Lessons Learned
 Future plans
 CRM-T Infrastructure details
Infrastructure in detail
47
CRM-T Infrastructure details
Web Server and Reporting
48
CRM-T Infrastructure details
Quest Monitoring
49
CRM-T Infrastructure details
Object manager and Batch Server
50
CRM-T Infrastructure details
EAI Server and ESB
51
CRM-T Infrastructure details
Database
52
CRM-T Infrastructure details
Storage
53

More Related Content

PDF
IMS04 BMC Software Strategy and Roadmap
PDF
Making Legacy IBM Systems Visible in ServiceNow
PPTX
What’s New in Assure MIMIX 10
PPTX
Monitoring and Reporting for IBM i Compliance and Security
PPTX
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
PPTX
Architecture concepts
PDF
Interconnect session 3498: Deployment Topologies for Jazz Reporting Service
PDF
What's new in informix v11.70
IMS04 BMC Software Strategy and Roadmap
Making Legacy IBM Systems Visible in ServiceNow
What’s New in Assure MIMIX 10
Monitoring and Reporting for IBM i Compliance and Security
Creating a Centralized Consumer Profile Management Service with WebSphere Dat...
Architecture concepts
Interconnect session 3498: Deployment Topologies for Jazz Reporting Service
What's new in informix v11.70

What's hot (19)

PDF
309675745
PDF
Effective admin and development in iib
ODP
SHARE 2014, Pittsburgh CICS scalability
PPT
Cics Ts 4.1 Technical Overview
PPTX
Presentation oracle optimized solutions
PPTX
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
PPTX
Architecture Concepts
PPTX
Change management in hybrid landscapes
DOC
Venkatesan_Thirumalai_US
DOC
VendorReview_IBMDB2
PDF
1. data infrastructure keynote october 2010 alain
PPTX
MMS2012-HP VirtualSystem-The Ideal Foundation for a Microsoft Private Cloud
PDF
CloudBridge and Repeater Datasheet
PPTX
Cloud or On Premise
PPTX
IBM Spectrum Scale and Its Use for Content Management
DOC
RESUME.DOC
DOC
Robert Hensel resume v111416 Lnkedin
PDF
EPBCS - A New Approach to Planning Implementations
309675745
Effective admin and development in iib
SHARE 2014, Pittsburgh CICS scalability
Cics Ts 4.1 Technical Overview
Presentation oracle optimized solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
Architecture Concepts
Change management in hybrid landscapes
Venkatesan_Thirumalai_US
VendorReview_IBMDB2
1. data infrastructure keynote october 2010 alain
MMS2012-HP VirtualSystem-The Ideal Foundation for a Microsoft Private Cloud
CloudBridge and Repeater Datasheet
Cloud or On Premise
IBM Spectrum Scale and Its Use for Content Management
RESUME.DOC
Robert Hensel resume v111416 Lnkedin
EPBCS - A New Approach to Planning Implementations

Viewers also liked (20)

PDF
SOA @ T-Mobile: Automatic Service Provisioning to the ESB
PPTX
Senses powerpoint
PDF
Arps public lecture brook
PPS
I do and how i live
ODP
Aitiidiikiiwan priid.impress
PDF
Ais life #1_2010
PDF
Rps8 five years on smart
PDF
Anna panova
PPTX
Emprender no es para todos - Semana ISW
PDF
Газар доорхи бүтээц
PPTX
DOC
Eigrp
PDF
Sacred places pix mini
PPT
Raj patil design portfolio
PPTX
Hurricane katrina presentation
DOCX
Minutes of bod meeting august 14'
PPT
To sample, or not to sample
PDF
Gamma dose reduction for underground diamond drillers lawrence
PPTX
ETL 523 Presentation: The Digital Divide
PPTX
Maquettes
SOA @ T-Mobile: Automatic Service Provisioning to the ESB
Senses powerpoint
Arps public lecture brook
I do and how i live
Aitiidiikiiwan priid.impress
Ais life #1_2010
Rps8 five years on smart
Anna panova
Emprender no es para todos - Semana ISW
Газар доорхи бүтээц
Eigrp
Sacred places pix mini
Raj patil design portfolio
Hurricane katrina presentation
Minutes of bod meeting august 14'
To sample, or not to sample
Gamma dose reduction for underground diamond drillers lawrence
ETL 523 Presentation: The Digital Divide
Maquettes

Similar to Presentation racsig 090730 (20)

PDF
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
PPTX
Sybase Global Infrastructure
PDF
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
DOC
Migrating from Single Instance to RAC Data guard
PDF
MIgrating to RAC using Dataguard
PDF
Sonny_Baker_Resume_v2
PDF
Oracle Application Management Suite
PDF
Sachin_Bapat_Resume
DOC
Ezhilvannan_Resume
DOC
Hedgepeth_Jeffrey_Resume1
DOCX
Lei Chen Resume (2016)
PPT
2007-11-slides 5.ppt in software development
PPTX
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
PDF
Siebel_CRM_6yrs_Prof
DOCX
VivekVarma - BearingPoint
PDF
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
PDF
21st Century Service Oriented Architecture
PPTX
101_Customer_Move and Modernize Siebel_07012021.pptx
DOC
Oracle dba rac 11g training
PDF
Optimize Oracle Application Integration Architecture (AIA) for Communications
Maximum Availability Architecture with Fusion Middleware 12c and Oracle Datab...
Sybase Global Infrastructure
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
Migrating from Single Instance to RAC Data guard
MIgrating to RAC using Dataguard
Sonny_Baker_Resume_v2
Oracle Application Management Suite
Sachin_Bapat_Resume
Ezhilvannan_Resume
Hedgepeth_Jeffrey_Resume1
Lei Chen Resume (2016)
2007-11-slides 5.ppt in software development
ServiceNow + Precisely: Getting Business Value and Visibility from Mainframe ...
Siebel_CRM_6yrs_Prof
VivekVarma - BearingPoint
Oracle RAC 12c Practical Performance Management and Tuning OOW13 [CON8825]
21st Century Service Oriented Architecture
101_Customer_Move and Modernize Siebel_07012021.pptx
Oracle dba rac 11g training
Optimize Oracle Application Integration Architecture (AIA) for Communications

More from maclean liu (20)

PDF
Mysql企业备份发展及实践
PDF
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
PDF
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
PDF
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
PDF
TomCat迁移步骤简述以及案例
PDF
PRM DUL Oracle Database Health Check
PDF
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
DOCX
Vbox virtual box在oracle linux 5 - shoug 梁洪响
PDF
【诗檀软件】Mysql高可用方案
PPTX
Shoug at apouc2015 4min pitch_biotwang_v2
PPTX
Apouc 4min pitch_biotwang_v2
PDF
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
PDF
诗檀软件 Oracle开发优化基础
PDF
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
PDF
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
PDF
追求Jdbc on oracle最佳性能?如何才好?
PDF
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
PDF
Prm dul is an oracle database recovery tool database
PDF
Oracle prm dul, jvm and os
PDF
Oracle dba必备技能 使用os watcher工具监控系统性能负载
Mysql企业备份发展及实践
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
TomCat迁移步骤简述以及案例
PRM DUL Oracle Database Health Check
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
Vbox virtual box在oracle linux 5 - shoug 梁洪响
【诗檀软件】Mysql高可用方案
Shoug at apouc2015 4min pitch_biotwang_v2
Apouc 4min pitch_biotwang_v2
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
诗檀软件 Oracle开发优化基础
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
追求Jdbc on oracle最佳性能?如何才好?
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
Prm dul is an oracle database recovery tool database
Oracle prm dul, jvm and os
Oracle dba必备技能 使用os watcher工具监控系统性能负载

Recently uploaded (20)

PDF
Launch a Bumble-Style App with AI Features in 2025.pdf
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PDF
CEH Module 2 Footprinting CEH V13, concepts
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
substrate PowerPoint Presentation basic one
PPTX
Blending method and technology for hydrogen.pptx
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PPTX
How to use fields_get method in Odoo 18
PDF
CCUS-as-the-Missing-Link-to-Net-Zero_AksCurious.pdf
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PDF
Human Computer Interaction Miterm Lesson
PDF
Connector Corner: Transform Unstructured Documents with Agentic Automation
PDF
Examining Bias in AI Generated News Content.pdf
PPTX
Rise of the Digital Control Grid Zeee Media and Hope and Tivon FTWProject.com
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
Build Real-Time ML Apps with Python, Feast & NoSQL
Launch a Bumble-Style App with AI Features in 2025.pdf
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
CEH Module 2 Footprinting CEH V13, concepts
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
substrate PowerPoint Presentation basic one
Blending method and technology for hydrogen.pptx
A symptom-driven medical diagnosis support model based on machine learning te...
How to use fields_get method in Odoo 18
CCUS-as-the-Missing-Link-to-Net-Zero_AksCurious.pdf
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
EIS-Webinar-Regulated-Industries-2025-08.pdf
Early detection and classification of bone marrow changes in lumbar vertebrae...
Human Computer Interaction Miterm Lesson
Connector Corner: Transform Unstructured Documents with Agentic Automation
Examining Bias in AI Generated News Content.pdf
Rise of the Digital Control Grid Zeee Media and Hope and Tivon FTWProject.com
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Build Real-Time ML Apps with Python, Feast & NoSQL

Presentation racsig 090730

  • 1. Managing Performance and Availability for 25,000 Siebel Contact Center Users with Oracle Real Application Clusters Roland Höhl, Perot Systems / Deutsche Telekom AG 30rd of July, 2009 RAC SIG Webseminar [email protected] [email protected]
  • 2. 2  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 3. 3 Who am I / Biography Roland Höhl  46 years old  Studied at the university in Darmstadt / Germany  Perot Systems employee Manager Application Development  Working for Deutsche Telekom as external consultant in the CRM-T project since 2006  In the Deutsche Telekom CRM project responsible for  technical architecture  infrastructure sizing and planning  release management
  • 4. 4 Company overview – Deutsche Telekom  Internationalization: Deutsche Telekom is represented in ~ 50 countries worldwide..  Employees: Approx. 242,000 employees worldwide (December 31, 2007)  Revenue: 62.5 billion euros (2007 financial year)  Figures T-Home  37 Million Fixed Line User (PSTN)  No.1 in German DSL-Market (total amount12, 5 Mio Customer)  150.000 new Customer for IPTV (Entertain) in 2007  1.800 Sale points (Biggest integrated sales- and Service organization for Fixed and Mobile products)
  • 5. 5  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 6. 6 Facts and figures  Project scope  Siebel upgrade from 7.0 to 7.8.2.5  Change from DB2 Host (OS390) to Oracle 10g RAC Unix  Reimplementation of customized Siebel business processes  Data migration from DB2 to Oracle 10g (based on EIM load)  Building a ESB around Siebel CRM (based on MQSeries)  Project staffing:  Oracle – Software vendor (Siebel / Database)  IBM Global Services – Development Siebel / EAI  T-Systems - Operations and hosting  Duration:  Project start: 11/2005  First release: 04/2008  Involved employees: ~800
  • 7. 7 Project Goals Strategic Goals Improve customer satisfaction and raise call center efficiency Time reduction to develop new products and services Cost reduction for operations, maintenance and enhancements Standardization, risk mitigation and protection of investment Functional Scale identical with CRM-C V2.130 (KM, KKM, OM) Upgrade to version: Siebel Standard SW 7.8 Align reduction of the application to Siebel Standard Reduction of interface complexity (Service oriented integration platform) Hardware change – Change of database platform if applicable Master function of CRM-T for customer and contract data (End of parallel operation, KoSi) Improved application operation, incident and problem management Operative Goals
  • 8. 8 Project Business Drivers  Advantages for the customers:  Improved customer service  Increased customer satisfaction  Advantages for the employees:  Enhanced graphical user interface  Simplification of IT systems  Single and completed view of customer information  Transparency of processes  Decreased administrative complexity  Advantages for the Deutsche Telekom:  Simplification of IT processes  Improvement of IT stability  Increased flexibility through open standards  Retirement of legacy systems  Achieved saving potentials in operations, support and development  1. Milestone in redesign of the IT architecture
  • 9. 9  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 10. 10 Deutsche Telekom – T-Home - CRM Landscape 2 Data Centers Data Warehouse Legacy Systems T-ShopT-Shop Call CentersCall Centers PartnerPartner T-ESB
  • 11. 11 Scope and usage  89 call center within Germany  20000 named call center users  7000 named T-Shop and partner users  80 million accounts  up to 40.000 customer contacts per day (Inbound items, i.e. Calls, Mails, Docs)  ~ up to 10.000 orders / hour  ~ 19 TB Data in the Database  ~ 40.000.000 user requests / hour  ~ 50.000.000 disk reads / hour  Up to 7.000.000.000 buffer gets / hour  42 Legacy systems connected  several million messages per day transferred
  • 12. 12  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 14. 14 Web Server Infrastructure Software Configuration • AIX 5.3 ML8 • IBM IHS 2.0 • Siebel Web Server Extension 7.8.2.5 Hardware Configuration • 12 x IBM P5 9133-55A •4 x 1.65 GHz P5 CPUs •24 GB Memory • CISCO CSS 1150 •Frontend and Backend Load balancer
  • 15. 15 Software Configuration • AIX 5.3 ML8 • Siebel 7.8.2.5 • WebSphere MQ V6 (on EAI Servers only) • IBM HACMP (Cluster) on EAI Servers • Genesys CTI Driver Siebel Application Infrastructure Hardware Configuration Users related Servers • 6 x IBM P595 (eCommunications) •48 x 5 GHz P5 CPUs •336 GB Memory • 3 x IBM P595 (eCommunications) •32 x 2.1 GHz P5 CPUs •128 GB Memory • 2 x IBM P590 (Siebel eConfigurator) EAI Server • 8 x IBM P590 (Active) •8 x 2.1 GHz CPUs •24 GB Memory • 8 x IBM P590 (passive – Virtual Servers) •0.1 x 2.1 GHz CPUs •24 GB Memory Batch/Read Audit Servers • 4 x IBM P590 (active + passive) •8 x 2.1 GHz CPUs •32 GB Memory
  • 16. 16 Oracle RAC Infrastructure Software Configuration • AIX 5.3 ML8 • ORACLE 10.2.0.3 • ASM, Streams Hardware Configuration Oracle RAC Nodes • 4 x IBM P595 •36 x 2.1 GHz P5 CPUs •312 GB Memory SAN Storage 160 TB • 10 x IBM DS8300 •RAID 10 •16 x 146 GB Disks •2 x 2 GB/sec FC Solid State Disks • 2 x Texas Memory Systems RamSan-500 •2 TB Flash RAID •2 GB/sec Bandwidth
  • 17. 17  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 18. 18 Session Management for Performance, Availability, and Scalability Sessions Handling • Up to 20.000 user session + several hundred sessions from EAI Components will be handled on the Siebel System • 2 Options to handle them on the Database •Oracle MTS •Siebel Connection Pooling
  • 19. 19 Session Management for Performance, Availability, and Scalability Sessions Management Siebel Connection Pooling of 5:1 for User Sessions • Session are connected via DB Services •Up to now each session will connect via the same service •Multiply Services allow to •Separate – Distribute – Prioritize - … load over the different RAC Nodes. Siebel Connection Pooling
  • 20. 20 Session Management for Performance, Availability, and Scalability Siebel Connection Pooling Resource Management • Siebel implements Transparent Application Failover TAF internal • Bad SQL statements have been hard to stop • Solution was Resource Profiles •CPU limitation
  • 21. 21  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 22. 22 RAC Performance & Availability Aspects  General memory considerations  Session size up to 15 MB  Keep enough memory for the operating system  10-20% more memory on RAC environments  Amount of cluster nodes  Expected load and hardware limitations defines the minimum # of nodes  More nodes provide better availability / less failure impact (MTBF, upgrades, …)  More nodes allows dedicated resource assignment  Less nodes reduce interconnect traffic, management overhead •Interconnect •Dedicated infrastructure •Failover Interfaces / Channel bundling
  • 23. 23 RAC Performance & Availability Aspects II  Disaster Scenarios and Stretch Cluster Limitations  20 – 50 km Max  Disaster Recovery  Consider Rolling Upgrades  Increase the availability – reduce planned down time  Define database services as soon as possible  Group Servers  Separate load  Control performance and availability  Watch out for Architectural bottlenecks  Centralized Tables, Sequences  Prevent, Tune, or Invest
  • 24. 24 RAC Performance Configuration  Consider to use Partitioning  Try to spread out load but keep the data local  Consider that i.e. DB Triggers will be fired on each node  Watch out for High frequently changing DB objects  Consider less records per block  Less records decreases block concurrency  Less nodes request the same block at a single point in time  Use caching  Increase cache size for sequences  Pin small tables into the cache  Oracle features  ASSM, ASM, Reverse Indexes, Function Based Indexes, Partitioning  Follow best practices
  • 25. 25  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  System Interfaces  Lessons Learned  Future plans Agenda
  • 26. 26 System Monitoring  Joint Team of Admins  IBM Tivoli  Oracle GRID Control  Quest Foglight Experience Monitor  Interface monitoring
  • 27. 27 Joint Team of Admins  Joint team of Siebel , EAI and Database Admins  Total overview of the system  Database monitoring  Siebel view (functional SQL process monitoring)  Database view (technical monitoring)  Queue Monitoring (Siebel inbound and outbound)  Technical Monitoring  Logs, CPU, Memory, Storage etc.  Propose technical improvements
  • 28. 28 System Monitoring  Joint Team of Admins  IBM Tivoli  Oracle GRID Control  Quest Foglight Experience Monitor  Interface monitoring
  • 29. 29 IBM Tivoli  Operational Management  System Automation  Monitoring (CPU, Memory…)  Workload Scheduling  Service Level Advisor  Storage Management:  Tivoli Storage Manager  SAN Volume Controller  Alerting
  • 30. 30 System Monitoring  Joint Team of Admins  IBM Tivoli  Oracle GRID Control  Quest Foglight Experience Monitor  Interface monitoring
  • 31. 31 Oracle GRID Control  RAC monitoring  Long running transactions  Wait States
  • 32. 32 System Monitoring  Joint Team of Admins  IBM Tivoli  Oracle GRID Control  Quest Foglight Experience Monitoring  Interface monitoring
  • 33. 33 Quest Foglight Experience Monitor Coverage:  Round about 60 business processes are being monitored (90% coverage)  Various report sets are available to users Applicable Fields:  Monitoring of software quality after deployment  Alarm based control of the production environment  Information base for sizing recommendations and server consolidation Acceptance:  FxM is accepted as an objective measuring instrument which reflects “end user reality” and is used to support SLA monitoring and reporting
  • 34. 34 System Monitoring  Joint Team of Admins  IBM Tivoli  Oracle GRID Control  Quest Foglight Experience Monitor  Interface Monitoring
  • 35. 35 Interface Monitoring  The Telekom Enterprise Service Bus
  • 36. 36 Interface Monitoring  Evaluation of auditing messages  Monitor traffic and response times on interfaces
  • 37. 37  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 38. 38 Reporting  OA Mirror (Operative analysis)  RO Mirror (Read only) Logshipment (arch and redo) CRM RAC Database Stage DB Capture process Propagation process RO Mirror DB Physical Standby read only Logshipment (arch and redo) OA Mirror DB Apply processes (16 CDC-capture CDC-apply (8) Data warehouse CRM OA Reports
  • 39. 39 CRM technical achievements  Reduced Response times  Increased Automation  Faster time to market and increased flexibility for new products  Less development more configuration  Service Oriented Architecture is implemented  Host platform retired  OLAP database with realtime mirroring based on Streams technology
  • 40. 40  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 42. 42  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 43. 43 Lessons Learned  Get a 360° view on the system  Test with full blown database as early as possible  Test the infrastructure with worst case scenarios  Test the system infrastructure carefully before roll out  Test your migration with real data  Always have a look at system and application performance  Avoid load mixing of batch and OLTP  Keep commit sizes low (Batch Jobs)  Keep work on a single node when possible  Be prepared for long running queries  Slim line tables with highly changing content
  • 44. 44  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans Agenda
  • 45. 45 Future Plans  Additional customer groups will be integrated and served through the new platform  Small business customers  Medium business customers  Additional product types will be integrated to process the complete portfolio  Internet access and value added services  Next generation network services  Service Architecture will be enhanced  Additional services will be externalized  Further automation improvements  Continuous systems consolidation and renewal of IT architecture  Upgrades to Oracle 11g and Siebel 8.x
  • 46. 46  Introductions  Project Overview & Business Drivers  Scope and Usage of the system  Technical Footprint - Handling 25,000 Users  Architecture & Infrastructure  Session Management  RAC Performance  System Monitoring  Reporting  System Interfaces  Lessons Learned  Future plans  CRM-T Infrastructure details Infrastructure in detail
  • 47. 47 CRM-T Infrastructure details Web Server and Reporting
  • 49. 49 CRM-T Infrastructure details Object manager and Batch Server
  • 53. 53

Editor's Notes

  • #7: Scalability – in our old environment we wear facing Scalability limitations on the Database. With the new RAC architecture we can scale up almost without any limitation.
  • #8: Scalability – in our old environment we wear facing Scalability limitations on the Database. With the new RAC architecture we can scale up almost without any limitation.
  • #11: Throughout the whole country, we support all our Call Centers plus out Stores (T-Shops) as well as our partners. They are all connected to the central CRM system which it self is connected to various legacy systems. The Data Centers are 150 Km away from each other. They are located in Bielefeld and Krefeld? There are 26 Legacy Systems directly connected to the CRM-T System The SOA T-ESB handles several million message every day 26 Legacy Systems
  • #12: Facts of the OH TESB up to 11 TB(6TB Siebel Daten) Up to 10.000 orders per hour 20 call center within Germany 7000 T-Shop User 20.000 concurrent user on the system So where is this massive amount of workload coming from?
  • #16: EAI Servers are Clustered EAI Servers are spread out over virtual Servers. In case of a failover the CPU will be shared between the servers so still 8 CPUs but for 2 Servers.
  • #17: Our RAC infrastructure does not consolidate applications. It is pure for Scalability and Availability. Heavy I/O rates will be captured with Solid State Disks. Several Critical high frequently changing Tables are stored there. Naming of FC Switches. Streams, 1 RAC node sends into Stage. Synch below 1 minute, (Juergen/ Thysen bezüglich apply pro Tag)
  • #19: Managing Sessions on the RAC Session Handling In Siebel we see up to 20000 Concurrent Users plus hundreds of Sessions from other components like EAI, Batch,… This huge amount of session have influenced our RAC Cluster stability heavily due to memory shortages (each Session had an amount of up to 15 MB) So 2 possibilities have been offered to solve this problem: Oracle Multi threaded Server which pools the user in the Database. There, we haven’t had lots of experience. Siebel Connection pooling which pools Siebel Users into a single connection. There we had lots of experiences from our old, migrated system So we choose Siebel Connection pooling as this was the more easy to implement solution by just changing the component parameters to a ratio of 1 to 5. So 1 Database connection is handling 5 Siebel user sessions. For other components we still use dedicated sessions as compared to user session they do not have a think time. Session Management: All sessions are connected via a single Database Service, where we define the Nodes, the failover strategy as well as the session distribution over the nodes. A single Service has several disadvantages this is why we have planned but currently not implemented to use multiply services in order to: Separate the different components from each other so that we can manage them independently This will allow us to prioritize these different component like Batch job will become higher priority during the night while Call Center user will get highest priority from 7 am till 5 pm. We will also be able to split this user session over different nodes on the RAC.So we can separate Call Center Agents from every thing else. They will have than dedicated dedicated Nodes (CPU, Memory, … ) and only in case of a failure other nodes will be used where the resources will be shared with other users or components. Resource Profiles: Finally we are currently using Profiles to implement Resource limitations. During the Rollout we have been facing several bad SQL statements which influenced the overall performance on the system. Due to the fact that siebel implements the loss of a session in a way that it reconnects and send the statement again, we haven’t been able to effectively stop long running SQL statements. To solve this problem we used Oracle resource profiles to limit the CPU per call time so that long running queries disappeared as problems.
  • #20: Managing Sessions on the RAC Session Handling In Siebel we see up to 20000 Concurrent Users plus hundreds of Sessions from other components like EAI, Batch,… This huge amount of session have influenced our RAC Cluster stability heavily due to memory shortages (each Session had an amount of up to 15 MB) So 2 possibilities have been offered to solve this problem: Oracle Multi threaded Server which pools the user in the Database. There, we haven’t had lots of experience. Siebel Connection pooling which pools Siebel Users into a single connection. There we had lots of experiences from our old, migrated system So we choose Siebel Connection pooling as this was the more easy to implement solution by just changing the component parameters to a ratio of 1 to 5. So 1 Database connection is handling 5 Siebel user sessions. For other components we still use dedicated sessions as compared to user session they do not have a think time. Session Management: All sessions are connected via a single Database Service, where we define the Nodes, the failover strategy as well as the session distribution over the nodes. A single Service has several disadvantages this is why we have planned but currently not implemented to use multiply services in order to: Separate the different components from each other so that we can manage them independently This will allow us to prioritize these different component like Batch job will become higher priority during the night while Call Center user will get highest priority from 7 am till 5 pm. We will also be able to split this user session over different nodes on the RAC.So we can separate Call Center Agents from every thing else. They will have than dedicated dedicated Nodes (CPU, Memory, … ) and only in case of a failure other nodes will be used where the resources will be shared with other users or components. Resource Profiles: Finally we are currently using Profiles to implement Resource limitations. During the Rollout we have been facing several bad SQL statements which influenced the overall performance on the system. Due to the fact that siebel implements the loss of a session in a way that it reconnects and send the statement again, we haven’t been able to effectively stop long running SQL statements. To solve this problem we used Oracle resource profiles to limit the CPU per call time so that long running queries disappeared as problems.
  • #21: Managing Sessions on the RAC Session Handling In Siebel we see up to 20000 Concurrent Users plus hundreds of Sessions from other components like EAI, Batch,… This huge amount of session have influenced our RAC Cluster stability heavily due to memory shortages (each Session had an amount of up to 15 MB) So 2 possibilities have been offered to solve this problem: Oracle Multi threaded Server which pools the user in the Database. There, we haven’t had lots of experience. Siebel Connection pooling which pools Siebel Users into a single connection. There we had lots of experiences from our old, migrated system So we choose Siebel Connection pooling as this was the more easy to implement solution by just changing the component parameters to a ratio of 1 to 5. So 1 Database connection is handling 5 Siebel user sessions. For other components we still use dedicated sessions as compared to user session they do not have a think time. Session Management: All sessions are connected via a single Database Service, where we define the Nodes, the failover strategy as well as the session distribution over the nodes. A single Service has several disadvantages this is why we have planned but currently not implemented to use multiply services in order to: Separate the different components from each other so that we can manage them independently This will allow us to prioritize these different component like Batch job will become higher priority during the night while Call Center user will get highest priority from 7 am till 5 pm. We will also be able to split this user session over different nodes on the RAC.So we can separate Call Center Agents from every thing else. They will have than dedicated dedicated Nodes (CPU, Memory, … ) and only in case of a failure other nodes will be used where the resources will be shared with other users or components. Resource Profiles: Finally we are currently using Profiles to implement Resource limitations. During the Rollout we have been facing several bad SQL statements which influenced the overall performance on the system. Due to the fact that siebel implements the loss of a session in a way that it reconnects and send the statement again, we haven’t been able to effectively stop long running SQL statements. To solve this problem we used Oracle resource profiles to limit the CPU per call time so that long running queries disappeared as problems.
  • #23: One big aspect from the beginning was to do the sizing of the RAC Cluster. This was hard to do as: not to many RAC environment will be used for pure scalability Not to many RAC environments exists handling that much OLTP user Not to many RAC environments exists from this size for a Siebel Enterprise Internal Siebel Architecture was not defined in detail. So there haven’t been to much comparable examples. The tests have been done on a system with 1/20 of the real DB size. Now after the rollout we are much smarter. Current SGA Target in Production is 50 GB (System Global Area – Data Cache) Current PGA Target in Production is 32 GB (Process Global Area – Sort, Current Memory utilization is about 60% (240 GB of memory) Memory was one of the Key factors not only for performance. Less memory has lead to RAC instability due to AIX memory paging. Therefore the Nodes haven’t been reachable for the Cluster Software which ends up in node evictions. Even as we haven’t really tested the load on a single Server we would calculate 10% to 20% more physical memory for an RAC environment in the future. ----------------------------------------------------- V1009: changed the order of More and Less nodes. How much nodes we should have in the RAC environment. Oracle RAC gives the Infrastructure Architects the possibility to choose between 2 and 64/128 Nodes. So another freedom where decisions needs to be made. Similar to the amount of memory, the amount of Nodes had to be an estimate during the sizing exercises. The necessary amount of CPU’s based on the expected load, mixed with the chosen Hardware, will provide the minimum amount of server. For sure the RAC is extensible by new nodes. However, performance will not always improve with an higher amount of node as this can increase the interconnect traffic. i.e. DB Triggers will be fired possible on all nodes. So the Data Blocks will be shipped through the enterprise. More nodes provides better availability as if one node fails less resources will be missed and the Mean time before Failure is less as well. However, more nodes increase the management overhead as well as needs more work in terms of workload distribution management (Which service should run where). So for us, 4 nodes have been chosen due to Load and Hardware restriction – every thing else has come from lessons learned. ----------------------------------------------------- Interconnect As the interconnect is the heart of the Cluster, we now understand that we have to have a dedicated infrastructure as we found that even sharing the Switches was a possible reason for node evictions. Even if the Bandwidth is more than sufficient, hasn't been a problem so far, the possible latency V1009: Failover Interfaces / Channel bundling should be considered in order to reach the necessary availability and throughput ------------------------------------------------------
  • #24: V1009: In the beginning we thought about Stretch Cluster but had a workshop with Oracle which clarified our limitations: This limitations are hard facts for the current hardware as the latency becomes the bottleneck. The recommended distance between Cluster nodes should not go over 20 Km. As our Data Centers are much further away from each other, we haven’t created a Stretch cluster and addresses the disaster scenarios via Backups. V1010 Disaster Recovery: Even as SOX conformity will be a goal for the future, we will not be able to use Stretch Cluster to fulfill the requirements as the nodes need to be to far away from each other. Therefore other methodologies like, Disk mirroring - what we currently use - or, Data Guard with a Stand By Database would be necessary. V1009: Consider Rolling Upgrades – so that the downtime for installing patches could be minimized. This is also important to have a strategy up front. V1009: Consider to use Partitioning – For sure this is helpful to tune SQL statements and reduce the amount of Data to be handled-but also to keep data Blocks on as less nodes as possible especially if they will be changed frequently from several nodes (like from Database triggers) V1009: Moved Database Service into this Slide as it fits better: Session Management –As we have discussed this previously it is an important feature. Our future design contains more than 20 Services for the different Siebel components. Use DB Services as early as possible. This allows the Load Balancer to spread out the load without influencing different resources. This will provide especially the DBA’s the freedom to move workload around within the cluster. Separate them on demand, Limit the provided CPU resources or the availability based on the session granularity. For example our Callcenter users are the most important Agents. So they should have the highest throughput and therefore should get dedicated resources. While our Batch components could get only limited resources during the day. Without Services it is hard to achieve this goals. This is also an aspect for availability. Like we can survive if EAI Components are having only ½ of the DB CPU power in case of a failure, but Callcenter user should again have full resources. So services allows us to manage Performance and Availability. ------------------------------------------- V1009: Watch out for Architectural bottlenecks - Consider the increase concurrency.Watch out for bottlenecks in the architecture like Database Sequences which needs to be used concurrently by all Nodes as well as centralized Tables (like the siebel S_SSA_ID which is used in Siebel to generate the unique ROW_IDs or other central internally used Tables – S_SRM_REQUEST, S_ESCL_REQ). Reduce the use of such central points, Tune them if it is sufficient, or move them to the fastest HW Disks as we did – Solid State Disks. ------------------------------------------
  • #25: V1009: Partitioning has been move on Top Consider Partitioning for Performance Next to Services, Partitioning can be a technology to separate Data and work with them on less nodes as possible. Tables which will be filled from Database Triggers, could get candidates to partition Node vise in order to keep the data blocks on a single node. -------------------------------- V1009: Renamed High frequently changing DB Object into “Watch out for High frequently changing DB object) High Frequently changing DB Objects Check such Objects regarding the amount of records in a block – as this is the smallest unit the DB will work with. Such high frequently changing DB Objects are usually candidates for partitioning,Decrease the amount of records per data block (table spaces like 4 or 2K table spaces.) Based on the Siebel components we have chosen and our Siebel Component Architecture, some tables became hot spots and the most demanding objects within the Cluster. (S_SRV_REQ – filled by Database Triggers – mostly for batch and Asynch request) S_SRM_REQUEST – Asynchron Component Requests mostly for EAI) Siebel in general will put all Tables / Indexes in 8K Blocks as this have been the standard for Oracle over years. As most of these objects are temporary storages and several rows will be part of a single block, these Blocks will be shipped all the time between Nodes. A first step to prevent this from getting a bottle neck was to use 2K Tablespaces instead of the standard 8K (we are using 16K to minimize the disk I/O). This will ship less Data via the interconnect. ----------------------------------------------------------------------- High frequently changing DB Objects – this are candidates usually for partitioning, smaller table spaces like 4 or 2K table spaces. This kind of objects usually will create heavy interconnect traffic as they may be manipulated by several nodes at the same time. Interconnect – The heard of the RAC Cluster. Should be physically separated from everything else. We have seen that it can lead to troubles if the communication on it will be delayed. Concurrency – Compared to other system, the concurrency will increase, so tables with high concurrency needs to be adopted (Initrans table parameter) as well as sequence caches needs to be increased. Session Management – Use DB Services as early as possible. This allows the Load Balancer to spread out the load without influencing different resources. Use of Oracle Features – Make yourself aware of all the new features even if they are not well supported or described by the used software i.e. Workload Management, ASM, ASSM, Different Table space sizes, Backup Strategies, … Follow best practices – There are several RAC, AIX and RAC, as well as different Application & RAC whitepapers i.e. Siebel –RAC whitepapers which helped us to configure the system up front in the right way.
  • #27: As we have seen in the architecture all ready, we are using different type of Monitors to get an almost 360° view on the system. However, the most important Key Factor was to have again joint team of Siebel and Oracle DB Administrators sitting together. Because as a Team we have the total overview of the system at the same time. They are monitoring all components from an Application point of view. Arising Problems / Symptoms can immediately identified, the reasons analyzed, and possible resolutions defined and executed without having any communication delays. Another benefit is that the team members are learning from each other and therefore can better realize, categorize and analyze system behaviors. This has already been a success factor in the old project and became a success factor again in the new project.
  • #28: As we have seen in the architecture all ready, we are using different type of Monitors to get an almost 360° view on the system. However, the most important Key Factor was to have again joint team of Siebel and Oracle DB Administrators sitting together. Because as a Team we have the total overview of the system at the same time. They are monitoring all components from an Application point of view. Arising Problems / Symptoms can immediately identified, the reasons analyzed, and possible resolutions defined and executed without having any communication delays. Another benefit is that the team members are learning from each other and therefore can better realize, categorize and analyze system behaviors. This has already been a success factor in the old project and became a success factor again in the new project.
  • #29: About Monitoring Tools: First we use IBM Tivoli to monitor all different types of system resource usage like CPU, Memory, Disk I/O, … This covers all IBM AIX platforms like different Siebel Servers, the Database Servers as well as the Web Servers. In the Screens, any changes can immediately be seen and helps the team to identify root causes. Additionally to the Visual information, monitors are controlling the systems and sending messages to the assigned and scheduled team members so that they can react if necessary.
  • #30: About Monitoring Tools: First we use IBM Tivoli to monitor all different types of system resource usage like CPU, Memory, Disk I/O, … This covers all IBM AIX platforms like different Siebel Servers, the Database Servers as well as the Web Servers. In the Screens, any changes can immediately be seen and helps the team to identify root causes. Additionally to the Visual information, monitors are controlling the systems and sending messages to the assigned and scheduled team members so that they can react if necessary.
  • #31: For the 360° view on the system it is essential to have a very detailed and in time information about Database internals. To handle this requirement, Oracle GRID Control will be used for the RAC environment. This allows our DBA’s and Siebel Specific DBA’s to get a just in time view of what's going on within the RAC environment without monitoring each node on its own. The different important information like Locks, User sessions, Wait events, … can all be monitored by using a couple of screen. The easy to use GUI helps to dig deep enough into the areas via simple drill downs. However, simple command line scripts will still be used to receive more detailed information.
  • #32: For the 360° view on the system it is essential to have a very detailed and in time information about Database internals. To handle this requirement, Oracle GRID Control will be used for the RAC environment. This allows our DBA’s and Siebel Specific DBA’s to get a just in time view of what's going on within the RAC environment without monitoring each node on its own. The different important information like Locks, User sessions, Wait events, … can all be monitored by using a couple of screen. The easy to use GUI helps to dig deep enough into the areas via simple drill downs. However, simple command line scripts will still be used to receive more detailed information.
  • #33: Next to the real time information, statistical information will be gathered to get more effective view on the future. They also will be used to improve the system from a more long term perspective. Quest Foghlight Express Monitor will be used to analyze the Screen / View visits, the corresponding user retention time on them. This gives us feedback where we need to improve the system to increase the Service Level.
  • #34: Next to the real time information, statistical information will be gathered to get more effective view on the future. They also will be used to improve the system from a more long term perspective. Quest Foghlight Express Monitor will be used to analyze the Screen / View visits, the corresponding user retention time on them. This gives us feedback where we need to improve the system to increase the Service Level.
  • #35: Further tools like our internal Use Case Robot - ANITA Client helps us to control the Service Level Agreements. It works like a Roboter and executes defined Use Cases in a specified interval. The output are response times for each Use Case + each step so that we are aware if the system could lead to a reduces Service Level. They are located in the Callcenter so that we receive real figures. Our Real Time Dataware House will be used to monitor Activity Queues and provide information to the Callcenter Management. Oracle Streams pushes changes into a mirror Database in almost real Time. So the Analysts can immediately see and react. Siebel Management Views will also be used to get a more detailed view into the running Siebel Enterprise.
  • #36: Further tools like our internal Use Case Robot - ANITA Client helps us to control the Service Level Agreements. It works like a Roboter and executes defined Use Cases in a specified interval. The output are response times for each Use Case + each step so that we are aware if the system could lead to a reduces Service Level. They are located in the Callcenter so that we receive real figures. Our Real Time Dataware House will be used to monitor Activity Queues and provide information to the Callcenter Management. Oracle Streams pushes changes into a mirror Database in almost real Time. So the Analysts can immediately see and react. Siebel Management Views will also be used to get a more detailed view into the running Siebel Enterprise.
  • #37: Further tools like our internal Use Case Robot - ANITA Client helps us to control the Service Level Agreements. It works like a Roboter and executes defined Use Cases in a specified interval. The output are response times for each Use Case + each step so that we are aware if the system could lead to a reduces Service Level. They are located in the Callcenter so that we receive real figures. Our Real Time Dataware House will be used to monitor Activity Queues and provide information to the Callcenter Management. Oracle Streams pushes changes into a mirror Database in almost real Time. So the Analysts can immediately see and react. Siebel Management Views will also be used to get a more detailed view into the running Siebel Enterprise.
  • #39: As we have seen in the architecture all ready, we are using different type of Monitors to get an almost 360° view on the system. Have a joint team of Siebel Admins and Oracle DB Admins. This is a Key factor to be successful. Because the Team has the total overview of the system at the same time. Arising Problems / Symptoms can in the joint team immediately identified, the reason analyzed, and possible resolutions defined and executed without having any communication delays. Another benefit is that the teams are learning from each other and therefore can better categorize and analyze symptoms. This has already been a success factor in the old project and became a success factor again in the new project. About Monitoring Tools: First we use IBM Tivoli to get an just in time overview of the system resource usage like CPU, Memory, Disk I/O, … This covers all IBM AIX platforms like the Siebel Servers, the Database Servers as well as the Web Servers. Second we use Oracle GRID Control for the RAC environment. This allows our DBA’s and Siebel Specific DBA’s to get a just in time view of what's going on within the RAC environment without monitoring each node on its own. ANITA Clients – our self made End User Robot, helps us to monitor the Service Level Agreement SLA based on selected use case which will be executed at various Call Centers to the real response time. Real Time DWH – With Oracle Technologies like Oracle Streams we are able to feed our DWH with the necessary data to monitor the Business Processes like activity backlogs, Order states, …
  • #40: As we have seen in the architecture all ready, we are using different type of Monitors to get an almost 360° view on the system. Have a joint team of Siebel Admins and Oracle DB Admins. This is a Key factor to be successful. Because the Team has the total overview of the system at the same time. Arising Problems / Symptoms can in the joint team immediately identified, the reason analyzed, and possible resolutions defined and executed without having any communication delays. Another benefit is that the teams are learning from each other and therefore can better categorize and analyze symptoms. This has already been a success factor in the old project and became a success factor again in the new project. About Monitoring Tools: First we use IBM Tivoli to get an just in time overview of the system resource usage like CPU, Memory, Disk I/O, … This covers all IBM AIX platforms like the Siebel Servers, the Database Servers as well as the Web Servers. Second we use Oracle GRID Control for the RAC environment. This allows our DBA’s and Siebel Specific DBA’s to get a just in time view of what's going on within the RAC environment without monitoring each node on its own. ANITA Clients – our self made End User Robot, helps us to monitor the Service Level Agreement SLA based on selected use case which will be executed at various Call Centers to the real response time. Real Time DWH – With Oracle Technologies like Oracle Streams we are able to feed our DWH with the necessary data to monitor the Business Processes like activity backlogs, Order states, …
  • #42: As we have seen in the architecture all ready, we are using different type of Monitors to get an almost 360° view on the system. Have a joint team of Siebel Admins and Oracle DB Admins. This is a Key factor to be successful. Because the Team has the total overview of the system at the same time. Arising Problems / Symptoms can in the joint team immediately identified, the reason analyzed, and possible resolutions defined and executed without having any communication delays. Another benefit is that the teams are learning from each other and therefore can better categorize and analyze symptoms. This has already been a success factor in the old project and became a success factor again in the new project. About Monitoring Tools: First we use IBM Tivoli to get an just in time overview of the system resource usage like CPU, Memory, Disk I/O, … This covers all IBM AIX platforms like the Siebel Servers, the Database Servers as well as the Web Servers. Second we use Oracle GRID Control for the RAC environment. This allows our DBA’s and Siebel Specific DBA’s to get a just in time view of what's going on within the RAC environment without monitoring each node on its own. ANITA Clients – our self made End User Robot, helps us to monitor the Service Level Agreement SLA based on selected use case which will be executed at various Call Centers to the real response time. Real Time DWH – With Oracle Technologies like Oracle Streams we are able to feed our DWH with the necessary data to monitor the Business Processes like activity backlogs, Order states, …
  • #44: Get a 360°Monitor the system Try to Bring different Views on the System together like OS resource usage + System Activities + Amount of sessions + … Build a shared Siebel + RAC Admin and Monitoring Team Monitor the activities on Key Tables, Components Test the infrastructure with worst case scenarios This allows the the administrators team to react in correctly and not searching for solutions when the problem occurs, I.e. stopping long running sql statements, monitoring the system actively, Test the infrastructure carefully The bigger it get the more important it will be. Also with vanilla systems so that the tests can start very early. Many necessary patches for stability and performance have been found during the first couple of days with full load on the system. Avoid load mix of Batch and OLTP Batch and OLTP uses different features of the Database (Parallel Queries, Commit Sizes, …) During the get well phase we still had a mix of both types and this made lots of headache for the DBA’s Keep Commit sizes low. From time to time we are running BATCH jobs. And from time to time they will not finish in time so we need to cancel the job. The Rollback of the transaction is very heavy and can bring the system into troubles . V1009: Added the following: Keep Work on single nodes: Use all the possible techniques like Sessions, partitioning, User segregations, … Be prepared for long running queries: Even with many test runs not all long running queries will be found.Improve tests with: 1. Full blows database tests, so sometimes small tables can create big troubles 2. Manual dummy user tests, to cover non standard use cases. Slim line Tables with highly changing content Tables which are creates interconnect traffic are usually candidates to move into Partitions, and small Tablespaces
  • #46: Separate User in groups to reflect there needs (User segmentation)