WebSphere runtime
performance tuning
© Copyright IBM Corporation 2010
Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.4.1
Unit objectives
After completing this unit, you should be able to:
• Identify the areas of the WebSphere runtime environment that need to
be tuned
• Identify and tune parameters associated with threading and
concurrency
• Identify and tune parameters associated with database connectivity,
EJBs, and dynamic caching
• Identify and tune parameters associated with WebSphere messaging
• Implement performance best practices for WebSphere clusters
• Explain performance considerations associated with using 64-bit
WebSphere
© Copyright IBM Corporation 2010
Topics
• Operating systems and the Java virtual machine (JVM)
• Threading and concurrency
• Database connectivity, EJBs, dynamic caching
• WebSphere messaging (service integration)
• Clustering
• 64-bit performance
© Copyright IBM Corporation 2010
Request flow and possible bottlenecks Operating system
JVM heap size
Host OS
GC pause times
Application
Server
JVM WC thread pool
Browser JCA thread pools
Web Message store
container
Queue buffers
HTTP Server
Internet
Internet
cloud HTTP
cloud
Server
plug-in Messaging engine
JMS
HTTP/S Queue
EJB
container
Thread pool JDBC
Database
Caching
Network DS connection
Plug-in LB option pools
ORB thread pool
DB provider
EJB cache
© Copyright IBM Corporation 2010
Tuning parameter hot list
• Review hardware and software requirements.
• Install the current refresh pack, fix pack, and the recommended interim
fixes.
• Verify network interconnections and hardware configuration.
• Tune the operating system.
• Set the minimum and maximum Java virtual machine (JVM) heap
sizes.
• Set the garbage collection level appropriately.
• Use type 4 (pure Java) JDBC driver.
• Tune WebSphere Application Server data sources and connection
pools.
• Enable the pass by reference option.
• Ensure that the transaction log is assigned to a fast disk.
• Disable functions that are not required.
• Review the application design.
© Copyright IBM Corporation 2010
Topic: Operating systems and the Java virtual machine (JVM)
© Copyright IBM Corporation 2010
Operating systems (1 of 2)
• Refer to documentation for the operating system hosting WebSphere
for details regarding recommended tuning approaches
• Tuning for the following operating systems is covered in the
WebSphere Application Server V7.0 Information Center
– Windows
– Linux
– AIX
– Solaris
– HP-UX
• Refer to the relevant information center article for tuning details
© Copyright IBM Corporation 2010
Operating systems (2 of 2)
Example settings and variables for Linux
• TCP/IP timeouts:
– timeout_timewait parameter
– TCP_KEEPALIVE_INTERVAL
– TCP_KEEPALIVE_PROBES
• Linux file descriptors (ulimit)
– Specifies the number of open files that are supported
© Copyright IBM Corporation 2010
Java virtual machine (JVM)
• Key JVM tuning parameters
– Heap size
• The optimal size of the Java heap should enable the JVM to operate at ~50%
live occupancy
– GC policy
• The gencon garbage collection policy yields the highest levels of throughput
and the shortest garbage collection pause times
• Test other GC policies: optthruput and optavgpause
– Generation sizes when running the generational garbage collector
• Nursery and tenured generational areas of the heap
– Tune nursery size so that the nursery is 1/4 to 1/2 the overall size of the
heap
– Tune tenured size (and nursery size) so that the tenured generation is large
enough to hold application caching
– Refer to the IBM JVM Diagnostic Guide for complete details
© Copyright IBM Corporation 2010
JVM configuration (1 of 2)
• Administrative console JVM configuration tab
© Copyright IBM Corporation 2010
JVM configuration (2 of 2)
• Administrative console JVM configuration tab
• The Generic JVM arguments window is where you can specify:
– GC policy (-Xgcpolicy)
– Use several garbage collection threads (-Xgcthreads)
– Disable class garbage collection (-Xnoclassgc)
– Disable explicit garbage collection (-Xdisableexplicitgc)
– Many more
© Copyright IBM Corporation 2010
Topic: Threading and concurrency
© Copyright IBM Corporation 2010
Threading and concurrency: Web server (1 of 2)
A single Java EE application frequently supports thousands of concurrent
clients
• Consider using multiple web server instances
• If an application needs to support more concurrent clients than the number
of threads supported by the host OS
The web server may refuse a session if a sudden spike of incoming
requests occur.
• Increasing the ListenBacklog and StartServers parameters may
reduce or eliminate this behavior
– The ListenBacklog parameter configures the maximum number of pending
connections
– The StartServers parameter configures number of IBM HTTP Server
processes to start initially
Avoid frequent creation and destruction of client threads or processes as
the number of users changes
• Configure the parameters:
– MinSpareServers
– MaxSpareServers
© Copyright IBM Corporation 2010
Threading and concurrency: Web server (2 of 3)
• Minimize logging into the web server’s access log
• Consider using random plug-in load balancing instead of weighted
round-robin for WebSphere clusters
– May provide more even distribution across the cluster when multiple processes
are sending requests
• Consider reducing the plug-in retry interval to application servers
– Retry interval value specifies the length of time to wait before trying to connect to
a server that has been marked temporarily unavailable.
– If the interval is too high an available server may not be utilized.
• Concurrency tuning at web servers depends on the expected peak
number of simultaneous TCP connections
• The number of requests currently being processed is the number of
simultaneous connections at this time.
• Monitor mod_status reports to determine the maximum number of
connections that must be handled.
• Configure via directives in the httpd.conf file
© Copyright IBM Corporation 2010
Threading and concurrency: Web server (3 of 3)
• Web server tuning recommendations
– ThreadsPerChild — default (25 on many operating systems)
– MaxClients — maximum number of simultaneous connections rounded up to an
even multiple of ThreadsPer Child
– StartServers — 2
– MinSpareThreads — the greater of 25 or 10% of MaxClients
– MaxSpareThreads — equal to MaxClients in cases where the level of web
server resource requirements is not a concern when the web server is idle (high
performance configurations)
– ServerLimit — MaxClients divided by ThreadsPerChild
– ThreadLimit — equal to ThreadsPerChild
• Refer to your web server documentation for more details
© Copyright IBM Corporation 2010
Threading and concurrency: Application server
• Often an HTTP request arrives at the application server and is
dispatched to a web container thread pool where it remains during its
entire processing.
• Thus, the size of the web container thread pool is a critical tuning
parameter
– Typically 50 to 75 web container threads is normal for efficient dispatching of
work to CPUs
• General application server thread pools
– Web container thread pool (HTTP requests)
– ORB thread pool (RMI/IIOP requests to EJBs)
– Work manager thread pools (asynchronous beans)
• Messaging thread pools
– Message listener thread pool
– Default thread pool
• J2C and JCA thread pools
– Default thread pool
– User-defined thread pools
© Copyright IBM Corporation 2010
Threading and concurrency: Server thread pools
Application servers >
server_name >
Thread pools
© Copyright IBM Corporation 2010
Message listeners and activation specifications (1 of 2)
• Message-driven beans (MDBs) accessing MQ queues are assigned
listeners and connection factories.
– Listeners and connection factories should be tuned for the concurrency required
to achieve application throughput requirements
– Tuning parameter: connection factory’s connection pool parameters
• MDBs based on JCA are assigned activation specifications
– Activation specifications should be tuned for the concurrency required to achieve
application throughput requirements
– Tuning parameter: Maximum concurrent MDB invocations per
endpoint (default 10)
© Copyright IBM Corporation 2010
Message listeners and activation specifications (2 of 2)
• Listeners and activation specifications configure the consumer side of
concurrency.
activation Producer
spec
Consumers JCA Resource
Adapter
activation thread pool
spec
• The provider side of concurrency is the thread pool.
• The provider side is shared among the set of listeners and activation
specifications.
• The concurrency configuration of all listeners or activation specs needs
to be considered when tuning the thread pool provider.
– Consider all thread pool consumers in aggregate when tuning the providing
thread pool
© Copyright IBM Corporation 2010
Topic: Database connectivity, EJBs, and dynamic caching
© Copyright IBM Corporation 2010
Database connectivity (1 of 2)
• One of the most important parameters to tune in the area of database
connectivity is database connections
• There is a producer-consumer relationship for database connections:
– WebSphere data sources are the consumer
– Database providers are the producer of database connections
consumer DB
JDBC data source producer
server
pool
• The number of database connections offered at the database provider
must be sufficient to handle the maximum number of database
connections requested from JDBC data sources
© Copyright IBM Corporation 2010
Database connectivity (2 of 2)
• Connection pool configuration
• Most important tuning
parameters
– Maximum connections
– Minimum connections
• General rule for tuning
database connections:
– One database connection per
thread that requires a database
connection
– The number of database
connections should be close to
the maximum number of threads
that can run on the application
server
© Copyright IBM Corporation 2010
Databases
• General database provider tuning recommendations
– Keep database statistics up-to-date using the tools that are part of the database
product
– Place database log files on high speed disk subsystems
– Place logs on a separate disk device from tablespace containers
– Maintain table indexing recommended by the tools that are a part of the
database product
– Size log files appropriately for the level of load against the databases
– Refer to performance documentation for the database product being used for a
comprehensive set of tuning recommendations
• Consider using DB2 Performance Expert Extended Insight (PEEI) for
DB2 databases
– Provides end-to-end monitoring for your WebSphere-DB2 environment
© Copyright IBM Corporation 2010
Enterprise Java Beans (EJBs): Cache tuning
• Cache size = Maximum concurrent active instances
– (Entity beans required per transaction) x (max expected concurrent transaction)
– Add maximum active stateless session bean instances
– Tuning parameters: cache size and cleanup interval
• Use large EJB caches if you have sufficient heap
– Benchmark caches routinely sized at 30 K entries or more
– Monitor to find the optimal size
• If the cache fills up, the container tries to passivate based on LRU
(least recently used)
– Passivation triggers disk I/O — slow
– Tivoli Performance Viewer shows a very high rate of the ejbStores method
being called
• Remember that some EJBs may be long lived
– Stateful session beans
• Use Tivoli Performance Viewer to monitor EJBs
© Copyright IBM Corporation 2010
ORB pass by reference (1 of 2)
• The Object Request Broker (ORB) Pass by reference option
determines if pass by reference or pass by value semantics should be
used when handling parameter objects involved in an EJB request.
• Pass by reference option can be found in the administrative
console at Application Servers > server_name > Container
Services > ORB service
– Enable this property with caution because unexpected behavior can occur
• By default this option is disabled and a copy of each parameter object
is made and passed to the invoked EJB method
• This is considerably more expensive than passing a simple reference
to the existing parameter object.
© Copyright IBM Corporation 2010
ORB pass by reference (2 of 2)
• This option only provides a benefit when the EJB client (servlet) and
invoked EJB are located within the same classloader.
– This requirement means that both the EJB client and EJB must be deployed in
the same EAR file and running on the same application server instance.
– If the EJB client and EJB modules are mapped to different application server
instances (often referred to as split-tier), the EJBs must be invoked remotely
using pass by value semantics
• Since the DayTrader application contains both the web and EJB
modules in the same EAR file and both are deployed to the same
application server instance, the ORB Pass by reference option
can be used to realize a performance gain.
© Copyright IBM Corporation 2010
Performance benefit of ORB pass by reference
• This option is beneficial for the DayTrader application
% improvement
ORB pass by
reference
enabled
ORB pass by
reference
disabled
© Copyright IBM Corporation 2010
Dynamic caching (1 of 3)
• To configure the dynamic cache service, click Servers > Server Types
> WebSphere application servers > server_name > Container
services > Dynamic cache service
• The dynamic cache service improves performance by caching the
output of
– Servlets
– Commands (Java object)
– JavaServer Pages (JSP)
• Dynamic caching features include
– Cache replication among clusters
– Cache disk offload
– Edge-side include caching
– External caching, which is the ability to control caches outside of the application
server — for example, the web server
© Copyright IBM Corporation 2010
Dynamic caching (2 of 3)
• Consider dynamic caching during application design
– Should be part of design rather than retro-fitting later
• Organize page layout and design for caching
– Servlet and JSP caching to hold repetitive, expensive page elements
– Stateful elements versus general use elements
• API caching via distributed map services
– Leverage dynamic caching services within the application
• Leverage the caching structure
– Multiple internal caches supported beginning in WebSphere 6.0
– Split objects into different caches
– Distribute cache data more efficiently
© Copyright IBM Corporation 2010
Dynamic caching (3 of 3)
• Two types of caching function: policy-based and API-based
• Policy based (defined via the cachespec.xml file)
– Servlet, JSP
– Commands (Java object)
– Web service
– Web service client
• API based
– Distributed map
– Cacheable servlet
– Cacheable command
• Distribution features
– Distribute cache contents to peer instances
– Push content to HTTP servers and other edge components
© Copyright IBM Corporation 2010
Performance benefit of servlet caching
• DayTrader benefits from servlet caching
Servlet caching
enabled
Servlet caching
disabled
© Copyright IBM Corporation 2010
Topic: WebSphere messaging (service integration)
© Copyright IBM Corporation 2010
WebSphere messaging concept view
• Applications connect to Cell
a bus. Bus
• Bus destinations are Server 1 Server 2
– Queues ME 1 ME 2
– Topics
• An application that Server 3 Server 4
sends messages
attaches to a destination ME 3
as a producer.
Cluster A
• An application that
receives messages
attaches to a destination
ME = messaging
as a consumer.
engine
© Copyright IBM Corporation 2010
WebSphere messaging: Reliability level
WebSphere messaging has multiple levels of reliability: High
• BEST_EFFORT_NONPERSISTENT
– Messages are never written to disk
– Throw away messages if memory cache over-runs
• EXPRESS_NONPERSISTENT
Reliability
– Messages are written asynchronously to persistent storage if
memory cache overruns, but are not kept over server restarts
Performance
– No acknowledgement that the ME has received the message
• RELIABLE_NON_PERSISTENT
– Same as Express_Nonpersistent, except that you have a low-
level acknowledgement message that the client code waits for,
before returning to the application with an OK or not OK response
• RELIABLE_PERSISTENT
– Messages are written asynchronously to persistent storage
during normal processing, and stay persisted over server restarts.
– If the server fails, messages might be lost if they are only held in
the cache at the time of failure.
• ASSURED_PERSISTENT
High
– Highest degree of reliability where assured delivery is supported
© Copyright IBM Corporation 2010
Performance comparison of message reliability levels
• Persistent versus non-persistent
© Copyright IBM Corporation 2010
WebSphere messaging data buffers (1 of 3)
Discardable data buffer
Message control
structure
Message itself
Cached data buffer
JDBC
Messaging engine Data store
• Each messaging engine has two data buffers to store messages
– Discardable data buffer
– Cached data buffer
• Messages are discarded from the data buffers using FIFO
– Messages from cached data buffer are still available from the database
– Messages from the discardable data buffer are lost when discarded
– The discardable data buffer must be sized according to the expected level of
load
© Copyright IBM Corporation 2010
WebSphere messaging data buffers (2 of 3)
• The default size for both the cached data buffer and the discardable
data buffer is 320 KB
– Establish an estimate for the average message size to determine the number of
messages that fill the data buffers — size according to the desired maximum
number of messages
– The default size of the buffers is likely to be too small
• Discarded messages from the discardable data buffer throw an
exception
– Look for com.ibm.ws.sib.msgstore.OutOfCacheSpace in
SystemOut.log
© Copyright IBM Corporation 2010
WebSphere messaging data buffers (3 of 3)
• Discarded messages from the cached data buffer do not throw an
exception
– But messages might need to be retrieved from the database, hurting
performance
– Actively monitor PMI data
• PMI: Performance Modules > SIB Service > SIB Messaging
Engines > Messaging Engine > Storage Management > Cache
– CacheStoredDiscardCount (for cache data buffer)
– CacheNotStoredDiscardCount (for discardable data buffer)
© Copyright IBM Corporation 2010
WebSphere messaging data store options
• The data store serves as a persistent repository for messages
operating information
• File-based data store (default)
– Messages are persisted to the file system
– Can perform as fast as a remote database
– Performance is improved when a fast disk such as RAID is used
• High availability considerations file-based data store
– Place file on highly available, shared drive
– Use IBM File System Locking Protocol Test for verification
• Remote database data store
– Messages are persisted to a remote database
– Frees up cycles for the application server JVM process that were previously
used to manage the file-based stores
– Can use high performance, production level database servers (DB2, Oracle)
© Copyright IBM Corporation 2010
Topic: Clustering
© Copyright IBM Corporation 2010
Balancing workloads with clusters
• Clusters are sets of servers that are managed together and participate
in workload management.
– They run the same Java EE applications for load distribution
• Clusters enable enterprise applications to scale beyond the amount of
throughput capable of being achieved with a single application server.
• Servers that are members of a cluster can be on different host
machines, as opposed to the servers that are part of the same node
and must be located on the same host machine.
• A cell can have no clusters, one cluster, or multiple clusters.
© Copyright IBM Corporation 2010
Clustering: Vertical scaling
• Vertical scaling
– Provides process level failover
– May improve throughput capacity Host computer A Node agent
Cluster member
HTTP server 1
Cluster
Plug-in
Cluster member
2
Plug-in
configuration
© Copyright IBM Corporation 2010
Clustering: Horizontal scaling
• Horizontal scaling
– Provides hardware HTTP server
level failover
– Improves throughput
Plug-in
Host Node agent
Host Node agent
computer A computer B
Cluster member Cluster member
1 2
Cluster
© Copyright IBM Corporation 2010
Clustering: Combined
Computer A
• Horizontal and vertical scaling
Node agent
– Improved capacity and failover Cluster member
1
– Improved throughput
Cluster member
HTTP server 2
Cluster
Plug-in Computer B
Cluster member
3
Cluster member
4
Node agent
© Copyright IBM Corporation 2010
Clustering: Multiple processors
• A single JVM is capable of driving multiple processors
– There are examples where a single JVM was able to drive a 64-way
machine, and other examples where the throughput fell off after only 4
processors
– The optimal number of CPUs depends on the environment and can vary
greatly
– To determine what number works best requires performance testing and
analysis
© Copyright IBM Corporation 2010
Clustering best practices (1 of 2)
The best values for your environment depend on your configurations
and the applications you are running. Here is some general
guidance:
• The optimal number of physical processors for a set of JVMs is
about four
– For example, on a 12-way box: three JVMs can use four CPUs apiece
• Avoid large numbers of JVMs in a single logical partition
– Each JVM may require a significant amount of CPU capacity (over one
physical processor of CPU capacity)
– Large numbers of JVMs that require significant amounts of CPU capacity
drive the number of processors per logical partition beyond the optimal
number of processors per JVM
• Consider using horizontal scaling instead of vertical scaling in
cases were large numbers of physical processors are required
© Copyright IBM Corporation 2010
Clustering best practices (2 of 2)
• Horizontal scaling — adding new logical partitions with new physical
CPUs
– Scales linearly in most cases
– Note: the new logical partitions host new WebSphere instances
• Vertical scaling — adding WebSphere instances and new physical
CPUs in existing logical partitions
– Do not scale linearly at high numbers of physical processors (over 8).
• Consider tuning -gcthreads (the number of helper threads spawned
by the JVM to perform garbage collection) if the number of logical
processors assigned to the host logical partition is high (over 8)
– The optimal number of garbage collection threads is about six
© Copyright IBM Corporation 2010
Topic: 64-bit performance
© Copyright IBM Corporation 2010
64-bit WebSphere Application Server (1 of 2)
• As of V7, a 64-bit version of WebSphere Application Server is available
on almost all supported hardware and operating system platforms
• Allows the JVM to grow well beyond 32-bit process size boundaries
• Applications that experience greatest benefit:
– Memory constrained
• Extra memory supports better caching strategy
• Avoids expensive queries
– Computationally expensive code
• Security algorithms
• May be required for cases that need a very large heap or a large
number of threads
© Copyright IBM Corporation 2010
64-bit WebSphere Application Server (2 of 2)
• Stay with 32-bit WebSphere if the hosted application operates
efficiently in a heap that is 1.8 GB (1800 MB) or smaller.
– JVMs automatically size the heap to operate at 50% live size occupancy (within
the max heap size parameter) because 50% live size occupancy is a good
tradeoff between time in garbage collection and time between garbage
collections
– A JVM will size the heap at 1.8 GB automatically if the “size of live” is 900 MB
– Collect and graph verbose GC to determine the “size of live” of an application
running under expected and peak loads
– Stay with the 32-bit JVM if the “size of live” is 900 MB or less
• Consider moving to the 64-bit JVM is “size of live” is larger than 900
MB and if additional caching improves the performance of the
application
• Note: “size of live” refers to the number of live objects in the heap.
© Copyright IBM Corporation 2010
Moving transaction logs and file store to a fast disk
• Since disk I/O operations are costly, storing log files on fast disks such
as a RAID can greatly improve performance.
• The Transaction log directory can be set in the administrative
console at Servers > Application Servers > server_name >
Container Services > Transaction Service.
• The File store log directory can be specified during the creation of a
SIBus member using either
– The -logDirectory option in the AdminTask addSIBusMember command
– The administration console SIBus member creation panels
© Copyright IBM Corporation 2010
Unit summary
Having completed this unit, you should be able to:
• Identify the areas of the WebSphere runtime environment that need to
be tuned
• Identify and tune parameters associated with threading and
concurrency
• Identify and tune parameters associated with database connectivity,
EJBs, and dynamic caching
• Identify and tune parameters associated with WebSphere messaging
• Implement performance best practices for WebSphere clusters
• Explain performance considerations associated with using 64-bit
WebSphere
© Copyright IBM Corporation 2010
Checkpoint
1. True or False: MinSpareServers and MaxSpareServers are web
server tuning parameters.
2. True or False: The maximum connections parameter specifies the
maximum number of physical connections that you can create in a
data source connection pool.
3. True or False: The dynamic cache caches JIT-compiled Java
methods.
4. True or False: As the WebSphere messaging level of reliability
increases, message processing performance also increases.
© Copyright IBM Corporation 2010
Checkpoint solution
1. True
2. True
3. False: The dynamic cache service improves performance by caching
the output of servlets, commands, and JSPs.
4. False: The higher levels of messaging reliability cause a decease in
general message processing performance.
© Copyright IBM Corporation 2010