Alfresco Day Zero Configuration Guide v0.14
Alfresco Day Zero Configuration Guide v0.14
14
ii
Document History
iii
0.8 2010-05-06 Peter Monks Reviewed changes and
accepted / rejected as
appropriate.
0.9 2010-06-14 Peter Monks Added links to Environment
Validation tool.
Removed (now redundant)
appendices.
0.10 2010-08-17 Peter Monks Added note about
clustering and db.pool.max
Added section on
hibernate.jdbc.fetch_size
0.11 2010-08-25 Peter Monks Updated virtual file servers
thread pool configuration
for v3.2+
0.12 2010-11-22 Peter Monks Added note on JIT compiler
exclusions
Added recommendation for
db.pool.idle
Added notes on DB2
Added recommendation for
in-transaction indexing
Added recommendation for
quota calculation
0.13 2010-11-23 Peter Monks Added recommendation for
db.pool.validate.query
0.14 2010-12-16 Peter Monks Added recommendation for
JVM stack size
Added recommendation for
index.recovery.mode
iv
Table of Contents
INTRODUCTION ................................................................................................... 1
DOCUMENT PURPOSE ............................................................................................... 1
INTENDED AUDIENCE ................................................................................................ 1
GLOSSARY .............................................................................................................. 1
ARCHITECTURE VALIDATION ............................................................................ 3
SUPPORTED STACKS FOR ALFRESCO ......................................................................... 3
HARDWARE ............................................................................................................. 3
I/O ................................................................................................................................................................................... 3
CPU ................................................................................................................................................................................ 4
DATABASE ............................................................................................................... 4
Maintenance and tuning ................................................................................................................................................. 4
OPERATING SYSTEM ................................................................................................ 5
JAVA VIRTUAL MACHINE ........................................................................................... 5
ENVIRONMENT VALIDATION ............................................................................. 7
DAY ZERO CONFIGURATION ............................................................................. 9
JVM TUNING ........................................................................................................... 9
Increase JVM heap ......................................................................................................................................................... 9
Reduce JVM stack .......................................................................................................................................................... 9
Remove JIT exclusions .................................................................................................................................................. 9
SET DIR.ROOT TO ABSOLUTE PATH .......................................................................... 10
ENABLE AUTOMATIC SEARCH INDEX RECOVERY ........................................................ 11
DATABASE CONNECTION POOL ................................................................................ 11
Maximum Size .............................................................................................................................................................. 11
Idle Size ........................................................................................................................................................................ 12
Validation Query ........................................................................................................................................................... 12
DATABASE FETCH SIZE ........................................................................................... 13
IN-TRANSACTION FULL TEXT INDEXING (OPTIONAL) .................................................... 13
QUOTA CALCULATIONS (OPTIONAL) ......................................................................... 14
APPLICATION SERVER WORKER THREAD POOL (OPTIONAL) ........................................ 14
VIRTUAL FILE SERVER (VFS) WORKER THREAD POOL (OPTIONAL) ............................. 14
SHAREPOINT PROTOCOL WORKER THREAD POOL (OPTIONAL) ................................... 15
JODCONVERTER-BASED OPENOFFICE INTEGRATION ............................................... 15
CONFIGURE OTHER THIRD PARTY APPLICATIONS ....................................................... 15
v
Introduction
Introduction
Document purpose
By default, Alfrescoʼs configuration is optimized for single user evaluation of Alfresco. This
configuration minimizes resource usage at the expense of scalability (particularly scalability in
the presence of large concurrent traffic volumes). Therefore, for any other use of Alfresco
(including but not limited to: QA, performance / scalability testing, production, production mirror,
disaster recovery), Alfresco strongly recommends that additional configuration be performed.
This document describes the universal configuration steps that should be taken to achieve this,
regardless of the specific Alfresco use case, and before Alfresco is started for the first time. It
does not describe the full breadth of Alfresco configuration options that can be leveraged to
scale Alfresco in use case specific ways, however – this is described in detail elsewhere (for
example, in the product documentation, knowledge base).
This document is currently focused on Alfresco 3.3 installations, although many of the
recommendations can be applied to earlier versions as well (provided the associated Supported
Stack is used, rather than the 3.3 Supported Stack).
Intended audience
This document is intended for developers, system administrators, and anyone who is tasked
with installing an Alfresco instance, regardless of the intended use of that instance (evaluation,
development, test / QA, production).
Glossary
The following table describes the terms that are used within this document, each of which has a
specific meaning within the context of Alfresco:
TERM DEFINITION
DBA DataBase Administrator – someone who has been trained and certified
to administer a specific relational database product.
Note: relational databases vary greatly in their capabilities, so it is critical
that any DBA be experienced with exactly the database product you are
intending to use for Alfresco.
I/O Input/Output – in this document refers to I/O performed by Alfresco to
some external software or device (such as the network or a disk
subsystem).
OS Operating System
TERM DEFINITION
CPU Central Processing Unit
VFS Virtual File Server – specifically the functionality in Alfresco that provides
access to the repository via CIFS, FTP, NFS and WebDAV
SMTP Simple Mail Transfer Protocol – a widely used protocol used for sending
email
IMAP Internet Message Access Protocol – a more modern protocol used for
interacting with email servers
JVM Java Virtual Machine
Architecture validation
This section describes the steps required to validate the architecture to ensure that it meets the
prerequisites for an Alfresco installation.
The following summary shows the steps that are required to validate the architecture:
1. Check the supported stacks list.
2. Optimize the hardware settings.
3. Validate the database.
4. Validate the Operating System.
5. Validate and tune the JVM.
Hardware
This section describes how to validate your I/O subsystems and CPU.
I/O
One of the primary determinants of Alfrescoʼs performance is I/O. Optimize the following, in
priority order:
1. I/O to the relational database Alfresco is configured to use.
2. I/O to the disk subsystem on which the Lucene indexes are stored
3. I/O to the disk subsystem on which the content is stored.
In each case, the goal is to minimize the latency (response time) between Alfresco and the
storage system, while also maximizing bandwidth.
Low latency is particularly important for database I/O, and one rudimentary test of this is to ping
the database server from the Alfresco server – round trip times greater than 1ms indicate a sub-
optimal network topology or configuration that will adversely impact Alfresco performance.
“Jitter” (highly variable round trip times) is also of concern, as that will increase the variability of
Alfrescoʼs performance – the standard deviation for round trip times should be less than 0.1ms.
CPU
Alfresco will function correctly on virtually all modern 32bit and 64bit CPUs, however, for
production use, Alfresco recommends a clock speed greater than 2.5Ghz to ensure reasonable
response times to the end user.
Although it is not strictly necessary, a 64bit architecture is also recommended, primarily
because it allows the JVM to utilize more memory (RAM) than a 32bit architecture.
Note: CPU clock speed is of particular concern for the Sun UltraSPARC architecture, as
some current UltraSPARC based servers ship with CPUs that have clock speeds as low
as 900Mhz, well below what is required for adequate Alfresco performance! If you intend
to use Sun servers for hosting Alfresco, please ensure that all CPUs have a clock speed
of at least 2.5Ghz.
At the time of writing, this implies that:
• an X or M class Sun server is required, with careful CPU selection to ensure 2.5Ghz (or
better) clock speed
• T class servers should not be used, as they do not support CPUs faster than
approximately 2Ghz
Understandably, Alfresco is unable to provide specific guidance on Sun server classes, models
or configurations, so you should talk with your Sun reseller to confirm that minimum CPU clock
speed recommendations will be met.
Database
Disclaimer: Alfresco is unable to provide specialized support for maintaining or tuning your
relational database. You MUST have an experienced, certified DBA on staff to support
your Alfresco installation(s)1.
1
Typically
this
will
not
be
a
full
time
role
once
the
database
is
configured
and
tuned
and
automated
maintenance
processes
are
in
place.
However
an
experienced,
certified
DBA
is
required
to
get
to
this
point.
2
Unless
your
DBA
recommends
otherwise,
Alfresco
suggests
performing
this
maintenance
daily.
3
Note:
Relying
on
your
database’s
automated
statistics
gathering
mechanism
may
not
be
optimal
–
consult
an
experienced,
certified
DBA
for
your
database
to
confirm
this.
Operating System
You should ensure that your chosen OS has been officially certified for use with Alfresco (refer
to the Supported Stacks list for details).
Alfresco is not sensitive to changes to the OS configuration, beyond the impact on I/O
performance (see I/O on page 3). That said, it is recommended that a 64bit OS be used if the
hardware (CPU, and so on) is 64bit capable.
4
http://dev.mysql.com/doc/refman/5.1/en/analyze-‐table.html
5
http://www.postgresql.org/docs/8.4/static/maintenance.html
6
http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/stats.htm#PFGRF003
7
http://technet.microsoft.com/en-‐us/library/ms188388.aspx
8
http://technet.microsoft.com/en-‐us/library/ms187348.aspx
9
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.admin.cmd.doc/doc/
r0001971.html
10
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.admin.cmd.doc/doc/
r0001980.html
For information on configuring and tuning the JVM, refer to the product documentation or the
following wiki page:
http://wiki.alfresco.com/wiki/JVM_Tuning
Note that Alfresco requires an official Sun 1.6 JDK (or IBM JDK, if using Websphere) –
other JVMs (earlier versions, Harmony, gcj, JRockit, HP, and so on) are NOT supported and are
known to cause issues in various parts of the product.
Alfresco recommends using a 64bit JVM if the underlying platform (OS and hardware) is 64bit
capable.
Environment validation
The following environment-specific items must be validated prior to installing Alfresco. Note that
Alfresco now provides an Environment Validation tool that can validate most of the following
requirements. This tool is available at:
https://network.alfresco.com/?f=default&o=workspace://SpacesStore/f98ad411-
510d-444f-8166-432a66fe172a
1. Validate that the hostname of the server is resolvable in DNS.11
2. Validate that the user Alfresco will run as can open sufficient file descriptors (4096 or more).
3. Validate that the ports on which Alfresco listens are available12:
o FTP: TCP 2113
o SMTP: TCP 2514
o SMB / NetBT: UDP 137,138, TCP 139,44515
o IMAP: TCP 14316
o SharePoint Protocol: TCP 707017
o Tomcat Administration: TCP 8005
o HTTP: TCP 8080
o RMI: TCP 50500
4. Validate that the installed JVM is Sun version 1.6.
5. Validate that the directory in which the JVM is installed does not contain spaces.
6. Validate that the directory in which Alfresco is installed does not contain spaces.
7. Validate that the directory Alfresco will use for the repository (typically called “alf_data”) is
both readable and writeable by the OS user that the Alfresco process will run as.
8. Validate that you can connect to the database as the Alfresco database user, from the
Alfresco server.18
11
This
is
required
if
Alfresco
is
going
to
be
configured
in
a
cluster.
12
Note:
the
ports
listened
here
are
the
defaults.
If
you’re
planning
on
reconfiguring
Alfresco
to
use
different
ports,
or
are
enabling
additional
protocols
(such
as
HTTPS,
SMTP,
IMAP
or
NFS)
you
should
update
this
list
with
those
port
numbers.
13
On
Unix-‐like
OSes
that
offer
so-‐called
“privileged
ports”,
Alfresco
will
normally
be
unable
to
bind
to
this
port
unless
run
as
the
root
user
(which
is
not
recommended).
In
this
case,
even
if
this
port
is
available,
Alfresco
will
still
fail
to
bind
to
it,
however
for
FTP
services
this
is
a
non-‐fatal
error
–
Alfresco’s
FTP
functionality
will
simply
be
disabled
in
the
repository.
14
SMTP
is
not
enabled
by
default.
15
On
Unix-‐like
OSes
that
offer
so-‐called
“privileged
ports”,
Alfresco
will
normally
be
unable
to
bind
to
this
port
unless
run
as
the
root
user
(which
is
not
recommended).
In
this
case,
even
if
this
port
is
available,
Alfresco
will
still
fail
to
bind
to
it,
however
for
CIFS
services
this
is
a
non-‐fatal
error
–
Alfresco’s
CIFS
functionality
will
simply
be
disabled
in
the
repository.
16
IMAP
is
not
enabled
by
default.
17
Some
of
the
Alfresco
bundles
(specifically
the
WAR,
EAR
and
Tomcat
bundles)
don’t
ship
with
the
SharePoint
Protocol
enabled
by
default.
If
you’re
using
one
of
these
bundles
you
can
ignore
this
port
until/unless
you
install
support
for
the
SharePoint
Protocol.
18
Note:
this
will
require
installation
of
the
database
vendor’s
“client
tools”
on
the
Alfresco
server.
9. Validate that the character encoding for the Alfresco database is UTF-8.
10. (MySQL only) Validate that the default storage engine for the database server that Alfresco
will use is InnoDB19.
11. Validate that the following third-party software is installed and the correct versions:
o OpenOffice v3.1 or newer
o ImageMagick v6.2 or newer
12. (RHEL and Solaris only) Validate that OpenOffice is able to run in headless mode.
Refer to the appendices in this document for OS and database-specific commands that can be
used to perform these validations.
19
Not
required
as
of
Alfresco
3.3.
JVM tuning
Note: the following recommendations are the bare minimum reconfiguration required by
Alfresco, but further tuning of the JVM may be necessary depending on your use of Alfresco.
Refer to the product documentation or the following wiki page.
http://wiki.alfresco.com/wiki/JVM_Tuning
With the exception of total heap size, it is not recommended to blindly set any JVM parameter
without first analyzing the running JVM and experimentally verifying that the change definitively
improves the behavior of Alfresco for your use case.
JVM tuning is a highly environment and use case specific activity, and it is trivially easy to
destroy the JVMʼs inherent reliability and scalability with uninformed changes to the JVM
settings.
If, as a result of making this change, you start seeing java.lang.StackOverflowError exceptions
in the Alfresco log, you may increase this value in 128k increments until the exceptions
disappear.
20
${ALFRESCO_HOME}/alfresco.sh
or
%ALFRESCO_HOME%\alfresco.bat
in
versions
up
to
and
including
3.3SP2,
${ALFRESCO_HOME}/tomcat/scripts/ctl.sh
or
%ALFRESCO_HOME%\tomcat\scripts\ctl.bat
in
versions
3.3SP3
and
above
that
use
Tomcat.
these classes, something that is no longer relevant now that Alfresco only supports JDK 1.6 and
above.
Double check that these JIT exclusions are commented out in the startup script, as follows (note
the highlighted comment symbol):
# Following only needed for Sun JVMs before to 1.5 update 8
#export JAVA_OPTS="${JAVA_OPTS} -
XX:CompileCommand=exclude,org/apache/lucene/index/IndexReader\$1,doBody
-
XX:CompileCommand=exclude,org/alfresco/repo/search/impl/lucene/index/Ind
exInfo\$Merger,mergeIndexes -
XX:CompileCommand=exclude,org/alfresco/repo/search/impl/lucene/index/Ind
exInfo\$Merger,mergeDeletions"
On Windows, the “rem” command should be used in place of the Unix-shell “#” comment
symbol.
Note: newer versions of Alfresco (3.3+) no longer include this option in the start up script so
donʼt be surprised if it is not present.
It is strongly recommended that you always set this value to an absolute file system path before
starting Alfresco for the first time. This ensures that no matter how the Alfresco instance is
started, it will always find the directories where content has previously been written.
With Tomcat, this property is found in:
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties21
If you do not set dir.root to an absolute path, you may see a “CONTENT INTEGRITY ERROR”
message in the alfresco.log file during a second or subsequent startup of the server.
Other than being an absolute path, Alfresco has no specific requirements for where this
directory resides or what it is called. You should optimize the location of the file system portion
of the Alfresco repository to maximize I/O performance (as mentioned in I/O on page 3)).
21
As
of
Alfresco
Enterprise
3.2.0
–
in
earlier
versions
this
property
is
found
in
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco/extension/custom-repository.properties
22
As
of
Alfresco
Enterprise
3.2r
–
this
number
may
change
in
future
versions.
23
Tomcat
6.0,
for
example,
allows
up
to
200
concurrent
HTTP
requests
by
default.
24
In
a
cluster
each
node
has
its
own
independent
database
connection
pool.
You
must
configure
sufficient
database
connections
for
all
of
the
Alfresco
cluster
nodes
to
be
able
to
connect
simultaneously.
saturates its own connection pools. Do not forget to factor in cluster nodes (which can each use
up to 275 database connections) as well as connections required by other applications that are
using the same database server as Alfresco.
The precise mechanism for reconfiguring your databaseʼs connection limit depends on the
relational database product you are using; your DBA should be able to readily configure this.
Idle Size
By default, each Alfresco instance will, when idle, reduce the size of the database connection
pool to no more than 8 open connections at any time, in order to minimize resource usage in
both the JVM and the database.
While appropriate for evaluation and individual developer environments, this setting is not
appropriate for any kind of multi-user or high traffic installation, including but not limited to QA,
performance / scalability test, production mirror and production environments.
For these environments Alfresco recommends disabling the idle connection reclamation logic in
the database connection pool, by adding the db.pool.idle property to:
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties
Validation Query
By default Alfresco does not periodically validate each database connection retrieved from the
database connection pool. Validating connections is, however, very important for long running
Alfresco servers, since there are various ways database connections can unexpectedly be
closed (for example by transient network glitches and database server timeouts).
Enabling periodic validation of database connections involves adding the
db.pool.validate.query property to:
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco-global.properties
and setting it to one of the following values, depending on the database thatʼs in use:
MySQL /* PING */
You may add this property anywhere in the file, although for clarity you should place it
immediately after the other database properties.
25
Notably
Oracle.
Important note: this setting globally disables quota calculations – the functionality is
completely disabled in this installation of Alfresco. For that reason this setting should not be
used if there is any requirement to use content quotas in this Alfresco instance.
It can, however, be turned back on at a later date with no side effects (beyond the expected
impact on Alfresco performance).
to:
26
More
information
is
available
at
http://tomcat.apache.org/tomcat-‐6.0-‐doc/config/index.html.
27
More
information
is
available
at
http://wiki.alfresco.com/wiki/File_Server_Configuration#Advanced_Server_Configuration
${ALFRESCO_HOME}/tomcat/shared/classes/alfresco/extension/custom-file-
servers-context.xml
Remove all of the <bean> definitions except for the bean with the id “fileServerConfiguration”.
Add the following property block to the “fileServerConfiguration” bean:
<property name="coreServerConfigBean" ref="coreServerConfig" />
28
http://www.artofsolving.com/opensource/jodconverter
29
Refer
to
the
product
documentation.
30
http://wiki.alfresco.com/wiki/ImageMagick_Configuration
31
http://wiki.alfresco.com/wiki/Installing_Alfresco_components#Installing_SWFTools
Please take careful note that the first property points to the directory into which the
ImageMagick is installed, whereas the second property points to the pdf2swf executable file.