APACHE 2.
0
A Look Under the Hood
CHUUG, June 2002
by Cliff Woolley
jwoolley@[Link]
Introduction
Assumptions
The problems with the design of 1.3
How Apache 2.0 addresses them
Assumptions
You are somewhat familiar with
configuring Apache 1.3
That you understand the concept of
Apache modules
The problems with 1.3
Non-standard configuration scripts
Porting to new and unusual platforms is
difficult
Doesnt scale well
Modules cant interact in particularly
interesting ways
How Apache 2.0 addresses these
design problems
Configuration now uses GNU autoconf
The Apache Portable Runtime (APR)
Multi-Processing Modules (MPMs)
I/O filtering
hooks
The build environment:
Using GNU autoconf
No more APACI now its the real thing
No more [Link] everything
uses ./configure arguments (hint: look at
[Link])
Autoconfs feature tests are nice from a
developers perspective
The build environment:
The source tree layout
Modules categorized by function, not just
lumped together
Platform-specific files hidden away
Vendors can add their own module
directories
The Apache Portable Runtime
Platform Abstraction
Resource Management
Consistency, consistency, consistency
APR: Platform abstraction
Feature tests
Native OS-specific data structures hidden
behind a consistent interface
APR: Resource management
Memory allocation handled for you
Resource lifetimes arranged into a tree
thats easy to prune
APR: All about consistency
interface to the Operating System
resource handling
Naming convention! (i.e., be ready for
renames)
Multi-Processing Modules
What are they?
How do you configure them?
Which one is best?
MPMs defined
A module that is specialized for managing
the process/thread model used by Apache
on a particular platform
Each has its own target OS and scalability
goals
MPM configuration
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule prefork.c>
StartServers
5
MinSpareServers
5
MaxSpareServers
10
MaxClients
150
MaxRequestsPerChild 0
</IfModule>
MPM configuration
# worker MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule worker.c>
StartServers
2
MaxClients
150
MinSpareThreads
25
MaxSpareThreads
75
ThreadsPerChild
25
MaxRequestsPerChild 0
</IfModule>
MPMs: How to choose
Benchmark!! (but dont trust ab)
Consider RAM usage vs. performance,
etc.
Other tunability factors too, but this is the
big one
Filtered I/O
Bucket Brigades (my specialty :)
Input Filters
Output Filters
Bucket Brigades
A convenient abstract data type
What do they look like?
How are they used?
What good are they?
Input filtering
Data is pulled from the client through the
input filters
Each filter transforms the data it hands
back to its caller in some way
Order is assigned at the beginning of each
request
Output filtering
The most common form interesting
things happen when old-style handlers
get converted into output filters
Data is pushed to the client through the
output filters
Again, each filter transforms the data that
passes through it
Apache modules
The module structure itself has changed:
module MODULE_VAR_EXPORT foo_module = {
STANDARD_MODULE_STUFF,
foo_init_Module,
foo_config_perdir_create,
foo_config_perdir_merge,
foo_config_server_create,
foo_config_server_merge,
foo_config_cmds,
foo_config_handler,
foo_hook_Translate,
foo_hook_Auth,
foo_hook_UserCheck,
foo_hook_Access,
NULL,
foo_hook_Fixup,
NULL,
NULL,
foo_init_Child,
NULL,
foo_hook_ReadReq,
};
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
module initializer
create per-dir
config structures
merge per-dir
config structures
create per-server config structures
merge per-server config structures
table of config file commands
[#8] MIME-typed-dispatched handlers
[#1] URI to filename translation
[#4] validate user id from request
[#5] check if the user is ok _here_
[#3] check access by host address
[#6] determine MIME type
[#7] pre-run fixups
[#9] log a transaction
[#2] header parser
child_init
child_exit
[#0] post read-request
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
Apache modules
The module structure itself has changed:
module AP_MODULE_DECLARE_DATA foo_module = {
STANDARD20_MODULE_STUFF,
foo_config_perdir_create,
/* create per-dir
config structures
foo_config_perdir_merge,
/* merge per-dir
config structures
foo_config_server_create,
/* create per-server config structures
foo_config_server_merge,
/* merge per-server config structures
foo_config_cmds,
/* table of configuration directives
foo_register_hooks
/* register hooks */
};
*/
*/
*/
*/
*/
Apache modules
What happened to all the other functions?
How does a module register interest in
one of those functions?
Hooks
A new, more flexible replacement for most
of the module_structs phases
Order is runtime-selectable (mostly)
Any module can register its own hooks
this allows a whole new level of intermodule cooperation
Hooks: example
static void register_hooks(apr_pool_t *p)
{
APR_REGISTER_OPTIONAL_FN(ap_ssi_get_tag_and_value);
APR_REGISTER_OPTIONAL_FN(ap_ssi_parse_string);
APR_REGISTER_OPTIONAL_FN(ap_register_include_handler);
ap_hook_post_config(include_post_config, NULL, NULL,
APR_HOOK_REALLY_FIRST);
ap_hook_fixups(include_fixup, NULL, NULL,
APR_HOOK_LAST);
ap_register_output_filter("INCLUDES", includes_filter,
AP_FTYPE_RESOURCE);
}
Hooks: example cont.
static int include_post_config(apr_pool_t *p, apr_pool_t *plog,
apr_pool_t *ptemp, server_rec *s)
{
include_hash = apr_hash_make(p);
ssi_pfn_register =
APR_RETRIEVE_OPTIONAL_FN(ap_register_include_handler);
if(ssi_pfn_register) {
ssi_pfn_register("if", handle_if);
ssi_pfn_register("set", handle_set);
ssi_pfn_register("else", handle_else);
}
return OK;
}
Conclusion
What will I get when upgrading to Apache
2.0?
What wont I get (yet)?
Future directions
Questions?