This document summarizes a project implementing load balancing with HAproxy and fault tolerance with Linux Heartbeat. It discusses configuring HAproxy for load balancing across backend Apache web servers. Linux Heartbeat is configured to provide failover protection by monitoring the health of load balancers and transferring a virtual IP address if the primary fails. The setup virtualizes servers in VirtualBox and tests performance using Jmeter under increasing load.
Overview of load balancing for efficiency in networks, focusing on software solutions like HAproxy, which enhances scalability and fault tolerance.
HAproxy, a high availability proxy, efficiently distributes traffic using various algorithms, enhancing server protection and performance.Implementation of a load-balanced web server cluster using VirtualBox; details on configurations for HAproxy and Heartbeat, optimized for traffic distribution.
Jmeter testing performance of the load-balanced system vs. standalone Apache2, with metrics on requests, average response times, and deviations. Software load balancing is an effective, low-cost alternative for traffic handling; highlights findings, reliability issues, and future enhancement plans.
Bibliography cited, along with configuration files for HAproxy and Heartbeat, noting important settings used throughout the project.
1. Introduction
Load balancingis a technique to distribute loads across a network in order to maximize the
efficiency and use of servers. Load balancer technology is split between hardware and software
implementations. The main subject of this paper will be an applied project implementing a load
balancer with a web-server back end. The focus of the applied project is on software load balancing,
specifically with the load balancing application HAproxy. The reasons for choosing a software
implementation are feasibility as all servers in the implementation can be virtualized, and relevance as
load balancing hardware is gradually being phased out in favour of software due to costs. The design of
the applied portion was intended to provide a system that is scalable, fault tolerant, inexpensive to
implement in a production setting, and much more capable of handling large amounts of traffic when
compared to a basic web-server. Any organization that intends to provide web services will benefit
from the ability to handle spikes in web traffic that this system designed to handle.
Section two of this paper will introduce HAproxy, what it is and how it works. This section will
also detail how HAproxy was configured in the applied project. Section three will explain about Linux
Heartbeat, where it was used in the project, and why it was needed. Section Four will explain in detail
about the setup of the load balanced web server system in this project. Section five will introduce
another program called Jmeter, an application used for testing web servers load handling capabilities.
This section will detail how Jmeter is used to test the load capabilities of servers, and how I developed
my test plan. Section six discusses the conclusions that I have drawn from testing, as well as any future
work I would like to carry out on this project. In section seven I list my contributions and what I
accomplished.
3.
2. HAproxy
HAproxy standsfor high availability proxy, and is designed for distributing TCP and HTTP
traffic across multiple computing resources in a network. Load balancing is intended to help high
traffic sites and sites that experience routine spikes in request traffic. It provides intelligent load
distribution for some of the largest websites on the internet, including adult entertainment site Youporn
which processes over 300,000 requests per second [1]. HAproxy is designed to do three things, and to
do them quickly and reliably. These three things are processing incoming request traffic, distributing
the request traffic to the servers in the back end, and querying those back-end servers' health.
HAproxy comes with many distributions of Linux, and is also available in the apt and yum
package libraries. When installing HAproxy it is generally accepted practice to install the package in a
chroot jail, and by default this is how the package manager will install it. A chroot jail simulates the
directory that the application is installed in to appear that it is a root directory. In other words, “It
allows a process to see only a single subtree of a system” [2]. This isolation from the rest of the system
provides another level of security, since if HAproxy itself is hacked, it makes it difficult for the intruder
to access other resources or directories on the computer. It is important that the application installed in
the chroot jail has not been installed as root, as this makes it easy to escape the jail.
To configure HAproxy you can edit the configuration file, by default located in
'/etc/haproxy/haproxy.cfg' (see Appendix I). HAproxy should install itself in a chroot jail, which can be
verified by checking that the global section of the configuration file contains a line such as 'chroot
/var/lib/haproxy'. Under the defaults section, mode can be either TCP or HTTP depending on
application. It is necessary to specify a front-end name, bind a port for HAproxy to listen to, and set a
default back-end to redirect traffic towards. Finally, the most important part is to specify a back-end so
that the redirected traffic has a destination. You must specify a server name, ensuring that it matches the
4.
name given inthe front-end section. Next, you should specify which algorithm you will use to decide
how the HTTP requests are distributed across your back-end. There are several task specific algorithms
available in HAproxy, mostly depending on the expected average length of connections to you server.
The mode must be specified, and must match the value entered in the global section. Finally, the back-
end servers are listed by server name, IP, port, and whether to query the status of the server.
3. Linux Heartbeat
Heartbeat is open source software used in server clusters. It can be configured to send UDP
broadcast, multicast or unicast heartbeats in a two-node configuration. This means that heartbeat will
send packets back and forth between server nodes periodically, to query the health of the server. The
transmission of these packets can be seen in the following diagram, between load balancing nodes
192.168.0.102 and 192.168.0.103 on lines 1 through 4.
Heartbeat can also be used to implement fail-over protection, which was the case in this
project's implementation. A virtual IP address was created by heartbeat and was taken over by the
primary server in the cluster. For as long as Heartbeat queried the primary server and found it to be
running, the primary load balancer would retain the virtual IP address. If the primary load balanced
Illustration 1: Wireshark Packet Analysis demonstrating Linux Heartbeat
5.
were to fail,Heartbeat would wait the length of time specified in the configuration file before retrying
the primary load balancer. If it was still not working at this point, Heartbeat would assign the virtual IP
address to the backup load balancing server in the cluster. This is how fail-over protection was
achieved in this project. Heartbeat will continue to monitor the health of the non-functioning load
balancer, and if it were to come back online the Virtual IP address would revert back it. This is done
automatically and is completely invisible to the user. This fail-over protection has the additional benefit
of allowing the systems administrator to take down one of the load balancers for maintenance, without
ever having to disrupt service to the web server.
Heartbeat requires three files (see Appendix II) to be configured prior to operation: ha.cf, the
main configuration file, haresources, the resource configuration file, and authkeys, the authentication
file to encrypt heartbeat packets between nodes.
4. Setup
To implement the load balanced web sever cluster I decided I would have to virtualize all
servers in my implementation using VirtualBox. The two load balancing nodes are running HAproxy
v1.4.24 and Linux Heartbeat, while the back end servers are running Apache2 Web Server v2.4.7. Each
server is running Ubuntu Server 14.04.3 LTS with 512 MB memory. By default, this version of Ubuntu
comes with HAproxy, Linux Heartbeat, and Apache.
HAproxy is set to handle HTTP traffic coming in on port 80. It has been installed in a chroot
in /var/lib/haproxy. Syslog has been set to save the HAproxy logs to /dev/log/. I chose the round-robin
algorithm in this implementation to distribute traffic, which rotates through servers in the back-end.
Heartbeat was configured to broadcast over on the wireless interface to port 694 by default.
Both nodes must be specified in the ha.cf configuration file (see Appendix II), as well as specifying the
6.
virtual IP addressto be used. It is important to set auto-failback to on, in order to have the secondary
node automatically take over in the event of the primary's failing. Deadtime has been set to 30 seconds,
meaning if the primary is down for 30 seconds, the auto-failback is triggered. The authkeys file must be
identical across both nodes, because it allows the nodes to authenticate each other. If this is incorrect,
Heartbeat will not be able to authenticate that the heartbeat packet has come from the correct nodes.
Apache has been left in its default configuration as this was suitable for this project's purposes.
It is set to serve up a static web-page containing information identifying the server it came from. The
HTML file is located in /var/www/index.
The following is a network diagram of the finalized implementation:
Illustration 2: Network Diagram of the Load Balanced Server Cluster
7.
Finally, I enabledHAproxy's stats page (see Appendix I) in order to verify the correct operation
of the load balancers and to get a graphical representation of how the system was operating. It is easy to
verify that the round robin algorithm is indeed distributing traffic evenly across the back end servers. A
sample of the stats page after several HTTP requests is as follows:
5. Jmeter and Testing
Jmeter is an open source application developed specifically for the testing the performance of
web sites and web applications. It allows site administrators to develop a test plan to measure and
quantify the performance of their website or applications. I created a test plan to measure the
performance of Apache2 web server and compare the performance with the load balanced web server
cluster in the project implementation. To test my system in Jmeter, I installed Jmeter on a second
machine running Ubuntu 14.04, and connected it to the same network as the load balanced server
system. I created a new test plan and added multiple thread groups. The number of threads corresponds
Illustration 3: HAproxy Statistics Page
8.
to the numberof users. You must specify a sampler, in this case an HTTP request. This allows you to
specify the IP address, port, and directory structure of the web site under test.
For each system I began at 500 users each sending an HTTP request 5 times per user. I
increased the number of users in increments of 500 until i reached 3000. I then compared the average,
median, and deviation of load times of each system. The results were as follows:
HAproxy load balancing with Apache2 back end
Illustration 4: Jmeter being used to load test Apache2 Web Server
9.
Requests 2500 50007500 10000 12500 15000
Average 212 330 394 1166 806 815
Median 15 26 71 177 178 143
Deviation 463 805 958 2593 2013 2193
Apache2
Requests 2500 5000 7500 10000 12500 15000
Average 532 1876 2575 2885 4014 3500
Median 104 674 1030 868 1163 1028
Deviation 978 2532 3690 7221 7336 6517
6. Conclusions
It is clear from the results in this implementation that software load balancing is an effective and
less expensive alternative to hardware. It would especially be suitable as a fast intermediate solution in
situations where large scale web services had gone down and needed to be brought back up quickly.
From the results generated by Jmeter it is clear that using the load balancing technology in conjunction
with multiple web servers provided some performance gain over using a single web server. The results
that were seen in this testing were not as dramatic as had been expected, and may imply that there are
sources of error in the test set up. In all test cases the deviation exceeded the average response time,
and this may indicate that the numbers are unreliable.
There are several sources that could have introduced the errors into the data generated from the
load testing. I strongly suspect that networking issues caused the deviation to be so high during testing.
It was noticed that at times, HTTP requests would hang until they had almost timed out, but had not
taken quite enough time to be dropped. This happened both with the Apache2 web server as well as
10.
with the loadbalanced web cluster. In this screen capture, a group of requests can be seen timed out,
and they have pushed up the average response time and deviation.
6. Contributions and Additional Work
What I contributed in this project was to demonstrate that software load balancing with
HAproxy is a viable technique for handling heavy web traffic. I was able to demonstrate traffic being
distributed across multiple servers, and to quantify the gains provided by even a small load balanced
system over a single web server. The server system in the project also met the stated goal of providing
fault tolerance in that there was no single point of failure which would take the entire system out of
operation. I see this project being of benefit to smaller businesses who require a more robust system
Illustration 5: Jmeter screen capture depicting deviation in average response time
11.
than a singleweb server can provide, while at the same time being more cost effective than hardware
load balancing.
Going forward with this project, I would like to improve on the testing. I believe with
improvements I can eliminate the source of error that was causing the deviation to exceed the average. I
would also like to get logging working using syslog on an external server.
Appendix II –Heartbeat Configuration Files
#
# There are lots of options in this file. All you have to have is a set
# of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},
# and a value for "auto_failback".
#
# ATTENTION: As the configuration file is read line by line,
# THE ORDER OF DIRECTIVE MATTERS!
#
# In particular, make sure that the udpport, serial baud rate
# etc. are set before the heartbeat media are defined!
# debug and log file directives go into effect when they
# are encountered.
#
# All will be fine if you keep them ordered as in this example.
#
#
# Note on logging:
# If all of debugfile, logfile and logfacility are not defined,
# logging is the same as use_logd yes. In other case, they are
# respectively effective. if detering the logging to syslog,
# logfacility must be "none".
#
# File to write debug messages to
debugfile /var/log/ha-debug
#
#
# File to write other messages to
#
logfile /var/log/ha-log
#
#
# Facility to use for syslog()/logger
#
logfacility local0
#
#
# A note on specifying "how long" times below...
#
# The default time unit is seconds
# 10 means ten seconds
#
# You can also specify them in milliseconds
# 1500ms means 1.5 seconds
#
#
16.
# keepalive: howlong between heartbeats?
#
keepalive 2
#
# deadtime: how long-to-declare-host-dead?
#
# If you set this too low you will get the problematic
# split-brain (or cluster partition) problem.
# See the FAQ for how to use warntime to tune deadtime.
#
deadtime 30
#
# warntime: how long before issuing "late heartbeat" warning?
# See the FAQ for how to use warntime to tune deadtime.
#
warntime 10
#
#
# Very first dead time (initdead)
#
# On some machines/OSes, etc. the network takes a while to come up
# and start working right after you've been rebooted. As a result
# we have a separate dead time for when things first come up.
# It should be at least twice the normal dead time.
#
initdead 120
#
#
# What UDP port to use for bcast/ucast communication?
#
udpport 694
#
# Baud rate for serial ports...
#
#baud 19200
#
# serial serialportname ...
#serial /dev/ttyS0 # Linux
#serial /dev/cuaa0 # FreeBSD
#serial /dev/cuad0 # FreeBSD 6.x
#serial /dev/cua/a # Solaris
#
#
# What interfaces to broadcast heartbeats over?
#
bcast eth0 # Linux
#bcast eth1 eth2 # Linux
#bcast le0 # Solaris
#bcast le1 le2 # Solaris
17.
#
# Set upa multicast heartbeat medium
# mcast [dev] [mcast group] [port] [ttl] [loop]
#
# [dev] device to send/rcv heartbeats on
# [mcast group] multicast group to join (class D multicast address
# 224.0.0.0 - 239.255.255.255)
# [port] udp port to sendto/rcvfrom (set this value to the
# same value as "udpport" above)
# [ttl] the ttl value for outbound heartbeats. this effects
# how far the multicast packet will propagate. (0-255)
# Must be greater than zero.
# [loop] toggles loopback for outbound multicast heartbeats.
# if enabled, an outbound packet will be looped back and
# received by the interface it was sent on. (0 or 1)
# Set this value to zero.
#
#
#mcast eth0 225.0.0.1 694 1 0
#
# Set up a unicast / udp heartbeat medium
# ucast [dev] [peer-ip-addr]
#
# [dev] device to send/rcv heartbeats on
# [peer-ip-addr] IP address of peer to send packets to
#
ucast eth0 192.168.0.103
#
#
# About boolean values...
#
# Any of the following case-insensitive values will work for true:
# true, on, yes, y, 1
# Any of the following case-insensitive values will work for false:
# false, off, no, n, 0
#
#
#
# auto_failback: determines whether a resource will
# automatically fail back to its "primary" node, or remain
# on whatever node is serving it until that node fails, or
# an administrator intervenes.
#
# The possible values for auto_failback are:
# on - enable automatic failbacks
# off - disable automatic failbacks
# legacy - enable automatic failbacks in systems
# where all nodes do not yet support
# the auto_failback option.
18.
#
# auto_failback "on"and "off" are backwards compatible with the old
# "nice_failback on" setting.
#
# See the FAQ for information on how to convert
# from "legacy" to "on" without a flash cut.
# (i.e., using a "rolling upgrade" process)
#
# The default value for auto_failback is "legacy", which
# will issue a warning at startup. So, make sure you put
# an auto_failback directive in your ha.cf file.
# (note: auto_failback can be any boolean or "legacy")
#
auto_failback on
#
#
# Basic STONITH support
# Using this directive assumes that there is one stonith
# device in the cluster. Parameters to this device are
# read from a configuration file. The format of this line is:
#
# stonith <stonith_type> <configfile>
#
# NOTE: it is up to you to maintain this file on each node in the
# cluster!
#
#stonith baytech /etc/ha.d/conf/stonith.baytech
#
# STONITH support
# You can configure multiple stonith devices using this directive.
# The format of the line is:
# stonith_host <hostfrom> <stonith_type> <params...>
# <hostfrom> is the machine the stonith device is attached
# to or * to mean it is accessible from any host.
# <stonith_type> is the type of stonith device (a list of
# supported drives is in /usr/lib/stonith.)
# <params...> are driver specific parameters. To see the
# format for a particular device, run:
# stonith -l -t <stonith_type>
#
#
# Note that if you put your stonith device access information in
# here, and you make this file publically readable, you're asking
# for a denial of service attack ;-)
#
# To get a list of supported stonith devices, run
# stonith -L
# For detailed information on which stonith devices are supported
# and their detailed configuration options, run this command:
19.
# stonith -h
#
#stonith_host* baytech 10.0.0.3 mylogin mysecretpassword
#stonith_host ken3 rps10 /dev/ttyS1 kathy 0
#stonith_host kathy rps10 /dev/ttyS1 ken3 0
#
# Watchdog is the watchdog timer. If our own heart doesn't beat for
# a minute, then our machine will reboot.
# NOTE: If you are using the software watchdog, you very likely
# wish to load the module with the parameter "nowayout=0" or
# compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
# an orderly shutdown of heartbeat will trigger a reboot, which is
# very likely NOT what you want.
#
#watchdog /dev/watchdog
#
# Tell what machines are in the cluster
# node nodename ... -- must match uname -n
node Balancer1
node Balancer2
#
# Less common options...
#
# Treats 10.10.10.254 as a psuedo-cluster-member
# Used together with ipfail below...
# note: don't use a cluster node as ping node
#
#ping 10.10.10.254
#
# Treats 10.10.10.254 and 10.10.10.253 as a psuedo-cluster-member
# called group1. If either 10.10.10.254 or 10.10.10.253 are up
# then group1 is up
# Used together with ipfail below...
#
#ping_group group1 10.10.10.254 10.10.10.253
#
# HBA ping derective for Fiber Channel
# Treats fc-card-name as psudo-cluster-member
# used with ipfail below ...
#
# You can obtain HBAAPI from http://hbaapi.sourceforge.net. You need
# to get the library specific to your HBA directly from the vender
# To install HBAAPI stuff, all You need to do is to compile the common
# part you obtained from the sourceforge. This will produce libHBAAPI.so
# which you need to copy to /usr/lib. You need also copy hbaapi.h to
# /usr/include.
#
# The fc-card-name is the name obtained from the hbaapitest program
# that is part of the hbaapi package. Running hbaapitest will produce
20.
# a verboseoutput. One of the first line is similar to:
# Apapter number 0 is named: qlogic-qla2200-0
# Here fc-card-name is qlogic-qla2200-0.
#
#hbaping fc-card-name
#
#
# Processes started and stopped with heartbeat. Restarted unless
# they exit with rc=100
#
#respawn userid /path/name/to/run
#respawn hacluster /usr/lib/heartbeat/ipfail
#
# Access control for client api
# default is no access
#
#apiauth client-name gid=gidlist uid=uidlist
#apiauth ipfail gid=haclient uid=hacluster
###########################
#
# Unusual options.
#
###########################
#
# hopfudge maximum hop count minus number of nodes in config
#hopfudge 1
#
# deadping - dead time for ping nodes
#deadping 30
#
# hbgenmethod - Heartbeat generation number creation method
# Normally these are stored on disk and incremented as needed.
#hbgenmethod time
#
# realtime - enable/disable realtime execution (high priority, etc.)
# defaults to on
#realtime off
#
# debug - set debug level
# defaults to zero
#debug 1
#
# API Authentication - replaces the fifo-permissions-based system of the past
#
#
# You can put a uid list and/or a gid list.
# If you put both, then a process is authorized if it qualifies under either
# the uid list, or under the gid list.
21.
#
# The groupname"default" has special meaning. If it is specified, then
# this will be used for authorizing groupless clients, and any client groups
# not otherwise specified.
#
# There is a subtle exception to this. "default" will never be used in the
# following cases (actual default auth directives noted in brackets)
# ipfail (uid=HA_CCMUSER)
# ccm (uid=HA_CCMUSER)
# ping (gid=HA_APIGROUP)
# cl_status (gid=HA_APIGROUP)
#
# This is done to avoid creating a gaping security hole and matches the most
# likely desired configuration.
#
#apiauth ipfail uid=hacluster
#apiauth ccm uid=hacluster
#apiauth cms uid=hacluster
#apiauth ping gid=haclient uid=alanr,root
#apiauth default gid=haclient
# message format in the wire, it can be classic or netstring,
# default: classic
#msgfmt classic/netstring
# Do we use logging daemon?
# If logging daemon is used, logfile/debugfile/logfacility in this file
# are not meaningful any longer. You should check the config file for logging
# daemon (the default is /etc/logd.cf)
# more infomartion can be fould
in the man page.
# Setting use_logd to "yes" is recommended
#
# use_logd yes/no
#
# the interval we reconnect to logging daemon if the previous connection failed
# default: 60 seconds
#conn_logd_time 60
#
#
# Configure compression module
# It could be zlib or bz2, depending on whether u have the corresponding
# library in the system.
#compression bz2
#
# Confiugre compression threshold
# This value determines the threshold to compress a message,
# e.g. if the threshold is 1, then any message with size greater than 1 KB
# will be compressed, the default is 2 (KB)