Pgcluster
Pgcluster
Ba la ncin g Usin g
PG CL US TER
Intr oduction
PGCluster is an open source postgresSQL tool for
1)Replicating the data between masters, the replication is
synchronous (i.e. there is no delay in duplicating the data).
2) Balancing the select operation between the master servers.
3) High Availability of service .
Load Balancer :
1) LB will receives the query from the web server and sends
the query to the DB servers which has the lowest load rate.
2) Load factor is calculated with the number of sessions under continuation.
3) Loader balancer checks a problem at the time of communication with
Cluster DB. When the problem of Cluster DB is detected, loader balancer
separates Cluster DB from a system.
Config File : pglb.conf
Replication Server :
1) The replication server sends the query from Cluster DB to
each other DB servers in the cluster .
2) The replication server checks the problem of Cluster DB at the time
of communication with Cluster DB. When a problem is detected, replication
server separates Cluster DB from subsequent replications
3) When returning the separated cluster DB to a system, or when adding
Cluster DB newly, the replication server synchronizes data to the Cluster DB
Config File : pgreplicate.conf
Cluster DB Servers :
1) The cluster DB servers can receives the query from load balancer,
replicator or from web server .
2) If the DB server receives the query from Load balancer or from
web server , it sends the query to the replicator for duplicating the data.
Config Files : postgresql.conf, pg_hba.conf ,cluster.conf,
Inst all atio n
Pre request
wget \
http://pgfoundry.org/frs/download.php/1208/pgcluster-1.7.0rc2.tar.gz
tar -zxvf pgcluster-1.x.x.tar.gz
chown -R postgres : postgres pgcluster-1.x.x
cd pgcluster-1.x.x
mkdir /usr/local/pgcluster
chown -R postgres : postgres pgcluster
Compiling
cd /usr/local/src/postgresql-8.x.x
tar -zxvf /tmp/pgcluster-1.x.x-patch.tar.gz
patch -p1 < pgcluster-1.x.x-patch
DataBase Initialization
su – postgres
mkdir Data
/usr/local/pgcluster/bin/initdb -D Data
Co nfig uratio n
It involves editing the configuration files of all the three servers .
You can get the sample conf files from the path
[Pgcluster source path]/src/pgcluster/pgrp/pgreplicate.conf.sample
[Pgcluster source path]/src/pgcluster/pglb/pglb.conf.sample
[DataDirectory]/cluster.conf
3) It may or may not contain Log info about where to log the load balancer process output.
Example
<Log_File_Info>
<File_Name> /tmp/pglb.log </File_Name>
<File_Size> 1M </File_Size>
<Rotate> 3 </Rotate>
</Log_File_Info>
Editing The Config File for Replicator (pgreplicate.conf) :
1) The pgreplicate.conf file should contains the info about each DB servers involved in the cluster and
the port to be used while recovery process.
Example :
<Cluster_Server_Info>
<Host_Name> DB1 </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7001 </Recovery_Port>
</Cluster_Server_Info>
Editing cluster.conf :
1) It should contain the info about in which host,port replication server is running.
Example :
<Replicate_Server_Info>
<Host_Name> RP </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
2) It should contain the info about in which host the DB server will be running, recovery port to be
used and the path where rsync and pg_dump located for recovery purpose.
Example :
<Host_Name> DB1 </Host_Name>
<Recovery_Port> 7001 </Recovery_Port>
<Rsync_Path> /usr/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -1 </Rsync_Option>
<Rsync_Compress> yes </Rsync_Compress>
<Pg_Dump_Path> /usr/local/pgsql/bin/pg_dump </Pg_Dump_Path>
<When_Stand_Alone> read_write </When_Stand_Alone>
3) It may or may contain the info about what are not to be replicated .
Example :
<Not_Replicate_Info>
<DB_Name> testDB </DB_Name>
<Table_Name> test_table </Table_Name>
</Not_Replicate_Info>
Sta rt in g & St oppin g
Se rve rs
Starting The Servers : The start the servers in the below given order
Start all DB servers
su – postgres
/usr/local/pgcluster/bin/postmaster -D Data &
Start Replicator Server :
Move the pglb.conf and pgreplicate.conf to a directory etc in pwd
/usr/local/pgcluster/bin/pgreplicate -lnv -D etc -W etc &
Start Load Balancer Server
/usr/local/pgcluster/bin/pglb -lnv -D etc -W etc &
Stopping The Servers : The stop the server in the below order
Stop Load Balancer :
/usr/local/pgcluster/bin/pglb -lnv -D etc -W etc stop &
Stop Replicator Server :
/usr/local/pgcluster/bin/pgreplicate -lnv -D etc -W etc stop &
Stop DB Servers :
/usr/local/pgcluster/bin/pg_ctl -D Data stop
pgreplicate [-D path_of_config_file] [-W path_of_work_files] [-U login
user][-l][-n][-v][-h][stop]
config file default path: /var/lib/pgsql/data/pgreplicate.conf
-l: print error logs in the log file.
-n: don't run in daemon mode.
-v: debug mode. need '-n' flag
-h: print this help
stop: stop pgreplicate
Steps Involved :
1) Run the above test script
2) It will insert the records in the DB servers and the data will be replicated to other DB servers
3) To check weather the data got replicated run the below commands
pgcluster1.7/bin/psql -h DB1 -c "select count(1) from test” testdb
pgcluster1.7/bin/psql -h DB2 -c "select count(1) from test” testdb
pgcluster1.7/bin/psql -h DB3 -c "select count(1) from test” testdb
4) If the all the above command returns the row count as 81 then the test result in successful .
Te stin g HA & Re cove ry
Test Script :
i=1
while [ $i -le 100 ]
do
pgcluster1.7/bin/psql -p 5431
-c "insert into test values ($i,'from LB')" testdb
i=`expr $i + 1`
done
The above script will insert 100 records into the table 'test' consecutively, in the middle of this
execution each server will be brought down and will be started with
recovery mode in order to test REOCVERY
Steps Involved :
Syntax :
Initialize mode:
pgbench -i [-h hostname] [-p port] [-s scaling_factor]
[-U login] [-P password][-d][dbname]
This will initialize the database with four table branches, tellers, accounts and history with records in it.
We can load the database with data using the scaling_factor
pgbench [-h hostname] [-p port][-c nclients][-t ntransactions]
[-s scaling_factor][-n No vacuuming]
[-C Establish connection for each transaction]
[-v Do vacuuming before testing ][-S Select]
[-N Do not update "branches" and "tellers". ][-f filename]
[-l log test operation ][-U login][-P password][-d][dbname]
2) Initialize the pgbench from LB server using the command
pgbench -i -s 3 -p 5431 DB
The above command will initialize the table accounts with 30,000 records , tellers with 30 ,
branches with 3 and history with 0 in all DB servers
3) Test the LB with following command
pgbench –c 200 -t 4 -S -p 5431
Out will be
starting vacuum...end.
transaction type: SELECT only
scaling factor: 3
number of clients: 200
number of transactions per client: 4
number of transactions actually processed: 800/800
tps = 98.775200 (including connections establishing)
tps = 119.329624 (excluding connections establishing)
tps – Transaction Per Second
4) Repeat the test with different values of clients and transactions in cluster DB and in stand alone DB
server and compare their outputs. You can also use the script doing this
t=4
echo "<HTML><BODY><H2>STATISTICS</H2><TABLE border=1> <TR><TH>No.Of
Clients</TH><TH>TPS</TH></TR>" > /tmp/load_Pg.html
for i in 50 70 100 130 150 180 200 220 245
do
echo $i
pgcluster1.7/bin/pgbench -p 5431 -c $i -t $t -S -C > /tmp/tmp
Cn=`cat /tmp/tmp | grep -w '^number of clients' | awk
'BEGIN{FS=":"}{print $2}'`
TPS=`cat /tmp/tmp | grep -w '^tps' | awk 'BEGIN{FS="="}{print $2}' |
head -1 | awk '{print $1}'`
echo "<TR><TD>$Cn</TD><TD>$TPS</TD></TR>" >> /tmp/load.html
done
echo "</TABLE></HTML>" >> /tmp/load.html