0% found this document useful (0 votes)
675 views

Pgcluster

PGCluster is an open source tool that provides: 1) Synchronous data replication between master database servers 2) Load balancing of read operations across multiple master servers 3) High availability of the database service It consists of load balancers, database cluster servers, and replicator servers. Configuration files must be edited for each type of server. To test, data is inserted into tables and checked for replication across servers. High availability is tested by bringing servers down during an insertion test.

Uploaded by

api-3856948
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
675 views

Pgcluster

PGCluster is an open source tool that provides: 1) Synchronous data replication between master database servers 2) Load balancing of read operations across multiple master servers 3) High availability of the database service It consists of load balancers, database cluster servers, and replicator servers. Configuration files must be edited for each type of server. To test, data is inserted into tables and checked for replication across servers. High availability is tested by bringing servers down during an insertion test.

Uploaded by

api-3856948
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Hig h Ava il ability & L oad

Ba la ncin g Usin g
PG CL US TER
Intr oduction
PGCluster is an open source postgresSQL tool for
1)Replicating the data between masters, the replication is
synchronous (i.e. there is no delay in duplicating the data).
2) Balancing the select operation between the master servers.
3) High Availability of service .

PGCluster is compatible with Linux,Solaris and FreeBSD and


it not compatible with Windows
Co ntent

Architecture

Installation

Configuration

Starting and Stopping the servers

Testing
Ar chit ecture
PGCluster Consists of following servers
1) Load Balancer.
2) DB Cluster Servers
3) Replicator

Load Balancer :
1) LB will receives the query from the web server and sends
the query to the DB servers which has the lowest load rate.
2) Load factor is calculated with the number of sessions under continuation.
3) Loader balancer checks a problem at the time of communication with
Cluster DB. When the problem of Cluster DB is detected, loader balancer
separates Cluster DB from a system.
Config File : pglb.conf
Replication Server :
1) The replication server sends the query from Cluster DB to
each other DB servers in the cluster .
2) The replication server checks the problem of Cluster DB at the time
of communication with Cluster DB. When a problem is detected, replication
server separates Cluster DB from subsequent replications
3) When returning the separated cluster DB to a system, or when adding
Cluster DB newly, the replication server synchronizes data to the Cluster DB
Config File : pgreplicate.conf
Cluster DB Servers :
1) The cluster DB servers can receives the query from load balancer,
replicator or from web server .
2) If the DB server receives the query from Load balancer or from
web server , it sends the query to the replicator for duplicating the data.
Config Files : postgresql.conf, pg_hba.conf ,cluster.conf,
Inst all atio n
Pre request

wget \
http://pgfoundry.org/frs/download.php/1208/pgcluster-1.7.0rc2.tar.gz
tar -zxvf pgcluster-1.x.x.tar.gz
chown -R postgres : postgres pgcluster-1.x.x
cd pgcluster-1.x.x
mkdir /usr/local/pgcluster
chown -R postgres : postgres pgcluster

Compiling

./configure --prefix=/usr/local/pgcluster –enable-thread-safety


make
make install
Patching 

cd /usr/local/src/postgresql-8.x.x
tar -zxvf /tmp/pgcluster-1.x.x-patch.tar.gz
patch -p1 < pgcluster-1.x.x-patch

DataBase Initialization

su – postgres
mkdir Data
/usr/local/pgcluster/bin/initdb -D Data
Co nfig uratio n
It involves editing the configuration files of all the three servers .
You can get the sample conf files from the path
[Pgcluster source path]/src/pgcluster/pgrp/pgreplicate.conf.sample
[Pgcluster source path]/src/pgcluster/pglb/pglb.conf.sample
[DataDirectory]/cluster.conf

Editing The Config File for Load Balancer (pglb.conf) :


1) The pglb.conf file should contains the info about each DB servers involved in the cluster and
maximum amount of sessions linked to that DB server , it needs to be lesser than the max_connection
of DB server
Example :
<Cluster_Server_Info>
<Host_Name> DB1 </Host_Name>
<Port> 5432 < /Port>
<Max_Connect> 100 </Max_Connect>
</Cluster_Server_Info>
2) It should contain info about in which host, port the load balancer will be running , maximum no.of
DB server will be involved in cluster , the port to be used will recovery process .
Example :
<Host_Name> LB </Host_Name>
<Backend_Socket_Dir> /tmp </Backend_Socket_Dir>
<Receive_Port> 5431 </Receive_Port>
<Recovery_Port> 6001 </Recovery_Port>
<Max_Cluster_Num> 128 </Max_Cluster_Num>

3) It may or may not contain Log info about where to log the load balancer process output.
Example
<Log_File_Info>
<File_Name> /tmp/pglb.log </File_Name>
<File_Size> 1M </File_Size>
<Rotate> 3 </Rotate>
</Log_File_Info>
Editing The Config File for Replicator (pgreplicate.conf) :
1) The pgreplicate.conf file should contains the info about each DB servers involved in the cluster and
the port to be used while recovery process.
Example :
<Cluster_Server_Info>
<Host_Name> DB1 </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7001 </Recovery_Port>
</Cluster_Server_Info>

2) It should contain info about LoadBalancer


Example :
<LoadBalance_Server_Info>
<Host_Name> LB </Host_Name>
<Recovery_Port> 6001 </Recovery_Port>
</LoadBalance_Server_Info>
3) It should contain info about in which host,port the replicator server will be running and the port to be
used for recovery process .
Example
<Host_Name> </Host_Name>
<Replication_Port> 8001 </Replication_Port>
<Recovery_Port> 8101 </Recovery_Port>
<Response_Mode> reliable </Response_Mode>
<Use_Replication_Log> yes </Use_Replication_Log>
4) It may or may not contain the logging info

Editing The Config File for DB Servers (postgresql.conf, pg_hba.conf, cluster.conf ) :


Editing postgresql.conf :
1) Make the DB server to listen to the replication server and LB server by editing the value of
listen_address to ' * '
2) Port in which the DB server will be running default ( 5432 ).
3) Maximum Number of client connection the DB server will allow default (max_connection=100)
Editing pg_hba.conf :
1) Add an entry for replication server and Load balancer to contact DB server .
Example
# TYPE DATABASE USER CIDR-ADDRESS METHOD
host all all 192.168.0.1/24 trust
host all all 192.168.0.7/24 trust

Editing cluster.conf :
1) It should contain the info about in which host,port replication server is running.
Example :
<Replicate_Server_Info>
<Host_Name> RP </Host_Name>
<Port> 8001 </Port>
<Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
2) It should contain the info about in which host the DB server will be running, recovery port to be
used and the path where rsync and pg_dump located for recovery purpose.
Example :
<Host_Name> DB1 </Host_Name>
<Recovery_Port> 7001 </Recovery_Port>
<Rsync_Path> /usr/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -1 </Rsync_Option>
<Rsync_Compress> yes </Rsync_Compress>
<Pg_Dump_Path> /usr/local/pgsql/bin/pg_dump </Pg_Dump_Path>
<When_Stand_Alone> read_write </When_Stand_Alone>

3) It may or may contain the info about what are not to be replicated .
Example :
<Not_Replicate_Info>
<DB_Name> testDB </DB_Name>
<Table_Name> test_table </Table_Name>
</Not_Replicate_Info>
Sta rt in g & St oppin g
Se rve rs
Starting The Servers : The start the servers in the below given order
Start all DB servers
su – postgres
/usr/local/pgcluster/bin/postmaster -D Data &
Start Replicator Server :
Move the pglb.conf and pgreplicate.conf to a directory etc in pwd
/usr/local/pgcluster/bin/pgreplicate -lnv -D etc -W etc &
Start Load Balancer Server
/usr/local/pgcluster/bin/pglb -lnv -D etc -W etc &

Stopping The Servers : The stop the server in the below order
Stop Load Balancer :
/usr/local/pgcluster/bin/pglb -lnv -D etc -W etc stop &
Stop Replicator Server :
/usr/local/pgcluster/bin/pgreplicate -lnv -D etc -W etc stop &
Stop DB Servers :
/usr/local/pgcluster/bin/pg_ctl -D Data stop
pgreplicate [-D path_of_config_file] [-W path_of_work_files] [-U login
user][-l][-n][-v][-h][stop]
config file default path: /var/lib/pgsql/data/pgreplicate.conf
-l: print error logs in the log file.
-n: don't run in daemon mode.
-v: debug mode. need '-n' flag
-h: print this help
stop: stop pgreplicate

pglb [-D path_of_config_file] [-W path_of_work_files] [-n][-v][-h]


[stop | restart]
config file default path: /var/lib/pgsql/data/pglb.conf
-l: print error logs in the log file.
-n: don't run in daemon mode.
-v: debug mode. need '-n' flag
-h: print this help
stop: stop pglb
restart: restart pglb
Te stin g Mu lti-Ma ste r
ReScript
Test plic: atio n
i=1
while [ $i -le 81 ]
do
pgcluster1.7/bin/psql -p 5432 -h DB1 \
-c "insert into test values ($i,'from DB1')" testdb
i=`expr $i + 1`
pgcluster1.7/bin/psql -p 5432 -h DB2 \
-c "insert into test values ($i,'from DB2')" testdb
i=`expr $i + 1`
pgcluster1.7/bin/psql -p 5432 -h DB3 \
-c "insert into test values ($i,'from DB3')" testdb
i=`expr $i + 1`
done
The above script will insert a record into the table test in each DB servers and the insertion made in each
DB servers must replicate to other DB severs

Steps Involved :
1) Run the above test script
2) It will insert the records in the DB servers and the data will be replicated to other DB servers
3) To check weather the data got replicated run the below commands
    pgcluster1.7/bin/psql -h DB1 -c "select count(1) from test” testdb
    pgcluster1.7/bin/psql -h DB2 -c "select count(1) from test” testdb
pgcluster1.7/bin/psql -h DB3 -c "select count(1) from test” testdb
4) If the all the above command returns the row count as 81 then the test result in successful .
Te stin g HA & Re cove ry
Test Script :
i=1
while [ $i -le 100 ]
do
pgcluster1.7/bin/psql -p 5431
-c "insert into test values ($i,'from LB')" testdb
i=`expr $i + 1`
done

The above script will insert 100 records into the table 'test' consecutively, in the middle of this
execution each server will be brought down and will be started with
recovery mode in order to test REOCVERY
Steps Involved :

1) Run the above script


2) While Running the script bring down one of the DB servers by killing its PID .
3) The script should run by inserting the records on other DB servers even after killing one of the DB
server.
4) The script will exit after inserting 100 records on the table test. If the script got exited then check row
count of table in each DB servers they all should have same count
5) Bring up the DB server in the RECOVERY mode by below command
pgcluster1.7/bin/postmaster -D <DATA_DIR> -U &
6) The above command will recover the DB server with pg_dump.
7) After starting in the recovery mode the DB server will comes in sync with the other DB servers by
replicating the data from its neighbouring DB servers.
8) The data in the all the DB server should be same .
Te stin g L oad Ba lancer (U sing
PG bench)
Steps Involved
1) Install pgbench
cd [Pgcluster source path]/contrib/pgbench/
make
make install
pgbench will be installed in the base directory(/usr/local/pgcluster)

Syntax : 
Initialize mode:
pgbench -i [-h hostname] [-p port] [-s scaling_factor]
[-U login] [-P password][-d][dbname]
This will initialize the database with four table branches, tellers, accounts and history with records in it.
We can load the database with data using the scaling_factor 
pgbench [-h hostname] [-p port][-c nclients][-t ntransactions]
[-s scaling_factor][-n No vacuuming]
[-C Establish connection for each transaction]
[-v Do vacuuming before testing ][-S Select]
[-N Do not update "branches" and "tellers". ][-f filename]
[-l log test operation ][-U login][-P password][-d][dbname]

2) Initialize the pgbench from LB server using the command 
pgbench -i -s 3 -p 5431 DB
   The above command will initialize the table accounts with 30,000 records , tellers with 30 ,        
    branches with 3 and history with 0 in all DB servers

3) Test the LB with following command 
        pgbench –c 200 -t 4 -S -p 5431
     Out will be 
starting vacuum...end.
transaction type: SELECT only
scaling factor: 3
number of clients: 200
number of transactions per client: 4
number of transactions actually processed: 800/800
tps = 98.775200 (including connections establishing)
tps = 119.329624 (excluding connections establishing)
tps – Transaction Per Second

4) Repeat the test with different values of clients and transactions in cluster DB and in stand alone DB 
server and compare their outputs. You can also use the script doing this 
t=4
echo "<HTML><BODY><H2>STATISTICS</H2><TABLE border=1> <TR><TH>No.Of
Clients</TH><TH>TPS</TH></TR>" > /tmp/load_Pg.html
for i in 50 70 100 130 150 180 200 220 245
do
echo $i
pgcluster1.7/bin/pgbench -p 5431 -c $i -t $t -S -C > /tmp/tmp
Cn=`cat /tmp/tmp | grep -w '^number of clients' | awk
'BEGIN{FS=":"}{print $2}'`
TPS=`cat /tmp/tmp | grep -w '^tps' | awk 'BEGIN{FS="="}{print $2}' |
head -1 | awk '{print $1}'`
echo "<TR><TD>$Cn</TD><TD>$TPS</TD></TR>" >> /tmp/load.html
done
echo "</TABLE></HTML>" >> /tmp/load.html

You might also like