Open In App

How to Setup a PostgreSQL Database Cluster

Last Updated : 10 Oct, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

A PostgreSQL database cluster refers to a collection of databases managed by a single instance of the PostgreSQL server. Setting up a PostgreSQL cluster is an essential task for organizing multiple databases and achieving high availability, scalability, and load balancing.

Whether we are working with multiple nodes or just a single server with many databases, understanding how to set up a PostgreSQL cluster is key to effective data management. This article will discuss setting up a PostgreSQL cluster, including installation, configuration, and basic cluster management.

How to Setup a PostgreSQL Database Cluster

These steps install the PostgreSQL server and the additional tools necessary for managing and enhancing PostgreSQL Database Cluster. By enabling the service, it ensures that PostgreSQL is ready to accept connections upon system startup.

Step 1: Installing PostgreSQL

Before setting up a PostgreSQL cluster, we need to install PostgreSQL on our system. Here’s how to do it on different operating systems. If PostgreSQL is not yet installed on our system, we can install it with the following commands.

On Ubuntu/Debian

sudo apt update
sudo apt install postgresql postgresql-contrib

On CentOS/RHEL

sudo yum install postgresql-server postgresql-contrib

Initializing the Database Cluster

Once the installation is complete, initialize the database cluster by running. This ensures that the PostgreSQL service starts and runs automatically on boot

sudo systemctl start postgresql
sudo systemctl enable postgresql

Step 2: Understanding PostgreSQL Cluster Components

Understanding the components of a PostgreSQL cluster helps us to manage databases effectively. The server is responsible for processing requests, while the data directory is where all database files are stored. When you set up PostgreSQL, we are essentially creating a cluster.

A PostgreSQL cluster consists of:

  • PostgreSQL Server: The database engine that manages and processes SQL queries.
  • Multiple Databases: A single PostgreSQL instance can manage several databases.
  • Data Directory: The folder where PostgreSQL stores data files for all the databases.

Default Data Directory:

This data directory is created automatically during installation, but we can create additional clusters if necessary

  • On Ubuntu: /var/lib/postgresql/<version>/main
  • On CentOS: /var/lib/pgsql/<version>/data

Step 3: Initializing a New PostgreSQL Cluster

If we want to create a new PostgreSQL cluster (besides the default one), you need to initialize it manually. The initdb command is used to initialize a new cluster.

Query:

sudo mkdir /newdata
sudo chown postgres:postgres /newdata
sudo -u postgres initdb -D /newdata

Explanation:

In this example, the new cluster will store its data in the /newdata directory. Make sure PostgreSQL has permission to access this directory by setting the owner to the postgres user.

Step 4: Starting the PostgreSQL Cluster

The pg_ctl command starts the PostgreSQL server with the specified data directory. Using psql, we can connect to the PostgreSQL prompt and manage databases within the new cluster.

sudo -u postgres pg_ctl -D /newdata start

Connecting to the New Cluster:

We can connect to the new cluster using the psql tool. This opens the PostgreSQL prompt where we can start managing the databases inside the new cluster.

psql -d postgres

Step 5: Managing Multiple PostgreSQL Clusters

PostgreSQL allows us to manage multiple clusters on a single machine, each with its own data directory and configurations. If we want to manage multiple clusters, we can use the pg_lsclusters and pg_ctlcluster commands (specific to Ubuntu/Debian systems).

List All Clusters:

The pg_lsclusters command lists all the PostgreSQL clusters on the machine, showing their status.

sudo pg_lsclusters

Start a Specific Cluster:

This is useful if you have more than one PostgreSQL version or multiple clusters with different configurations.

sudo pg_ctlcluster <version> <cluster-name> start

Step 6: Setting Up Streaming Replication (Optional)

For high availability and scalability, we can set up streaming replication within the PostgreSQL cluster. This involves setting up a primary server (master) and one or more standby servers (replicas). Here’s a basic overview of setting up streaming replication.

1. Configure the Primary Server:

Edit the postgresql.conf file on the primary server:

sudo nano /var/lib/pgsql/data/postgresql.conf

Uncomment and set the following parameters:

wal_level = replica
max_wal_senders = 3

Next, edit the pg_hba.conf file to allow replication connections from the standby server:

host replication all <standby-ip-address>/32 md5

2. Backup the Primary Server Data:

Use pg_basebackup to create a base backup of the primary server data, which the standby will use to replicate:

sudo -u postgres pg_basebackup -h <primary-server-ip> -D /var/lib/pgsql/data -P -U replication -Fp -Xs -R

3. Start the Standby Server:

Start the standby server, and it will begin streaming changes from the primary server:

sudo systemctl start postgresql

Step 7: Basic PostgreSQL Cluster Commands

These commands are essential for managing the PostgreSQL service. Restarting and stopping the service allows for maintenance and configuration changes, while checking the status ensures the service is running correctly.

Restart PostgreSQL Cluster:

sudo systemctl restart postgresql

Stop PostgreSQL Cluster:

sudo systemctl stop postgresql

Checking the Cluster Status:

sudo systemctl status postgresql

Conclusion

Setting up a PostgreSQL database cluster is important for managing multiple databases and enhancing availability and performance. Whether we are managing a single server or multiple nodes, PostgreSQL provides flexible clustering options, including multi-database management and replication.


Next Article

Similar Reads