How to Setup a PostgreSQL Database Cluster
Last Updated :
10 Oct, 2024
A PostgreSQL database cluster refers to a collection of databases managed by a single instance of the PostgreSQL server. Setting up a PostgreSQL cluster is an essential task for organizing multiple databases and achieving high availability, scalability, and load balancing.
Whether we are working with multiple nodes or just a single server with many databases, understanding how to set up a PostgreSQL cluster is key to effective data management. This article will discuss setting up a PostgreSQL cluster, including installation, configuration, and basic cluster management.
How to Setup a PostgreSQL Database Cluster
These steps install the PostgreSQL server and the additional tools necessary for managing and enhancing PostgreSQL Database Cluster. By enabling the service, it ensures that PostgreSQL is ready to accept connections upon system startup.
Step 1: Installing PostgreSQL
Before setting up a PostgreSQL cluster, we need to install PostgreSQL on our system. Here’s how to do it on different operating systems. If PostgreSQL is not yet installed on our system, we can install it with the following commands.
On Ubuntu/Debian
sudo apt update
sudo apt install postgresql postgresql-contrib
On CentOS/RHEL
sudo yum install postgresql-server postgresql-contrib
Initializing the Database Cluster
Once the installation is complete, initialize the database cluster by running. This ensures that the PostgreSQL service starts and runs automatically on boot
sudo systemctl start postgresql
sudo systemctl enable postgresql
Step 2: Understanding PostgreSQL Cluster Components
Understanding the components of a PostgreSQL cluster helps us to manage databases effectively. The server is responsible for processing requests, while the data directory is where all database files are stored. When you set up PostgreSQL, we are essentially creating a cluster.
A PostgreSQL cluster consists of:
- PostgreSQL Server: The database engine that manages and processes SQL queries.
- Multiple Databases: A single PostgreSQL instance can manage several databases.
- Data Directory: The folder where PostgreSQL stores data files for all the databases.
Default Data Directory:
This data directory is created automatically during installation, but we can create additional clusters if necessary
- On Ubuntu: /var/lib/postgresql/<version>/main
- On CentOS: /var/lib/pgsql/<version>/data
Step 3: Initializing a New PostgreSQL Cluster
If we want to create a new PostgreSQL cluster (besides the default one), you need to initialize it manually. The initdb command is used to initialize a new cluster.
Query:
sudo mkdir /newdata
sudo chown postgres:postgres /newdata
sudo -u postgres initdb -D /newdata
Explanation:
In this example, the new cluster will store its data in the /newdata directory. Make sure PostgreSQL has permission to access this directory by setting the owner to the postgres user.
Step 4: Starting the PostgreSQL Cluster
The pg_ctl
command starts the PostgreSQL server with the specified data directory. Using psql
, we can connect to the PostgreSQL prompt and manage databases within the new cluster.
sudo -u postgres pg_ctl -D /newdata start
Connecting to the New Cluster:
We can connect to the new cluster using the psql tool. This opens the PostgreSQL prompt where we can start managing the databases inside the new cluster.
psql -d postgres
Step 5: Managing Multiple PostgreSQL Clusters
PostgreSQL allows us to manage multiple clusters on a single machine, each with its own data directory and configurations. If we want to manage multiple clusters, we can use the pg_lsclusters and pg_ctlcluster commands (specific to Ubuntu/Debian systems).
List All Clusters:
The pg_lsclusters
command lists all the PostgreSQL clusters on the machine, showing their status.
sudo pg_lsclusters
Start a Specific Cluster:
This is useful if you have more than one PostgreSQL version or multiple clusters with different configurations.
sudo pg_ctlcluster <version> <cluster-name> start
Step 6: Setting Up Streaming Replication (Optional)
For high availability and scalability, we can set up streaming replication within the PostgreSQL cluster. This involves setting up a primary server (master) and one or more standby servers (replicas). Here’s a basic overview of setting up streaming replication.
1. Configure the Primary Server:
Edit the postgresql.conf file on the primary server:
sudo nano /var/lib/pgsql/data/postgresql.conf
Uncomment and set the following parameters:
wal_level = replica
max_wal_senders = 3
Next, edit the pg_hba.conf file to allow replication connections from the standby server:
host replication all <standby-ip-address>/32 md5
2. Backup the Primary Server Data:
Use pg_basebackup to create a base backup of the primary server data, which the standby will use to replicate:
sudo -u postgres pg_basebackup -h <primary-server-ip> -D /var/lib/pgsql/data -P -U replication -Fp -Xs -R
3. Start the Standby Server:
Start the standby server, and it will begin streaming changes from the primary server:
sudo systemctl start postgresql
Step 7: Basic PostgreSQL Cluster Commands
These commands are essential for managing the PostgreSQL service. Restarting and stopping the service allows for maintenance and configuration changes, while checking the status ensures the service is running correctly.
Restart PostgreSQL Cluster:
sudo systemctl restart postgresql
Stop PostgreSQL Cluster:
sudo systemctl stop postgresql
Checking the Cluster Status:
sudo systemctl status postgresql
Conclusion
Setting up a PostgreSQL database cluster is important for managing multiple databases and enhancing availability and performance. Whether we are managing a single server or multiple nodes, PostgreSQL provides flexible clustering options, including multi-database management and replication.
Similar Reads
PostgreSQL ALTER DATABASE
The PostgreSQL ALTER DATABASE statement is a powerful tool used for modifying an existing database.  The features of a database, once created can be changed using the ALTER DATABASE statement. Let's explore the syntax of PostgreSQL ALTER DATABASE, the various actions that can be performed, and prac
3 min read
How to set up a PostgreSQL Database with Podman
Podman is a tool that developers and system administrators are using more and more to manage and deploy their software stacks as the need for containerized applications grows. We will look at how to use Podman to set up and maintain a PostgreSQL database in this tutorial. The overview of PostgreSQL
7 min read
PostgreSQL - Size of a Database
Efficient database management is essential for ensuring optimal performance in PostgreSQL. One critical aspect of this is monitoring the database size to manage storage and plan for scaling. PostgreSQL offers powerful built-in functions like pg_database_size() to calculate the size of a specific dat
4 min read
How to Dump and Restore PostgreSQL Database?
PostgreSQL remains among the most efficient and widely applied open-source relational database management systems. It provides the superior function of saving, configuring, and extracting information most effectively. In the process of migrating data, creating backups, or transferring databases betw
6 min read
Python PostgreSQL - Create Database
In this article, we will discuss how to create database in PostgreSQL using pysopg2 in Python. CREATE DATABASE is one of the Data Definition Language ( DDL ) statements supported by the PostgreSQL Database Management System. It is used to create database in PostgreSQL. Database name should be always
1 min read
PostgreSQL - Create Database
Creating a database in PostgreSQL is an important task for developers and database administrators to manage data effectively. PostgreSQL provides multiple ways to create a database, catering to different user preferences, whether through the command-line interface or using a graphical interface like
5 min read
PostgreSQL - Backup Database
All commercial firms and corporations are never 100% free of data threats. Being wary of scenarios like data corruption, host or network failures or even deliberate storage facility destruction is extremely important. Therefore, backing up data is a critical activity that every firm depends on to en
12 min read
PostgreSQL - Restore Database
Restoring a PostgreSQL database is an essential process for data recovery, especially when dealing with data loss, corruption, or unintentional errors. The ability to efficiently restore a database ensures that an organization can maintain data integrity, availability, and consistency. In this artic
5 min read
How to Export PostgreSQL Database Without Data Using SQL?
When we are working with the PostgreSQL database, there are multiple times we need to export the database structure. This approach is useful when we create a skeleton database or migrate the schema changes for different environments or systems. In this article, we will explore the process of exporti
3 min read
PostgreSQL - Show Databases
In PostgreSQL, viewing a list of all databases on a server requires specific commands, as it doesnât support a direct SHOW DATABASES statement like MySQL. Instead, you can use the \l or \l+ commands in psql or query the pg_database view to display all databases. In this article, we will guide us thr
3 min read