Incremental Backups in PostgreSQL 17

 The old adage that a DBA or SRE is only as good as their last backup is true.  PostgreSQL 17 added the ability to combine multiple incremental backups with a full backup to provide a complete data dictionary to recover a failed system. It is very easy to use.

This is a quick example of using incrementals. I recommend watching this video by Robert Haas for more details and some discussion of backup strategies.

Step 1 - Enable WALL Summerization


The incremental backups require Write Ahead Log Summarization. It is likely not on your instance by default. But it is easy to enable.

demo=# alter system set summerize_wal = 'on';
ALTER SYSTEM
demo=# select pg_reload_conf();
pg_reload_conf 
----------------
 t
(1 row)

Step 2 - Full Backup

The examples below are much simpler than you will run into in real life.  The scenario is that we make a full backup on Sunday and do incrementals the rest of the week. Wednesday through Saturday are omitted below. 

stoker@ThinkPad:~$ mkdir full monday tuesday

stoker@ThinkPad:~$ pg_basebackup -D full --verbose
Password: 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 6/12000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_21051"
pg_basebackup: write-ahead log end point: 6/12000158
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
stoker@ThinkPad:~$ 


Step 3 - First Incremental (Monday)

On Monday, the first incremental backup is created. The manifest, with all the WAL information on what has changed, needed to be referenced.

stoker@ThinkPad:~$ pg_basebackup -D monday -i ./full/backup_manifest --verbose
Password: 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 6/14000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_22703"
pg_basebackup: write-ahead log end point: 6/14000120
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed


Step 4 - Second Incremental (Tuesday)

On Tuesday, another incremental backup is created.  

We could have just recorded the changes since the last incremental backup by referring to the backup manifest in the Monday directory. In that case, to completely restore the data, we need to combine the full backup and both the Monday and Tuesday incremental backups. That is a very valid way of operating. If the data does not churn a great deal, this can save time making incrementals. 

My personal preference is to base incrementals on full backups.  This means only the full and Tuesday backups must be used for complete restoration. This reduction in logical steps may seem trivial, but tempting a Murphy's Law Ticket is something I avoid after too many restorations that are too early in the morning after catastrophic system failures. 

stoker@ThinkPad:~$ pg_basebackup -D tuesday -i ./full/backup_manifest --verbose
Password: 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 6/16000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_22870"
pg_basebackup: write-ahead log end point: 6/16000120
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

Step 5 - Combine Full, Monday, and Tuesday

Sometime after the Tuesday incremental backup, something horrible happens to your instance (probably someone with too many instance privileges made a DELETE query without a WHERE clause (AGAIN!?!?)). It is simple to create a new home for the data and combine the Full and Tuesday backup data into that new home.

stoker@ThinkPad:~$ mkdir newdatadir
stoker@ThinkPad:~$ /usr/lib/postgresql/17/bin/pg_combinebackup full tuesday -o newdatadir
stoker@ThinkPad:~$ 

Step 6 - Restart


Make sure you 'chmod 700' your new datadir and that your postgres account is the owner. Stop the running instance. You're now ready to use the restored data directory. Point pg_ctl at the new datadir, and you are done.

Summary


Incremental backups will make your life easier, and this feature is a big reason for those not on PostgreSQL 17 to upgrade. 


Comments

Popular posts from this blog

Can Artificial Intelligence Created Better Tables Than You?

Saving The Output From psql

PostgreSQL's COPY and \COPY