Work with PostgreSQL database WAL log files

Datastream uses the PostgreSQL WAL (Write Ahead Log) transaction log to read PostgreSQL streams. The log is stored in WAL files on the database server. Each record in the WAL log represents a single change to the actual data in one of the tables in the database.

Set configuration parameters for PostgreSQL WAL files

It is recommended that you apply the following configuration settings to your PostgreSQL database:

  • max_slot_wal_keep_size: set this parameter (available only for PostgreSQL 13 and above) to limit the amount of storage used by the replication slot. This is particularly important for long-running transactions, which, in extreme cases, can lead to the WAL file size taking up the entire storage and crashing the database.

  • statement_timeout: set this parameter to a selected value to reduce latency caused by long-running transactions. You can also use statement_timeout as an alternative precaution measure for databases that don't support max_slot_wal_keep_size.

  • wal_sender_timeout: set this parameter to 0 (to disable the timeout) or to a value greater than or equal to 10 minutes.

If you plan to create more than 10 streams, or the number of logical replication slots that is used by other resources in addition to the number of planned streams exceeds 10, make sure to modify the following parameters:

  • max_replication_slots: increase the value of this parameter, depending on the number of replication slots set for your database (you need 1 replication slot per stream). You can only set max_replication_slots at server start.

  • max_wal_senders: increase the value of this parameter, so that it's greater than the value of the max_replication_slots parameter. You can only set max_wal_senders when you start the server.

Optimize WAL log files

To avoid high latency of your streams and rapid growth in the size of WAL log files when replicating data from a PostgreSQL source, consider applying the following precautions:

  • Avoid large long-running operations because they can significantly increase the size of your WAL file.
  • Use UNLOGGED or TEMPORARY tables during batch operations.
  • Check your WAL configuration and consider reducing the checkpoint frequency. For more information, see WAL configuration
  • Check for large DELETE operations and consider replacing them with TRUNCATE operations. Doing this can significantly reduce WAL file data, however you need to be cautious, because Datastream doesn't replicate TRUNCATE operations.

What's next