Categories
SysOps

How to perform PostgreSQL point-in-time recovery

This blog post is an update to the earlier PostgreSQL entries in which I will show you how to use continuous Write Ahead Log archiving to take advantage of the point-in-time recovery.

Table of contents

Initial notes

I will use Debian Jessie and PostgreSQL 9.4 database server.

debian:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 8.2 (jessie)
Release:        8.2
Codename:       jessie
postgres@debian:~$ psql -A -t -c "select version()"
PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit

Database configuration

Create archive/wal, archive/base, archive/repo and archive/temp directories inside postgres home directory.

debian:~$ sudo su - postgres
postgres@debian:~$ mkdir -p archive/{wal,base,repo,temp}

Set proper permissions on the created directories.

postgres@debian:~$ chmod -R 0700 archive
archive/wal directory will be used to store Write Ahead Log file segments continuously, archive/base will contain basic database backup, archive/repo will serve as an old backup repository, archive/temp will be used only temporarily.

Specify logging information required for Write Ahead Log archiving.

debian:~$ sudo sed -i -e "s/^#wal_level = minimal/wal_level = archive/" /etc/postgresql/9.4/main/postgresql.conf

Enable archive mode on the database server.

debian:~$ sudo sed -i -e "s/^#archive_mode = off/archive_mode = on/" /etc/postgresql/9.4/main/postgresql.conf

Define shell command that will be used to store each completed Write Ahead Log file segment.

debian:~$ sudo sed -i -e "s/^#archive_command = ''/archive_command = 'test ! -f \/var\/lib\/postgresql\/archive\/wal\/%f.xz \&\& xz < %p > \/var\/lib\/postgresql\/archive\/wal\/%f.xz'/" /etc/postgresql/9.4/main/postgresql.conf
%f parameter will be expanded to file name, %p parameter will be expanded to pathname relative to the cluster data directory.

Restart PostgreSQL service to apply changes.

postgres@debian:~$ pg_ctlcluster 9.4 main restart

Create initial database backup

I will use standard file system utilities instead pg_basebackup as this utility is available since PostgreSQL 9.1 and not before that. Repeat this procedure every week or month to keep backups off-line and speed up the recovery process. You can use rsync to keep backups on different servers.

Backup currently archived Write Ahead Log file segments to clean the archive/wal directory.

postgres@debian:~$ mv /var/lib/postgresql/archive/wal/* /var/lib/postgresql/archive/temp/
postgres@debian:~$ tar --remove-files -cf /var/lib/postgresql/archive/repo/wal_$(hostname)_$(date +%Y-%m-%d_%H%M).tar /var/lib/postgresql/archive/temp/*

Backup currently archived base directory to clean the archive/base directory.

postgres@debian:~$ tar --remove-files -cfJ /var/lib/postgresql/archive/repo/base_$(hostname)_$(date +%Y-%m-%d_%H%M).txz /var/lib/postgresql/archive/base/*

Create a checkpoint in the database and write a backup label file to start backup mode.

postgres@debian:~$ psql -A -t -c "select pg_start_backup('Backup created $(date +%d/%m/%Y %H:%M) at $(hostname)')"
0/5C000028

Copy database to defined earlier archive/base location.

postgres@debian:~$ cp -pr /var/lib/postgresql/9.4/main/* /var/lib/postgresql/archive/base/

Stop backup mode and perform the automatic switch to the next Write Ahead Log segment.

postgres@debian:~$ psql  -A -t -c "select pg_stop_backup()"
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
0/60003E20

Remove obsolete postmaster.opts and postmaster.pid files from backup.

postgres@debian:~$ rm /var/lib/postgresql/archive/base/postmaster.*

Remove Write Ahead Log file segments stored inside base backup as these files are very likely outdated, and we have already gathered current files inside archive/wal directory.

postgres@debian:~$ rm -r /var/lib/postgresql/archive/base/pg_xlog/*

Now you have complete backup inside /var/lib/postgresql/archive/base/ directory and current Write Ahead Log compressed file segments inside /var/lib/postgresql/archive/wal/ directory.

The recovery process – restore to the most recent state

Stop the database server.

postgres@debian:~$ pg_ctlcluster 9.4 main stop

Backup current database files.

postgres@debian:~$ mv /var/lib/postgresql/9.4/main /var/lib/postgresql/9.4/main.before_recovery

Copy archived database files.

postgres@debian:~$ cp -rp /var/lib/postgresql/archive/base /var/lib/postgresql/9.4/main

Create a simple recovery file to restore every available Write Ahead Log file segment.

postgres@debian:~$ cat << EOF | tee /var/lib/postgresql/9.4/main/recovery.conf
# provide missing wal file segments
restore_command = 'xz --decompress < /var/lib/postgresql/archive/wal/%f.xz > %p'
EOF
This solution introduces recovery.conf recovery configuration file as Write Ahead Log file segments are compressed and stored in a different location.

Start the database server to perform recovery.

postgres@debian:~$ pg_ctlcluster 9.4 main start

Monitor recovery process.

postgres@debian:~$ tail /var/log/postgresql/postgresql-9.4-main.log
2016-01-22 16:35:56 CET [6622-1] LOG:  database system was interrupted; last known up at 2016-01-22 16:13:48 CET
2016-01-22 16:35:56 CET [6622-2] LOG:  starting archive recovery
2016-01-22 16:35:56 CET [6622-3] LOG:  restored log file "00000001000000000000005C" from archive
2016-01-22 16:35:56 CET [6622-4] LOG:  redo starts at 0/5C000090
2016-01-22 16:35:56 CET [6622-5] LOG:  restored log file "00000001000000000000005D" from archive
2016-01-22 16:35:56 CET [6622-6] LOG:  restored log file "00000001000000000000005E" from archive
2016-01-22 16:35:57 CET [6622-7] LOG:  restored log file "00000001000000000000005F" from archive
2016-01-22 16:35:57 CET [6622-8] LOG:  restored log file "000000010000000000000060" from archive
2016-01-22 16:35:57 CET [6622-9] LOG:  consistent recovery state reached at 0/60003E20
2016-01-22 16:35:57 CET [6622-10] LOG:  restored log file "000000010000000000000061" from archive
2016-01-22 16:35:57 CET [6622-11] LOG:  restored log file "000000010000000000000062" from archive
2016-01-22 16:35:58 CET [6622-12] LOG:  restored log file "000000010000000000000063" from archive
2016-01-22 16:35:58 CET [6622-13] LOG:  restored log file "000000010000000000000064" from archive
sh: 1: cannot open /var/lib/postgresql/archive/wal/000000010000000000000065.xz: No such file
2016-01-22 16:35:58 CET [6622-14] LOG:  redo done at 0/64003E78
2016-01-22 16:35:58 CET [6622-15] LOG:  last completed transaction was at log time 2016-01-22 16:22:20.237615+01
2016-01-22 16:35:59 CET [6622-16] LOG:  restored log file "000000010000000000000064" from archive
sh: 1: cannot open /var/lib/postgresql/archive/wal/00000002.history.xz: No such file
2016-01-22 16:35:59 CET [6622-17] LOG:  selected new timeline ID: 2
sh: 1: cannot open /var/lib/postgresql/archive/wal/00000001.history.xz: No such file
2016-01-22 16:36:00 CET [6622-18] LOG:  archive recovery complete
2016-01-22 16:36:01 CET [6622-19] LOG:  MultiXact member wraparound protections are now enabled

The recovery process – restore to the specific point in time

Stop the database server.

postgres@debian:~$ pg_ctlcluster 9.4 main stop

Backup current database files.

postgres@debian:~$ mv /var/lib/postgresql/9.4/main /var/lib/postgresql/9.4/main.before_recovery

Copy archived database files.

postgres@debian:~$ cp -rp /var/lib/postgresql/archive/base /var/lib/postgresql/9.4/main

Create recovery file to restore the database to the point in time specified by recovery_target_time directive.

postgres@debian:~$ cat << EOF | tee /var/lib/postgresql/9.4/main/recovery.conf
# provide missing wal file segments
restore_command = 'xz --decompress < /var/lib/postgresql/archive/wal/%f.xz > %p'

# recovery target time
recovery_target_time='2016-01-22 23:05:10'

# end recovery after reaching target time
pause_at_recovery_target=false        # 9.4
# recovery_target_action=promote      # 9.5

# apply commits older than target time
recovery_target_inclusive = false
EOF

Start the database server to perform recovery.

postgres@debian:~$ pg_ctlcluster 9.4 main start

Monitor recovery process.

postgres@debian:~$ tail /var/log/postgresql/postgresql-9.4-main.log
2016-01-22 23:06:55 CET [8062-1] LOG:  database system was interrupted; last known up at 2016-01-22 23:03:42 CET
2016-01-22 23:06:55 CET [8062-2] LOG:  starting point-in-time recovery to 2016-01-22 23:05:10+01
2016-01-22 23:06:55 CET [8062-3] LOG:  restored log file "000000010000000000000003" from archive
2016-01-22 23:06:55 CET [8062-4] LOG:  redo starts at 0/3000090
2016-01-22 23:06:55 CET [8062-5] LOG:  restored log file "000000010000000000000004" from archive
2016-01-22 23:06:55 CET [8062-6] LOG:  consistent recovery state reached at 0/4000050
2016-01-22 23:06:55 CET [8062-7] LOG:  restored log file "000000010000000000000005" from archive
2016-01-22 23:06:55 CET [8062-8] LOG:  redo done at 0/5010A20
2016-01-22 23:06:55 CET [8062-9] LOG:  last completed transaction was at log time 2016-01-22 23:05:01.436339+01
2016-01-22 23:06:56 CET [8062-10] LOG:  restored log file "000000010000000000000005" from archive
2016-01-22 23:06:56 CET [8062-11] LOG:  selected new timeline ID: 2
2016-01-22 23:06:57 CET [8062-12] LOG:  archive recovery complete
2016-01-22 23:06:57 CET [8062-13] LOG:  MultiXact member wraparound protections are now enabled

The recovery process – restore to the named restore point

Create a named restore point using the following function.

postgres@debian:~$ psql -A -t -c "select pg_create_restore_point('Restore point before important operation');"
0/50000F8
This solution is available since PostgreSQL 9.1.

Stop the database server.

postgres@debian:~$ pg_ctlcluster 9.4 main stop

Backup current database files.

postgres@debian:~$ mv /var/lib/postgresql/9.4/main /var/lib/postgresql/9.4/main.before_recovery

Copy archived database files.

postgres@debian:~$ cp -rp /var/lib/postgresql/archive/base /var/lib/postgresql/9.4/main

Create recovery file to restore the database the named restore point specified by recovery_target_name directive.

postgres@debian:~$ cat << EOF | tee /var/lib/postgresql/9.4/main/recovery.conf
# provide missing wal file segments
restore_command = 'xz --decompress < /var/lib/postgresql/archive/wal/%f.xz > %p'

# recovery target name
recovery_target_name='Restore point before important operation'
EOF

Start the database server to perform recovery.

postgres@debian:~$ pg_ctlcluster 9.4 main start

Monitor recovery process.

postgres@debian:~$ tail -f /var/log/postgresql/postgresql-9.4-main.log
2016-01-23 02:03:58 CET [10929-1] LOG:  database system was interrupted; last known up at 2016-01-23 01:54:31 CET
2016-01-23 02:03:58 CET [10929-2] LOG:  creating missing WAL directory "pg_xlog/archive_status"
2016-01-23 02:03:58 CET [10929-3] LOG:  starting point-in-time recovery to "Restore point before important operation"
2016-01-23 02:03:58 CET [10929-4] LOG:  restored log file "000000010000000000000002" from archive
2016-01-23 02:03:58 CET [10929-5] LOG:  redo starts at 0/2000090
2016-01-23 02:03:58 CET [10929-6] LOG:  consistent recovery state reached at 0/20000B8
2016-01-23 02:03:58 CET [10929-7] LOG:  restored log file "000000010000000000000003" from archive
2016-01-23 02:03:58 CET [10929-8] LOG:  restored log file "000000010000000000000004" from archive
2016-01-23 02:03:58 CET [10929-9] LOG:  restored log file "000000010000000000000005" from archive
2016-01-23 02:03:58 CET [10929-10] LOG:  recovery stopping at restore point "Restore point before important operation", time 2016-01-23 01:55:42.981348+01
2016-01-23 02:03:58 CET [10929-11] LOG:  redo done at 0/5000090
2016-01-23 02:03:58 CET [10929-12] LOG:  last completed transaction was at log time 2016-01-23 01:55:21.700315+01
2016-01-23 02:03:58 CET [10929-13] LOG:  selected new timeline ID: 2
sh: 1: cannot open /var/lib/postgresql/archive/wal/00000001.history.xz: No such file
2016-01-23 02:04:00 CET [10929-14] LOG:  archive recovery complete
2016-01-23 02:04:00 CET [10929-15] LOG:  MultiXact member wraparound protections are now enabled

The recovery process – restore to the specific point in time using hot standby mode

This solution requires hot_standby operation mode, so you need to specify the proper format for archived Write Ahead Log file segments.

Change logging information from archive to hot_standby mode.

debian:~$ sudo sed -i -e "s/^wal_level = archive/wal_level = hot_standby/" /etc/postgresql/9.4/main/postgresql.conf

Enable hot_standby mode to allow queries during the recovery process.

debian:~$ sudo sed -i -e "s/^#hot_standby = off/hot_standby = on/" /etc/postgresql/9.4/main/postgresql.conf

Restart database server.

postgres@debian:~$ pg_ctlcluster 9.4 main restart
Perform new backup at this moment as you need to have Write Ahead Log file segments using hot_standby logging information.

Stop the database server.

postgres@debian:~$ pg_ctlcluster 9.4 main stop

Backup current database files.

postgres@debian:~$ mv /var/lib/postgresql/9.4/main /var/lib/postgresql/9.4/main.before_recovery

Copy archived database files.

postgres@debian:~$ cp -rp /var/lib/postgresql/archive/base /var/lib/postgresql/9.4/main

Create a recovery file to restore the database to a specified point in time.

postgres@debian:~$ cat << EOF | tee /var/lib/postgresql/9.4/main/recovery.conf
# provide missing wal file segments
restore_command = 'xz --decompress < /var/lib/postgresql/archive/wal/%f.xz > %p'

# recovery target time
recovery_target_time='2016-01-22 23:05:10'

# end recovery after reaching target time
pause_at_recovery_target=true         # 9.4
# recovery_target_action=pause        # 9.5

# apply commits older than or equal to the target time
recovery_target_inclusive = true
EOF

Start the database server to perform recovery.

postgres@debian:~$ pg_ctlcluster 9.4 main start
Query database to verify the desired state as it is working in read-only mode.

You can easily verify that database is in recovery mode.

postgres@debian:~$ psql  -A -t -c "select  pg_is_xlog_replay_paused()"
t

Update recovery target time to a new value to prepare for the next recovery step.

postgres@debian:~$ cat << EOF | tee /var/lib/postgresql/9.4/main/recovery.conf
# provide missing wal file segments
restore_command = 'xz --decompress < /var/lib/postgresql/archive/wal/%f.xz > %p'

# recovery target time
recovery_target_time='2016-01-23 00:05:00'

# pause recovery after reaching target time
pause_at_recovery_target=true         # 9.4
# recovery_target_action=pause        # 9.5

# apply commits older than or equal to the target time
recovery_target_inclusive = true
EOF

Restart the database server to continue recovery.

postgres@debian:~$ pg_ctlcluster 9.4 main restart
Repeat previous steps (recovery target time update and database restart) till you reach the desired database state.

Restore normal operations after reaching the desired database state.

postgres@debian:~$ psql -A -t -c "select pg_xlog_replay_resume()"

Monitor recovery process.

postgres@debian:~$ tail /var/log/postgresql/postgresql-9.4-main.log
2016-01-23 00:30:48 CET [10271-1] LOG:  database system was interrupted; last known up at 2016-01-22 23:58:03 CET
2016-01-23 00:30:48 CET [10271-2] LOG:  creating missing WAL directory "pg_xlog/archive_status"
2016-01-23 00:30:48 CET [10271-3] LOG:  starting point-in-time recovery to 2016-01-22 23:57:00+01
2016-01-23 00:30:48 CET [10271-4] LOG:  restored log file "000000010000000000000002" from archive
2016-01-23 00:30:48 CET [10271-5] LOG:  redo starts at 0/2000028
2016-01-23 00:30:48 CET [10271-6] LOG:  consistent recovery state reached at 0/2000128
2016-01-23 00:30:48 CET [10270-1] LOG:  database system is ready to accept read only connections
2016-01-23 00:30:48 CET [10271-7] LOG:  restored log file "000000010000000000000003" from archive
2016-01-23 00:30:48 CET [10271-8] LOG:  recovery stopping before commit of transaction 685, time 2016-01-22 23:58:48.268537+01
2016-01-23 00:30:48 CET [10271-9] LOG:  recovery has paused
2016-01-23 00:30:48 CET [10271-10] HINT:  Execute pg_xlog_replay_resume() to continue.
2016-01-23 00:31:15 CET [10270-2] LOG:  received smart shutdown request
2016-01-23 00:31:15 CET [10274-1] LOG:  shutting down
2016-01-23 00:31:15 CET [10274-2] LOG:  database system is shut down
2016-01-23 00:31:16 CET [10322-1] LOG:  database system was shut down in recovery at 2016-01-23 00:31:15 CET
2016-01-23 00:31:16 CET [10322-2] LOG:  starting point-in-time recovery to 2016-01-22 23:58:00+01
2016-01-23 00:31:16 CET [10322-3] LOG:  restored log file "000000010000000000000002" from archive
2016-01-23 00:31:16 CET [10322-4] LOG:  redo starts at 0/2000028
2016-01-23 00:31:16 CET [10322-5] LOG:  restored log file "000000010000000000000003" from archive
2016-01-23 00:31:16 CET [10322-6] LOG:  consistent recovery state reached at 0/300F8B0
2016-01-23 00:31:16 CET [10321-1] LOG:  database system is ready to accept read only connections
2016-01-23 00:31:16 CET [10322-7] LOG:  recovery stopping before commit of transaction 685, time 2016-01-22 23:58:48.268537+01
2016-01-23 00:31:16 CET [10322-8] LOG:  recovery has paused
2016-01-23 00:31:16 CET [10322-9] HINT:  Execute pg_xlog_replay_resume() to continue.
2016-01-23 00:31:52 CET [10321-2] LOG:  received smart shutdown request
2016-01-23 00:31:52 CET [10325-1] LOG:  shutting down
2016-01-23 00:31:52 CET [10325-2] LOG:  database system is shut down
2016-01-23 00:31:53 CET [10361-1] LOG:  database system was shut down in recovery at 2016-01-23 00:31:52 CET
2016-01-23 00:31:53 CET [10361-2] LOG:  starting point-in-time recovery to 2016-01-22 23:59:00+01
2016-01-23 00:31:53 CET [10361-3] LOG:  restored log file "000000010000000000000002" from archive
2016-01-23 00:31:53 CET [10361-4] LOG:  redo starts at 0/2000028
2016-01-23 00:31:53 CET [10361-5] LOG:  restored log file "000000010000000000000003" from archive
2016-01-23 00:31:53 CET [10361-6] LOG:  consistent recovery state reached at 0/300F8B0
2016-01-23 00:31:53 CET [10360-1] LOG:  database system is ready to accept read only connections
2016-01-23 00:31:53 CET [10361-7] LOG:  recovery stopping before commit of transaction 686, time 2016-01-23 00:02:33.477804+01
2016-01-23 00:31:53 CET [10361-8] LOG:  recovery has paused
2016-01-23 00:31:53 CET [10361-9] HINT:  Execute pg_xlog_replay_resume() to continue.
2016-01-23 00:32:14 CET [10361-10] LOG:  redo done at 0/3010A28
2016-01-23 00:32:14 CET [10361-11] LOG:  last completed transaction was at log time 2016-01-22 23:58:48.268537+01
2016-01-23 00:32:14 CET [10361-12] LOG:  restored log file "00000002.history" from archive
2016-01-23 00:32:14 CET [10361-13] LOG:  restored log file "00000003.history" from archive
2016-01-23 00:32:14 CET [10361-14] LOG:  restored log file "00000004.history" from archive
2016-01-23 00:32:14 CET [10361-15] LOG:  restored log file "00000005.history" from archive
2016-01-23 00:32:14 CET [10361-16] LOG:  restored log file "00000006.history" from archive
2016-01-23 00:32:14 CET [10361-17] LOG:  restored log file "00000007.history" from archive
sh: 1: cannot open /var/lib/postgresql/archive/wal/00000008.history.xz: No such file
2016-01-23 00:32:14 CET [10361-18] LOG:  selected new timeline ID: 8
sh: 1: cannot open /var/lib/postgresql/archive/wal/00000001.history.xz: No such file
2016-01-23 00:32:14 CET [10361-19] LOG:  archive recovery complete
2016-01-23 00:32:14 CET [10361-20] LOG:  MultiXact member wraparound protections are now enabled
2016-01-23 00:32:14 CET [10360-2] LOG:  database system is ready to accept connections
2016-01-23 00:32:14 CET [10413-1] LOG:  autovacuum launcher started

Additional notes

There is more than that, but now you can go one step further and inspect documentation.

References

PostgreSQL Documentation → Backup and Restore


PostgreSQL Documentation → Backup and Restore → Continuous Archiving and Point-in-Time Recovery (PITR) → Making a Base Backup Using the Low Level API

PostgreSQL Documentation → Recovery Configuration → Archive recovery settings

PostgreSQL Documentation → Recovery Configuration → Recovery Target Settings