2 25.3. Continuous Archiving and Point-in-Time Recovery (PITR) #
4 25.3.1. Setting Up WAL Archiving
5 25.3.2. Making a Base Backup
6 25.3.3. Making an Incremental Backup
7 25.3.4. Making a Base Backup Using the Low Level API
8 25.3.5. Recovering Using a Continuous Archive Backup
10 25.3.7. Tips and Examples
13 At all times, PostgreSQL maintains a write ahead log (WAL) in the
14 pg_wal/ subdirectory of the cluster's data directory. The log records
15 every change made to the database's data files. This log exists
16 primarily for crash-safety purposes: if the system crashes, the
17 database can be restored to consistency by “replaying” the log entries
18 made since the last checkpoint. However, the existence of the log makes
19 it possible to use a third strategy for backing up databases: we can
20 combine a file-system-level backup with backup of the WAL files. If
21 recovery is needed, we restore the file system backup and then replay
22 from the backed-up WAL files to bring the system to a current state.
23 This approach is more complex to administer than either of the previous
24 approaches, but it has some significant benefits:
25 * We do not need a perfectly consistent file system backup as the
26 starting point. Any internal inconsistency in the backup will be
27 corrected by log replay (this is not significantly different from
28 what happens during crash recovery). So we do not need a file
29 system snapshot capability, just tar or a similar archiving tool.
30 * Since we can combine an indefinitely long sequence of WAL files for
31 replay, continuous backup can be achieved simply by continuing to
32 archive the WAL files. This is particularly valuable for large
33 databases, where it might not be convenient to take a full backup
35 * It is not necessary to replay the WAL entries all the way to the
36 end. We could stop the replay at any point and have a consistent
37 snapshot of the database as it was at that time. Thus, this
38 technique supports point-in-time recovery: it is possible to
39 restore the database to its state at any time since your base
41 * If we continuously feed the series of WAL files to another machine
42 that has been loaded with the same base backup file, we have a warm
43 standby system: at any point we can bring up the second machine and
44 it will have a nearly-current copy of the database.
48 pg_dump and pg_dumpall do not produce file-system-level backups and
49 cannot be used as part of a continuous-archiving solution. Such dumps
50 are logical and do not contain enough information to be used by WAL
53 As with the plain file-system-backup technique, this method can only
54 support restoration of an entire database cluster, not a subset. Also,
55 it requires a lot of archival storage: the base backup might be bulky,
56 and a busy system will generate many megabytes of WAL traffic that have
57 to be archived. Still, it is the preferred backup technique in many
58 situations where high reliability is needed.
60 To recover successfully using continuous archiving (also called “online
61 backup” by many database vendors), you need a continuous sequence of
62 archived WAL files that extends back at least as far as the start time
63 of your backup. So to get started, you should set up and test your
64 procedure for archiving WAL files before you take your first base
65 backup. Accordingly, we first discuss the mechanics of archiving WAL
68 25.3.1. Setting Up WAL Archiving #
70 In an abstract sense, a running PostgreSQL system produces an
71 indefinitely long sequence of WAL records. The system physically
72 divides this sequence into WAL segment files, which are normally 16MB
73 apiece (although the segment size can be altered during initdb). The
74 segment files are given numeric names that reflect their position in
75 the abstract WAL sequence. When not using WAL archiving, the system
76 normally creates just a few segment files and then “recycles” them by
77 renaming no-longer-needed segment files to higher segment numbers. It's
78 assumed that segment files whose contents precede the last checkpoint
79 are no longer of interest and can be recycled.
81 When archiving WAL data, we need to capture the contents of each
82 segment file once it is filled, and save that data somewhere before the
83 segment file is recycled for reuse. Depending on the application and
84 the available hardware, there could be many different ways of “saving
85 the data somewhere”: we could copy the segment files to an NFS-mounted
86 directory on another machine, write them onto a tape drive (ensuring
87 that you have a way of identifying the original name of each file), or
88 batch them together and burn them onto CDs, or something else entirely.
89 To provide the database administrator with flexibility, PostgreSQL
90 tries not to make any assumptions about how the archiving will be done.
91 Instead, PostgreSQL lets the administrator specify a shell command or
92 an archive library to be executed to copy a completed segment file to
93 wherever it needs to go. This could be as simple as a shell command
94 that uses cp, or it could invoke a complex C function — it's all up to
97 To enable WAL archiving, set the wal_level configuration parameter to
98 replica or higher, archive_mode to on, specify the shell command to use
99 in the archive_command configuration parameter or specify the library
100 to use in the archive_library configuration parameter. In practice
101 these settings will always be placed in the postgresql.conf file.
103 In archive_command, %p is replaced by the path name of the file to
104 archive, while %f is replaced by only the file name. (The path name is
105 relative to the current working directory, i.e., the cluster's data
106 directory.) Use %% if you need to embed an actual % character in the
107 command. The simplest useful command is something like:
108 archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/arch
110 archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
112 which will copy archivable WAL segments to the directory
113 /mnt/server/archivedir. (This is an example, not a recommendation, and
114 might not work on all platforms.) After the %p and %f parameters have
115 been replaced, the actual command executed might look like this:
116 test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/000000010
117 00000A900000065 /mnt/server/archivedir/00000001000000A900000065
119 A similar command will be generated for each new file to be archived.
121 The archive command will be executed under the ownership of the same
122 user that the PostgreSQL server is running as. Since the series of WAL
123 files being archived contains effectively everything in your database,
124 you will want to be sure that the archived data is protected from
125 prying eyes; for example, archive into a directory that does not have
126 group or world read access.
128 It is important that the archive command return zero exit status if and
129 only if it succeeds. Upon getting a zero result, PostgreSQL will assume
130 that the file has been successfully archived, and will remove or
131 recycle it. However, a nonzero status tells PostgreSQL that the file
132 was not archived; it will try again periodically until it succeeds.
134 Another way to archive is to use a custom archive module as the
135 archive_library. Since such modules are written in C, creating your own
136 may require considerably more effort than writing a shell command.
137 However, archive modules can be more performant than archiving via
138 shell, and they will have access to many useful server resources. For
139 more information about archive modules, see Chapter 49.
141 When the archive command is terminated by a signal (other than SIGTERM
142 that is used as part of a server shutdown) or an error by the shell
143 with an exit status greater than 125 (such as command not found), or if
144 the archive function emits an ERROR or FATAL, the archiver process
145 aborts and gets restarted by the postmaster. In such cases, the failure
146 is not reported in pg_stat_archiver.
148 Archive commands and libraries should generally be designed to refuse
149 to overwrite any pre-existing archive file. This is an important safety
150 feature to preserve the integrity of your archive in case of
151 administrator error (such as sending the output of two different
152 servers to the same archive directory). It is advisable to test your
153 proposed archive library to ensure that it does not overwrite an
156 In rare cases, PostgreSQL may attempt to re-archive a WAL file that was
157 previously archived. For example, if the system crashes before the
158 server makes a durable record of archival success, the server will
159 attempt to archive the file again after restarting (provided archiving
160 is still enabled). When an archive command or library encounters a
161 pre-existing file, it should return a zero status or true,
162 respectively, if the WAL file has identical contents to the
163 pre-existing archive and the pre-existing archive is fully persisted to
164 storage. If a pre-existing file contains different contents than the
165 WAL file being archived, the archive command or library must return a
166 nonzero status or false, respectively.
168 The example command above for Unix avoids overwriting a pre-existing
169 archive by including a separate test step. On some Unix platforms, cp
170 has switches such as -i that can be used to do the same thing less
171 verbosely, but you should not rely on these without verifying that the
172 right exit status is returned. (In particular, GNU cp will return
173 status zero when -i is used and the target file already exists, which
174 is not the desired behavior.)
176 While designing your archiving setup, consider what will happen if the
177 archive command or library fails repeatedly because some aspect
178 requires operator intervention or the archive runs out of space. For
179 example, this could occur if you write to tape without an autochanger;
180 when the tape fills, nothing further can be archived until the tape is
181 swapped. You should ensure that any error condition or request to a
182 human operator is reported appropriately so that the situation can be
183 resolved reasonably quickly. The pg_wal/ directory will continue to
184 fill with WAL segment files until the situation is resolved. (If the
185 file system containing pg_wal/ fills up, PostgreSQL will do a PANIC
186 shutdown. No committed transactions will be lost, but the database will
187 remain offline until you free some space.)
189 The speed of the archive command or library is unimportant as long as
190 it can keep up with the average rate at which your server generates WAL
191 data. Normal operation continues even if the archiving process falls a
192 little behind. If archiving falls significantly behind, this will
193 increase the amount of data that would be lost in the event of a
194 disaster. It will also mean that the pg_wal/ directory will contain
195 large numbers of not-yet-archived segment files, which could eventually
196 exceed available disk space. You are advised to monitor the archiving
197 process to ensure that it is working as you intend.
199 In writing your archive command or library, you should assume that the
200 file names to be archived can be up to 64 characters long and can
201 contain any combination of ASCII letters, digits, and dots. It is not
202 necessary to preserve the original relative path (%p) but it is
203 necessary to preserve the file name (%f).
205 Note that although WAL archiving will allow you to restore any
206 modifications made to the data in your PostgreSQL database, it will not
207 restore changes made to configuration files (that is, postgresql.conf,
208 pg_hba.conf and pg_ident.conf), since those are edited manually rather
209 than through SQL operations. You might wish to keep the configuration
210 files in a location that will be backed up by your regular file system
211 backup procedures. See Section 19.2 for how to relocate the
214 The archive command or function is only invoked on completed WAL
215 segments. Hence, if your server generates only little WAL traffic (or
216 has slack periods where it does so), there could be a long delay
217 between the completion of a transaction and its safe recording in
218 archive storage. To put a limit on how old unarchived data can be, you
219 can set archive_timeout to force the server to switch to a new WAL
220 segment file at least that often. Note that archived files that are
221 archived early due to a forced switch are still the same length as
222 completely full files. It is therefore unwise to set a very short
223 archive_timeout — it will bloat your archive storage. archive_timeout
224 settings of a minute or so are usually reasonable.
226 Also, you can force a segment switch manually with pg_switch_wal if you
227 want to ensure that a just-finished transaction is archived as soon as
228 possible. Other utility functions related to WAL management are listed
231 When wal_level is minimal some SQL commands are optimized to avoid WAL
232 logging, as described in Section 14.4.7. If archiving or streaming
233 replication were turned on during execution of one of these statements,
234 WAL would not contain enough information for archive recovery. (Crash
235 recovery is unaffected.) For this reason, wal_level can only be changed
236 at server start. However, archive_command and archive_library can be
237 changed with a configuration file reload. If you are archiving via
238 shell and wish to temporarily stop archiving, one way to do it is to
239 set archive_command to the empty string (''). This will cause WAL files
240 to accumulate in pg_wal/ until a working archive_command is
243 25.3.2. Making a Base Backup #
245 The easiest way to perform a base backup is to use the pg_basebackup
246 tool. It can create a base backup either as regular files or as a tar
247 archive. If more flexibility than pg_basebackup can provide is
248 required, you can also make a base backup using the low level API (see
251 It is not necessary to be concerned about the amount of time it takes
252 to make a base backup. However, if you normally run the server with
253 full_page_writes disabled, you might notice a drop in performance while
254 the backup runs since full_page_writes is effectively forced on during
257 To make use of the backup, you will need to keep all the WAL segment
258 files generated during and after the file system backup. To aid you in
259 doing this, the base backup process creates a backup history file that
260 is immediately stored into the WAL archive area. This file is named
261 after the first WAL segment file that you need for the file system
262 backup. For example, if the starting WAL file is
263 0000000100001234000055CD the backup history file will be named
264 something like 0000000100001234000055CD.007C9330.backup. (The second
265 part of the file name stands for an exact position within the WAL file,
266 and can ordinarily be ignored.) Once you have safely archived the file
267 system backup and the WAL segment files used during the backup (as
268 specified in the backup history file), all archived WAL segments with
269 names numerically less are no longer needed to recover the file system
270 backup and can be deleted. However, you should consider keeping several
271 backup sets to be absolutely certain that you can recover your data.
273 The backup history file is just a small text file. It contains the
274 label string you gave to pg_basebackup, as well as the starting and
275 ending times and WAL segments of the backup. If you used the label to
276 identify the associated dump file, then the archived history file is
277 enough to tell you which dump file to restore.
279 Since you have to keep around all the archived WAL files back to your
280 last base backup, the interval between base backups should usually be
281 chosen based on how much storage you want to expend on archived WAL
282 files. You should also consider how long you are prepared to spend
283 recovering, if recovery should be necessary — the system will have to
284 replay all those WAL segments, and that could take awhile if it has
285 been a long time since the last base backup.
287 25.3.3. Making an Incremental Backup #
289 You can use pg_basebackup to take an incremental backup by specifying
290 the --incremental option. You must supply, as an argument to
291 --incremental, the backup manifest to an earlier backup from the same
292 server. In the resulting backup, non-relation files will be included in
293 their entirety, but some relation files may be replaced by smaller
294 incremental files which contain only the blocks which have been changed
295 since the earlier backup and enough metadata to reconstruct the current
298 To figure out which blocks need to be backed up, the server uses WAL
299 summaries, which are stored in the data directory, inside the directory
300 pg_wal/summaries. If the required summary files are not present, an
301 attempt to take an incremental backup will fail. The summaries present
302 in this directory must cover all LSNs from the start LSN of the prior
303 backup to the start LSN of the current backup. Since the server looks
304 for WAL summaries just after establishing the start LSN of the current
305 backup, the necessary summary files probably won't be instantly present
306 on disk, but the server will wait for any missing files to show up.
307 This also helps if the WAL summarization process has fallen behind.
308 However, if the necessary files have already been removed, or if the
309 WAL summarizer doesn't catch up quickly enough, the incremental backup
312 When restoring an incremental backup, it will be necessary to have not
313 only the incremental backup itself but also all earlier backups that
314 are required to supply the blocks omitted from the incremental backup.
315 See pg_combinebackup for further information about this requirement.
316 Note that there are restrictions on the use of pg_combinebackup when
317 the checksum status of the cluster has been changed; see
318 pg_combinebackup limitations.
320 Note that all of the requirements for making use of a full backup also
321 apply to an incremental backup. For instance, you still need all of the
322 WAL segment files generated during and after the file system backup,
323 and any relevant WAL history files. And you still need to create a
324 recovery.signal (or standby.signal) and perform recovery, as described
325 in Section 25.3.5. The requirement to have earlier backups available at
326 restore time and to use pg_combinebackup is an additional requirement
327 on top of everything else. Keep in mind that PostgreSQL has no built-in
328 mechanism to figure out which backups are still needed as a basis for
329 restoring later incremental backups. You must keep track of the
330 relationships between your full and incremental backups on your own,
331 and be certain not to remove earlier backups if they might be needed
332 when restoring later incremental backups.
334 Incremental backups typically only make sense for relatively large
335 databases where a significant portion of the data does not change, or
336 only changes slowly. For a small database, it's simpler to ignore the
337 existence of incremental backups and simply take full backups, which
338 are simpler to manage. For a large database all of which is heavily
339 modified, incremental backups won't be much smaller than full backups.
341 An incremental backup is only possible if replay would begin from a
342 later checkpoint than for the previous backup upon which it depends. If
343 you take the incremental backup on the primary, this condition is
344 always satisfied, because each backup triggers a new checkpoint. On a
345 standby, replay begins from the most recent restartpoint. Therefore, an
346 incremental backup of a standby server can fail if there has been very
347 little activity since the previous backup, since no new restartpoint
348 might have been created.
350 25.3.4. Making a Base Backup Using the Low Level API #
352 Instead of taking a full or incremental base backup using
353 pg_basebackup, you can take a base backup using the low-level API. This
354 procedure contains a few more steps than the pg_basebackup method, but
355 is relatively simple. It is very important that these steps are
356 executed in sequence, and that the success of a step is verified before
357 proceeding to the next step.
359 Multiple backups are able to be run concurrently (both those started
360 using this backup API and those started using pg_basebackup).
362 1. Ensure that WAL archiving is enabled and working.
363 2. Connect to the server (it does not matter which database) as a user
364 with rights to run pg_backup_start (superuser, or a user who has
365 been granted EXECUTE on the function) and issue the command:
366 SELECT pg_backup_start(label => 'label', fast => false);
368 where label is any string you want to use to uniquely identify this
369 backup operation. The connection calling pg_backup_start must be
370 maintained until the end of the backup, or the backup will be
371 automatically aborted.
372 Online backups are always started at the beginning of a checkpoint.
373 By default, pg_backup_start will wait for the next regularly
374 scheduled checkpoint to complete, which may take a long time (see
375 the configuration parameters checkpoint_timeout and
376 checkpoint_completion_target). This is usually preferable as it
377 minimizes the impact on the running system. If you want to start
378 the backup as soon as possible, pass true as the second parameter
379 to pg_backup_start and it will request an immediate checkpoint,
380 which will finish as fast as possible using as much I/O as
382 3. Perform the backup, using any convenient file-system-backup tool
383 such as tar or cpio (not pg_dump or pg_dumpall). It is neither
384 necessary nor desirable to stop normal operation of the database
385 while you do this. See Section 25.3.4.1 for things to consider
387 4. In the same connection as before, issue the command:
388 SELECT * FROM pg_backup_stop(wait_for_archive => true);
390 This terminates backup mode. On a primary, it also performs an
391 automatic switch to the next WAL segment. On a standby, it is not
392 possible to automatically switch WAL segments, so you may wish to
393 run pg_switch_wal on the primary to perform a manual switch. The
394 reason for the switch is to arrange for the last WAL segment file
395 written during the backup interval to be ready to archive.
396 pg_backup_stop will return one row with three values. The second of
397 these fields should be written to a file named backup_label in the
398 root directory of the backup. The third field should be written to
399 a file named tablespace_map unless the field is empty. These files
400 are vital to the backup working and must be written byte for byte
401 without modification, which may require opening the file in binary
403 5. Once the WAL segment files active during the backup are archived,
404 you are done. The file identified by pg_backup_stop's first return
405 value is the last segment that is required to form a complete set
406 of backup files. On a primary, if archive_mode is enabled and the
407 wait_for_archive parameter is true, pg_backup_stop does not return
408 until the last segment has been archived. On a standby,
409 archive_mode must be always in order for pg_backup_stop to wait.
410 Archiving of these files happens automatically since you have
411 already configured archive_command or archive_library. In most
412 cases this happens quickly, but you are advised to monitor your
413 archive system to ensure there are no delays. If the archive
414 process has fallen behind because of failures of the archive
415 command or library, it will keep retrying until the archive
416 succeeds and the backup is complete. If you wish to place a time
417 limit on the execution of pg_backup_stop, set an appropriate
418 statement_timeout value, but make note that if pg_backup_stop
419 terminates because of this your backup may not be valid.
420 If the backup process monitors and ensures that all WAL segment
421 files required for the backup are successfully archived then the
422 wait_for_archive parameter (which defaults to true) can be set to
423 false to have pg_backup_stop return as soon as the stop backup
424 record is written to the WAL. By default, pg_backup_stop will wait
425 until all WAL has been archived, which can take some time. This
426 option must be used with caution: if WAL archiving is not monitored
427 correctly then the backup might not include all of the WAL files
428 and will therefore be incomplete and not able to be restored.
430 25.3.4.1. Backing Up the Data Directory #
432 Some file system backup tools emit warnings or errors if the files they
433 are trying to copy change while the copy proceeds. When taking a base
434 backup of an active database, this situation is normal and not an
435 error. However, you need to ensure that you can distinguish complaints
436 of this sort from real errors. For example, some versions of rsync
437 return a separate exit code for “vanished source files”, and you can
438 write a driver script to accept this exit code as a non-error case.
439 Also, some versions of GNU tar return an error code indistinguishable
440 from a fatal error if a file was truncated while tar was copying it.
441 Fortunately, GNU tar versions 1.16 and later exit with 1 if a file was
442 changed during the backup, and 2 for other errors. With GNU tar version
443 1.23 and later, you can use the warning options
444 --warning=no-file-changed --warning=no-file-removed to hide the related
447 Be certain that your backup includes all of the files under the
448 database cluster directory (e.g., /usr/local/pgsql/data). If you are
449 using tablespaces that do not reside underneath this directory, be
450 careful to include them as well (and be sure that your backup archives
451 symbolic links as links, otherwise the restore will corrupt your
454 You should, however, omit from the backup the files within the
455 cluster's pg_wal/ subdirectory. This slight adjustment is worthwhile
456 because it reduces the risk of mistakes when restoring. This is easy to
457 arrange if pg_wal/ is a symbolic link pointing to someplace outside the
458 cluster directory, which is a common setup anyway for performance
459 reasons. You might also want to exclude postmaster.pid and
460 postmaster.opts, which record information about the running postmaster,
461 not about the postmaster which will eventually use this backup. (These
462 files can confuse pg_ctl.)
464 It is often a good idea to also omit from the backup the files within
465 the cluster's pg_replslot/ directory, so that replication slots that
466 exist on the primary do not become part of the backup. Otherwise, the
467 subsequent use of the backup to create a standby may result in
468 indefinite retention of WAL files on the standby, and possibly bloat on
469 the primary if hot standby feedback is enabled, because the clients
470 that are using those replication slots will still be connecting to and
471 updating the slots on the primary, not the standby. Even if the backup
472 is only intended for use in creating a new primary, copying the
473 replication slots isn't expected to be particularly useful, since the
474 contents of those slots will likely be badly out of date by the time
475 the new primary comes on line.
477 The contents of the directories pg_dynshmem/, pg_notify/, pg_serial/,
478 pg_snapshots/, pg_stat_tmp/, and pg_subtrans/ (but not the directories
479 themselves) can be omitted from the backup as they will be initialized
480 on postmaster startup.
482 Any file or directory beginning with pgsql_tmp can be omitted from the
483 backup. These files are removed on postmaster start and the directories
484 will be recreated as needed.
486 pg_internal.init files can be omitted from the backup whenever a file
487 of that name is found. These files contain relation cache data that is
488 always rebuilt when recovering.
490 The backup label file includes the label string you gave to
491 pg_backup_start, as well as the time at which pg_backup_start was run,
492 and the name of the starting WAL file. In case of confusion it is
493 therefore possible to look inside a backup file and determine exactly
494 which backup session the dump file came from. The tablespace map file
495 includes the symbolic link names as they exist in the directory
496 pg_tblspc/ and the full path of each symbolic link. These files are not
497 merely for your information; their presence and contents are critical
498 to the proper operation of the system's recovery process.
500 It is also possible to make a backup while the server is stopped. In
501 this case, you obviously cannot use pg_backup_start or pg_backup_stop,
502 and you will therefore be left to your own devices to keep track of
503 which backup is which and how far back the associated WAL files go. It
504 is generally better to follow the continuous archiving procedure above.
506 25.3.5. Recovering Using a Continuous Archive Backup #
508 Okay, the worst has happened and you need to recover from your backup.
509 Here is the procedure:
510 1. Stop the server, if it's running.
511 2. If you have the space to do so, copy the whole cluster data
512 directory and any tablespaces to a temporary location in case you
513 need them later. Note that this precaution will require that you
514 have enough free space on your system to hold two copies of your
515 existing database. If you do not have enough space, you should at
516 least save the contents of the cluster's pg_wal subdirectory, as it
517 might contain WAL files which were not archived before the system
519 3. Remove all existing files and subdirectories under the cluster data
520 directory and under the root directories of any tablespaces you are
522 4. If you're restoring a full backup, you can restore the database
523 files directly into the target directories. Be sure that they are
524 restored with the right ownership (the database system user, not
525 root!) and with the right permissions. If you are using
526 tablespaces, you should verify that the symbolic links in
527 pg_tblspc/ were correctly restored.
528 5. If you're restoring an incremental backup, you'll need to restore
529 the incremental backup and all earlier backups upon which it
530 directly or indirectly depends to the machine where you are
531 performing the restore. These backups will need to be placed in
532 separate directories, not the target directories where you want the
533 running server to end up. Once this is done, use pg_combinebackup
534 to pull data from the full backup and all of the subsequent
535 incremental backups and write out a synthetic full backup to the
536 target directories. As above, verify that permissions and
537 tablespace links are correct.
538 6. Remove any files present in pg_wal/; these came from the file
539 system backup and are therefore probably obsolete rather than
540 current. If you didn't archive pg_wal/ at all, then recreate it
541 with proper permissions, being careful to ensure that you
542 re-establish it as a symbolic link if you had it set up that way
544 7. If you have unarchived WAL segment files that you saved in step 2,
545 copy them into pg_wal/. (It is best to copy them, not move them, so
546 you still have the unmodified files if a problem occurs and you
548 8. Set recovery configuration settings in postgresql.conf (see
549 Section 19.5.5) and create a file recovery.signal in the cluster
550 data directory. You might also want to temporarily modify
551 pg_hba.conf to prevent ordinary users from connecting until you are
552 sure the recovery was successful.
553 9. Start the server. The server will go into recovery mode and proceed
554 to read through the archived WAL files it needs. Should the
555 recovery be terminated because of an external error, the server can
556 simply be restarted and it will continue recovery. Upon completion
557 of the recovery process, the server will remove recovery.signal (to
558 prevent accidentally re-entering recovery mode later) and then
559 commence normal database operations.
560 10. Inspect the contents of the database to ensure you have recovered
561 to the desired state. If not, return to step 1. If all is well,
562 allow your users to connect by restoring pg_hba.conf to normal.
564 The key part of all this is to set up a recovery configuration that
565 describes how you want to recover and how far the recovery should run.
566 The one thing that you absolutely must specify is the restore_command,
567 which tells PostgreSQL how to retrieve archived WAL file segments. Like
568 the archive_command, this is a shell command string. It can contain %f,
569 which is replaced by the name of the desired WAL file, and %p, which is
570 replaced by the path name to copy the WAL file to. (The path name is
571 relative to the current working directory, i.e., the cluster's data
572 directory.) Write %% if you need to embed an actual % character in the
573 command. The simplest useful command is something like:
574 restore_command = 'cp /mnt/server/archivedir/%f %p'
576 which will copy previously archived WAL segments from the directory
577 /mnt/server/archivedir. Of course, you can use something much more
578 complicated, perhaps even a shell script that requests the operator to
579 mount an appropriate tape.
581 It is important that the command return nonzero exit status on failure.
582 The command will be called requesting files that are not present in the
583 archive; it must return nonzero when so asked. This is not an error
584 condition. An exception is that if the command was terminated by a
585 signal (other than SIGTERM, which is used as part of a database server
586 shutdown) or an error by the shell (such as command not found), then
587 recovery will abort and the server will not start up.
589 Not all of the requested files will be WAL segment files; you should
590 also expect requests for files with a suffix of .history. Also be aware
591 that the base name of the %p path will be different from %f; do not
592 expect them to be interchangeable.
594 WAL segments that cannot be found in the archive will be sought in
595 pg_wal/; this allows use of recent un-archived segments. However,
596 segments that are available from the archive will be used in preference
599 Normally, recovery will proceed through all available WAL segments,
600 thereby restoring the database to the current point in time (or as
601 close as possible given the available WAL segments). Therefore, a
602 normal recovery will end with a “file not found” message, the exact
603 text of the error message depending upon your choice of
604 restore_command. You may also see an error message at the start of
605 recovery for a file named something like 00000001.history. This is also
606 normal and does not indicate a problem in simple recovery situations;
607 see Section 25.3.6 for discussion.
609 If you want to recover to some previous point in time (say, right
610 before the junior DBA dropped your main transaction table), just
611 specify the required stopping point. You can specify the stop point,
612 known as the “recovery target”, either by date/time, named restore
613 point or by completion of a specific transaction ID. As of this writing
614 only the date/time and named restore point options are very usable,
615 since there are no tools to help you identify with any accuracy which
616 transaction ID to use.
620 The stop point must be after the ending time of the base backup, i.e.,
621 the end time of pg_backup_stop. You cannot use a base backup to recover
622 to a time when that backup was in progress. (To recover to such a time,
623 you must go back to your previous base backup and roll forward from
626 If recovery finds corrupted WAL data, recovery will halt at that point
627 and the server will not start. In such a case the recovery process
628 could be re-run from the beginning, specifying a “recovery target”
629 before the point of corruption so that recovery can complete normally.
630 If recovery fails for an external reason, such as a system crash or if
631 the WAL archive has become inaccessible, then the recovery can simply
632 be restarted and it will restart almost from where it failed. Recovery
633 restart works much like checkpointing in normal operation: the server
634 periodically forces all its state to disk, and then updates the
635 pg_control file to indicate that the already-processed WAL data need
636 not be scanned again.
640 The ability to restore the database to a previous point in time creates
641 some complexities that are akin to science-fiction stories about time
642 travel and parallel universes. For example, in the original history of
643 the database, suppose you dropped a critical table at 5:15PM on Tuesday
644 evening, but didn't realize your mistake until Wednesday noon. Unfazed,
645 you get out your backup, restore to the point-in-time 5:14PM Tuesday
646 evening, and are up and running. In this history of the database
647 universe, you never dropped the table. But suppose you later realize
648 this wasn't such a great idea, and would like to return to sometime
649 Wednesday morning in the original history. You won't be able to if,
650 while your database was up-and-running, it overwrote some of the WAL
651 segment files that led up to the time you now wish you could get back
652 to. Thus, to avoid this, you need to distinguish the series of WAL
653 records generated after you've done a point-in-time recovery from those
654 that were generated in the original database history.
656 To deal with this problem, PostgreSQL has a notion of timelines.
657 Whenever an archive recovery completes, a new timeline is created to
658 identify the series of WAL records generated after that recovery. The
659 timeline ID number is part of WAL segment file names so a new timeline
660 does not overwrite the WAL data generated by previous timelines. For
661 example, in the WAL file name 0000000100001234000055CD, the leading
662 00000001 is the timeline ID in hexadecimal. (Note that in other
663 contexts, such as server log messages, timeline IDs are usually printed
666 It is in fact possible to archive many different timelines. While that
667 might seem like a useless feature, it's often a lifesaver. Consider the
668 situation where you aren't quite sure what point-in-time to recover to,
669 and so have to do several point-in-time recoveries by trial and error
670 until you find the best place to branch off from the old history.
671 Without timelines this process would soon generate an unmanageable
672 mess. With timelines, you can recover to any prior state, including
673 states in timeline branches that you abandoned earlier.
675 Every time a new timeline is created, PostgreSQL creates a “timeline
676 history” file that shows which timeline it branched off from and when.
677 These history files are necessary to allow the system to pick the right
678 WAL segment files when recovering from an archive that contains
679 multiple timelines. Therefore, they are archived into the WAL archive
680 area just like WAL segment files. The history files are just small text
681 files, so it's cheap and appropriate to keep them around indefinitely
682 (unlike the segment files which are large). You can, if you like, add
683 comments to a history file to record your own notes about how and why
684 this particular timeline was created. Such comments will be especially
685 valuable when you have a thicket of different timelines as a result of
688 The default behavior of recovery is to recover to the latest timeline
689 found in the archive. If you wish to recover to the timeline that was
690 current when the base backup was taken or into a specific child
691 timeline (that is, you want to return to some state that was itself
692 generated after a recovery attempt), you need to specify current or the
693 target timeline ID in recovery_target_timeline. You cannot recover into
694 timelines that branched off earlier than the base backup.
696 25.3.7. Tips and Examples #
698 Some tips for configuring continuous archiving are given here.
700 25.3.7.1. Standalone Hot Backups #
702 It is possible to use PostgreSQL's backup facilities to produce
703 standalone hot backups. These are backups that cannot be used for
704 point-in-time recovery, yet are typically much faster to backup and
705 restore than pg_dump dumps. (They are also much larger than pg_dump
706 dumps, so in some cases the speed advantage might be negated.)
708 As with base backups, the easiest way to produce a standalone hot
709 backup is to use the pg_basebackup tool. If you include the -X
710 parameter when calling it, all the write-ahead log required to use the
711 backup will be included in the backup automatically, and no special
712 action is required to restore the backup.
714 25.3.7.2. Compressed Archive Logs #
716 If archive storage size is a concern, you can use gzip to compress the
718 archive_command = 'gzip < %p > /mnt/server/archivedir/%f.gz'
720 You will then need to use gunzip during recovery:
721 restore_command = 'gunzip < /mnt/server/archivedir/%f.gz > %p'
723 25.3.7.3. archive_command Scripts #
725 Many people choose to use scripts to define their archive_command, so
726 that their postgresql.conf entry looks very simple:
727 archive_command = 'local_backup_script.sh "%p" "%f"'
729 Using a separate script file is advisable any time you want to use more
730 than a single command in the archiving process. This allows all
731 complexity to be managed within the script, which can be written in a
732 popular scripting language such as bash or perl.
734 Examples of requirements that might be solved within a script include:
735 * Copying data to secure off-site data storage
736 * Batching WAL files so that they are transferred every three hours,
737 rather than one at a time
738 * Interfacing with other backup and recovery software
739 * Interfacing with monitoring software to report errors
743 When using an archive_command script, it's desirable to enable
744 logging_collector. Any messages written to stderr from the script will
745 then appear in the database server log, allowing complex configurations
746 to be diagnosed easily if they fail.
750 At this writing, there are several limitations of the continuous
751 archiving technique. These will probably be fixed in future releases:
752 * If a CREATE DATABASE command is executed while a base backup is
753 being taken, and then the template database that the CREATE
754 DATABASE copied is modified while the base backup is still in
755 progress, it is possible that recovery will cause those
756 modifications to be propagated into the created database as well.
757 This is of course undesirable. To avoid this risk, it is best not
758 to modify any template databases while taking a base backup.
759 * CREATE TABLESPACE commands are WAL-logged with the literal absolute
760 path, and will therefore be replayed as tablespace creations with
761 the same absolute path. This might be undesirable if the WAL is
762 being replayed on a different machine. It can be dangerous even if
763 the WAL is being replayed on the same machine, but into a new data
764 directory: the replay will still overwrite the contents of the
765 original tablespace. To avoid potential gotchas of this sort, the
766 best practice is to take a new base backup after creating or
767 dropping tablespaces.
769 It should also be noted that the default WAL format is fairly bulky
770 since it includes many disk page snapshots. These page snapshots are
771 designed to support crash recovery, since we might need to fix
772 partially-written disk pages. Depending on your system hardware and
773 software, the risk of partial writes might be small enough to ignore,
774 in which case you can significantly reduce the total volume of archived
775 WAL files by turning off page snapshots using the full_page_writes
776 parameter. (Read the notes and warnings in Chapter 28 before you do
777 so.) Turning off page snapshots does not prevent use of the WAL for
778 PITR operations. An area for future development is to compress archived
779 WAL data by removing unnecessary page copies even when full_page_writes
780 is on. In the meantime, administrators might wish to reduce the number
781 of page snapshots included in WAL by increasing the checkpoint interval
782 parameters as much as feasible.