begriffs open source - ai-pg/blob - full-docs/txt/app-pgbasebackup.txt

   1
   2 pg_basebackup
   3
   4    pg_basebackup — take a base backup of a PostgreSQL cluster
   5
   6 Synopsis
   7
   8    pg_basebackup [option...]
   9
  10 Description
  11
  12    pg_basebackup is used to take a base backup of a running PostgreSQL
  13    database cluster. The backup is taken without affecting other clients
  14    of the database, and can be used both for point-in-time recovery (see
  15    Section 25.3) and as the starting point for a log-shipping or
  16    streaming-replication standby server (see Section 26.2).
  17
  18    pg_basebackup can take a full or incremental base backup of the
  19    database. When used to take a full backup, it makes an exact copy of
  20    the database cluster's files. When used to take an incremental backup,
  21    some files that would have been part of a full backup may be replaced
  22    with incremental versions of the same files, containing only those
  23    blocks that have been modified since the reference backup. An
  24    incremental backup cannot be used directly; instead, pg_combinebackup
  25    must first be used to combine it with the previous backups upon which
  26    it depends. See Section 25.3.3 for more information about incremental
  27    backups, and Section 25.3.5 for steps to recover from a backup.
  28
  29    In any mode, pg_basebackup makes sure the server is put into and out of
  30    backup mode automatically. Backups are always taken of the entire
  31    database cluster; it is not possible to back up individual databases or
  32    database objects. For selective backups, another tool such as pg_dump
  33    must be used.
  34
  35    The backup is made over a regular PostgreSQL connection that uses the
  36    replication protocol. The connection must be made with a user ID that
  37    has REPLICATION permissions (see Section 21.2) or is a superuser, and
  38    pg_hba.conf must permit the replication connection. The server must
  39    also be configured with max_wal_senders set high enough to provide at
  40    least one walsender for the backup plus one for WAL streaming (if
  41    used).
  42
  43    There can be multiple pg_basebackups running at the same time, but it
  44    is usually better from a performance point of view to take only one
  45    backup, and copy the result.
  46
  47    pg_basebackup can make a base backup from not only a primary server but
  48    also a standby. To take a backup from a standby, set up the standby so
  49    that it can accept replication connections (that is, set
  50    max_wal_senders and hot_standby, and configure its pg_hba.conf
  51    appropriately). You will also need to enable full_page_writes on the
  52    primary.
  53
  54    Note that there are some limitations in taking a backup from a standby:
  55      * The backup history file is not created in the database cluster
  56        backed up.
  57      * pg_basebackup cannot force the standby to switch to a new WAL file
  58        at the end of backup. When you are using -X none, if write activity
  59        on the primary is low, pg_basebackup may need to wait a long time
  60        for the last WAL file required for the backup to be switched and
  61        archived. In this case, it may be useful to run pg_switch_wal on
  62        the primary in order to trigger an immediate WAL file switch.
  63      * If the standby is promoted to be primary during backup, the backup
  64        fails.
  65      * All WAL records required for the backup must contain sufficient
  66        full-page writes, which requires you to enable full_page_writes on
  67        the primary.
  68
  69    Whenever pg_basebackup is taking a base backup, the server's
  70    pg_stat_progress_basebackup view will report the progress of the
  71    backup. See Section 27.4.6 for details.
  72
  73 Options
  74
  75    The following command-line options control the location and format of
  76    the output:
  77
  78    -D directory
  79           --pgdata=directory
  80           Sets the target directory to write the output to. pg_basebackup
  81           will create this directory (and any missing parent directories)
  82           if it does not exist. If it already exists, it must be empty.
  83
  84           When the backup is in tar format, the target directory may be
  85           specified as - (dash), causing the tar file to be written to
  86           stdout.
  87
  88           This option is required.
  89
  90    -F format
  91           --format=format
  92           Selects the format for the output. format can be one of the
  93           following:
  94
  95         p
  96                 plain
  97                 Write the output as plain files, with the same layout as
  98                 the source server's data directory and tablespaces. When
  99                 the cluster has no additional tablespaces, the whole
 100                 database will be placed in the target directory. If the
 101                 cluster contains additional tablespaces, the main data
 102                 directory will be placed in the target directory, but all
 103                 other tablespaces will be placed in the same absolute path
 104                 as they have on the source server. (See
 105                 --tablespace-mapping to change that.)
 106
 107                 This is the default format.
 108
 109         t
 110                 tar
 111                 Write the output as tar files in the target directory. The
 112                 main data directory's contents will be written to a file
 113                 named base.tar, and each other tablespace will be written
 114                 to a separate tar file named after that tablespace's OID.
 115
 116                 If the target directory is specified as - (dash), the tar
 117                 contents will be written to standard output, suitable for
 118                 piping to (for example) gzip. This is only allowed if the
 119                 cluster has no additional tablespaces and WAL streaming is
 120                 not used.
 121
 122    -i old_manifest_file
 123           --incremental=old_manifest_file
 124           Performs an incremental backup. The backup manifest for the
 125           reference backup must be provided, and will be uploaded to the
 126           server, which will respond by sending the requested incremental
 127           backup.
 128
 129    -R
 130           --write-recovery-conf
 131           Creates a standby.signal file and appends connection settings to
 132           the postgresql.auto.conf file in the target directory (or within
 133           the base archive file when using tar format). This eases setting
 134           up a standby server using the results of the backup.
 135
 136           The postgresql.auto.conf file will record the connection
 137           settings and, if specified, the replication slot that
 138           pg_basebackup is using, so that streaming replication and
 139           logical replication slot synchronization will use the same
 140           settings later on. The dbname will be recorded only if the
 141           dbname was specified explicitly in the connection string or
 142           environment variable.
 143
 144    -t target
 145           --target=target
 146           Instructs the server where to place the base backup. The default
 147           target is client, which specifies that the backup should be sent
 148           to the machine where pg_basebackup is running. If the target is
 149           instead set to server:/some/path, the backup will be stored on
 150           the machine where the server is running in the /some/path
 151           directory. Storing a backup on the server requires superuser
 152           privileges or having privileges of the pg_write_server_files
 153           role. If the target is set to blackhole, the contents are
 154           discarded and not stored anywhere. This should only be used for
 155           testing purposes, as you will not end up with an actual backup.
 156
 157           Since WAL streaming is implemented by pg_basebackup rather than
 158           by the server, this option cannot be used together with
 159           -Xstream. Since that is the default, when this option is
 160           specified, you must also specify either -Xfetch or -Xnone.
 161
 162    -T olddir=newdir
 163           --tablespace-mapping=olddir=newdir
 164           Relocates the tablespace in directory olddir to newdir during
 165           the backup. To be effective, olddir must exactly match the path
 166           specification of the tablespace as it is defined on the source
 167           server. (But it is not an error if there is no tablespace in
 168           olddir on the source server.) Meanwhile newdir is a directory in
 169           the receiving host's filesystem. As with the main target
 170           directory, newdir need not exist already, but if it does exist
 171           it must be empty. Both olddir and newdir must be absolute paths.
 172           If either path needs to contain an equal sign (=), precede that
 173           with a backslash. This option can be specified multiple times
 174           for multiple tablespaces.
 175
 176           If a tablespace is relocated in this way, the symbolic links
 177           inside the main data directory are updated to point to the new
 178           location. So the new data directory is ready to be used for a
 179           new server instance with all tablespaces in the updated
 180           locations.
 181
 182           Currently, this option only works with plain output format; it
 183           is ignored if tar format is selected.
 184
 185    --waldir=waldir
 186           Sets the directory to write WAL (write-ahead log) files to. By
 187           default WAL files will be placed in the pg_wal subdirectory of
 188           the target directory, but this option can be used to place them
 189           elsewhere. waldir must be an absolute path. As with the main
 190           target directory, waldir need not exist already, but if it does
 191           exist it must be empty. This option can only be specified when
 192           the backup is in plain format.
 193
 194    -X method
 195           --wal-method=method
 196           Includes the required WAL (write-ahead log) files in the backup.
 197           This will include all write-ahead logs generated during the
 198           backup. Unless the method none is specified, it is possible to
 199           start a postmaster in the target directory without the need to
 200           consult the WAL archive, thus making the output a completely
 201           standalone backup.
 202
 203           The following methods for collecting the write-ahead logs are
 204           supported:
 205
 206         n
 207                 none
 208                 Don't include write-ahead logs in the backup.
 209
 210         f
 211                 fetch
 212                 The write-ahead log files are collected at the end of the
 213                 backup. Therefore, it is necessary for the source server's
 214                 wal_keep_size parameter to be set high enough that the
 215                 required log data is not removed before the end of the
 216                 backup. If the required log data has been recycled before
 217                 it's time to transfer it, the backup will fail and be
 218                 unusable.
 219
 220                 When tar format is used, the write-ahead log files will be
 221                 included in the base.tar file.
 222
 223         s
 224                 stream
 225                 Stream write-ahead log data while the backup is being
 226                 taken. This method will open a second connection to the
 227                 server and start streaming the write-ahead log in parallel
 228                 while running the backup. Therefore, it will require two
 229                 replication connections not just one. As long as the
 230                 client can keep up with the write-ahead log data, using
 231                 this method requires no extra write-ahead logs to be saved
 232                 on the source server.
 233
 234                 When tar format is used, the write-ahead log files will be
 235                 written to a separate file named pg_wal.tar (if the server
 236                 is a version earlier than 10, the file will be named
 237                 pg_xlog.tar).
 238
 239                 This value is the default.
 240
 241    -z
 242           --gzip
 243           Enables gzip compression of tar file output, with the default
 244           compression level. Compression is only available when using the
 245           tar format, and the suffix .gz will automatically be added to
 246           all tar filenames.
 247
 248    -Z level
 249           -Z [{client|server}-]method[:detail]
 250           --compress=level
 251           --compress=[{client|server}-]method[:detail]
 252           Requests compression of the backup. If client or server is
 253           included, it specifies where the compression is to be performed.
 254           Compressing on the server will reduce transfer bandwidth but
 255           will increase server CPU consumption. The default is client
 256           except when --target is used. In that case, the backup is not
 257           being sent to the client, so only server compression is
 258           sensible. When -Xstream, which is the default, is used,
 259           server-side compression will not be applied to the WAL. To
 260           compress the WAL, use client-side compression, or specify
 261           -Xfetch.
 262
 263           The compression method can be set to gzip, lz4, zstd, none for
 264           no compression or an integer (no compression if 0, gzip if
 265           greater than 0). A compression detail string can optionally be
 266           specified. If the detail string is an integer, it specifies the
 267           compression level. Otherwise, it should be a comma-separated
 268           list of items, each of the form keyword or keyword=value.
 269           Currently, the supported keywords are level, long, and workers.
 270           The detail string cannot be used when the compression method is
 271           specified as a plain integer.
 272
 273           If no compression level is specified, the default compression
 274           level will be used. If only a level is specified without
 275           mentioning an algorithm, gzip compression will be used if the
 276           level is greater than 0, and no compression will be used if the
 277           level is 0.
 278
 279           When the tar format is used with gzip, lz4, or zstd, the suffix
 280           .gz, .lz4, or .zst, respectively, will be automatically added to
 281           all tar filenames. When the plain format is used, client-side
 282           compression may not be specified, but it is still possible to
 283           request server-side compression. If this is done, the server
 284           will compress the backup for transmission, and the client will
 285           decompress and extract it.
 286
 287           When this option is used in combination with -Xstream,
 288           pg_wal.tar will be compressed using gzip if client-side gzip
 289           compression is selected, but will not be compressed if any other
 290           compression algorithm is selected, or if server-side compression
 291           is selected.
 292
 293    The following command-line options control the generation of the backup
 294    and the invocation of the program:
 295
 296    -c {fast|spread}
 297           --checkpoint={fast|spread}
 298           Sets checkpoint mode to fast (immediate) or spread (the default)
 299           (see Section 25.3.4).
 300
 301    -C
 302           --create-slot
 303           Specifies that the replication slot named by the --slot option
 304           should be created before starting the backup. An error is raised
 305           if the slot already exists.
 306
 307    -l label
 308           --label=label
 309           Sets the label for the backup. If none is specified, a default
 310           value of “pg_basebackup base backup” will be used.
 311
 312    -n
 313           --no-clean
 314           By default, when pg_basebackup aborts with an error, it removes
 315           any directories it might have created before discovering that it
 316           cannot finish the job (for example, the target directory and
 317           write-ahead log directory). This option inhibits tidying-up and
 318           is thus useful for debugging.
 319
 320           Note that tablespace directories are not cleaned up either way.
 321
 322    -N
 323           --no-sync
 324           By default, pg_basebackup will wait for all files to be written
 325           safely to disk. This option causes pg_basebackup to return
 326           without waiting, which is faster, but means that a subsequent
 327           operating system crash can leave the base backup corrupt.
 328           Generally, this option is useful for testing but should not be
 329           used when creating a production installation.
 330
 331    -P
 332           --progress
 333           Enables progress reporting. Turning this on will deliver an
 334           approximate progress report during the backup. Since the
 335           database may change during the backup, this is only an
 336           approximation and may not end at exactly 100%. In particular,
 337           when WAL log is included in the backup, the total amount of data
 338           cannot be estimated in advance, and in this case the estimated
 339           target size will increase once it passes the total estimate
 340           without WAL.
 341
 342    -r rate
 343           --max-rate=rate
 344           Sets the maximum transfer rate at which data is collected from
 345           the source server. This can be useful to limit the impact of
 346           pg_basebackup on the server. Values are in kilobytes per second.
 347           Use a suffix of M to indicate megabytes per second. A suffix of
 348           k is also accepted, and has no effect. Valid values are between
 349           32 kilobytes per second and 1024 megabytes per second.
 350
 351           This option always affects transfer of the data directory.
 352           Transfer of WAL files is only affected if the collection method
 353           is fetch.
 354
 355    -S slotname
 356           --slot=slotname
 357           This option can only be used together with -X stream. It causes
 358           WAL streaming to use the specified replication slot. If the base
 359           backup is intended to be used as a streaming-replication standby
 360           using a replication slot, the standby should then use the same
 361           replication slot name as primary_slot_name. This ensures that
 362           the primary server does not remove any necessary WAL data in the
 363           time between the end of the base backup and the start of
 364           streaming replication on the new standby.
 365
 366           The specified replication slot has to exist unless the option -C
 367           is also used.
 368
 369           If this option is not specified and the server supports
 370           temporary replication slots (version 10 and later), then a
 371           temporary replication slot is automatically used for WAL
 372           streaming.
 373
 374    --sync-method=method
 375           When set to fsync, which is the default, pg_basebackup will
 376           recursively open and synchronize all files in the backup
 377           directory. When the plain format is used, the search for files
 378           will follow symbolic links for the WAL directory and each
 379           configured tablespace.
 380
 381           On Linux, syncfs may be used instead to ask the operating system
 382           to synchronize the whole file system that contains the backup
 383           directory. When the plain format is used, pg_basebackup will
 384           also synchronize the file systems that contain the WAL files and
 385           each tablespace. See recovery_init_sync_method for information
 386           about the caveats to be aware of when using syncfs.
 387
 388           This option has no effect when --no-sync is used.
 389
 390    -v
 391           --verbose
 392           Enables verbose mode. Will output some extra steps during
 393           startup and shutdown, as well as show the exact file name that
 394           is currently being processed if progress reporting is also
 395           enabled.
 396
 397    --manifest-checksums=algorithm
 398           Specifies the checksum algorithm that should be applied to each
 399           file included in the backup manifest. Currently, the available
 400           algorithms are NONE, CRC32C, SHA224, SHA256, SHA384, and SHA512.
 401           The default is CRC32C.
 402
 403           If NONE is selected, the backup manifest will not contain any
 404           checksums. Otherwise, it will contain a checksum of each file in
 405           the backup using the specified algorithm. In addition, the
 406           manifest will always contain a SHA256 checksum of its own
 407           contents. The SHA algorithms are significantly more
 408           CPU-intensive than CRC32C, so selecting one of them may increase
 409           the time required to complete the backup.
 410
 411           Using a SHA hash function provides a cryptographically secure
 412           digest of each file for users who wish to verify that the backup
 413           has not been tampered with, while the CRC-32C algorithm provides
 414           a checksum that is much faster to calculate; it is good at
 415           catching errors due to accidental changes but is not resistant
 416           to malicious modifications. Note that, to be useful against an
 417           adversary who has access to the backup, the backup manifest
 418           would need to be stored securely elsewhere or otherwise verified
 419           not to have been modified since the backup was taken.
 420
 421           pg_verifybackup can be used to check the integrity of a backup
 422           against the backup manifest.
 423
 424    --manifest-force-encode
 425           Forces all filenames in the backup manifest to be hex-encoded.
 426           If this option is not specified, only non-UTF8 filenames are
 427           hex-encoded. This option is mostly intended to test that tools
 428           which read a backup manifest file properly handle this case.
 429
 430    --no-estimate-size
 431           Prevents the server from estimating the total amount of backup
 432           data that will be streamed, resulting in the backup_total column
 433           in the pg_stat_progress_basebackup view always being NULL.
 434
 435           Without this option, the backup will start by enumerating the
 436           size of the entire database, and then go back and send the
 437           actual contents. This may make the backup take slightly longer,
 438           and in particular it will take longer before the first data is
 439           sent. This option is useful to avoid such estimation time if
 440           it's too long.
 441
 442           This option is not allowed when using --progress.
 443
 444    --no-manifest
 445           Disables generation of a backup manifest. If this option is not
 446           specified, the server will generate and send a backup manifest
 447           which can be verified using pg_verifybackup. The manifest is a
 448           list of every file present in the backup with the exception of
 449           any WAL files that may be included. It also stores the size,
 450           last modification time, and an optional checksum for each file.
 451
 452    --no-slot
 453           Prevents the creation of a temporary replication slot for the
 454           backup.
 455
 456           By default, if log streaming is selected but no slot name is
 457           given with the -S option, then a temporary replication slot is
 458           created (if supported by the source server).
 459
 460           The main purpose of this option is to allow taking a base backup
 461           when the server has no free replication slots. Using a
 462           replication slot is almost always preferred, because it prevents
 463           needed WAL from being removed by the server during the backup.
 464
 465    --no-verify-checksums
 466           Disables verification of checksums, if they are enabled on the
 467           server the base backup is taken from.
 468
 469           By default, checksums are verified and checksum failures will
 470           result in a non-zero exit status. However, the base backup will
 471           not be removed in such a case, as if the --no-clean option had
 472           been used. Checksum verification failures will also be reported
 473           in the pg_stat_database view.
 474
 475    The following command-line options control the connection to the source
 476    server:
 477
 478    -d connstr
 479           --dbname=connstr
 480           Specifies parameters used to connect to the server, as a
 481           connection string; these will override any conflicting command
 482           line options.
 483
 484           This option is called --dbname for consistency with other client
 485           applications, but because pg_basebackup doesn't connect to any
 486           particular database in the cluster, any database name included
 487           in the connection string will be ignored by the server. However,
 488           a database name supplied that way overrides the default database
 489           name (replication) for purposes of looking up the replication
 490           connection's password in ~/.pgpass. Similarly, middleware or
 491           proxies used in connecting to PostgreSQL might utilize the name
 492           for purposes such as connection routing. The database name can
 493           also be used by logical replication slot synchronization.
 494
 495    -h host
 496           --host=host
 497           Specifies the host name of the machine on which the server is
 498           running. If the value begins with a slash, it is used as the
 499           directory for a Unix domain socket. The default is taken from
 500           the PGHOST environment variable, if set, else a Unix domain
 501           socket connection is attempted.
 502
 503    -p port
 504           --port=port
 505           Specifies the TCP port or local Unix domain socket file
 506           extension on which the server is listening for connections.
 507           Defaults to the PGPORT environment variable, if set, or a
 508           compiled-in default.
 509
 510    -s interval
 511           --status-interval=interval
 512           Specifies the number of seconds between status packets sent back
 513           to the source server. Smaller values allow more accurate
 514           monitoring of backup progress from the server. A value of zero
 515           disables periodic status updates completely, although an update
 516           will still be sent when requested by the server, to avoid
 517           timeout-based disconnects. The default value is 10 seconds.
 518
 519    -U username
 520           --username=username
 521           Specifies the user name to connect as.
 522
 523    -w
 524           --no-password
 525           Prevents issuing a password prompt. If the server requires
 526           password authentication and a password is not available by other
 527           means such as a .pgpass file, the connection attempt will fail.
 528           This option can be useful in batch jobs and scripts where no
 529           user is present to enter a password.
 530
 531    -W
 532           --password
 533           Forces pg_basebackup to prompt for a password before connecting
 534           to the source server.
 535
 536           This option is never essential, since pg_basebackup will
 537           automatically prompt for a password if the server demands
 538           password authentication. However, pg_basebackup will waste a
 539           connection attempt finding out that the server wants a password.
 540           In some cases it is worth typing -W to avoid the extra
 541           connection attempt.
 542
 543    Other options are also available:
 544
 545    -V
 546           --version
 547           Prints the pg_basebackup version and exits.
 548
 549    -?
 550           --help
 551           Shows help about pg_basebackup command line arguments, and
 552           exits.
 553
 554 Environment
 555
 556    This utility, like most other PostgreSQL utilities, uses the
 557    environment variables supported by libpq (see Section 32.15).
 558
 559    The environment variable PG_COLOR specifies whether to use color in
 560    diagnostic messages. Possible values are always, auto and never.
 561
 562 Notes
 563
 564    At the beginning of the backup, a checkpoint needs to be performed on
 565    the source server. This can take some time (especially if the option
 566    --checkpoint=fast is not used), during which pg_basebackup will appear
 567    to be idle.
 568
 569    The backup will include all files in the data directory and
 570    tablespaces, including the configuration files and any additional files
 571    placed in the directory by third parties, except certain temporary
 572    files managed by PostgreSQL and operating system files. But only
 573    regular files and directories are copied, except that symbolic links
 574    used for tablespaces are preserved. Symbolic links pointing to certain
 575    directories known to PostgreSQL are copied as empty directories. Other
 576    symbolic links and special device files are skipped. See Section 54.4
 577    for the precise details.
 578
 579    In plain format, tablespaces will be backed up to the same path they
 580    have on the source server, unless the option --tablespace-mapping is
 581    used. Without this option, running a plain format base backup on the
 582    same host as the server will not work if tablespaces are in use,
 583    because the backup would have to be written to the same directory
 584    locations as the original tablespaces.
 585
 586    When tar format is used, it is the user's responsibility to unpack each
 587    tar file before starting a PostgreSQL server that uses the data. If
 588    there are additional tablespaces, the tar files for them need to be
 589    unpacked in the correct locations. In this case the symbolic links for
 590    those tablespaces will be created by the server according to the
 591    contents of the tablespace_map file that is included in the base.tar
 592    file.
 593
 594    pg_basebackup works with servers of the same or older major version,
 595    down to 9.1. However, WAL streaming mode (-X stream) only works with
 596    server version 9.3 and later, the tar format (--format=tar) only works
 597    with server version 9.5 and later, and incremental backup
 598    (--incremental) only works with server version 17 and later.
 599
 600    pg_basebackup will preserve group permissions for data files if group
 601    permissions are enabled on the source cluster.
 602
 603 Examples
 604
 605    To create a base backup of the server at mydbserver and store it in the
 606    local directory /usr/local/pgsql/data:
 607 $ pg_basebackup -h mydbserver -D /usr/local/pgsql/data
 608
 609    To create a backup of the local server with one compressed tar file for
 610    each tablespace, and store it in the directory backup, showing a
 611    progress report while running:
 612 $ pg_basebackup -D backup -Ft -z -P
 613
 614    To create a backup of a single-tablespace local database and compress
 615    this with bzip2:
 616 $ pg_basebackup -D - -Ft -X fetch | bzip2 > backup.tar.bz2
 617
 618    (This command will fail if there are multiple tablespaces in the
 619    database.)
 620
 621    To create a backup of a local database where the tablespace in /opt/ts
 622    is relocated to ./backup/ts:
 623 $ pg_basebackup -D backup/data -T /opt/ts=$(pwd)/backup/ts
 624
 625    To create a backup of the local server with one tar file for each
 626    tablespace compressed with gzip at level 9, stored in the directory
 627    backup:
 628 $ pg_basebackup -D backup -Ft --compress=gzip:9
 629
 630 See Also
 631
 632    pg_dump, Section 27.4.6