begriffs open source - ai-pg/blob - full-docs/txt/backup-file.txt

   1
   2 25.2. File System Level Backup #
   3
   4    An alternative backup strategy is to directly copy the files that
   5    PostgreSQL uses to store the data in the database; Section 18.2
   6    explains where these files are located. You can use whatever method you
   7    prefer for doing file system backups; for example:
   8 tar -cf backup.tar /usr/local/pgsql/data
   9
  10    There are two restrictions, however, which make this method
  11    impractical, or at least inferior to the pg_dump method:
  12     1. The database server must be shut down in order to get a usable
  13        backup. Half-way measures such as disallowing all connections will
  14        not work (in part because tar and similar tools do not take an
  15        atomic snapshot of the state of the file system, but also because
  16        of internal buffering within the server). Information about
  17        stopping the server can be found in Section 18.5. Needless to say,
  18        you also need to shut down the server before restoring the data.
  19     2. If you have dug into the details of the file system layout of the
  20        database, you might be tempted to try to back up or restore only
  21        certain individual tables or databases from their respective files
  22        or directories. This will not work because the information
  23        contained in these files is not usable without the commit log
  24        files, pg_xact/*, which contain the commit status of all
  25        transactions. A table file is only usable with this information. Of
  26        course it is also impossible to restore only a table and the
  27        associated pg_xact data because that would render all other tables
  28        in the database cluster useless. So file system backups only work
  29        for complete backup and restoration of an entire database cluster.
  30
  31    An alternative file-system backup approach is to make a “consistent
  32    snapshot” of the data directory, if the file system supports that
  33    functionality (and you are willing to trust that it is implemented
  34    correctly). The typical procedure is to make a “frozen snapshot” of the
  35    volume containing the database, then copy the whole data directory (not
  36    just parts, see above) from the snapshot to a backup device, then
  37    release the frozen snapshot. This will work even while the database
  38    server is running. However, a backup created in this way saves the
  39    database files in a state as if the database server was not properly
  40    shut down; therefore, when you start the database server on the
  41    backed-up data, it will think the previous server instance crashed and
  42    will replay the WAL log. This is not a problem; just be aware of it
  43    (and be sure to include the WAL files in your backup). You can perform
  44    a CHECKPOINT before taking the snapshot to reduce recovery time.
  45
  46    If your database is spread across multiple file systems, there might
  47    not be any way to obtain exactly-simultaneous frozen snapshots of all
  48    the volumes. For example, if your data files and WAL log are on
  49    different disks, or if tablespaces are on different file systems, it
  50    might not be possible to use snapshot backup because the snapshots must
  51    be simultaneous. Read your file system documentation very carefully
  52    before trusting the consistent-snapshot technique in such situations.
  53
  54    If simultaneous snapshots are not possible, one option is to shut down
  55    the database server long enough to establish all the frozen snapshots.
  56    Another option is to perform a continuous archiving base backup
  57    (Section 25.3.2) because such backups are immune to file system changes
  58    during the backup. This requires enabling continuous archiving just
  59    during the backup process; restore is done using continuous archive
  60    recovery (Section 25.3.5).
  61
  62    Another option is to use rsync to perform a file system backup. This is
  63    done by first running rsync while the database server is running, then
  64    shutting down the database server long enough to do an rsync
  65    --checksum. (--checksum is necessary because rsync only has file
  66    modification-time granularity of one second.) The second rsync will be
  67    quicker than the first, because it has relatively little data to
  68    transfer, and the end result will be consistent because the server was
  69    down. This method allows a file system backup to be performed with
  70    minimal downtime.
  71
  72    Note that a file system backup will typically be larger than an SQL
  73    dump. (pg_dump does not need to dump the contents of indexes for
  74    example, just the commands to recreate them.) However, taking a file
  75    system backup might be faster.