begriffs open source - ai-pg/blob - full-docs/src/sgml/html/creating-cluster.html

   1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
   2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>18.2. Creating a Database Cluster</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="postgres-user.html" title="18.1. The PostgreSQL User Account" /><link rel="next" href="server-start.html" title="18.3. Starting the Database Server" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">18.2. Creating a Database Cluster</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="postgres-user.html" title="18.1. The PostgreSQL User Account">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 18. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 18. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-start.html" title="18.3. Starting the Database Server">Next</a></td></tr></table><hr /></div><div class="sect1" id="CREATING-CLUSTER"><div class="titlepage"><div><div><h2 class="title" style="clear: both">18.2. Creating a Database Cluster <a href="#CREATING-CLUSTER" class="id_link">#</a></h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-MOUNT-POINTS">18.2.1. Use of Secondary File Systems</a></span></dt><dt><span class="sect2"><a href="creating-cluster.html#CREATING-CLUSTER-FILESYSTEM">18.2.2. File Systems</a></span></dt></dl></div><a id="id-1.6.5.5.2" class="indexterm"></a><a id="id-1.6.5.5.3" class="indexterm"></a><p>
   3    Before you can do anything, you must initialize a database storage
   4    area on disk. We call this a <em class="firstterm">database cluster</em>.
   5    (The <acronym class="acronym">SQL</acronym> standard uses the term catalog cluster.) A
   6    database cluster is a collection of databases that is managed by a
   7    single instance of a running database server. After initialization, a
   8    database cluster will contain a database named <code class="literal">postgres</code>,
   9    which is meant as a default database for use by utilities, users and third
  10    party applications.  The database server itself does not require the
  11    <code class="literal">postgres</code> database to exist, but many external utility
  12    programs assume it exists.  There are two more databases created within
  13    each cluster during initialization, named <code class="literal">template1</code>
  14    and <code class="literal">template0</code>.  As the names suggest, these will be
  15    used as templates for subsequently-created databases; they should not be
  16    used for actual work.  (See <a class="xref" href="managing-databases.html" title="Chapter 22. Managing Databases">Chapter 22</a> for
  17    information about creating new databases within a cluster.)
  18   </p><p>
  19    In file system terms, a database cluster is a single directory
  20    under which all data will be stored. We call this the <em class="firstterm">data
  21    directory</em> or <em class="firstterm">data area</em>. It is
  22    completely up to you where you choose to store your data.  There is no
  23    default, although locations such as
  24    <code class="filename">/usr/local/pgsql/data</code> or
  25    <code class="filename">/var/lib/pgsql/data</code> are popular.
  26    The data directory must be initialized before being used, using the program
  27    <a class="xref" href="app-initdb.html" title="initdb"><span class="refentrytitle"><span class="application">initdb</span></span></a><a id="id-1.6.5.5.5.6" class="indexterm"></a>
  28    which is installed with <span class="productname">PostgreSQL</span>.
  29   </p><p>
  30    If you are using a pre-packaged version
  31    of <span class="productname">PostgreSQL</span>, it may well have a specific
  32    convention for where to place the data directory, and it may also
  33    provide a script for creating the data directory.  In that case you
  34    should use that script in preference to
  35    running <code class="command">initdb</code> directly.
  36    Consult the package-level documentation for details.
  37   </p><p>
  38    To initialize a database cluster manually,
  39    run <code class="command">initdb</code> and specify the desired
  40    file system location of the database cluster with the
  41    <code class="option">-D</code> option, for example:
  42 </p><pre class="screen">
  43 <code class="prompt">$</code> <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong>
  44 </pre><p>
  45    Note that you must execute this command while logged into the
  46    <span class="productname">PostgreSQL</span> user account, which is
  47    described in the previous section.
  48   </p><div class="tip"><h3 class="title">Tip</h3><p>
  49     As an alternative to the <code class="option">-D</code> option, you can set
  50     the environment variable <code class="envar">PGDATA</code>.
  51     <a id="id-1.6.5.5.8.1.3" class="indexterm"></a>
  52    </p></div><p>
  53    Alternatively, you can run <code class="command">initdb</code> via
  54    the <a class="xref" href="app-pg-ctl.html" title="pg_ctl"><span class="refentrytitle"><span class="application">pg_ctl</span></span></a>
  55    program<a id="id-1.6.5.5.9.3" class="indexterm"></a> like so:
  56 </p><pre class="screen">
  57 <code class="prompt">$</code> <strong class="userinput"><code>pg_ctl -D /usr/local/pgsql/data initdb</code></strong>
  58 </pre><p>
  59    This may be more intuitive if you are
  60    using <code class="command">pg_ctl</code> for starting and stopping the
  61    server (see <a class="xref" href="server-start.html" title="18.3. Starting the Database Server">Section 18.3</a>), so
  62    that <code class="command">pg_ctl</code> would be the sole command you use
  63    for managing the database server instance.
  64   </p><p>
  65    <code class="command">initdb</code> will attempt to create the directory you
  66    specify if it does not already exist.  Of course, this will fail if
  67    <code class="command">initdb</code> does not have permissions to write in the
  68    parent directory.  It's generally recommendable that the
  69    <span class="productname">PostgreSQL</span> user own not just the data
  70    directory but its parent directory as well, so that this should not
  71    be a problem.  If the desired parent directory doesn't exist either,
  72    you will need to create it first, using root privileges if the
  73    grandparent directory isn't writable.  So the process might look
  74    like this:
  75 </p><pre class="screen">
  76 root# <strong class="userinput"><code>mkdir /usr/local/pgsql</code></strong>
  77 root# <strong class="userinput"><code>chown postgres /usr/local/pgsql</code></strong>
  78 root# <strong class="userinput"><code>su postgres</code></strong>
  79 postgres$ <strong class="userinput"><code>initdb -D /usr/local/pgsql/data</code></strong>
  80 </pre><p>
  81   </p><p>
  82    <code class="command">initdb</code> will refuse to run if the data directory
  83    exists and already contains files; this is to prevent accidentally
  84    overwriting an existing installation.
  85   </p><p>
  86    Because the data directory contains all the data stored in the
  87    database, it is essential that it be secured from unauthorized
  88    access. <code class="command">initdb</code> therefore revokes access
  89    permissions from everyone but the
  90    <span class="productname">PostgreSQL</span> user, and optionally, group.
  91    Group access, when enabled, is read-only.  This allows an unprivileged
  92    user in the same group as the cluster owner to take a backup of the
  93    cluster data or perform other operations that only require read access.
  94   </p><p>
  95    Note that enabling or disabling group access on an existing cluster requires
  96    the cluster to be shut down and the appropriate mode to be set on all
  97    directories and files before restarting
  98    <span class="productname">PostgreSQL</span>.  Otherwise, a mix of modes might
  99    exist in the data directory.  For clusters that allow access only by the
 100    owner, the appropriate modes are <code class="literal">0700</code> for directories
 101    and <code class="literal">0600</code> for files.  For clusters that also allow
 102    reads by the group, the appropriate modes are <code class="literal">0750</code>
 103    for directories and <code class="literal">0640</code> for files.
 104   </p><p>
 105    However, while the directory contents are secure, the default
 106    client authentication setup allows any local user to connect to the
 107    database and even become the database superuser. If you do not
 108    trust other local users, we recommend you use one of
 109    <code class="command">initdb</code>'s <code class="option">-W</code>, <code class="option">--pwprompt</code>
 110    or <code class="option">--pwfile</code> options to assign a password to the
 111    database superuser.<a id="id-1.6.5.5.14.5" class="indexterm"></a>
 112    Also, specify <code class="option">-A scram-sha-256</code>
 113    so that the default <code class="literal">trust</code> authentication
 114    mode is not used; or modify the generated <code class="filename">pg_hba.conf</code>
 115    file after running <code class="command">initdb</code>, but
 116    <span class="emphasis"><em>before</em></span> you start the server for the first time. (Other
 117    reasonable approaches include using <code class="literal">peer</code> authentication
 118    or file system permissions to restrict connections. See <a class="xref" href="client-authentication.html" title="Chapter 20. Client Authentication">Chapter 20</a> for more information.)
 119   </p><p>
 120    <code class="command">initdb</code> also initializes the default
 121    locale<a id="id-1.6.5.5.15.2" class="indexterm"></a> for the database cluster.
 122    Normally, it will just take the locale settings in the environment
 123    and apply them to the initialized database.  It is possible to
 124    specify a different locale for the database; more information about
 125    that can be found in <a class="xref" href="locale.html" title="23.1. Locale Support">Section 23.1</a>.  The default sort order used
 126    within the particular database cluster is set by
 127    <code class="command">initdb</code>, and while you can create new databases using
 128    different sort order, the order used in the template databases that initdb
 129    creates cannot be changed without dropping and recreating them.
 130    There is also a performance impact for using locales
 131    other than <code class="literal">C</code> or <code class="literal">POSIX</code>. Therefore, it is
 132    important to make this choice correctly the first time.
 133   </p><p>
 134    <code class="command">initdb</code> also sets the default character set encoding
 135    for the database cluster.  Normally this should be chosen to match the
 136    locale setting.  For details see <a class="xref" href="multibyte.html" title="23.3. Character Set Support">Section 23.3</a>.
 137   </p><p>
 138    Non-<code class="literal">C</code> and non-<code class="literal">POSIX</code> locales rely on the
 139    operating system's collation library for character set ordering.
 140    This controls the ordering of keys stored in indexes.  For this reason,
 141    a cluster cannot switch to an incompatible collation library version,
 142    either through snapshot restore, binary streaming replication, a
 143    different operating system, or an operating system upgrade.
 144   </p><div class="sect2" id="CREATING-CLUSTER-MOUNT-POINTS"><div class="titlepage"><div><div><h3 class="title">18.2.1. Use of Secondary File Systems <a href="#CREATING-CLUSTER-MOUNT-POINTS" class="id_link">#</a></h3></div></div></div><a id="id-1.6.5.5.18.2" class="indexterm"></a><p>
 145     Many installations create their database clusters on file systems
 146     (volumes) other than the machine's <span class="quote">“<span class="quote">root</span>”</span> volume.  If you
 147     choose to do this, it is not advisable to try to use the secondary
 148     volume's topmost directory (mount point) as the data directory.
 149     Best practice is to create a directory within the mount-point
 150     directory that is owned by the <span class="productname">PostgreSQL</span>
 151     user, and then create the data directory within that.  This avoids
 152     permissions problems, particularly for operations such
 153     as <span class="application">pg_upgrade</span>, and it also ensures clean failures if
 154     the secondary volume is taken offline.
 155    </p></div><div class="sect2" id="CREATING-CLUSTER-FILESYSTEM"><div class="titlepage"><div><div><h3 class="title">18.2.2. File Systems <a href="#CREATING-CLUSTER-FILESYSTEM" class="id_link">#</a></h3></div></div></div><p>
 156     Generally, any file system with POSIX semantics can be used for
 157     PostgreSQL.  Users prefer different file systems for a variety of reasons,
 158     including vendor support, performance, and familiarity.  Experience
 159     suggests that, all other things being equal, one should not expect major
 160     performance or behavior changes merely from switching file systems or
 161     making minor file system configuration changes.
 162    </p><div class="sect3" id="CREATING-CLUSTER-NFS"><div class="titlepage"><div><div><h4 class="title">18.2.2.1. NFS <a href="#CREATING-CLUSTER-NFS" class="id_link">#</a></h4></div></div></div><a id="id-1.6.5.5.19.3.2" class="indexterm"></a><p>
 163      It is possible to use an <acronym class="acronym">NFS</acronym> file system for storing
 164      the <span class="productname">PostgreSQL</span> data directory.
 165      <span class="productname">PostgreSQL</span> does nothing special for
 166      <acronym class="acronym">NFS</acronym> file systems, meaning it assumes
 167      <acronym class="acronym">NFS</acronym> behaves exactly like locally-connected drives.
 168      <span class="productname">PostgreSQL</span> does not use any functionality that
 169      is known to have nonstandard behavior on <acronym class="acronym">NFS</acronym>, such as
 170      file locking.
 171     </p><p>
 172      The only firm requirement for using <acronym class="acronym">NFS</acronym> with
 173      <span class="productname">PostgreSQL</span> is that the file system is mounted
 174      using the <code class="literal">hard</code> option.  With the
 175      <code class="literal">hard</code> option, processes can <span class="quote">“<span class="quote">hang</span>”</span>
 176      indefinitely if there are network problems, so this configuration will
 177      require a careful monitoring setup.  The <code class="literal">soft</code> option
 178      will interrupt system calls in case of network problems, but
 179      <span class="productname">PostgreSQL</span> will not repeat system calls
 180      interrupted in this way, so any such interruption will result in an I/O
 181      error being reported.
 182     </p><p>
 183      It is not necessary to use the <code class="literal">sync</code> mount option.  The
 184      behavior of the <code class="literal">async</code> option is sufficient, since
 185      <span class="productname">PostgreSQL</span> issues <code class="literal">fsync</code>
 186      calls at appropriate times to flush the write caches.  (This is analogous
 187      to how it works on a local file system.)  However, it is strongly
 188      recommended to use the <code class="literal">sync</code> export option on the NFS
 189      <span class="emphasis"><em>server</em></span> on systems where it exists (mainly Linux).
 190      Otherwise, an <code class="literal">fsync</code> or equivalent on the NFS client is
 191      not actually guaranteed to reach permanent storage on the server, which
 192      could cause corruption similar to running with the parameter <a class="xref" href="runtime-config-wal.html#GUC-FSYNC">fsync</a> off.  The defaults of these mount and export
 193      options differ between vendors and versions, so it is recommended to
 194      check and perhaps specify them explicitly in any case to avoid any
 195      ambiguity.
 196     </p><p>
 197      In some cases, an external storage product can be accessed either via NFS
 198      or a lower-level protocol such as iSCSI.  In the latter case, the storage
 199      appears as a block device and any available file system can be created on
 200      it.  That approach might relieve the DBA from having to deal with some of
 201      the idiosyncrasies of NFS, but of course the complexity of managing
 202      remote storage then happens at other levels.
 203     </p></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="postgres-user.html" title="18.1. The PostgreSQL User Account">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html" title="Chapter 18. Server Setup and Operation">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-start.html" title="18.3. Starting the Database Server">Next</a></td></tr><tr><td width="40%" align="left" valign="top">18.1. The <span class="productname">PostgreSQL</span> User Account </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 18.3. Starting the Database Server</td></tr></table></div></body></html>