begriffs open source - ai-pg/blob - full-docs/txt/sql-cluster.txt

   1
   2 CLUSTER
   3
   4    CLUSTER — cluster a table according to an index
   5
   6 Synopsis
   7
   8 CLUSTER [ ( option [, ...] ) ] [ table_name [ USING index_name ] ]
   9
  10 where option can be one of:
  11
  12     VERBOSE [ boolean ]
  13
  14 Description
  15
  16    CLUSTER instructs PostgreSQL to cluster the table specified by
  17    table_name based on the index specified by index_name. The index must
  18    already have been defined on table_name.
  19
  20    When a table is clustered, it is physically reordered based on the
  21    index information. Clustering is a one-time operation: when the table
  22    is subsequently updated, the changes are not clustered. That is, no
  23    attempt is made to store new or updated rows according to their index
  24    order. (If one wishes, one can periodically recluster by issuing the
  25    command again. Also, setting the table's fillfactor storage parameter
  26    to less than 100% can aid in preserving cluster ordering during
  27    updates, since updated rows are kept on the same page if enough space
  28    is available there.)
  29
  30    When a table is clustered, PostgreSQL remembers which index it was
  31    clustered by. The form CLUSTER table_name reclusters the table using
  32    the same index as before. You can also use the CLUSTER or SET WITHOUT
  33    CLUSTER forms of ALTER TABLE to set the index to be used for future
  34    cluster operations, or to clear any previous setting.
  35
  36    CLUSTER without a table_name reclusters all the previously-clustered
  37    tables in the current database that the calling user has privileges
  38    for. This form of CLUSTER cannot be executed inside a transaction
  39    block.
  40
  41    When a table is being clustered, an ACCESS EXCLUSIVE lock is acquired
  42    on it. This prevents any other database operations (both reads and
  43    writes) from operating on the table until the CLUSTER is finished.
  44
  45 Parameters
  46
  47    table_name
  48           The name (possibly schema-qualified) of a table.
  49
  50    index_name
  51           The name of an index.
  52
  53    VERBOSE
  54           Prints a progress report as each table is clustered at INFO
  55           level.
  56
  57    boolean
  58           Specifies whether the selected option should be turned on or
  59           off. You can write TRUE, ON, or 1 to enable the option, and
  60           FALSE, OFF, or 0 to disable it. The boolean value can also be
  61           omitted, in which case TRUE is assumed.
  62
  63 Notes
  64
  65    To cluster a table, one must have the MAINTAIN privilege on the table.
  66
  67    In cases where you are accessing single rows randomly within a table,
  68    the actual order of the data in the table is unimportant. However, if
  69    you tend to access some data more than others, and there is an index
  70    that groups them together, you will benefit from using CLUSTER. If you
  71    are requesting a range of indexed values from a table, or a single
  72    indexed value that has multiple rows that match, CLUSTER will help
  73    because once the index identifies the table page for the first row that
  74    matches, all other rows that match are probably already on the same
  75    table page, and so you save disk accesses and speed up the query.
  76
  77    CLUSTER can re-sort the table using either an index scan on the
  78    specified index, or (if the index is a b-tree) a sequential scan
  79    followed by sorting. It will attempt to choose the method that will be
  80    faster, based on planner cost parameters and available statistical
  81    information.
  82
  83    While CLUSTER is running, the search_path is temporarily changed to
  84    pg_catalog, pg_temp.
  85
  86    When an index scan is used, a temporary copy of the table is created
  87    that contains the table data in the index order. Temporary copies of
  88    each index on the table are created as well. Therefore, you need free
  89    space on disk at least equal to the sum of the table size and the index
  90    sizes.
  91
  92    When a sequential scan and sort is used, a temporary sort file is also
  93    created, so that the peak temporary space requirement is as much as
  94    double the table size, plus the index sizes. This method is often
  95    faster than the index scan method, but if the disk space requirement is
  96    intolerable, you can disable this choice by temporarily setting
  97    enable_sort to off.
  98
  99    It is advisable to set maintenance_work_mem to a reasonably large value
 100    (but not more than the amount of RAM you can dedicate to the CLUSTER
 101    operation) before clustering.
 102
 103    Because the planner records statistics about the ordering of tables, it
 104    is advisable to run ANALYZE on the newly clustered table. Otherwise,
 105    the planner might make poor choices of query plans.
 106
 107    Because CLUSTER remembers which indexes are clustered, one can cluster
 108    the tables one wants clustered manually the first time, then set up a
 109    periodic maintenance script that executes CLUSTER without any
 110    parameters, so that the desired tables are periodically reclustered.
 111
 112    Each backend running CLUSTER will report its progress in the
 113    pg_stat_progress_cluster view. See Section 27.4.2 for details.
 114
 115    Clustering a partitioned table clusters each of its partitions using
 116    the partition of the specified partitioned index. When clustering a
 117    partitioned table, the index may not be omitted. CLUSTER on a
 118    partitioned table cannot be executed inside a transaction block.
 119
 120 Examples
 121
 122    Cluster the table employees on the basis of its index employees_ind:
 123 CLUSTER employees USING employees_ind;
 124
 125    Cluster the employees table using the same index that was used before:
 126 CLUSTER employees;
 127
 128    Cluster all tables in the database that have previously been clustered:
 129 CLUSTER;
 130
 131 Compatibility
 132
 133    There is no CLUSTER statement in the SQL standard.
 134
 135    The following syntax was used before PostgreSQL 17 and is still
 136    supported:
 137 CLUSTER [ VERBOSE ] [ table_name [ USING index_name ] ]
 138
 139    The following syntax was used before PostgreSQL 8.3 and is still
 140    supported:
 141 CLUSTER index_name ON table_name
 142
 143 See Also
 144
 145    clusterdb, Section 27.4.2