4 ANALYZE — collect statistics about a database
8 ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
10 where option can be one of:
13 SKIP_LOCKED [ boolean ]
14 BUFFER_USAGE_LIMIT size
16 and table_and_columns is:
18 [ ONLY ] table_name [ * ] [ ( column_name [, ...] ) ]
22 ANALYZE collects statistics about the contents of tables in the
23 database, and stores the results in the pg_statistic system catalog.
24 Subsequently, the query planner uses these statistics to help determine
25 the most efficient execution plans for queries.
27 Without a table_and_columns list, ANALYZE processes every table and
28 materialized view in the current database that the current user has
29 permission to analyze. With a list, ANALYZE processes only those
30 table(s). It is further possible to give a list of column names for a
31 table, in which case only the statistics for those columns are
37 Enables display of progress messages at INFO level.
40 Specifies that ANALYZE should not wait for any conflicting locks
41 to be released when beginning work on a relation: if a relation
42 cannot be locked immediately without waiting, the relation is
43 skipped. Note that even with this option, ANALYZE may still
44 block when opening the relation's indexes or when acquiring
45 sample rows from partitions, table inheritance children, and
46 some types of foreign tables. Also, while ANALYZE ordinarily
47 processes all partitions of specified partitioned tables, this
48 option will cause ANALYZE to skip all partitions if there is a
49 conflicting lock on the partitioned table.
52 Specifies the Buffer Access Strategy ring buffer size for
53 ANALYZE. This size is used to calculate the number of shared
54 buffers which will be reused as part of this strategy. 0
55 disables use of a Buffer Access Strategy. When this option is
56 not specified, ANALYZE uses the value from
57 vacuum_buffer_usage_limit. Higher settings can allow ANALYZE to
58 run more quickly, but having too large a setting may cause too
59 many other useful pages to be evicted from shared buffers. The
60 minimum value is 128 kB and the maximum value is 16 GB.
63 Specifies whether the selected option should be turned on or
64 off. You can write TRUE, ON, or 1 to enable the option, and
65 FALSE, OFF, or 0 to disable it. The boolean value can also be
66 omitted, in which case TRUE is assumed.
69 Specifies an amount of memory in kilobytes. Sizes may also be
70 specified as a string containing the numerical size followed by
71 any one of the following memory units: B (bytes), kB
72 (kilobytes), MB (megabytes), GB (gigabytes), or TB (terabytes).
75 The name (possibly schema-qualified) of a specific table to
76 analyze. If omitted, all regular tables, partitioned tables, and
77 materialized views in the current database are analyzed (but not
78 foreign tables). If ONLY is specified before the table name,
79 only that table is analyzed. If ONLY is not specified, the table
80 and all its inheritance child tables or partitions (if any) are
81 analyzed. Optionally, * can be specified after the table name to
82 explicitly indicate that inheritance child tables (or
83 partitions) are to be analyzed.
86 The name of a specific column to analyze. Defaults to all
91 When VERBOSE is specified, ANALYZE emits progress messages to indicate
92 which table is currently being processed. Various statistics about the
93 tables are printed as well.
97 To analyze a table, one must ordinarily have the MAINTAIN privilege on
98 the table. However, database owners are allowed to analyze all tables
99 in their databases, except shared catalogs. ANALYZE will skip over any
100 tables that the calling user does not have permission to analyze.
102 Foreign tables are analyzed only when explicitly selected. Not all
103 foreign data wrappers support ANALYZE. If the table's wrapper does not
104 support ANALYZE, the command prints a warning and does nothing.
106 In the default PostgreSQL configuration, the autovacuum daemon (see
107 Section 24.1.6) takes care of automatic analyzing of tables when they
108 are first loaded with data, and as they change throughout regular
109 operation. When autovacuum is disabled, it is a good idea to run
110 ANALYZE periodically, or just after making major changes in the
111 contents of a table. Accurate statistics will help the planner to
112 choose the most appropriate query plan, and thereby improve the speed
113 of query processing. A common strategy for read-mostly databases is to
114 run VACUUM and ANALYZE once a day during a low-usage time of day. (This
115 will not be sufficient if there is heavy update activity.)
117 While ANALYZE is running, the search_path is temporarily changed to
120 ANALYZE requires only a read lock on the target table, so it can run in
121 parallel with other non-DDL activity on the table.
123 The statistics collected by ANALYZE usually include a list of some of
124 the most common values in each column and a histogram showing the
125 approximate data distribution in each column. One or both of these can
126 be omitted if ANALYZE deems them uninteresting (for example, in a
127 unique-key column, there are no common values) or if the column data
128 type does not support the appropriate operators. There is more
129 information about the statistics in Chapter 24.
131 For large tables, ANALYZE takes a random sample of the table contents,
132 rather than examining every row. This allows even very large tables to
133 be analyzed in a small amount of time. Note, however, that the
134 statistics are only approximate, and will change slightly each time
135 ANALYZE is run, even if the actual table contents did not change. This
136 might result in small changes in the planner's estimated costs shown by
137 EXPLAIN. In rare situations, this non-determinism will cause the
138 planner's choices of query plans to change after ANALYZE is run. To
139 avoid this, raise the amount of statistics collected by ANALYZE, as
142 The extent of analysis can be controlled by adjusting the
143 default_statistics_target configuration variable, or on a
144 column-by-column basis by setting the per-column statistics target with
145 ALTER TABLE ... ALTER COLUMN ... SET STATISTICS. The target value sets
146 the maximum number of entries in the most-common-value list and the
147 maximum number of bins in the histogram. The default target value is
148 100, but this can be adjusted up or down to trade off accuracy of
149 planner estimates against the time taken for ANALYZE and the amount of
150 space occupied in pg_statistic. In particular, setting the statistics
151 target to zero disables collection of statistics for that column. It
152 might be useful to do that for columns that are never used as part of
153 the WHERE, GROUP BY, or ORDER BY clauses of queries, since the planner
154 will have no use for statistics on such columns.
156 The largest statistics target among the columns being analyzed
157 determines the number of table rows sampled to prepare the statistics.
158 Increasing the target causes a proportional increase in the time and
159 space needed to do ANALYZE.
161 One of the values estimated by ANALYZE is the number of distinct values
162 that appear in each column. Because only a subset of the rows are
163 examined, this estimate can sometimes be quite inaccurate, even with
164 the largest possible statistics target. If this inaccuracy leads to bad
165 query plans, a more accurate value can be determined manually and then
166 installed with ALTER TABLE ... ALTER COLUMN ... SET (n_distinct = ...).
168 If the table being analyzed has inheritance children, ANALYZE gathers
169 two sets of statistics: one on the rows of the parent table only, and a
170 second including rows of both the parent table and all of its children.
171 This second set of statistics is needed when planning queries that
172 process the inheritance tree as a whole. The autovacuum daemon,
173 however, will only consider inserts or updates on the parent table
174 itself when deciding whether to trigger an automatic analyze for that
175 table. If that table is rarely inserted into or updated, the
176 inheritance statistics will not be up to date unless you run ANALYZE
177 manually. By default, ANALYZE will also recursively collect and update
178 the statistics for each inheritance child table. The ONLY keyword may
179 be used to disable this.
181 For partitioned tables, ANALYZE gathers statistics by sampling rows
182 from all partitions. By default, ANALYZE will also recursively collect
183 and update the statistics for each partition. The ONLY keyword may be
184 used to disable this.
186 The autovacuum daemon does not process partitioned tables, nor does it
187 process inheritance parents if only the children are ever modified. It
188 is usually necessary to periodically run a manual ANALYZE to keep the
189 statistics of the table hierarchy up to date.
191 If any child tables or partitions are foreign tables whose foreign data
192 wrappers do not support ANALYZE, those tables are ignored while
193 gathering inheritance statistics.
195 If the table being analyzed is completely empty, ANALYZE will not
196 record new statistics for that table. Any existing statistics will be
199 Each backend running ANALYZE will report its progress in the
200 pg_stat_progress_analyze view. See Section 27.4.1 for details.
204 There is no ANALYZE statement in the SQL standard.
206 The following syntax was used before PostgreSQL version 11 and is still
208 ANALYZE [ VERBOSE ] [ table_and_columns [, ...] ]
212 VACUUM, vacuumdb, Section 19.10.2, Section 24.1.6, Section 27.4.1