begriffs open source - ai-pg/blob - full-docs/txt/sql-explain.txt

   1
   2 EXPLAIN
   3
   4    EXPLAIN — show the execution plan of a statement
   5
   6 Synopsis
   7
   8 EXPLAIN [ ( option [, ...] ) ] statement
   9
  10 where option can be one of:
  11
  12     ANALYZE [ boolean ]
  13     VERBOSE [ boolean ]
  14     COSTS [ boolean ]
  15     SETTINGS [ boolean ]
  16     GENERIC_PLAN [ boolean ]
  17     BUFFERS [ boolean ]
  18     SERIALIZE [ { NONE | TEXT | BINARY } ]
  19     WAL [ boolean ]
  20     TIMING [ boolean ]
  21     SUMMARY [ boolean ]
  22     MEMORY [ boolean ]
  23     FORMAT { TEXT | XML | JSON | YAML }
  24
  25 Description
  26
  27    This command displays the execution plan that the PostgreSQL planner
  28    generates for the supplied statement. The execution plan shows how the
  29    table(s) referenced by the statement will be scanned — by plain
  30    sequential scan, index scan, etc. — and if multiple tables are
  31    referenced, what join algorithms will be used to bring together the
  32    required rows from each input table.
  33
  34    The most critical part of the display is the estimated statement
  35    execution cost, which is the planner's guess at how long it will take
  36    to run the statement (measured in cost units that are arbitrary, but
  37    conventionally mean disk page fetches). Actually two numbers are shown:
  38    the start-up cost before the first row can be returned, and the total
  39    cost to return all the rows. For most queries the total cost is what
  40    matters, but in contexts such as a subquery in EXISTS, the planner will
  41    choose the smallest start-up cost instead of the smallest total cost
  42    (since the executor will stop after getting one row, anyway). Also, if
  43    you limit the number of rows to return with a LIMIT clause, the planner
  44    makes an appropriate interpolation between the endpoint costs to
  45    estimate which plan is really the cheapest.
  46
  47    The ANALYZE option causes the statement to be actually executed, not
  48    only planned. Then actual run time statistics are added to the display,
  49    including the total elapsed time expended within each plan node (in
  50    milliseconds) and the total number of rows it actually returned. This
  51    is useful for seeing whether the planner's estimates are close to
  52    reality.
  53
  54 Important
  55
  56    Keep in mind that the statement is actually executed when the ANALYZE
  57    option is used. Although EXPLAIN will discard any output that a SELECT
  58    would return, other side effects of the statement will happen as usual.
  59    If you wish to use EXPLAIN ANALYZE on an INSERT, UPDATE, DELETE, MERGE,
  60    CREATE TABLE AS, or EXECUTE statement without letting the command
  61    affect your data, use this approach:
  62 BEGIN;
  63 EXPLAIN ANALYZE ...;
  64 ROLLBACK;
  65
  66 Parameters
  67
  68    ANALYZE
  69           Carry out the command and show actual run times and other
  70           statistics. This parameter defaults to FALSE.
  71
  72    VERBOSE
  73           Display additional information regarding the plan. Specifically,
  74           include the output column list for each node in the plan tree,
  75           schema-qualify table and function names, always label variables
  76           in expressions with their range table alias, and always print
  77           the name of each trigger for which statistics are displayed. The
  78           query identifier will also be displayed if one has been
  79           computed, see compute_query_id for more details. This parameter
  80           defaults to FALSE.
  81
  82    COSTS
  83           Include information on the estimated startup and total cost of
  84           each plan node, as well as the estimated number of rows and the
  85           estimated width of each row. This parameter defaults to TRUE.
  86
  87    SETTINGS
  88           Include information on configuration parameters. Specifically,
  89           include options affecting query planning with value different
  90           from the built-in default value. This parameter defaults to
  91           FALSE.
  92
  93    GENERIC_PLAN
  94           Allow the statement to contain parameter placeholders like $1,
  95           and generate a generic plan that does not depend on the values
  96           of those parameters. See PREPARE for details about generic plans
  97           and the types of statement that support parameters. This
  98           parameter cannot be used together with ANALYZE. It defaults to
  99           FALSE.
 100
 101    BUFFERS
 102           Include information on buffer usage. Specifically, include the
 103           number of shared blocks hit, read, dirtied, and written, the
 104           number of local blocks hit, read, dirtied, and written, the
 105           number of temp blocks read and written, and the time spent
 106           reading and writing data file blocks, local blocks and temporary
 107           file blocks (in milliseconds) if track_io_timing is enabled. A
 108           hit means that a read was avoided because the block was found
 109           already in cache when needed. Shared blocks contain data from
 110           regular tables and indexes; local blocks contain data from
 111           temporary tables and indexes; while temporary blocks contain
 112           short-term working data used in sorts, hashes, Materialize plan
 113           nodes, and similar cases. The number of blocks dirtied indicates
 114           the number of previously unmodified blocks that were changed by
 115           this query; while the number of blocks written indicates the
 116           number of previously-dirtied blocks evicted from cache by this
 117           backend during query processing. The number of blocks shown for
 118           an upper-level node includes those used by all its child nodes.
 119           In text format, only non-zero values are printed. Buffers
 120           information is automatically included when ANALYZE is used.
 121
 122    SERIALIZE
 123           Include information on the cost of serializing the query's
 124           output data, that is converting it to text or binary format to
 125           send to the client. This can be a significant part of the time
 126           required for regular execution of the query, if the datatype
 127           output functions are expensive or if TOASTed values must be
 128           fetched from out-of-line storage. EXPLAIN's default behavior,
 129           SERIALIZE NONE, does not perform these conversions. If SERIALIZE
 130           TEXT or SERIALIZE BINARY is specified, the appropriate
 131           conversions are performed, and the time spent doing so is
 132           measured (unless TIMING OFF is specified). If the BUFFERS option
 133           is also specified, then any buffer accesses involved in the
 134           conversions are counted too. In no case, however, will EXPLAIN
 135           actually send the resulting data to the client; hence network
 136           transmission costs cannot be investigated this way.
 137           Serialization may only be enabled when ANALYZE is also enabled.
 138           If SERIALIZE is written without an argument, TEXT is assumed.
 139
 140    WAL
 141           Include information on WAL record generation. Specifically,
 142           include the number of records, number of full page images (fpi),
 143           the amount of WAL generated in bytes and the number of times the
 144           WAL buffers became full. In text format, only non-zero values
 145           are printed. This parameter may only be used when ANALYZE is
 146           also enabled. It defaults to FALSE.
 147
 148    TIMING
 149           Include actual startup time and time spent in each node in the
 150           output. The overhead of repeatedly reading the system clock can
 151           slow down the query significantly on some systems, so it may be
 152           useful to set this parameter to FALSE when only actual row
 153           counts, and not exact times, are needed. Run time of the entire
 154           statement is always measured, even when node-level timing is
 155           turned off with this option. This parameter may only be used
 156           when ANALYZE is also enabled. It defaults to TRUE.
 157
 158    SUMMARY
 159           Include summary information (e.g., totaled timing information)
 160           after the query plan. Summary information is included by default
 161           when ANALYZE is used but otherwise is not included by default,
 162           but can be enabled using this option. Planning time in EXPLAIN
 163           EXECUTE includes the time required to fetch the plan from the
 164           cache and the time required for re-planning, if necessary.
 165
 166    MEMORY
 167           Include information on memory consumption by the query planning
 168           phase. Specifically, include the precise amount of storage used
 169           by planner in-memory structures, as well as total memory
 170           considering allocation overhead. This parameter defaults to
 171           FALSE.
 172
 173    FORMAT
 174           Specify the output format, which can be TEXT, XML, JSON, or
 175           YAML. Non-text output contains the same information as the text
 176           output format, but is easier for programs to parse. This
 177           parameter defaults to TEXT.
 178
 179    boolean
 180           Specifies whether the selected option should be turned on or
 181           off. You can write TRUE, ON, or 1 to enable the option, and
 182           FALSE, OFF, or 0 to disable it. The boolean value can also be
 183           omitted, in which case TRUE is assumed.
 184
 185    statement
 186           Any SELECT, INSERT, UPDATE, DELETE, MERGE, VALUES, EXECUTE,
 187           DECLARE, CREATE TABLE AS, or CREATE MATERIALIZED VIEW AS
 188           statement, whose execution plan you wish to see.
 189
 190 Outputs
 191
 192    The command's result is a textual description of the plan selected for
 193    the statement, optionally annotated with execution statistics.
 194    Section 14.1 describes the information provided.
 195
 196 Notes
 197
 198    In order to allow the PostgreSQL query planner to make reasonably
 199    informed decisions when optimizing queries, the pg_statistic data
 200    should be up-to-date for all tables used in the query. Normally the
 201    autovacuum daemon will take care of that automatically. But if a table
 202    has recently had substantial changes in its contents, you might need to
 203    do a manual ANALYZE rather than wait for autovacuum to catch up with
 204    the changes.
 205
 206    In order to measure the run-time cost of each node in the execution
 207    plan, the current implementation of EXPLAIN ANALYZE adds profiling
 208    overhead to query execution. As a result, running EXPLAIN ANALYZE on a
 209    query can sometimes take significantly longer than executing the query
 210    normally. The amount of overhead depends on the nature of the query, as
 211    well as the platform being used. The worst case occurs for plan nodes
 212    that in themselves require very little time per execution, and on
 213    machines that have relatively slow operating system calls for obtaining
 214    the time of day.
 215
 216 Examples
 217
 218    To show the plan for a simple query on a table with a single integer
 219    column and 10000 rows:
 220 EXPLAIN SELECT * FROM foo;
 221
 222                        QUERY PLAN
 223 ---------------------------------------------------------
 224  Seq Scan on foo  (cost=0.00..155.00 rows=10000 width=4)
 225 (1 row)
 226
 227    Here is the same query, with JSON output formatting:
 228 EXPLAIN (FORMAT JSON) SELECT * FROM foo;
 229            QUERY PLAN
 230 --------------------------------
 231  [                             +
 232    {                           +
 233      "Plan": {                 +
 234        "Node Type": "Seq Scan",+
 235        "Relation Name": "foo", +
 236        "Alias": "foo",         +
 237        "Startup Cost": 0.00,   +
 238        "Total Cost": 155.00,   +
 239        "Plan Rows": 10000,     +
 240        "Plan Width": 4         +
 241      }                         +
 242    }                           +
 243  ]
 244 (1 row)
 245
 246    If there is an index and we use a query with an indexable WHERE
 247    condition, EXPLAIN might show a different plan:
 248 EXPLAIN SELECT * FROM foo WHERE i = 4;
 249
 250                          QUERY PLAN
 251 --------------------------------------------------------------
 252  Index Scan using fi on foo  (cost=0.00..5.98 rows=1 width=4)
 253    Index Cond: (i = 4)
 254 (2 rows)
 255
 256    Here is the same query, but in YAML format:
 257 EXPLAIN (FORMAT YAML) SELECT * FROM foo WHERE i='4';
 258           QUERY PLAN
 259 -------------------------------
 260  - Plan:                      +
 261      Node Type: "Index Scan"  +
 262      Scan Direction: "Forward"+
 263      Index Name: "fi"         +
 264      Relation Name: "foo"     +
 265      Alias: "foo"             +
 266      Startup Cost: 0.00       +
 267      Total Cost: 5.98         +
 268      Plan Rows: 1             +
 269      Plan Width: 4            +
 270      Index Cond: "(i = 4)"
 271 (1 row)
 272
 273    XML format is left as an exercise for the reader.
 274
 275    Here is the same plan with cost estimates suppressed:
 276 EXPLAIN (COSTS FALSE) SELECT * FROM foo WHERE i = 4;
 277
 278         QUERY PLAN
 279 ----------------------------
 280  Index Scan using fi on foo
 281    Index Cond: (i = 4)
 282 (2 rows)
 283
 284    Here is an example of a query plan for a query using an aggregate
 285    function:
 286 EXPLAIN SELECT sum(i) FROM foo WHERE i < 10;
 287
 288                              QUERY PLAN
 289 ---------------------------------------------------------------------
 290  Aggregate  (cost=23.93..23.93 rows=1 width=4)
 291    ->  Index Scan using fi on foo  (cost=0.00..23.92 rows=6 width=4)
 292          Index Cond: (i < 10)
 293 (3 rows)
 294
 295    Here is an example of using EXPLAIN EXECUTE to display the execution
 296    plan for a prepared query:
 297 PREPARE query(int, int) AS SELECT sum(bar) FROM test
 298     WHERE id > $1 AND id < $2
 299     GROUP BY foo;
 300
 301 EXPLAIN ANALYZE EXECUTE query(100, 200);
 302
 303                                                        QUERY PLAN
 304 -------------------------------------------------------------------------------
 305 ------------------------------------------
 306  HashAggregate  (cost=10.77..10.87 rows=10 width=12) (actual time=0.043..0.044 r
 307 ows=10.00 loops=1)
 308    Group Key: foo
 309    Batches: 1  Memory Usage: 24kB
 310    Buffers: shared hit=4
 311    ->  Index Scan using test_pkey on test  (cost=0.29..10.27 rows=99 width=8) (a
 312 ctual time=0.009..0.025 rows=99.00 loops=1)
 313          Index Cond: ((id > 100) AND (id < 200))
 314          Index Searches: 1
 315          Buffers: shared hit=4
 316  Planning Time: 0.244 ms
 317  Execution Time: 0.073 ms
 318 (10 rows)
 319
 320    Of course, the specific numbers shown here depend on the actual
 321    contents of the tables involved. Also note that the numbers, and even
 322    the selected query strategy, might vary between PostgreSQL releases due
 323    to planner improvements. In addition, the ANALYZE command uses random
 324    sampling to estimate data statistics; therefore, it is possible for
 325    cost estimates to change after a fresh run of ANALYZE, even if the
 326    actual distribution of data in the table has not changed.
 327
 328    Notice that the previous example showed a “custom” plan for the
 329    specific parameter values given in EXECUTE. We might also wish to see
 330    the generic plan for a parameterized query, which can be done with
 331    GENERIC_PLAN:
 332 EXPLAIN (GENERIC_PLAN)
 333   SELECT sum(bar) FROM test
 334     WHERE id > $1 AND id < $2
 335     GROUP BY foo;
 336
 337                                   QUERY PLAN
 338 -------------------------------------------------------------------------------
 339  HashAggregate  (cost=26.79..26.89 rows=10 width=12)
 340    Group Key: foo
 341    ->  Index Scan using test_pkey on test  (cost=0.29..24.29 rows=500 width=8)
 342          Index Cond: ((id > $1) AND (id < $2))
 343 (4 rows)
 344
 345    In this case the parser correctly inferred that $1 and $2 should have
 346    the same data type as id, so the lack of parameter type information
 347    from PREPARE was not a problem. In other cases it might be necessary to
 348    explicitly specify types for the parameter symbols, which can be done
 349    by casting them, for example:
 350 EXPLAIN (GENERIC_PLAN)
 351   SELECT sum(bar) FROM test
 352     WHERE id > $1::integer AND id < $2::integer
 353     GROUP BY foo;
 354
 355 Compatibility
 356
 357    There is no EXPLAIN statement defined in the SQL standard.
 358
 359    The following syntax was used before PostgreSQL version 9.0 and is
 360    still supported:
 361 EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
 362
 363    Note that in this syntax, the options must be specified in exactly the
 364    order shown.
 365
 366 See Also
 367
 368    ANALYZE