begriffs open source - ai-pg/blob - full-docs/txt/runtime-config-query.txt

   1
   2 19.7. Query Planning #
   3
   4    19.7.1. Planner Method Configuration
   5    19.7.2. Planner Cost Constants
   6    19.7.3. Genetic Query Optimizer
   7    19.7.4. Other Planner Options
   8
   9 19.7.1. Planner Method Configuration #
  10
  11    These configuration parameters provide a crude method of influencing
  12    the query plans chosen by the query optimizer. If the default plan
  13    chosen by the optimizer for a particular query is not optimal, a
  14    temporary solution is to use one of these configuration parameters to
  15    force the optimizer to choose a different plan. Better ways to improve
  16    the quality of the plans chosen by the optimizer include adjusting the
  17    planner cost constants (see Section 19.7.2), running ANALYZE manually,
  18    increasing the value of the default_statistics_target configuration
  19    parameter, and increasing the amount of statistics collected for
  20    specific columns using ALTER TABLE SET STATISTICS.
  21
  22    enable_async_append (boolean) #
  23           Enables or disables the query planner's use of async-aware
  24           append plan types. The default is on.
  25
  26    enable_bitmapscan (boolean) #
  27           Enables or disables the query planner's use of bitmap-scan plan
  28           types. The default is on.
  29
  30    enable_distinct_reordering (boolean) #
  31           Enables or disables the query planner's ability to reorder
  32           DISTINCT keys to match the input path's pathkeys. The default is
  33           on.
  34
  35    enable_gathermerge (boolean) #
  36           Enables or disables the query planner's use of gather merge plan
  37           types. The default is on.
  38
  39    enable_group_by_reordering (boolean) #
  40           Controls if the query planner will produce a plan which will
  41           provide GROUP BY keys sorted in the order of keys of a child
  42           node of the plan, such as an index scan. When disabled, the
  43           query planner will produce a plan with GROUP BY keys only sorted
  44           to match the ORDER BY clause, if any. When enabled, the planner
  45           will try to produce a more efficient plan. The default value is
  46           on.
  47
  48    enable_hashagg (boolean) #
  49           Enables or disables the query planner's use of hashed
  50           aggregation plan types. The default is on.
  51
  52    enable_hashjoin (boolean) #
  53           Enables or disables the query planner's use of hash-join plan
  54           types. The default is on.
  55
  56    enable_incremental_sort (boolean) #
  57           Enables or disables the query planner's use of incremental sort
  58           steps. The default is on.
  59
  60    enable_indexscan (boolean) #
  61           Enables or disables the query planner's use of index-scan and
  62           index-only-scan plan types. The default is on. Also see
  63           enable_indexonlyscan.
  64
  65    enable_indexonlyscan (boolean) #
  66           Enables or disables the query planner's use of index-only-scan
  67           plan types (see Section 11.9). The default is on. The
  68           enable_indexscan setting must also be enabled to have the query
  69           planner consider index-only-scans.
  70
  71    enable_material (boolean) #
  72           Enables or disables the query planner's use of materialization.
  73           It is impossible to suppress materialization entirely, but
  74           turning this variable off prevents the planner from inserting
  75           materialize nodes except in cases where it is required for
  76           correctness. The default is on.
  77
  78    enable_memoize (boolean) #
  79           Enables or disables the query planner's use of memoize plans for
  80           caching results from parameterized scans inside nested-loop
  81           joins. This plan type allows scans to the underlying plans to be
  82           skipped when the results for the current parameters are already
  83           in the cache. Less commonly looked up results may be evicted
  84           from the cache when more space is required for new entries. The
  85           default is on.
  86
  87    enable_mergejoin (boolean) #
  88           Enables or disables the query planner's use of merge-join plan
  89           types. The default is on.
  90
  91    enable_nestloop (boolean) #
  92           Enables or disables the query planner's use of nested-loop join
  93           plans. It is impossible to suppress nested-loop joins entirely,
  94           but turning this variable off discourages the planner from using
  95           one if there are other methods available. The default is on.
  96
  97    enable_parallel_append (boolean) #
  98           Enables or disables the query planner's use of parallel-aware
  99           append plan types. The default is on.
 100
 101    enable_parallel_hash (boolean) #
 102           Enables or disables the query planner's use of hash-join plan
 103           types with parallel hash. Has no effect if hash-join plans are
 104           not also enabled. The default is on.
 105
 106    enable_partition_pruning (boolean) #
 107           Enables or disables the query planner's ability to eliminate a
 108           partitioned table's partitions from query plans. This also
 109           controls the planner's ability to generate query plans which
 110           allow the query executor to remove (ignore) partitions during
 111           query execution. The default is on. See Section 5.12.4 for
 112           details.
 113
 114    enable_partitionwise_join (boolean) #
 115           Enables or disables the query planner's use of partitionwise
 116           join, which allows a join between partitioned tables to be
 117           performed by joining the matching partitions. Partitionwise join
 118           currently applies only when the join conditions include all the
 119           partition keys, which must be of the same data type and have
 120           one-to-one matching sets of child partitions. With this setting
 121           enabled, the number of nodes whose memory usage is restricted by
 122           work_mem appearing in the final plan can increase linearly
 123           according to the number of partitions being scanned. This can
 124           result in a large increase in overall memory consumption during
 125           the execution of the query. Query planning also becomes
 126           significantly more expensive in terms of memory and CPU. The
 127           default value is off.
 128
 129    enable_partitionwise_aggregate (boolean) #
 130           Enables or disables the query planner's use of partitionwise
 131           grouping or aggregation, which allows grouping or aggregation on
 132           partitioned tables to be performed separately for each
 133           partition. If the GROUP BY clause does not include the partition
 134           keys, only partial aggregation can be performed on a
 135           per-partition basis, and finalization must be performed later.
 136           With this setting enabled, the number of nodes whose memory
 137           usage is restricted by work_mem appearing in the final plan can
 138           increase linearly according to the number of partitions being
 139           scanned. This can result in a large increase in overall memory
 140           consumption during the execution of the query. Query planning
 141           also becomes significantly more expensive in terms of memory and
 142           CPU. The default value is off.
 143
 144    enable_presorted_aggregate (boolean) #
 145           Controls if the query planner will produce a plan which will
 146           provide rows which are presorted in the order required for the
 147           query's ORDER BY / DISTINCT aggregate functions. When disabled,
 148           the query planner will produce a plan which will always require
 149           the executor to perform a sort before performing aggregation of
 150           each aggregate function containing an ORDER BY or DISTINCT
 151           clause. When enabled, the planner will try to produce a more
 152           efficient plan which provides input to the aggregate functions
 153           which is presorted in the order they require for aggregation.
 154           The default value is on.
 155
 156    enable_self_join_elimination (boolean) #
 157           Enables or disables the query planner's optimization which
 158           analyses the query tree and replaces self joins with
 159           semantically equivalent single scans. Takes into consideration
 160           only plain tables. The default is on.
 161
 162    enable_seqscan (boolean) #
 163           Enables or disables the query planner's use of sequential scan
 164           plan types. It is impossible to suppress sequential scans
 165           entirely, but turning this variable off discourages the planner
 166           from using one if there are other methods available. The default
 167           is on.
 168
 169    enable_sort (boolean) #
 170           Enables or disables the query planner's use of explicit sort
 171           steps. It is impossible to suppress explicit sorts entirely, but
 172           turning this variable off discourages the planner from using one
 173           if there are other methods available. The default is on.
 174
 175    enable_tidscan (boolean) #
 176           Enables or disables the query planner's use of TID scan plan
 177           types. The default is on.
 178
 179 19.7.2. Planner Cost Constants #
 180
 181    The cost variables described in this section are measured on an
 182    arbitrary scale. Only their relative values matter, hence scaling them
 183    all up or down by the same factor will result in no change in the
 184    planner's choices. By default, these cost variables are based on the
 185    cost of sequential page fetches; that is, seq_page_cost is
 186    conventionally set to 1.0 and the other cost variables are set with
 187    reference to that. But you can use a different scale if you prefer,
 188    such as actual execution times in milliseconds on a particular machine.
 189
 190 Note
 191
 192    Unfortunately, there is no well-defined method for determining ideal
 193    values for the cost variables. They are best treated as averages over
 194    the entire mix of queries that a particular installation will receive.
 195    This means that changing them on the basis of just a few experiments is
 196    very risky.
 197
 198    seq_page_cost (floating point) #
 199           Sets the planner's estimate of the cost of a disk page fetch
 200           that is part of a series of sequential fetches. The default is
 201           1.0. This value can be overridden for tables and indexes in a
 202           particular tablespace by setting the tablespace parameter of the
 203           same name (see ALTER TABLESPACE).
 204
 205    random_page_cost (floating point) #
 206           Sets the planner's estimate of the cost of a
 207           non-sequentially-fetched disk page. The default is 4.0. This
 208           value can be overridden for tables and indexes in a particular
 209           tablespace by setting the tablespace parameter of the same name
 210           (see ALTER TABLESPACE).
 211
 212           Reducing this value relative to seq_page_cost will cause the
 213           system to prefer index scans; raising it will make index scans
 214           look relatively more expensive. You can raise or lower both
 215           values together to change the importance of disk I/O costs
 216           relative to CPU costs, which are described by the following
 217           parameters.
 218
 219           Random access to mechanical disk storage is normally much more
 220           expensive than four times sequential access. However, a lower
 221           default is used (4.0) because the majority of random accesses to
 222           disk, such as indexed reads, are assumed to be in cache. The
 223           default value can be thought of as modeling random access as 40
 224           times slower than sequential, while expecting 90% of random
 225           reads to be cached.
 226
 227           If you believe a 90% cache rate is an incorrect assumption for
 228           your workload, you can increase random_page_cost to better
 229           reflect the true cost of random storage reads. Correspondingly,
 230           if your data is likely to be completely in cache, such as when
 231           the database is smaller than the total server memory, decreasing
 232           random_page_cost can be appropriate. Storage that has a low
 233           random read cost relative to sequential, e.g., solid-state
 234           drives, might also be better modeled with a lower value for
 235           random_page_cost, e.g., 1.1.
 236
 237 Tip
 238
 239           Although the system will let you set random_page_cost to less
 240           than seq_page_cost, it is not physically sensible to do so.
 241           However, setting them equal makes sense if the database is
 242           entirely cached in RAM, since in that case there is no penalty
 243           for touching pages out of sequence. Also, in a heavily-cached
 244           database you should lower both values relative to the CPU
 245           parameters, since the cost of fetching a page already in RAM is
 246           much smaller than it would normally be.
 247
 248    cpu_tuple_cost (floating point) #
 249           Sets the planner's estimate of the cost of processing each row
 250           during a query. The default is 0.01.
 251
 252    cpu_index_tuple_cost (floating point) #
 253           Sets the planner's estimate of the cost of processing each index
 254           entry during an index scan. The default is 0.005.
 255
 256    cpu_operator_cost (floating point) #
 257           Sets the planner's estimate of the cost of processing each
 258           operator or function executed during a query. The default is
 259           0.0025.
 260
 261    parallel_setup_cost (floating point) #
 262           Sets the planner's estimate of the cost of launching parallel
 263           worker processes. The default is 1000.
 264
 265    parallel_tuple_cost (floating point) #
 266           Sets the planner's estimate of the cost of transferring one
 267           tuple from a parallel worker process to another process. The
 268           default is 0.1.
 269
 270    min_parallel_table_scan_size (integer) #
 271           Sets the minimum amount of table data that must be scanned in
 272           order for a parallel scan to be considered. For a parallel
 273           sequential scan, the amount of table data scanned is always
 274           equal to the size of the table, but when indexes are used the
 275           amount of table data scanned will normally be less. If this
 276           value is specified without units, it is taken as blocks, that is
 277           BLCKSZ bytes, typically 8kB. The default is 8 megabytes (8MB).
 278
 279    min_parallel_index_scan_size (integer) #
 280           Sets the minimum amount of index data that must be scanned in
 281           order for a parallel scan to be considered. Note that a parallel
 282           index scan typically won't touch the entire index; it is the
 283           number of pages which the planner believes will actually be
 284           touched by the scan which is relevant. This parameter is also
 285           used to decide whether a particular index can participate in a
 286           parallel vacuum. See VACUUM. If this value is specified without
 287           units, it is taken as blocks, that is BLCKSZ bytes, typically
 288           8kB. The default is 512 kilobytes (512kB).
 289
 290    effective_cache_size (integer) #
 291           Sets the planner's assumption about the effective size of the
 292           disk cache that is available to a single query. This is factored
 293           into estimates of the cost of using an index; a higher value
 294           makes it more likely index scans will be used, a lower value
 295           makes it more likely sequential scans will be used. When setting
 296           this parameter you should consider both PostgreSQL's shared
 297           buffers and the portion of the kernel's disk cache that will be
 298           used for PostgreSQL data files, though some data might exist in
 299           both places. Also, take into account the expected number of
 300           concurrent queries on different tables, since they will have to
 301           share the available space. This parameter has no effect on the
 302           size of shared memory allocated by PostgreSQL, nor does it
 303           reserve kernel disk cache; it is used only for estimation
 304           purposes. The system also does not assume data remains in the
 305           disk cache between queries. If this value is specified without
 306           units, it is taken as blocks, that is BLCKSZ bytes, typically
 307           8kB. The default is 4 gigabytes (4GB). (If BLCKSZ is not 8kB,
 308           the default value scales proportionally to it.)
 309
 310    jit_above_cost (floating point) #
 311           Sets the query cost above which JIT compilation is activated, if
 312           enabled (see Chapter 30). Performing JIT costs planning time but
 313           can accelerate query execution. Setting this to -1 disables JIT
 314           compilation. The default is 100000.
 315
 316    jit_inline_above_cost (floating point) #
 317           Sets the query cost above which JIT compilation attempts to
 318           inline functions and operators. Inlining adds planning time, but
 319           can improve execution speed. It is not meaningful to set this to
 320           less than jit_above_cost. Setting this to -1 disables inlining.
 321           The default is 500000.
 322
 323    jit_optimize_above_cost (floating point) #
 324           Sets the query cost above which JIT compilation applies
 325           expensive optimizations. Such optimization adds planning time,
 326           but can improve execution speed. It is not meaningful to set
 327           this to less than jit_above_cost, and it is unlikely to be
 328           beneficial to set it to more than jit_inline_above_cost. Setting
 329           this to -1 disables expensive optimizations. The default is
 330           500000.
 331
 332 19.7.3. Genetic Query Optimizer #
 333
 334    The genetic query optimizer (GEQO) is an algorithm that does query
 335    planning using heuristic searching. This reduces planning time for
 336    complex queries (those joining many relations), at the cost of
 337    producing plans that are sometimes inferior to those found by the
 338    normal exhaustive-search algorithm. For more information see
 339    Chapter 61.
 340
 341    geqo (boolean) #
 342           Enables or disables genetic query optimization. This is on by
 343           default. It is usually best not to turn it off in production;
 344           the geqo_threshold variable provides more granular control of
 345           GEQO.
 346
 347    geqo_threshold (integer) #
 348           Use genetic query optimization to plan queries with at least
 349           this many FROM items involved. (Note that a FULL OUTER JOIN
 350           construct counts as only one FROM item.) The default is 12. For
 351           simpler queries it is usually best to use the regular,
 352           exhaustive-search planner, but for queries with many tables the
 353           exhaustive search takes too long, often longer than the penalty
 354           of executing a suboptimal plan. Thus, a threshold on the size of
 355           the query is a convenient way to manage use of GEQO.
 356
 357    geqo_effort (integer) #
 358           Controls the trade-off between planning time and query plan
 359           quality in GEQO. This variable must be an integer in the range
 360           from 1 to 10. The default value is five. Larger values increase
 361           the time spent doing query planning, but also increase the
 362           likelihood that an efficient query plan will be chosen.
 363
 364           geqo_effort doesn't actually do anything directly; it is only
 365           used to compute the default values for the other variables that
 366           influence GEQO behavior (described below). If you prefer, you
 367           can set the other parameters by hand instead.
 368
 369    geqo_pool_size (integer) #
 370           Controls the pool size used by GEQO, that is the number of
 371           individuals in the genetic population. It must be at least two,
 372           and useful values are typically 100 to 1000. If it is set to
 373           zero (the default setting) then a suitable value is chosen based
 374           on geqo_effort and the number of tables in the query.
 375
 376    geqo_generations (integer) #
 377           Controls the number of generations used by GEQO, that is the
 378           number of iterations of the algorithm. It must be at least one,
 379           and useful values are in the same range as the pool size. If it
 380           is set to zero (the default setting) then a suitable value is
 381           chosen based on geqo_pool_size.
 382
 383    geqo_selection_bias (floating point) #
 384           Controls the selection bias used by GEQO. The selection bias is
 385           the selective pressure within the population. Values can be from
 386           1.50 to 2.00; the latter is the default.
 387
 388    geqo_seed (floating point) #
 389           Controls the initial value of the random number generator used
 390           by GEQO to select random paths through the join order search
 391           space. The value can range from zero (the default) to one.
 392           Varying the value changes the set of join paths explored, and
 393           may result in a better or worse best path being found.
 394
 395 19.7.4. Other Planner Options #
 396
 397    default_statistics_target (integer) #
 398           Sets the default statistics target for table columns without a
 399           column-specific target set via ALTER TABLE SET STATISTICS.
 400           Larger values increase the time needed to do ANALYZE, but might
 401           improve the quality of the planner's estimates. The default is
 402           100. For more information on the use of statistics by the
 403           PostgreSQL query planner, refer to Section 14.2.
 404
 405    constraint_exclusion (enum) #
 406           Controls the query planner's use of table constraints to
 407           optimize queries. The allowed values of constraint_exclusion are
 408           on (examine constraints for all tables), off (never examine
 409           constraints), and partition (examine constraints only for
 410           inheritance child tables and UNION ALL subqueries). partition is
 411           the default setting. It is often used with traditional
 412           inheritance trees to improve performance.
 413
 414           When this parameter allows it for a particular table, the
 415           planner compares query conditions with the table's CHECK
 416           constraints, and omits scanning tables for which the conditions
 417           contradict the constraints. For example:
 418
 419 CREATE TABLE parent(key integer, ...);
 420 CREATE TABLE child1000(check (key between 1000 and 1999)) INHERITS(parent);
 421 CREATE TABLE child2000(check (key between 2000 and 2999)) INHERITS(parent);
 422 ...
 423 SELECT * FROM parent WHERE key = 2400;
 424
 425           With constraint exclusion enabled, this SELECT will not scan
 426           child1000 at all, improving performance.
 427
 428           Currently, constraint exclusion is enabled by default only for
 429           cases that are often used to implement table partitioning via
 430           inheritance trees. Turning it on for all tables imposes extra
 431           planning overhead that is quite noticeable on simple queries,
 432           and most often will yield no benefit for simple queries. If you
 433           have no tables that are partitioned using traditional
 434           inheritance, you might prefer to turn it off entirely. (Note
 435           that the equivalent feature for partitioned tables is controlled
 436           by a separate parameter, enable_partition_pruning.)
 437
 438           Refer to Section 5.12.5 for more information on using constraint
 439           exclusion to implement partitioning.
 440
 441    cursor_tuple_fraction (floating point) #
 442           Sets the planner's estimate of the fraction of a cursor's rows
 443           that will be retrieved. The default is 0.1. Smaller values of
 444           this setting bias the planner towards using “fast start” plans
 445           for cursors, which will retrieve the first few rows quickly
 446           while perhaps taking a long time to fetch all rows. Larger
 447           values put more emphasis on the total estimated time. At the
 448           maximum setting of 1.0, cursors are planned exactly like regular
 449           queries, considering only the total estimated time and not how
 450           soon the first rows might be delivered.
 451
 452    from_collapse_limit (integer) #
 453           The planner will merge sub-queries into upper queries if the
 454           resulting FROM list would have no more than this many items.
 455           Smaller values reduce planning time but might yield inferior
 456           query plans. The default is eight. For more information see
 457           Section 14.3.
 458
 459           Setting this value to geqo_threshold or more may trigger use of
 460           the GEQO planner, resulting in non-optimal plans. See
 461           Section 19.7.3.
 462
 463    jit (boolean) #
 464           Determines whether JIT compilation may be used by PostgreSQL, if
 465           available (see Chapter 30). The default is on.
 466
 467    join_collapse_limit (integer) #
 468           The planner will rewrite explicit JOIN constructs (except FULL
 469           JOINs) into lists of FROM items whenever a list of no more than
 470           this many items would result. Smaller values reduce planning
 471           time but might yield inferior query plans.
 472
 473           By default, this variable is set the same as
 474           from_collapse_limit, which is appropriate for most uses. Setting
 475           it to 1 prevents any reordering of explicit JOINs. Thus, the
 476           explicit join order specified in the query will be the actual
 477           order in which the relations are joined. Because the query
 478           planner does not always choose the optimal join order, advanced
 479           users can elect to temporarily set this variable to 1, and then
 480           specify the join order they desire explicitly. For more
 481           information see Section 14.3.
 482
 483           Setting this value to geqo_threshold or more may trigger use of
 484           the GEQO planner, resulting in non-optimal plans. See
 485           Section 19.7.3.
 486
 487    plan_cache_mode (enum) #
 488           Prepared statements (either explicitly prepared or implicitly
 489           generated, for example by PL/pgSQL) can be executed using custom
 490           or generic plans. Custom plans are made afresh for each
 491           execution using its specific set of parameter values, while
 492           generic plans do not rely on the parameter values and can be
 493           re-used across executions. Thus, use of a generic plan saves
 494           planning time, but if the ideal plan depends strongly on the
 495           parameter values then a generic plan may be inefficient. The
 496           choice between these options is normally made automatically, but
 497           it can be overridden with plan_cache_mode. The allowed values
 498           are auto (the default), force_custom_plan and
 499           force_generic_plan. This setting is considered when a cached
 500           plan is to be executed, not when it is prepared. For more
 501           information see PREPARE.
 502
 503    recursive_worktable_factor (floating point) #
 504           Sets the planner's estimate of the average size of the working
 505           table of a recursive query, as a multiple of the estimated size
 506           of the initial non-recursive term of the query. This helps the
 507           planner choose the most appropriate method for joining the
 508           working table to the query's other tables. The default value is
 509           10.0. A smaller value such as 1.0 can be helpful when the
 510           recursion has low “fan-out” from one step to the next, as for
 511           example in shortest-path queries. Graph analytics queries may
 512           benefit from larger-than-default values.