begriffs open source - ai-pg/blob - full-docs/txt/indexes-examine.txt

   1
   2 11.12. Examining Index Usage #
   3
   4    Although indexes in PostgreSQL do not need maintenance or tuning, it is
   5    still important to check which indexes are actually used by the
   6    real-life query workload. Examining index usage for an individual query
   7    is done with the EXPLAIN command; its application for this purpose is
   8    illustrated in Section 14.1. It is also possible to gather overall
   9    statistics about index usage in a running server, as described in
  10    Section 27.2.
  11
  12    It is difficult to formulate a general procedure for determining which
  13    indexes to create. There are a number of typical cases that have been
  14    shown in the examples throughout the previous sections. A good deal of
  15    experimentation is often necessary. The rest of this section gives some
  16    tips for that:
  17      * Always run ANALYZE first. This command collects statistics about
  18        the distribution of the values in the table. This information is
  19        required to estimate the number of rows returned by a query, which
  20        is needed by the planner to assign realistic costs to each possible
  21        query plan. In absence of any real statistics, some default values
  22        are assumed, which are almost certain to be inaccurate. Examining
  23        an application's index usage without having run ANALYZE is
  24        therefore a lost cause. See Section 24.1.3 and Section 24.1.6 for
  25        more information.
  26      * Use real data for experimentation. Using test data for setting up
  27        indexes will tell you what indexes you need for the test data, but
  28        that is all.
  29        It is especially fatal to use very small test data sets. While
  30        selecting 1000 out of 100000 rows could be a candidate for an
  31        index, selecting 1 out of 100 rows will hardly be, because the 100
  32        rows probably fit within a single disk page, and there is no plan
  33        that can beat sequentially fetching 1 disk page.
  34        Also be careful when making up test data, which is often
  35        unavoidable when the application is not yet in production. Values
  36        that are very similar, completely random, or inserted in sorted
  37        order will skew the statistics away from the distribution that real
  38        data would have.
  39      * When indexes are not used, it can be useful for testing to force
  40        their use. There are run-time parameters that can turn off various
  41        plan types (see Section 19.7.1). For instance, turning off
  42        sequential scans (enable_seqscan) and nested-loop joins
  43        (enable_nestloop), which are the most basic plans, will force the
  44        system to use a different plan. If the system still chooses a
  45        sequential scan or nested-loop join then there is probably a more
  46        fundamental reason why the index is not being used; for example,
  47        the query condition does not match the index. (What kind of query
  48        can use what kind of index is explained in the previous sections.)
  49      * If forcing index usage does use the index, then there are two
  50        possibilities: Either the system is right and using the index is
  51        indeed not appropriate, or the cost estimates of the query plans
  52        are not reflecting reality. So you should time your query with and
  53        without indexes. The EXPLAIN ANALYZE command can be useful here.
  54      * If it turns out that the cost estimates are wrong, there are,
  55        again, two possibilities. The total cost is computed from the
  56        per-row costs of each plan node times the selectivity estimate of
  57        the plan node. The costs estimated for the plan nodes can be
  58        adjusted via run-time parameters (described in Section 19.7.2). An
  59        inaccurate selectivity estimate is due to insufficient statistics.
  60        It might be possible to improve this by tuning the
  61        statistics-gathering parameters (see ALTER TABLE).
  62        If you do not succeed in adjusting the costs to be more
  63        appropriate, then you might have to resort to forcing index usage
  64        explicitly. You might also want to contact the PostgreSQL
  65        developers to examine the issue.