begriffs open source - ai-pg/blob - full-docs/txt/indexes-intro.txt

   1
   2 11.1. Introduction #
   3
   4    Suppose we have a table similar to this:
   5 CREATE TABLE test1 (
   6     id integer,
   7     content varchar
   8 );
   9
  10    and the application issues many queries of the form:
  11 SELECT content FROM test1 WHERE id = constant;
  12
  13    With no advance preparation, the system would have to scan the entire
  14    test1 table, row by row, to find all matching entries. If there are
  15    many rows in test1 and only a few rows (perhaps zero or one) that would
  16    be returned by such a query, this is clearly an inefficient method. But
  17    if the system has been instructed to maintain an index on the id
  18    column, it can use a more efficient method for locating matching rows.
  19    For instance, it might only have to walk a few levels deep into a
  20    search tree.
  21
  22    A similar approach is used in most non-fiction books: terms and
  23    concepts that are frequently looked up by readers are collected in an
  24    alphabetic index at the end of the book. The interested reader can scan
  25    the index relatively quickly and flip to the appropriate page(s),
  26    rather than having to read the entire book to find the material of
  27    interest. Just as it is the task of the author to anticipate the items
  28    that readers are likely to look up, it is the task of the database
  29    programmer to foresee which indexes will be useful.
  30
  31    The following command can be used to create an index on the id column,
  32    as discussed:
  33 CREATE INDEX test1_id_index ON test1 (id);
  34
  35    The name test1_id_index can be chosen freely, but you should pick
  36    something that enables you to remember later what the index was for.
  37
  38    To remove an index, use the DROP INDEX command. Indexes can be added to
  39    and removed from tables at any time.
  40
  41    Once an index is created, no further intervention is required: the
  42    system will update the index when the table is modified, and it will
  43    use the index in queries when it thinks doing so would be more
  44    efficient than a sequential table scan. But you might have to run the
  45    ANALYZE command regularly to update statistics to allow the query
  46    planner to make educated decisions. See Chapter 14 for information
  47    about how to find out whether an index is used and when and why the
  48    planner might choose not to use an index.
  49
  50    Indexes can also benefit UPDATE and DELETE commands with search
  51    conditions. Indexes can moreover be used in join searches. Thus, an
  52    index defined on a column that is part of a join condition can also
  53    significantly speed up queries with joins.
  54
  55    In general, PostgreSQL indexes can be used to optimize queries that
  56    contain one or more WHERE or JOIN clauses of the form
  57 indexed-column indexable-operator comparison-value
  58
  59    Here, the indexed-column is whatever column or expression the index has
  60    been defined on. The indexable-operator is an operator that is a member
  61    of the index's operator class for the indexed column. (More details
  62    about that appear below.) And the comparison-value can be any
  63    expression that is not volatile and does not reference the index's
  64    table.
  65
  66    In some cases the query planner can extract an indexable clause of this
  67    form from another SQL construct. A simple example is that if the
  68    original clause was
  69 comparison-value operator indexed-column
  70
  71    then it can be flipped around into indexable form if the original
  72    operator has a commutator operator that is a member of the index's
  73    operator class.
  74
  75    Creating an index on a large table can take a long time. By default,
  76    PostgreSQL allows reads (SELECT statements) to occur on the table in
  77    parallel with index creation, but writes (INSERT, UPDATE, DELETE) are
  78    blocked until the index build is finished. In production environments
  79    this is often unacceptable. It is possible to allow writes to occur in
  80    parallel with index creation, but there are several caveats to be aware
  81    of — for more information see Building Indexes Concurrently.
  82
  83    After an index is created, the system has to keep it synchronized with
  84    the table. This adds overhead to data manipulation operations. Indexes
  85    can also prevent the creation of heap-only tuples. Therefore indexes
  86    that are seldom or never used in queries should be removed.