begriffs open source - ai-pg/blob - full-docs/txt/index-unique-checks.txt

   1
   2 63.5. Index Uniqueness Checks #
   3
   4    PostgreSQL enforces SQL uniqueness constraints using unique indexes,
   5    which are indexes that disallow multiple entries with identical keys.
   6    An access method that supports this feature sets amcanunique true. (At
   7    present, only b-tree supports it.) Columns listed in the INCLUDE clause
   8    are not considered when enforcing uniqueness.
   9
  10    Because of MVCC, it is always necessary to allow duplicate entries to
  11    exist physically in an index: the entries might refer to successive
  12    versions of a single logical row. The behavior we actually want to
  13    enforce is that no MVCC snapshot could include two rows with equal
  14    index keys. This breaks down into the following cases that must be
  15    checked when inserting a new row into a unique index:
  16      * If a conflicting valid row has been deleted by the current
  17        transaction, it's okay. (In particular, since an UPDATE always
  18        deletes the old row version before inserting the new version, this
  19        will allow an UPDATE on a row without changing the key.)
  20      * If a conflicting row has been inserted by an as-yet-uncommitted
  21        transaction, the would-be inserter must wait to see if that
  22        transaction commits. If it rolls back then there is no conflict. If
  23        it commits without deleting the conflicting row again, there is a
  24        uniqueness violation. (In practice we just wait for the other
  25        transaction to end and then redo the visibility check in toto.)
  26      * Similarly, if a conflicting valid row has been deleted by an
  27        as-yet-uncommitted transaction, the would-be inserter must wait for
  28        that transaction to commit or abort, and then repeat the test.
  29
  30    Furthermore, immediately before reporting a uniqueness violation
  31    according to the above rules, the access method must recheck the
  32    liveness of the row being inserted. If it is committed dead then no
  33    violation should be reported. (This case cannot occur during the
  34    ordinary scenario of inserting a row that's just been created by the
  35    current transaction. It can happen during CREATE UNIQUE INDEX
  36    CONCURRENTLY, however.)
  37
  38    We require the index access method to apply these tests itself, which
  39    means that it must reach into the heap to check the commit status of
  40    any row that is shown to have a duplicate key according to the index
  41    contents. This is without a doubt ugly and non-modular, but it saves
  42    redundant work: if we did a separate probe then the index lookup for a
  43    conflicting row would be essentially repeated while finding the place
  44    to insert the new row's index entry. What's more, there is no obvious
  45    way to avoid race conditions unless the conflict check is an integral
  46    part of insertion of the new index entry.
  47
  48    If the unique constraint is deferrable, there is additional complexity:
  49    we need to be able to insert an index entry for a new row, but defer
  50    any uniqueness-violation error until end of statement or even later. To
  51    avoid unnecessary repeat searches of the index, the index access method
  52    should do a preliminary uniqueness check during the initial insertion.
  53    If this shows that there is definitely no conflicting live tuple, we
  54    are done. Otherwise, we schedule a recheck to occur when it is time to
  55    enforce the constraint. If, at the time of the recheck, both the
  56    inserted tuple and some other tuple with the same key are live, then
  57    the error must be reported. (Note that for this purpose, “live”
  58    actually means “any tuple in the index entry's HOT chain is live”.) To
  59    implement this, the aminsert function is passed a checkUnique parameter
  60    having one of the following values:
  61      * UNIQUE_CHECK_NO indicates that no uniqueness checking should be
  62        done (this is not a unique index).
  63      * UNIQUE_CHECK_YES indicates that this is a non-deferrable unique
  64        index, and the uniqueness check must be done immediately, as
  65        described above.
  66      * UNIQUE_CHECK_PARTIAL indicates that the unique constraint is
  67        deferrable. PostgreSQL will use this mode to insert each row's
  68        index entry. The access method must allow duplicate entries into
  69        the index, and report any potential duplicates by returning false
  70        from aminsert. For each row for which false is returned, a deferred
  71        recheck will be scheduled.
  72        The access method must identify any rows which might violate the
  73        unique constraint, but it is not an error for it to report false
  74        positives. This allows the check to be done without waiting for
  75        other transactions to finish; conflicts reported here are not
  76        treated as errors and will be rechecked later, by which time they
  77        may no longer be conflicts.
  78      * UNIQUE_CHECK_EXISTING indicates that this is a deferred recheck of
  79        a row that was reported as a potential uniqueness violation.
  80        Although this is implemented by calling aminsert, the access method
  81        must not insert a new index entry in this case. The index entry is
  82        already present. Rather, the access method must check to see if
  83        there is another live index entry. If so, and if the target row is
  84        also still live, report error.
  85        It is recommended that in a UNIQUE_CHECK_EXISTING call, the access
  86        method further verify that the target row actually does have an
  87        existing entry in the index, and report error if not. This is a
  88        good idea because the index tuple values passed to aminsert will
  89        have been recomputed. If the index definition involves functions
  90        that are not really immutable, we might be checking the wrong area
  91        of the index. Checking that the target row is found in the recheck
  92        verifies that we are scanning for the same tuple values as were
  93        used in the original insertion.