2 63.4. Index Locking Considerations #
4 Index access methods must handle concurrent updates of the index by
5 multiple processes. The core PostgreSQL system obtains AccessShareLock
6 on the index during an index scan, and RowExclusiveLock when updating
7 the index (including plain VACUUM). Since these lock types do not
8 conflict, the access method is responsible for handling any
9 fine-grained locking it might need. An ACCESS EXCLUSIVE lock on the
10 index as a whole will be taken only during index creation, destruction,
11 or REINDEX (SHARE UPDATE EXCLUSIVE is taken instead with CONCURRENTLY).
13 Building an index type that supports concurrent updates usually
14 requires extensive and subtle analysis of the required behavior. For
15 the b-tree and hash index types, you can read about the design
16 decisions involved in src/backend/access/nbtree/README and
17 src/backend/access/hash/README.
19 Aside from the index's own internal consistency requirements,
20 concurrent updates create issues about consistency between the parent
21 table (the heap) and the index. Because PostgreSQL separates accesses
22 and updates of the heap from those of the index, there are windows in
23 which the index might be inconsistent with the heap. We handle this
24 problem with the following rules:
25 * A new heap entry is made before making its index entries.
26 (Therefore a concurrent index scan is likely to fail to see the
27 heap entry. This is okay because the index reader would be
28 uninterested in an uncommitted row anyway. But see Section 63.5.)
29 * When a heap entry is to be deleted (by VACUUM), all its index
30 entries must be removed first.
31 * An index scan must maintain a pin on the index page holding the
32 item last returned by amgettuple, and ambulkdelete cannot delete
33 entries from pages that are pinned by other backends. The need for
34 this rule is explained below.
36 Without the third rule, it is possible for an index reader to see an
37 index entry just before it is removed by VACUUM, and then to arrive at
38 the corresponding heap entry after that was removed by VACUUM. This
39 creates no serious problems if that item number is still unused when
40 the reader reaches it, since an empty item slot will be ignored by
41 heap_fetch(). But what if a third backend has already re-used the item
42 slot for something else? When using an MVCC-compliant snapshot, there
43 is no problem because the new occupant of the slot is certain to be too
44 new to pass the snapshot test. However, with a non-MVCC-compliant
45 snapshot (such as SnapshotAny), it would be possible to accept and
46 return a row that does not in fact match the scan keys. We could defend
47 against this scenario by requiring the scan keys to be rechecked
48 against the heap row in all cases, but that is too expensive. Instead,
49 we use a pin on an index page as a proxy to indicate that the reader
50 might still be “in flight” from the index entry to the matching heap
51 entry. Making ambulkdelete block on such a pin ensures that VACUUM
52 cannot delete the heap entry before the reader is done with it. This
53 solution costs little in run time, and adds blocking overhead only in
54 the rare cases where there actually is a conflict.
56 This solution requires that index scans be “synchronous”: we have to
57 fetch each heap tuple immediately after scanning the corresponding
58 index entry. This is expensive for a number of reasons. An
59 “asynchronous” scan in which we collect many TIDs from the index, and
60 only visit the heap tuples sometime later, requires much less index
61 locking overhead and can allow a more efficient heap access pattern.
62 Per the above analysis, we must use the synchronous approach for
63 non-MVCC-compliant snapshots, but an asynchronous scan is workable for
64 a query using an MVCC snapshot.
66 In an amgetbitmap index scan, the access method does not keep an index
67 pin on any of the returned tuples. Therefore it is only safe to use
68 such scans with MVCC-compliant snapshots.
70 When the ampredlocks flag is not set, any scan using that index access
71 method within a serializable transaction will acquire a nonblocking
72 predicate lock on the full index. This will generate a read-write
73 conflict with the insert of any tuple into that index by a concurrent
74 serializable transaction. If certain patterns of read-write conflicts
75 are detected among a set of concurrent serializable transactions, one
76 of those transactions may be canceled to protect data integrity. When
77 the flag is set, it indicates that the index access method implements
78 finer-grained predicate locking, which will tend to reduce the
79 frequency of such transaction cancellations.