1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>11.9. Index-Only Scans and Covering Indexes</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="indexes-partial.html" title="11.8. Partial Indexes" /><link rel="next" href="indexes-opclass.html" title="11.10. Operator Classes and Operator Families" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">11.9. Index-Only Scans and Covering Indexes</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="indexes-partial.html" title="11.8. Partial Indexes">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="indexes.html" title="Chapter 11. Indexes">Up</a></td><th width="60%" align="center">Chapter 11. Indexes</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="indexes-opclass.html" title="11.10. Operator Classes and Operator Families">Next</a></td></tr></table><hr /></div><div class="sect1" id="INDEXES-INDEX-ONLY-SCANS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">11.9. Index-Only Scans and Covering Indexes <a href="#INDEXES-INDEX-ONLY-SCANS" class="id_link">#</a></h2></div></div></div><a id="id-1.5.10.12.2" class="indexterm"></a><a id="id-1.5.10.12.3" class="indexterm"></a><a id="id-1.5.10.12.4" class="indexterm"></a><a id="id-1.5.10.12.5" class="indexterm"></a><p>
3 All indexes in <span class="productname">PostgreSQL</span>
4 are <em class="firstterm">secondary</em> indexes, meaning that each index is
5 stored separately from the table's main data area (which is called the
6 table's <em class="firstterm">heap</em>
7 in <span class="productname">PostgreSQL</span> terminology). This means that
8 in an ordinary index scan, each row retrieval requires fetching data from
9 both the index and the heap. Furthermore, while the index entries that
10 match a given indexable <code class="literal">WHERE</code> condition are usually
11 close together in the index, the table rows they reference might be
12 anywhere in the heap. The heap-access portion of an index scan thus
13 involves a lot of random access into the heap, which can be slow,
14 particularly on traditional rotating media. (As described in
15 <a class="xref" href="indexes-bitmap-scans.html" title="11.5. Combining Multiple Indexes">Section 11.5</a>, bitmap scans try to alleviate
16 this cost by doing the heap accesses in sorted order, but that only goes
19 To solve this performance problem, <span class="productname">PostgreSQL</span>
20 supports <em class="firstterm">index-only scans</em>, which can answer
21 queries from an index alone without any heap access. The basic idea is
22 to return values directly out of each index entry instead of consulting
23 the associated heap entry. There are two fundamental restrictions on
24 when this method can be used:
26 </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
27 The index type must support index-only scans. B-tree indexes always
28 do. GiST and SP-GiST indexes support index-only scans for some
29 operator classes but not others. Other index types have no support.
30 The underlying requirement is that the index must physically store, or
31 else be able to reconstruct, the original data value for each index
32 entry. As a counterexample, GIN indexes cannot support index-only
33 scans because each index entry typically holds only part of the
35 </p></li><li class="listitem"><p>
36 The query must reference only columns stored in the index. For
37 example, given an index on columns <code class="literal">x</code>
38 and <code class="literal">y</code> of a table that also has a
39 column <code class="literal">z</code>, these queries could use index-only scans:
40 </p><pre class="programlisting">
41 SELECT x, y FROM tab WHERE x = 'key';
42 SELECT x FROM tab WHERE x = 'key' AND y < 42;
44 but these queries could not:
45 </p><pre class="programlisting">
46 SELECT x, z FROM tab WHERE x = 'key';
47 SELECT x FROM tab WHERE x = 'key' AND z < 42;
49 (Expression indexes and partial indexes complicate this rule,
51 </p></li></ol></div><p>
53 If these two fundamental requirements are met, then all the data values
54 required by the query are available from the index, so an index-only scan
55 is physically possible. But there is an additional requirement for any
56 table scan in <span class="productname">PostgreSQL</span>: it must verify that
57 each retrieved row be <span class="quote">“<span class="quote">visible</span>”</span> to the query's MVCC
58 snapshot, as discussed in <a class="xref" href="mvcc.html" title="Chapter 13. Concurrency Control">Chapter 13</a>. Visibility information
59 is not stored in index entries, only in heap entries; so at first glance
60 it would seem that every row retrieval would require a heap access
61 anyway. And this is indeed the case, if the table row has been modified
62 recently. However, for seldom-changing data there is a way around this
63 problem. <span class="productname">PostgreSQL</span> tracks, for each page in
64 a table's heap, whether all rows stored in that page are old enough to be
65 visible to all current and future transactions. This information is
66 stored in a bit in the table's <em class="firstterm">visibility map</em>. An
67 index-only scan, after finding a candidate index entry, checks the
68 visibility map bit for the corresponding heap page. If it's set, the row
69 is known visible and so the data can be returned with no further work.
70 If it's not set, the heap entry must be visited to find out whether it's
71 visible, so no performance advantage is gained over a standard index
72 scan. Even in the successful case, this approach trades visibility map
73 accesses for heap accesses; but since the visibility map is four orders
74 of magnitude smaller than the heap it describes, far less physical I/O is
75 needed to access it. In most situations the visibility map remains
76 cached in memory all the time.
78 In short, while an index-only scan is possible given the two fundamental
79 requirements, it will be a win only if a significant fraction of the
80 table's heap pages have their all-visible map bits set. But tables in
81 which a large fraction of the rows are unchanging are common enough to
82 make this type of scan very useful in practice.
84 <a id="id-1.5.10.12.10.1" class="indexterm"></a>
85 To make effective use of the index-only scan feature, you might choose to
86 create a <em class="firstterm">covering index</em>, which is an index
87 specifically designed to include the columns needed by a particular
88 type of query that you run frequently. Since queries typically need to
89 retrieve more columns than just the ones they search
90 on, <span class="productname">PostgreSQL</span> allows you to create an index
91 in which some columns are just <span class="quote">“<span class="quote">payload</span>”</span> and are not part
92 of the search key. This is done by adding an <code class="literal">INCLUDE</code>
93 clause listing the extra columns. For example, if you commonly run
95 </p><pre class="programlisting">
96 SELECT y FROM tab WHERE x = 'key';
98 the traditional approach to speeding up such queries would be to create
99 an index on <code class="literal">x</code> only. However, an index defined as
100 </p><pre class="programlisting">
101 CREATE INDEX tab_x_y ON tab(x) INCLUDE (y);
103 could handle these queries as index-only scans,
104 because <code class="literal">y</code> can be obtained from the index without
107 Because column <code class="literal">y</code> is not part of the index's search
108 key, it does not have to be of a data type that the index can handle;
109 it's merely stored in the index and is not interpreted by the index
110 machinery. Also, if the index is a unique index, that is
111 </p><pre class="programlisting">
112 CREATE UNIQUE INDEX tab_x_y ON tab(x) INCLUDE (y);
114 the uniqueness condition applies to just column <code class="literal">x</code>,
115 not to the combination of <code class="literal">x</code> and <code class="literal">y</code>.
116 (An <code class="literal">INCLUDE</code> clause can also be written
117 in <code class="literal">UNIQUE</code> and <code class="literal">PRIMARY KEY</code>
118 constraints, providing alternative syntax for setting up an index like
121 It's wise to be conservative about adding non-key payload columns to an
122 index, especially wide columns. If an index tuple exceeds the
123 maximum size allowed for the index type, data insertion will fail.
124 In any case, non-key columns duplicate data from the index's table
125 and bloat the size of the index, thus potentially slowing searches.
126 And remember that there is little point in including payload columns in an
127 index unless the table changes slowly enough that an index-only scan is
128 likely to not need to access the heap. If the heap tuple must be visited
129 anyway, it costs nothing more to get the column's value from there.
130 Other restrictions are that expressions are not currently supported as
131 included columns, and that only B-tree, GiST and SP-GiST indexes currently
132 support included columns.
134 Before <span class="productname">PostgreSQL</span> had
135 the <code class="literal">INCLUDE</code> feature, people sometimes made covering
136 indexes by writing the payload columns as ordinary index columns,
138 </p><pre class="programlisting">
139 CREATE INDEX tab_x_y ON tab(x, y);
141 even though they had no intention of ever using <code class="literal">y</code> as
142 part of a <code class="literal">WHERE</code> clause. This works fine as long as
143 the extra columns are trailing columns; making them be leading columns is
144 unwise for the reasons explained in <a class="xref" href="indexes-multicolumn.html" title="11.3. Multicolumn Indexes">Section 11.3</a>.
145 However, this method doesn't support the case where you want the index to
146 enforce uniqueness on the key column(s).
148 <em class="firstterm">Suffix truncation</em> always removes non-key
149 columns from upper B-Tree levels. As payload columns, they are
150 never used to guide index scans. The truncation process also
151 removes one or more trailing key column(s) when the remaining
152 prefix of key column(s) happens to be sufficient to describe tuples
153 on the lowest B-Tree level. In practice, covering indexes without
154 an <code class="literal">INCLUDE</code> clause often avoid storing columns
155 that are effectively payload in the upper levels. However,
156 explicitly defining payload columns as non-key columns
157 <span class="emphasis"><em>reliably</em></span> keeps the tuples in upper levels
160 In principle, index-only scans can be used with expression indexes.
161 For example, given an index on <code class="literal">f(x)</code>
162 where <code class="literal">x</code> is a table column, it should be possible to
164 </p><pre class="programlisting">
165 SELECT f(x) FROM tab WHERE f(x) < 1;
167 as an index-only scan; and this is very attractive
168 if <code class="literal">f()</code> is an expensive-to-compute function.
169 However, <span class="productname">PostgreSQL</span>'s planner is currently not
170 very smart about such cases. It considers a query to be potentially
171 executable by index-only scan only when all <span class="emphasis"><em>columns</em></span>
172 needed by the query are available from the index. In this
173 example, <code class="literal">x</code> is not needed except in the
174 context <code class="literal">f(x)</code>, but the planner does not notice that and
175 concludes that an index-only scan is not possible. If an index-only scan
176 seems sufficiently worthwhile, this can be worked around by
177 adding <code class="literal">x</code> as an included column, for example
178 </p><pre class="programlisting">
179 CREATE INDEX tab_f_x ON tab (f(x)) INCLUDE (x);
181 An additional caveat, if the goal is to avoid
182 recalculating <code class="literal">f(x)</code>, is that the planner won't
183 necessarily match uses of <code class="literal">f(x)</code> that aren't in
184 indexable <code class="literal">WHERE</code> clauses to the index column. It will
185 usually get this right in simple queries such as shown above, but not in
186 queries that involve joins. These deficiencies may be remedied in future
187 versions of <span class="productname">PostgreSQL</span>.
189 Partial indexes also have interesting interactions with index-only scans.
190 Consider the partial index shown in <a class="xref" href="indexes-partial.html#INDEXES-PARTIAL-EX3" title="Example 11.3. Setting up a Partial Unique Index">Example 11.3</a>:
191 </p><pre class="programlisting">
192 CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
195 In principle, we could do an index-only scan on this index to satisfy a
197 </p><pre class="programlisting">
198 SELECT target FROM tests WHERE subject = 'some-subject' AND success;
200 But there's a problem: the <code class="literal">WHERE</code> clause refers
201 to <code class="literal">success</code> which is not available as a result column
202 of the index. Nonetheless, an index-only scan is possible because the
203 plan does not need to recheck that part of the <code class="literal">WHERE</code>
204 clause at run time: all entries found in the index necessarily
205 have <code class="literal">success = true</code> so this need not be explicitly
206 checked in the plan. <span class="productname">PostgreSQL</span> versions 9.6
207 and later will recognize such cases and allow index-only scans to be
208 generated, but older versions will not.
209 </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="indexes-partial.html" title="11.8. Partial Indexes">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="indexes.html" title="Chapter 11. Indexes">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="indexes-opclass.html" title="11.10. Operator Classes and Operator Families">Next</a></td></tr><tr><td width="40%" align="left" valign="top">11.8. Partial Indexes </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 18.0 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 11.10. Operator Classes and Operator Families</td></tr></table></div></body></html>