2 F.23. pageinspect — low-level inspection of database pages #
4 F.23.1. General Functions
6 F.23.3. B-Tree Functions
10 F.23.7. Hash Functions
12 The pageinspect module provides functions that allow you to inspect the
13 contents of database pages at a low level, which is useful for
14 debugging purposes. All of these functions may be used only by
17 F.23.1. General Functions #
19 get_raw_page(relname text, fork text, blkno bigint) returns bytea
20 get_raw_page reads the specified block of the named relation and
21 returns a copy as a bytea value. This allows a single
22 time-consistent copy of the block to be obtained. fork should be
23 'main' for the main data fork, 'fsm' for the free space map,
24 'vm' for the visibility map, or 'init' for the initialization
27 get_raw_page(relname text, blkno bigint) returns bytea
28 A shorthand version of get_raw_page, for reading from the main
29 fork. Equivalent to get_raw_page(relname, 'main', blkno)
31 page_header(page bytea) returns record
32 page_header shows fields that are common to all PostgreSQL heap
35 A page image obtained with get_raw_page should be passed as
36 argument. For example:
38 test=# SELECT * FROM page_header(get_raw_page('pg_class', 0));
39 lsn | checksum | flags | lower | upper | special | pagesize | version |
41 -----------+----------+--------+-------+-------+---------+----------+---------+-
43 0/24A1B50 | 0 | 1 | 232 | 368 | 8192 | 8192 | 4 |
46 The returned columns correspond to the fields in the
47 PageHeaderData struct. See src/include/storage/bufpage.h for
50 The checksum field is the checksum stored in the page, which
51 might be incorrect if the page is somehow corrupted. If data
52 checksums are disabled for this instance, then the value stored
55 page_checksum(page bytea, blkno bigint) returns smallint
56 page_checksum computes the checksum for the page, as if it was
57 located at the given block.
59 A page image obtained with get_raw_page should be passed as
60 argument. For example:
62 test=# SELECT page_checksum(get_raw_page('pg_class', 0), 0);
67 Note that the checksum depends on the block number, so matching
68 block numbers should be passed (except when doing esoteric
71 The checksum computed with this function can be compared with
72 the checksum result field of the function page_header. If data
73 checksums are enabled for this instance, then the two values
76 fsm_page_contents(page bytea) returns text
77 fsm_page_contents shows the internal node structure of an FSM
80 test=# SELECT fsm_page_contents(get_raw_page('pg_class', 'fsm', 0));
82 The output is a multiline string, with one line per node in the
83 binary tree within the page. Only those nodes that are not zero
84 are printed. The so-called "next" pointer, which points to the
85 next slot to be returned from the page, is also printed.
87 See src/backend/storage/freespace/README for more information on
88 the structure of an FSM page.
90 F.23.2. Heap Functions #
92 heap_page_items(page bytea) returns setof record
93 heap_page_items shows all line pointers on a heap page. For
94 those line pointers that are in use, tuple headers as well as
95 tuple raw data are also shown. All tuples are shown, whether or
96 not the tuples were visible to an MVCC snapshot at the time the
99 A heap page image obtained with get_raw_page should be passed as
100 argument. For example:
102 test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
104 See src/include/storage/itemid.h and
105 src/include/access/htup_details.h for explanations of the fields
108 The heap_tuple_infomask_flags function can be used to unpack the
109 flag bits of t_infomask and t_infomask2 for heap tuples.
111 tuple_data_split(rel_oid oid, t_data bytea, t_infomask integer,
112 t_infomask2 integer, t_bits text [, do_detoast bool]) returns
114 tuple_data_split splits tuple data into attributes in the same
115 way as backend internals.
117 test=# SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask, t_infom
118 ask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
120 This function should be called with the same arguments as the
121 return attributes of heap_page_items.
123 If do_detoast is true, attributes will be detoasted as needed.
124 Default value is false.
126 heap_page_item_attrs(page bytea, rel_oid regclass [, do_detoast bool])
128 heap_page_item_attrs is equivalent to heap_page_items except
129 that it returns tuple raw data as an array of attributes that
130 can optionally be detoasted by do_detoast which is false by
133 A heap page image obtained with get_raw_page should be passed as
134 argument. For example:
136 test=# SELECT * FROM heap_page_item_attrs(get_raw_page('pg_class', 0), 'pg_class
139 heap_tuple_infomask_flags(t_infomask integer, t_infomask2 integer)
141 heap_tuple_infomask_flags decodes the t_infomask and t_infomask2
142 returned by heap_page_items into a human-readable set of arrays
143 made of flag names, with one column for all the flags and one
144 column for combined flags. For example:
146 test=# SELECT t_ctid, raw_flags, combined_flags
147 FROM heap_page_items(get_raw_page('pg_class', 0)),
148 LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2)
149 WHERE t_infomask IS NOT NULL OR t_infomask2 IS NOT NULL;
151 This function should be called with the same arguments as the
152 return attributes of heap_page_items.
154 Combined flags are displayed for source-level macros that take
155 into account the value of more than one raw bit, such as
158 See src/include/access/htup_details.h for explanations of the
161 F.23.3. B-Tree Functions #
163 bt_metap(relname text) returns record
164 bt_metap returns information about a B-tree index's metapage.
167 test=# SELECT * FROM bt_metap('pg_cast_oid_index');
168 -[ RECORD 1 ]-------------+-------
175 last_cleanup_num_delpages | 0
176 last_cleanup_num_tuples | 230
179 bt_page_stats(relname text, blkno bigint) returns record
180 bt_page_stats returns summary information about a data page of a
181 B-tree index. For example:
183 test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
197 bt_multi_page_stats(relname text, blkno bigint, blk_count bigint)
199 bt_multi_page_stats returns the same information as
200 bt_page_stats, but does so for each page of the range of pages
201 beginning at blkno and extending for blk_count pages. If
202 blk_count is negative, all pages from blkno to the end of the
203 index are reported on. For example:
205 test=# SELECT * FROM bt_multi_page_stats('pg_proc_oid_index', 5, 2);
231 bt_page_items(relname text, blkno bigint) returns setof record
232 bt_page_items returns detailed information about all of the
233 items on a B-tree index page. For example:
235 test=# SELECT itemoffset, ctid, itemlen, nulls, vars, data, dead, htid, tids[0:2
237 FROM bt_page_items('tenk2_hundred', 5);
238 itemoffset | ctid | itemlen | nulls | vars | data | dea
240 ------------+-----------+---------+-------+------+-------------------------+----
241 --+--------+---------------------
242 1 | (16,1) | 16 | f | f | 30 00 00 00 00 00 00 00 |
244 2 | (16,8292) | 616 | f | f | 24 00 00 00 00 00 00 00 | f
245 | (1,6) | {"(1,6)","(10,22)"}
246 3 | (16,8292) | 616 | f | f | 25 00 00 00 00 00 00 00 | f
247 | (1,18) | {"(1,18)","(4,22)"}
248 4 | (16,8292) | 616 | f | f | 26 00 00 00 00 00 00 00 | f
249 | (4,18) | {"(4,18)","(6,17)"}
250 5 | (16,8292) | 616 | f | f | 27 00 00 00 00 00 00 00 | f
251 | (1,2) | {"(1,2)","(1,19)"}
252 6 | (16,8292) | 616 | f | f | 28 00 00 00 00 00 00 00 | f
253 | (2,24) | {"(2,24)","(4,11)"}
254 7 | (16,8292) | 616 | f | f | 29 00 00 00 00 00 00 00 | f
255 | (2,17) | {"(2,17)","(11,2)"}
256 8 | (16,8292) | 616 | f | f | 2a 00 00 00 00 00 00 00 | f
257 | (0,25) | {"(0,25)","(3,20)"}
258 9 | (16,8292) | 616 | f | f | 2b 00 00 00 00 00 00 00 | f
259 | (0,10) | {"(0,10)","(0,14)"}
260 10 | (16,8292) | 616 | f | f | 2c 00 00 00 00 00 00 00 | f
261 | (1,3) | {"(1,3)","(3,9)"}
262 11 | (16,8292) | 616 | f | f | 2d 00 00 00 00 00 00 00 | f
263 | (6,28) | {"(6,28)","(11,1)"}
264 12 | (16,8292) | 616 | f | f | 2e 00 00 00 00 00 00 00 | f
265 | (0,27) | {"(0,27)","(1,13)"}
266 13 | (16,8292) | 616 | f | f | 2f 00 00 00 00 00 00 00 | f
267 | (4,17) | {"(4,17)","(4,21)"}
270 This is a B-tree leaf page. All tuples that point to the table
271 happen to be posting list tuples (all of which store a total of
272 100 6 byte TIDs). There is also a “high key” tuple at itemoffset
273 number 1. ctid is used to store encoded information about each
274 tuple in this example, though leaf page tuples often store a
275 heap TID directly in the ctid field instead. tids is the list of
276 TIDs stored as a posting list.
278 In an internal page (not shown), the block number part of ctid
279 is a “downlink”, which is a block number of another page in the
280 index itself. The offset part (the second number) of ctid stores
281 encoded information about the tuple, such as the number of
282 columns present (suffix truncation may have removed unneeded
283 suffix columns). Truncated columns are treated as having the
284 value “minus infinity”.
286 htid shows a heap TID for the tuple, regardless of the
287 underlying tuple representation. This value may match ctid, or
288 may be decoded from the alternative representations used by
289 posting list tuples and tuples from internal pages. Tuples in
290 internal pages usually have the implementation level heap TID
291 column truncated away, which is represented as a NULL htid
294 Note that the first item on any non-rightmost page (any page
295 with a non-zero value in the btpo_next field) is the page's
296 “high key”, meaning its data serves as an upper bound on all
297 items appearing on the page, while its ctid field does not point
298 to another block. Also, on internal pages, the first real data
299 item (the first item that is not a high key) reliably has every
300 column truncated away, leaving no actual value in its data
301 field. Such an item does have a valid downlink in its ctid
304 For more details about the structure of B-tree indexes, see
305 Section 65.1.4.1. For more details about deduplication and
306 posting lists, see Section 65.1.4.3.
308 bt_page_items(page bytea) returns setof record
309 It is also possible to pass a page to bt_page_items as a bytea
310 value. A page image obtained with get_raw_page should be passed
311 as argument. So the last example could also be rewritten like
314 test=# SELECT itemoffset, ctid, itemlen, nulls, vars, data, dead, htid, tids[0:2
316 FROM bt_page_items(get_raw_page('tenk2_hundred', 5));
317 itemoffset | ctid | itemlen | nulls | vars | data | dea
319 ------------+-----------+---------+-------+------+-------------------------+----
320 --+--------+---------------------
321 1 | (16,1) | 16 | f | f | 30 00 00 00 00 00 00 00 |
323 2 | (16,8292) | 616 | f | f | 24 00 00 00 00 00 00 00 | f
324 | (1,6) | {"(1,6)","(10,22)"}
325 3 | (16,8292) | 616 | f | f | 25 00 00 00 00 00 00 00 | f
326 | (1,18) | {"(1,18)","(4,22)"}
327 4 | (16,8292) | 616 | f | f | 26 00 00 00 00 00 00 00 | f
328 | (4,18) | {"(4,18)","(6,17)"}
329 5 | (16,8292) | 616 | f | f | 27 00 00 00 00 00 00 00 | f
330 | (1,2) | {"(1,2)","(1,19)"}
331 6 | (16,8292) | 616 | f | f | 28 00 00 00 00 00 00 00 | f
332 | (2,24) | {"(2,24)","(4,11)"}
333 7 | (16,8292) | 616 | f | f | 29 00 00 00 00 00 00 00 | f
334 | (2,17) | {"(2,17)","(11,2)"}
335 8 | (16,8292) | 616 | f | f | 2a 00 00 00 00 00 00 00 | f
336 | (0,25) | {"(0,25)","(3,20)"}
337 9 | (16,8292) | 616 | f | f | 2b 00 00 00 00 00 00 00 | f
338 | (0,10) | {"(0,10)","(0,14)"}
339 10 | (16,8292) | 616 | f | f | 2c 00 00 00 00 00 00 00 | f
340 | (1,3) | {"(1,3)","(3,9)"}
341 11 | (16,8292) | 616 | f | f | 2d 00 00 00 00 00 00 00 | f
342 | (6,28) | {"(6,28)","(11,1)"}
343 12 | (16,8292) | 616 | f | f | 2e 00 00 00 00 00 00 00 | f
344 | (0,27) | {"(0,27)","(1,13)"}
345 13 | (16,8292) | 616 | f | f | 2f 00 00 00 00 00 00 00 | f
346 | (4,17) | {"(4,17)","(4,21)"}
349 All the other details are the same as explained in the previous
352 F.23.4. BRIN Functions #
354 brin_page_type(page bytea) returns text
355 brin_page_type returns the page type of the given BRIN index
356 page, or throws an error if the page is not a valid BRIN page.
359 test=# SELECT brin_page_type(get_raw_page('brinidx', 0));
364 brin_metapage_info(page bytea) returns record
365 brin_metapage_info returns assorted information about a BRIN
366 index metapage. For example:
368 test=# SELECT * FROM brin_metapage_info(get_raw_page('brinidx', 0));
369 magic | version | pagesperrange | lastrevmappage
370 ------------+---------+---------------+----------------
371 0xA8109CFA | 1 | 4 | 2
373 brin_revmap_data(page bytea) returns setof tid
374 brin_revmap_data returns the list of tuple identifiers in a BRIN
375 index range map page. For example:
377 test=# SELECT * FROM brin_revmap_data(get_raw_page('brinidx', 2)) LIMIT 5;
386 brin_page_items(page bytea, index oid) returns setof record
387 brin_page_items returns the data stored in the BRIN data page.
390 test=# SELECT * FROM brin_page_items(get_raw_page('brinidx', 5),
392 ORDER BY blknum, attnum LIMIT 6;
393 itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | empty | v
395 ------------+--------+--------+----------+----------+-------------+-------+-----
397 137 | 0 | 1 | t | f | f | f |
398 137 | 0 | 2 | f | f | f | f | {1 .
400 138 | 4 | 1 | t | f | f | f |
401 138 | 4 | 2 | f | f | f | f | {89
403 139 | 8 | 1 | t | f | f | f |
404 139 | 8 | 2 | f | f | f | f | {177
407 The returned columns correspond to the fields in the
408 BrinMemTuple and BrinValues structs. See
409 src/include/access/brin_tuple.h for details.
411 F.23.5. GIN Functions #
413 gin_metapage_info(page bytea) returns record
414 gin_metapage_info returns information about a GIN index
415 metapage. For example:
417 test=# SELECT * FROM gin_metapage_info(get_raw_page('gin_index', 0));
418 -[ RECORD 1 ]----+-----------
419 pending_head | 4294967295
420 pending_tail | 4294967295
430 gin_page_opaque_info(page bytea) returns record
431 gin_page_opaque_info returns information about a GIN index
432 opaque area, like the page type. For example:
434 test=# SELECT * FROM gin_page_opaque_info(get_raw_page('gin_index', 2));
435 rightlink | maxoff | flags
436 -----------+--------+------------------------
437 5 | 0 | {data,leaf,compressed}
440 gin_leafpage_items(page bytea) returns setof record
441 gin_leafpage_items returns information about the data stored in
442 a compressed GIN leaf page. For example:
444 test=# SELECT first_tid, nbytes, tids[0:5] AS some_tids
445 FROM gin_leafpage_items(get_raw_page('gin_test_idx', 2));
446 first_tid | nbytes | some_tids
447 -----------+--------+----------------------------------------------------------
448 (8,41) | 244 | {"(8,41)","(8,43)","(8,44)","(8,45)","(8,46)"}
449 (10,45) | 248 | {"(10,45)","(10,46)","(10,47)","(10,48)","(10,49)"}
450 (12,52) | 248 | {"(12,52)","(12,53)","(12,54)","(12,55)","(12,56)"}
451 (14,59) | 320 | {"(14,59)","(14,60)","(14,61)","(14,62)","(14,63)"}
452 (167,16) | 376 | {"(167,16)","(167,17)","(167,18)","(167,19)","(167,20)"}
453 (170,30) | 376 | {"(170,30)","(170,31)","(170,32)","(170,33)","(170,34)"}
454 (173,44) | 197 | {"(173,44)","(173,45)","(173,46)","(173,47)","(173,48)"}
457 F.23.6. GiST Functions #
459 gist_page_opaque_info(page bytea) returns record
460 gist_page_opaque_info returns information from a GiST index
461 page's opaque area, such as the NSN, rightlink and page type.
464 test=# SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
465 lsn | nsn | rightlink | flags
466 -----+-----+-----------+--------
467 0/1 | 0/0 | 1 | {leaf}
470 gist_page_items(page bytea, index_oid regclass) returns setof record
471 gist_page_items returns information about the data stored in a
472 page of a GiST index. For example:
474 test=# SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gis
476 itemoffset | ctid | itemlen | dead | keys
477 ------------+-----------+---------+------+-------------------------------
478 1 | (1,65535) | 40 | f | (p)=("(185,185),(1,1)")
479 2 | (2,65535) | 40 | f | (p)=("(370,370),(186,186)")
480 3 | (3,65535) | 40 | f | (p)=("(555,555),(371,371)")
481 4 | (4,65535) | 40 | f | (p)=("(740,740),(556,556)")
482 5 | (5,65535) | 40 | f | (p)=("(870,870),(741,741)")
483 6 | (6,65535) | 40 | f | (p)=("(1000,1000),(871,871)")
486 gist_page_items_bytea(page bytea) returns setof record
487 Same as gist_page_items, but returns the key data as a raw bytea
488 blob. Since it does not attempt to decode the key, it does not
489 need to know which index is involved. For example:
491 test=# SELECT * FROM gist_page_items_bytea(get_raw_page('test_gist_idx', 0));
492 itemoffset | ctid | itemlen | dead |
494 ------------+-----------+---------+------+--------------------------------------
495 ----------------------------------------------
496 1 | (1,65535) | 40 | f | \x00000100ffff28000000000000c06440000
497 0000000c06440000000000000f03f000000000000f03f
498 2 | (2,65535) | 40 | f | \x00000200ffff28000000000000c07440000
499 0000000c074400000000000e064400000000000e06440
500 3 | (3,65535) | 40 | f | \x00000300ffff28000000000000207f40000
501 0000000207f400000000000d074400000000000d07440
502 4 | (4,65535) | 40 | f | \x00000400ffff28000000000000c08440000
503 0000000c084400000000000307f400000000000307f40
504 5 | (5,65535) | 40 | f | \x00000500ffff28000000000000f08940000
505 0000000f089400000000000c884400000000000c88440
506 6 | (6,65535) | 40 | f | \x00000600ffff28000000000000208f40000
507 0000000208f400000000000f889400000000000f88940
508 7 | (7,65535) | 40 | f | \x00000700ffff28000000000000408f40000
509 0000000408f400000000000288f400000000000288f40
512 F.23.7. Hash Functions #
514 hash_page_type(page bytea) returns text
515 hash_page_type returns page type of the given HASH index page.
518 test=# SELECT hash_page_type(get_raw_page('con_hash_index', 0));
523 hash_page_stats(page bytea) returns setof record
524 hash_page_stats returns information about a bucket or overflow
525 page of a HASH index. For example:
527 test=# SELECT * FROM hash_page_stats(get_raw_page('con_hash_index', 1));
528 -[ RECORD 1 ]---+-----------
533 hasho_prevblkno | 4096
534 hasho_nextblkno | 8474
537 hasho_page_id | 65408
539 hash_page_items(page bytea) returns setof record
540 hash_page_items returns information about the data stored in a
541 bucket or overflow page of a HASH index page. For example:
543 test=# SELECT * FROM hash_page_items(get_raw_page('con_hash_index', 1)) LIMIT 5;
544 itemoffset | ctid | data
545 ------------+-----------+------------
546 1 | (899,77) | 1053474816
547 2 | (897,29) | 1053474816
548 3 | (894,207) | 1053474816
549 4 | (892,159) | 1053474816
550 5 | (890,111) | 1053474816
552 hash_bitmap_info(index oid, blkno bigint) returns record
553 hash_bitmap_info shows the status of a bit in the bitmap page
554 for a particular overflow page of HASH index. For example:
556 test=# SELECT * FROM hash_bitmap_info('con_hash_index', 2052);
557 bitmapblkno | bitmapbit | bitstatus
558 -------------+-----------+-----------
561 hash_metapage_info(page bytea) returns record
562 hash_metapage_info returns information stored in the meta page
563 of a HASH index. For example:
565 test=# SELECT magic, version, ntuples, ffactor, bsize, bmsize, bmshift,
566 test-# maxbucket, highmask, lowmask, ovflpoint, firstfree, nmaps, procid,
567 test-# regexp_replace(spares::text, '(,0)*}', '}') as spares,
568 test-# regexp_replace(mapp::text, '(,0)*}', '}') as mapp
569 test-# FROM hash_metapage_info(get_raw_page('con_hash_index', 0));
570 -[ RECORD 1 ]------------------------------------------------------------------
586 spares | {0,0,0,0,0,0,1,1,1,1,1,1,1,1,3,4,4,4,45,55,58,59,508,567,628,704,11