5 65.5.2. Built-in Operator Classes
10 BRIN stands for Block Range Index. BRIN is designed for handling very
11 large tables in which certain columns have some natural correlation
12 with their physical location within the table.
14 BRIN works in terms of block ranges (or “page ranges”). A block range
15 is a group of pages that are physically adjacent in the table; for each
16 block range, some summary info is stored by the index. For example, a
17 table storing a store's sale orders might have a date column on which
18 each order was placed, and most of the time the entries for earlier
19 orders will appear earlier in the table as well; a table storing a ZIP
20 code column might have all codes for a city grouped together naturally.
22 BRIN indexes can satisfy queries via regular bitmap index scans, and
23 will return all tuples in all pages within each range if the summary
24 info stored by the index is consistent with the query conditions. The
25 query executor is in charge of rechecking these tuples and discarding
26 those that do not match the query conditions — in other words, these
27 indexes are lossy. Because a BRIN index is very small, scanning the
28 index adds little overhead compared to a sequential scan, but may avoid
29 scanning large parts of the table that are known not to contain
32 The specific data that a BRIN index will store, as well as the specific
33 queries that the index will be able to satisfy, depend on the operator
34 class selected for each column of the index. Data types having a linear
35 sort order can have operator classes that store the minimum and maximum
36 value within each block range, for instance; geometrical types might
37 store the bounding box for all the objects in the block range.
39 The size of the block range is determined at index creation time by the
40 pages_per_range storage parameter. The number of index entries will be
41 equal to the size of the relation in pages divided by the selected
42 value for pages_per_range. Therefore, the smaller the number, the
43 larger the index becomes (because of the need to store more index
44 entries), but at the same time the summary data stored can be more
45 precise and more data blocks can be skipped during an index scan.
47 65.5.1.1. Index Maintenance #
49 At the time of creation, all existing heap pages are scanned and a
50 summary index tuple is created for each range, including the
51 possibly-incomplete range at the end. As new pages are filled with
52 data, page ranges that are already summarized will cause the summary
53 information to be updated with data from the new tuples. When a new
54 page is created that does not fall within the last summarized range,
55 the range that the new page belongs to does not automatically acquire a
56 summary tuple; those tuples remain unsummarized until a summarization
57 run is invoked later, creating the initial summary for that range.
59 There are several ways to trigger the initial summarization of a page
60 range. If the table is vacuumed, either manually or by autovacuum, all
61 existing unsummarized page ranges are summarized. Also, if the index's
62 autosummarize parameter is enabled, which it isn't by default, whenever
63 autovacuum runs in that database, summarization will occur for all
64 unsummarized page ranges that have been filled, regardless of whether
65 the table itself is processed by autovacuum; see below.
67 Lastly, the following functions can be used (while these functions run,
68 search_path is temporarily changed to pg_catalog, pg_temp):
69 brin_summarize_new_values(regclass) which summarizes all unsummarized
71 brin_summarize_range(regclass, bigint) which summarizes only the range
72 containing the given page, if it is unsummarized.
74 When autosummarization is enabled, a request is sent to autovacuum to
75 execute a targeted summarization for a block range when an insertion is
76 detected for the first item of the first page of the next block range,
77 to be fulfilled the next time an autovacuum worker finishes running in
78 the same database. If the request queue is full, the request is not
79 recorded and a message is sent to the server log:
80 LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
83 When this happens, the range will remain unsummarized until the next
84 regular vacuum run on the table, or one of the functions mentioned
87 Conversely, a range can be de-summarized using the
88 brin_desummarize_range(regclass, bigint) function, which is useful when
89 the index tuple is no longer a very good representation because the
90 existing values have changed. See Section 9.28.8 for details.
92 65.5.2. Built-in Operator Classes #
94 The core PostgreSQL distribution includes the BRIN operator classes
97 The minmax operator classes store the minimum and the maximum values
98 appearing in the indexed column within the range. The inclusion
99 operator classes store a value which includes the values in the indexed
100 column within the range. The bloom operator classes build a Bloom
101 filter for all values in the range. The minmax-multi operator classes
102 store multiple minimum and maximum values, representing values
103 appearing in the indexed column within the range.
105 Table 65.4. Built-in BRIN Operator Classes
106 Name Indexable Operators
107 bit_minmax_ops = (bit,bit)
112 box_inclusion_ops @> (box,point)
125 bpchar_bloom_ops = (character,character)
126 bpchar_minmax_ops = (character,character)
127 < (character,character)
128 <= (character,character)
129 > (character,character)
130 >= (character,character)
131 bytea_bloom_ops = (bytea,bytea)
132 bytea_minmax_ops = (bytea,bytea)
137 char_bloom_ops = ("char","char")
138 char_minmax_ops = ("char","char")
143 date_bloom_ops = (date,date)
144 date_minmax_ops = (date,date)
149 date_minmax_multi_ops = (date,date)
154 float4_bloom_ops = (float4,float4)
155 float4_minmax_ops = (float4,float4)
160 float4_minmax_multi_ops = (float4,float4)
165 float8_bloom_ops = (float8,float8)
166 float8_minmax_ops = (float8,float8)
171 float8_minmax_multi_ops = (float8,float8)
176 inet_inclusion_ops << (inet,inet)
182 inet_bloom_ops = (inet,inet)
183 inet_minmax_ops = (inet,inet)
188 inet_minmax_multi_ops = (inet,inet)
193 int2_bloom_ops = (int2,int2)
194 int2_minmax_ops = (int2,int2)
199 int2_minmax_multi_ops = (int2,int2)
204 int4_bloom_ops = (int4,int4)
205 int4_minmax_ops = (int4,int4)
210 int4_minmax_multi_ops = (int4,int4)
215 int8_bloom_ops = (bigint,bigint)
216 int8_minmax_ops = (bigint,bigint)
221 int8_minmax_multi_ops = (bigint,bigint)
226 interval_bloom_ops = (interval,interval)
227 interval_minmax_ops = (interval,interval)
228 < (interval,interval)
229 <= (interval,interval)
230 > (interval,interval)
231 >= (interval,interval)
232 interval_minmax_multi_ops = (interval,interval)
233 < (interval,interval)
234 <= (interval,interval)
235 > (interval,interval)
236 >= (interval,interval)
237 macaddr_bloom_ops = (macaddr,macaddr)
238 macaddr_minmax_ops = (macaddr,macaddr)
243 macaddr_minmax_multi_ops = (macaddr,macaddr)
248 macaddr8_bloom_ops = (macaddr8,macaddr8)
249 macaddr8_minmax_ops = (macaddr8,macaddr8)
250 < (macaddr8,macaddr8)
251 <= (macaddr8,macaddr8)
252 > (macaddr8,macaddr8)
253 >= (macaddr8,macaddr8)
254 macaddr8_minmax_multi_ops = (macaddr8,macaddr8)
255 < (macaddr8,macaddr8)
256 <= (macaddr8,macaddr8)
257 > (macaddr8,macaddr8)
258 >= (macaddr8,macaddr8)
259 name_bloom_ops = (name,name)
260 name_minmax_ops = (name,name)
265 numeric_bloom_ops = (numeric,numeric)
266 numeric_minmax_ops = (numeric,numeric)
271 numeric_minmax_multi_ops = (numeric,numeric)
276 oid_bloom_ops = (oid,oid)
277 oid_minmax_ops = (oid,oid)
282 oid_minmax_multi_ops = (oid,oid)
287 pg_lsn_bloom_ops = (pg_lsn,pg_lsn)
288 pg_lsn_minmax_ops = (pg_lsn,pg_lsn)
293 pg_lsn_minmax_multi_ops = (pg_lsn,pg_lsn)
298 range_inclusion_ops = (anyrange,anyrange)
299 < (anyrange,anyrange)
300 <= (anyrange,anyrange)
301 >= (anyrange,anyrange)
302 > (anyrange,anyrange)
303 && (anyrange,anyrange)
304 @> (anyrange,anyelement)
305 @> (anyrange,anyrange)
306 <@ (anyrange,anyrange)
307 << (anyrange,anyrange)
308 >> (anyrange,anyrange)
309 &< (anyrange,anyrange)
310 &> (anyrange,anyrange)
311 -|- (anyrange,anyrange)
312 text_bloom_ops = (text,text)
313 text_minmax_ops = (text,text)
318 tid_bloom_ops = (tid,tid)
319 tid_minmax_ops = (tid,tid)
324 tid_minmax_multi_ops = (tid,tid)
329 timestamp_bloom_ops = (timestamp,timestamp)
330 timestamp_minmax_ops = (timestamp,timestamp)
331 < (timestamp,timestamp)
332 <= (timestamp,timestamp)
333 > (timestamp,timestamp)
334 >= (timestamp,timestamp)
335 timestamp_minmax_multi_ops = (timestamp,timestamp)
336 < (timestamp,timestamp)
337 <= (timestamp,timestamp)
338 > (timestamp,timestamp)
339 >= (timestamp,timestamp)
340 timestamptz_bloom_ops = (timestamptz,timestamptz)
341 timestamptz_minmax_ops = (timestamptz,timestamptz)
342 < (timestamptz,timestamptz)
343 <= (timestamptz,timestamptz)
344 > (timestamptz,timestamptz)
345 >= (timestamptz,timestamptz)
346 timestamptz_minmax_multi_ops = (timestamptz,timestamptz)
347 < (timestamptz,timestamptz)
348 <= (timestamptz,timestamptz)
349 > (timestamptz,timestamptz)
350 >= (timestamptz,timestamptz)
351 time_bloom_ops = (time,time)
352 time_minmax_ops = (time,time)
357 time_minmax_multi_ops = (time,time)
362 timetz_bloom_ops = (timetz,timetz)
363 timetz_minmax_ops = (timetz,timetz)
368 timetz_minmax_multi_ops = (timetz,timetz)
373 uuid_bloom_ops = (uuid,uuid)
374 uuid_minmax_ops = (uuid,uuid)
379 uuid_minmax_multi_ops = (uuid,uuid)
384 varbit_minmax_ops = (varbit,varbit)
390 65.5.2.1. Operator Class Parameters #
392 Some of the built-in operator classes allow specifying parameters
393 affecting behavior of the operator class. Each operator class has its
394 own set of allowed parameters. Only the bloom and minmax-multi operator
395 classes allow specifying parameters:
397 bloom operator classes accept these parameters:
400 Defines the estimated number of distinct non-null values in the
401 block range, used by BRIN bloom indexes for sizing of the Bloom
402 filter. It behaves similarly to n_distinct option for ALTER
403 TABLE. When set to a positive value, each block range is assumed
404 to contain this number of distinct non-null values. When set to
405 a negative value, which must be greater than or equal to -1, the
406 number of distinct non-null values is assumed to grow linearly
407 with the maximum possible number of tuples in the block range
408 (about 290 rows per block). The default value is -0.1, and the
409 minimum number of distinct non-null values is 16.
412 Defines the desired false positive rate used by BRIN bloom
413 indexes for sizing of the Bloom filter. The values must be
414 between 0.0001 and 0.25. The default value is 0.01, which is 1%
417 minmax-multi operator classes accept these parameters:
420 Defines the maximum number of values stored by BRIN minmax
421 indexes to summarize a block range. Each value may represent
422 either a point, or a boundary of an interval. Values must be
423 between 8 and 256, and the default value is 32.
425 65.5.3. Extensibility #
427 The BRIN interface has a high level of abstraction, requiring the
428 access method implementer only to implement the semantics of the data
429 type being accessed. The BRIN layer itself takes care of concurrency,
430 logging and searching the index structure.
432 All it takes to get a BRIN access method working is to implement a few
433 user-defined methods, which define the behavior of summary values
434 stored in the index and the way they interact with scan keys. In short,
435 BRIN combines extensibility with generality, code reuse, and a clean
438 There are four methods that an operator class for BRIN must provide:
440 BrinOpcInfo *opcInfo(Oid type_oid)
441 Returns internal information about the indexed columns' summary
442 data. The return value must point to a palloc'd BrinOpcInfo,
443 which has this definition:
445 typedef struct BrinOpcInfo
447 /* Number of columns stored in an index column of this opclass */
450 /* Opaque pointer for the opclass' private use */
453 /* Type cache entries of the stored columns */
454 TypeCacheEntry *oi_typcache[FLEXIBLE_ARRAY_MEMBER];
457 BrinOpcInfo.oi_opaque can be used by the operator class routines
458 to pass information between support functions during an index
461 bool consistent(BrinDesc *bdesc, BrinValues *column, ScanKey *keys, int
463 Returns whether all the ScanKey entries are consistent with the
464 given indexed values for a range. The attribute number to use is
465 passed as part of the scan key. Multiple scan keys for the same
466 attribute may be passed at once; the number of entries is
467 determined by the nkeys parameter.
469 bool consistent(BrinDesc *bdesc, BrinValues *column, ScanKey key)
470 Returns whether the ScanKey is consistent with the given indexed
471 values for a range. The attribute number to use is passed as
472 part of the scan key. This is an older backward-compatible
473 variant of the consistent function.
475 bool addValue(BrinDesc *bdesc, BrinValues *column, Datum newval, bool
477 Given an index tuple and an indexed value, modifies the
478 indicated attribute of the tuple so that it additionally
479 represents the new value. If any modification was done to the
480 tuple, true is returned.
482 bool unionTuples(BrinDesc *bdesc, BrinValues *a, BrinValues *b)
483 Consolidates two index tuples. Given two index tuples, modifies
484 the indicated attribute of the first of them so that it
485 represents both tuples. The second tuple is not modified.
487 An operator class for BRIN can optionally specify the following method:
489 void options(local_relopts *relopts)
490 Defines a set of user-visible parameters that control operator
493 The options function is passed a pointer to a local_relopts
494 struct, which needs to be filled with a set of operator class
495 specific options. The options can be accessed from other support
496 functions using the PG_HAS_OPCLASS_OPTIONS() and
497 PG_GET_OPCLASS_OPTIONS() macros.
499 Since both key extraction of indexed values and representation
500 of the key in BRIN are flexible, they may depend on
501 user-specified parameters.
503 The core distribution includes support for four types of operator
504 classes: minmax, minmax-multi, inclusion and bloom. Operator class
505 definitions using them are shipped for in-core data types as
506 appropriate. Additional operator classes can be defined by the user for
507 other data types using equivalent definitions, without having to write
508 any source code; appropriate catalog entries being declared is enough.
509 Note that assumptions about the semantics of operator strategies are
510 embedded in the support functions' source code.
512 Operator classes that implement completely different semantics are also
513 possible, provided implementations of the four main support functions
514 described above are written. Note that backwards compatibility across
515 major releases is not guaranteed: for example, additional support
516 functions might be required in later releases.
518 To write an operator class for a data type that implements a totally
519 ordered set, it is possible to use the minmax support functions
520 alongside the corresponding operators, as shown in Table 65.5. All
521 operator class members (functions and operators) are mandatory.
523 Table 65.5. Function and Support Numbers for Minmax Operator Classes
524 Operator class member Object
525 Support Function 1 internal function brin_minmax_opcinfo()
526 Support Function 2 internal function brin_minmax_add_value()
527 Support Function 3 internal function brin_minmax_consistent()
528 Support Function 4 internal function brin_minmax_union()
529 Operator Strategy 1 operator less-than
530 Operator Strategy 2 operator less-than-or-equal-to
531 Operator Strategy 3 operator equal-to
532 Operator Strategy 4 operator greater-than-or-equal-to
533 Operator Strategy 5 operator greater-than
535 To write an operator class for a complex data type which has values
536 included within another type, it's possible to use the inclusion
537 support functions alongside the corresponding operators, as shown in
538 Table 65.6. It requires only a single additional function, which can be
539 written in any language. More functions can be defined for additional
540 functionality. All operators are optional. Some operators require other
541 operators, as shown as dependencies on the table.
543 Table 65.6. Function and Support Numbers for Inclusion Operator Classes
544 Operator class member Object Dependency
545 Support Function 1 internal function brin_inclusion_opcinfo()
546 Support Function 2 internal function brin_inclusion_add_value()
547 Support Function 3 internal function brin_inclusion_consistent()
548 Support Function 4 internal function brin_inclusion_union()
549 Support Function 11 function to merge two elements
550 Support Function 12 optional function to check whether two elements are
552 Support Function 13 optional function to check if an element is
553 contained within another
554 Support Function 14 optional function to check whether an element is
556 Operator Strategy 1 operator left-of Operator Strategy 4
557 Operator Strategy 2 operator does-not-extend-to-the-right-of Operator
559 Operator Strategy 3 operator overlaps
560 Operator Strategy 4 operator does-not-extend-to-the-left-of Operator
562 Operator Strategy 5 operator right-of Operator Strategy 2
563 Operator Strategy 6, 18 operator same-as-or-equal-to Operator Strategy
565 Operator Strategy 7, 16, 24, 25 operator contains-or-equal-to
566 Operator Strategy 8, 26, 27 operator is-contained-by-or-equal-to
568 Operator Strategy 9 operator does-not-extend-above Operator Strategy 11
569 Operator Strategy 10 operator is-below Operator Strategy 12
570 Operator Strategy 11 operator is-above Operator Strategy 9
571 Operator Strategy 12 operator does-not-extend-below Operator Strategy
573 Operator Strategy 20 operator less-than Operator Strategy 5
574 Operator Strategy 21 operator less-than-or-equal-to Operator Strategy 5
575 Operator Strategy 22 operator greater-than Operator Strategy 1
576 Operator Strategy 23 operator greater-than-or-equal-to Operator
579 Support function numbers 1 through 10 are reserved for the BRIN
580 internal functions, so the SQL level functions start with number 11.
581 Support function number 11 is the main function required to build the
582 index. It should accept two arguments with the same data type as the
583 operator class, and return the union of them. The inclusion operator
584 class can store union values with different data types if it is defined
585 with the STORAGE parameter. The return value of the union function
586 should match the STORAGE data type.
588 Support function numbers 12 and 14 are provided to support
589 irregularities of built-in data types. Function number 12 is used to
590 support network addresses from different families which are not
591 mergeable. Function number 14 is used to support empty ranges. Function
592 number 13 is an optional but recommended one, which allows the new
593 value to be checked before it is passed to the union function. As the
594 BRIN framework can shortcut some operations when the union is not
595 changed, using this function can improve index performance.
597 To write an operator class for a data type that implements only an
598 equality operator and supports hashing, it is possible to use the bloom
599 support procedures alongside the corresponding operators, as shown in
600 Table 65.7. All operator class members (procedures and operators) are
603 Table 65.7. Procedure and Support Numbers for Bloom Operator Classes
604 Operator class member Object
605 Support Procedure 1 internal function brin_bloom_opcinfo()
606 Support Procedure 2 internal function brin_bloom_add_value()
607 Support Procedure 3 internal function brin_bloom_consistent()
608 Support Procedure 4 internal function brin_bloom_union()
609 Support Procedure 5 internal function brin_bloom_options()
610 Support Procedure 11 function to compute hash of an element
611 Operator Strategy 1 operator equal-to
613 Support procedure numbers 1-10 are reserved for the BRIN internal
614 functions, so the SQL level functions start with number 11. Support
615 function number 11 is the main function required to build the index. It
616 should accept one argument with the same data type as the operator
617 class, and return a hash of the value.
619 The minmax-multi operator class is also intended for data types
620 implementing a totally ordered set, and may be seen as a simple
621 extension of the minmax operator class. While minmax operator class
622 summarizes values from each block range into a single contiguous
623 interval, minmax-multi allows summarization into multiple smaller
624 intervals to improve handling of outlier values. It is possible to use
625 the minmax-multi support procedures alongside the corresponding
626 operators, as shown in Table 65.8. All operator class members
627 (procedures and operators) are mandatory.
629 Table 65.8. Procedure and Support Numbers for minmax-multi Operator
631 Operator class member Object
632 Support Procedure 1 internal function brin_minmax_multi_opcinfo()
633 Support Procedure 2 internal function brin_minmax_multi_add_value()
634 Support Procedure 3 internal function brin_minmax_multi_consistent()
635 Support Procedure 4 internal function brin_minmax_multi_union()
636 Support Procedure 5 internal function brin_minmax_multi_options()
637 Support Procedure 11 function to compute distance between two values
639 Operator Strategy 1 operator less-than
640 Operator Strategy 2 operator less-than-or-equal-to
641 Operator Strategy 3 operator equal-to
642 Operator Strategy 4 operator greater-than-or-equal-to
643 Operator Strategy 5 operator greater-than
645 Both minmax and inclusion operator classes support cross-data-type
646 operators, though with these the dependencies become more complicated.
647 The minmax operator class requires a full set of operators to be
648 defined with both arguments having the same data type. It allows
649 additional data types to be supported by defining extra sets of
650 operators. Inclusion operator class operator strategies are dependent
651 on another operator strategy as shown in Table 65.6, or the same
652 operator strategy as themselves. They require the dependency operator
653 to be defined with the STORAGE data type as the left-hand-side argument
654 and the other supported data type to be the right-hand-side argument of
655 the supported operator. See float4_minmax_ops as an example of minmax,
656 and box_inclusion_ops as an example of inclusion.