2 65.3. SP-GiST Indexes #
5 65.3.2. Built-in Operator Classes
10 65.3.1. Introduction #
12 SP-GiST is an abbreviation for space-partitioned GiST. SP-GiST supports
13 partitioned search trees, which facilitate development of a wide range
14 of different non-balanced data structures, such as quad-trees, k-d
15 trees, and radix trees (tries). The common feature of these structures
16 is that they repeatedly divide the search space into partitions that
17 need not be of equal size. Searches that are well matched to the
18 partitioning rule can be very fast.
20 These popular data structures were originally developed for in-memory
21 usage. In main memory, they are usually designed as a set of
22 dynamically allocated nodes linked by pointers. This is not suitable
23 for direct storing on disk, since these chains of pointers can be
24 rather long which would require too many disk accesses. In contrast,
25 disk-based data structures should have a high fanout to minimize I/O.
26 The challenge addressed by SP-GiST is to map search tree nodes to disk
27 pages in such a way that a search need access only a few disk pages,
28 even if it traverses many nodes.
30 Like GiST, SP-GiST is meant to allow the development of custom data
31 types with the appropriate access methods, by an expert in the domain
32 of the data type, rather than a database expert.
34 Some of the information here is derived from Purdue University's
35 SP-GiST Indexing Project web site. The SP-GiST implementation in
36 PostgreSQL is primarily maintained by Teodor Sigaev and Oleg Bartunov,
37 and there is more information on their web site.
39 65.3.2. Built-in Operator Classes #
41 The core PostgreSQL distribution includes the SP-GiST operator classes
44 Table 65.2. Built-in SP-GiST Operator Classes
45 Name Indexable Operators Ordering Operators
46 box_ops << (box,box) <-> (box,point)
58 inet_ops << (inet,inet)
69 kd_point_ops |>> (point,point) <-> (point,point)
75 poly_ops << (polygon,polygon) <-> (polygon,point)
87 quad_point_ops |>> (point,point) <-> (point,point)
93 range_ops = (anyrange,anyrange)
94 && (anyrange,anyrange)
95 @> (anyrange,anyelement)
96 @> (anyrange,anyrange)
97 <@ (anyrange,anyrange)
98 << (anyrange,anyrange)
99 >> (anyrange,anyrange)
100 &< (anyrange,anyrange)
101 &> (anyrange,anyrange)
102 -|- (anyrange,anyrange)
103 text_ops = (text,text)
114 Of the two operator classes for type point, quad_point_ops is the
115 default. kd_point_ops supports the same operators but uses a different
116 index data structure that may offer better performance in some
119 The quad_point_ops, kd_point_ops and poly_ops operator classes support
120 the <-> ordering operator, which enables the k-nearest neighbor (k-NN)
121 search over indexed point or polygon data sets.
123 65.3.3. Extensibility #
125 SP-GiST offers an interface with a high level of abstraction, requiring
126 the access method developer to implement only methods specific to a
127 given data type. The SP-GiST core is responsible for efficient disk
128 mapping and searching the tree structure. It also takes care of
129 concurrency and logging considerations.
131 Leaf tuples of an SP-GiST tree usually contain values of the same data
132 type as the indexed column, although it is also possible for them to
133 contain lossy representations of the indexed column. Leaf tuples stored
134 at the root level will directly represent the original indexed data
135 value, but leaf tuples at lower levels might contain only a partial
136 value, such as a suffix. In that case the operator class support
137 functions must be able to reconstruct the original value using
138 information accumulated from the inner tuples that are passed through
139 to reach the leaf level.
141 When an SP-GiST index is created with INCLUDE columns, the values of
142 those columns are also stored in leaf tuples. The INCLUDE columns are
143 of no concern to the SP-GiST operator class, so they are not discussed
146 Inner tuples are more complex, since they are branching points in the
147 search tree. Each inner tuple contains a set of one or more nodes,
148 which represent groups of similar leaf values. A node contains a
149 downlink that leads either to another, lower-level inner tuple, or to a
150 short list of leaf tuples that all lie on the same index page. Each
151 node normally has a label that describes it; for example, in a radix
152 tree the node label could be the next character of the string value.
153 (Alternatively, an operator class can omit the node labels, if it works
154 with a fixed set of nodes for all inner tuples; see Section 65.3.4.2.)
155 Optionally, an inner tuple can have a prefix value that describes all
156 its members. In a radix tree this could be the common prefix of the
157 represented strings. The prefix value is not necessarily really a
158 prefix, but can be any data needed by the operator class; for example,
159 in a quad-tree it can store the central point that the four quadrants
160 are measured with respect to. A quad-tree inner tuple would then also
161 contain four nodes corresponding to the quadrants around this central
164 Some tree algorithms require knowledge of level (or depth) of the
165 current tuple, so the SP-GiST core provides the possibility for
166 operator classes to manage level counting while descending the tree.
167 There is also support for incrementally reconstructing the represented
168 value when that is needed, and for passing down additional data (called
169 traverse values) during a tree descent.
173 The SP-GiST core code takes care of null entries. Although SP-GiST
174 indexes do store entries for nulls in indexed columns, this is hidden
175 from the index operator class code: no null index entries or search
176 conditions will ever be passed to the operator class methods. (It is
177 assumed that SP-GiST operators are strict and so cannot succeed for
178 null values.) Null values are therefore not discussed further here.
180 There are five user-defined methods that an index operator class for
181 SP-GiST must provide, and two are optional. All five mandatory methods
182 follow the convention of accepting two internal arguments, the first of
183 which is a pointer to a C struct containing input values for the
184 support method, while the second argument is a pointer to a C struct
185 where output values must be placed. Four of the mandatory methods just
186 return void, since all their results appear in the output struct; but
187 leaf_consistent returns a boolean result. The methods must not modify
188 any fields of their input structs. In all cases, the output struct is
189 initialized to zeroes before calling the user-defined method. The
190 optional sixth method compress accepts a datum to be indexed as the
191 only argument and returns a value suitable for physical storage in a
192 leaf tuple. The optional seventh method options accepts an internal
193 pointer to a C struct, where opclass-specific parameters should be
194 placed, and returns void.
196 The five mandatory user-defined methods are:
199 Returns static information about the index implementation,
200 including the data type OIDs of the prefix and node label data
203 The SQL declaration of the function must look like this:
205 CREATE FUNCTION my_config(internal, internal) RETURNS void ...
207 The first argument is a pointer to a spgConfigIn C struct,
208 containing input data for the function. The second argument is a
209 pointer to a spgConfigOut C struct, which the function must fill
212 typedef struct spgConfigIn
214 Oid attType; /* Data type to be indexed */
217 typedef struct spgConfigOut
219 Oid prefixType; /* Data type of inner-tuple prefixes */
220 Oid labelType; /* Data type of inner-tuple node labels */
221 Oid leafType; /* Data type of leaf-tuple values */
222 bool canReturnData; /* Opclass can reconstruct original data */
223 bool longValuesOK; /* Opclass can cope with values > 1 page */
226 attType is passed in order to support polymorphic index operator
227 classes; for ordinary fixed-data-type operator classes, it will
228 always have the same value and so can be ignored.
230 For operator classes that do not use prefixes, prefixType can be
231 set to VOIDOID. Likewise, for operator classes that do not use
232 node labels, labelType can be set to VOIDOID. canReturnData
233 should be set true if the operator class is capable of
234 reconstructing the originally-supplied index value. longValuesOK
235 should be set true only when the attType is of variable length
236 and the operator class is capable of segmenting long values by
237 repeated suffixing (see Section 65.3.4.1).
239 leafType should match the index storage type defined by the
240 operator class's opckeytype catalog entry. (Note that opckeytype
241 can be zero, implying the storage type is the same as the
242 operator class's input type, which is the most common
243 situation.) For reasons of backward compatibility, the config
244 method can set leafType to some other value, and that value will
245 be used; but this is deprecated since the index contents are
246 then incorrectly identified in the catalogs. Also, it's
247 permissible to leave leafType uninitialized (zero); that is
248 interpreted as meaning the index storage type derived from
251 When attType and leafType are different, the optional method
252 compress must be provided. Method compress is responsible for
253 transformation of datums to be indexed from attType to leafType.
256 Chooses a method for inserting a new value into an inner tuple.
258 The SQL declaration of the function must look like this:
260 CREATE FUNCTION my_choose(internal, internal) RETURNS void ...
262 The first argument is a pointer to a spgChooseIn C struct,
263 containing input data for the function. The second argument is a
264 pointer to a spgChooseOut C struct, which the function must fill
267 typedef struct spgChooseIn
269 Datum datum; /* original datum to be indexed */
270 Datum leafDatum; /* current datum to be stored at leaf */
271 int level; /* current level (counting from zero) */
273 /* Data from current inner tuple */
274 bool allTheSame; /* tuple is marked all-the-same? */
275 bool hasPrefix; /* tuple has a prefix? */
276 Datum prefixDatum; /* if so, the prefix value */
277 int nNodes; /* number of nodes in the inner tuple */
278 Datum *nodeLabels; /* node label values (NULL if none) */
281 typedef enum spgChooseResultType
283 spgMatchNode = 1, /* descend into existing node */
284 spgAddNode, /* add a node to the inner tuple */
285 spgSplitTuple /* split inner tuple (change its prefix) */
286 } spgChooseResultType;
288 typedef struct spgChooseOut
290 spgChooseResultType resultType; /* action code, see above */
293 struct /* results for spgMatchNode */
295 int nodeN; /* descend to this node (index from 0) */
296 int levelAdd; /* increment level by this much */
297 Datum restDatum; /* new leaf datum */
299 struct /* results for spgAddNode */
301 Datum nodeLabel; /* new node's label */
302 int nodeN; /* where to insert it (index from 0) */
304 struct /* results for spgSplitTuple */
306 /* Info to form new upper-level inner tuple with one child tuple */
307 bool prefixHasPrefix; /* tuple should have a prefix? */
308 Datum prefixPrefixDatum; /* if so, its value */
309 int prefixNNodes; /* number of nodes */
310 Datum *prefixNodeLabels; /* their labels (or NULL for
312 int childNodeN; /* which node gets child tuple */
314 /* Info to form new lower-level inner tuple with all old nodes */
315 bool postfixHasPrefix; /* tuple should have a prefix? */
316 Datum postfixPrefixDatum; /* if so, its value */
321 datum is the original datum of spgConfigIn.attType type that was
322 to be inserted into the index. leafDatum is a value of
323 spgConfigOut.leafType type, which is initially a result of
324 method compress applied to datum when method compress is
325 provided, or the same value as datum otherwise. leafDatum can
326 change at lower levels of the tree if the choose or picksplit
327 methods change it. When the insertion search reaches a leaf
328 page, the current value of leafDatum is what will be stored in
329 the newly created leaf tuple. level is the current inner tuple's
330 level, starting at zero for the root level. allTheSame is true
331 if the current inner tuple is marked as containing multiple
332 equivalent nodes (see Section 65.3.4.3). hasPrefix is true if
333 the current inner tuple contains a prefix; if so, prefixDatum is
334 its value. nNodes is the number of child nodes contained in the
335 inner tuple, and nodeLabels is an array of their label values,
336 or NULL if there are no labels.
338 The choose function can determine either that the new value
339 matches one of the existing child nodes, or that a new child
340 node must be added, or that the new value is inconsistent with
341 the tuple prefix and so the inner tuple must be split to create
342 a less restrictive prefix.
344 If the new value matches one of the existing child nodes, set
345 resultType to spgMatchNode. Set nodeN to the index (from zero)
346 of that node in the node array. Set levelAdd to the increment in
347 level caused by descending through that node, or leave it as
348 zero if the operator class does not use levels. Set restDatum to
349 equal leafDatum if the operator class does not modify datums
350 from one level to the next, or otherwise set it to the modified
351 value to be used as leafDatum at the next level.
353 If a new child node must be added, set resultType to spgAddNode.
354 Set nodeLabel to the label to be used for the new node, and set
355 nodeN to the index (from zero) at which to insert the node in
356 the node array. After the node has been added, the choose
357 function will be called again with the modified inner tuple;
358 that call should result in an spgMatchNode result.
360 If the new value is inconsistent with the tuple prefix, set
361 resultType to spgSplitTuple. This action moves all the existing
362 nodes into a new lower-level inner tuple, and replaces the
363 existing inner tuple with a tuple having a single downlink
364 pointing to the new lower-level inner tuple. Set prefixHasPrefix
365 to indicate whether the new upper tuple should have a prefix,
366 and if so set prefixPrefixDatum to the prefix value. This new
367 prefix value must be sufficiently less restrictive than the
368 original to accept the new value to be indexed. Set prefixNNodes
369 to the number of nodes needed in the new tuple, and set
370 prefixNodeLabels to a palloc'd array holding their labels, or to
371 NULL if node labels are not required. Note that the total size
372 of the new upper tuple must be no more than the total size of
373 the tuple it is replacing; this constrains the lengths of the
374 new prefix and new labels. Set childNodeN to the index (from
375 zero) of the node that will downlink to the new lower-level
376 inner tuple. Set postfixHasPrefix to indicate whether the new
377 lower-level inner tuple should have a prefix, and if so set
378 postfixPrefixDatum to the prefix value. The combination of these
379 two prefixes and the downlink node's label (if any) must have
380 the same meaning as the original prefix, because there is no
381 opportunity to alter the node labels that are moved to the new
382 lower-level tuple, nor to change any child index entries. After
383 the node has been split, the choose function will be called
384 again with the replacement inner tuple. That call may return an
385 spgAddNode result, if no suitable node was created by the
386 spgSplitTuple action. Eventually choose must return spgMatchNode
387 to allow the insertion to descend to the next level.
390 Decides how to create a new inner tuple over a set of leaf
393 The SQL declaration of the function must look like this:
395 CREATE FUNCTION my_picksplit(internal, internal) RETURNS void ...
397 The first argument is a pointer to a spgPickSplitIn C struct,
398 containing input data for the function. The second argument is a
399 pointer to a spgPickSplitOut C struct, which the function must
400 fill with result data.
402 typedef struct spgPickSplitIn
404 int nTuples; /* number of leaf tuples */
405 Datum *datums; /* their datums (array of length nTuples) */
406 int level; /* current level (counting from zero) */
409 typedef struct spgPickSplitOut
411 bool hasPrefix; /* new inner tuple should have a prefix? */
412 Datum prefixDatum; /* if so, its value */
414 int nNodes; /* number of nodes for new inner tuple */
415 Datum *nodeLabels; /* their labels (or NULL for no labels) */
417 int *mapTuplesToNodes; /* node index for each leaf tuple */
418 Datum *leafTupleDatums; /* datum to store in each new leaf tuple */
421 nTuples is the number of leaf tuples provided. datums is an
422 array of their datum values of spgConfigOut.leafType type. level
423 is the current level that all the leaf tuples share, which will
424 become the level of the new inner tuple.
426 Set hasPrefix to indicate whether the new inner tuple should
427 have a prefix, and if so set prefixDatum to the prefix value.
428 Set nNodes to indicate the number of nodes that the new inner
429 tuple will contain, and set nodeLabels to an array of their
430 label values, or to NULL if node labels are not required. Set
431 mapTuplesToNodes to an array that gives the index (from zero) of
432 the node that each leaf tuple should be assigned to. Set
433 leafTupleDatums to an array of the values to be stored in the
434 new leaf tuples (these will be the same as the input datums if
435 the operator class does not modify datums from one level to the
436 next). Note that the picksplit function is responsible for
437 palloc'ing the nodeLabels, mapTuplesToNodes and leafTupleDatums
440 If more than one leaf tuple is supplied, it is expected that the
441 picksplit function will classify them into more than one node;
442 otherwise it is not possible to split the leaf tuples across
443 multiple pages, which is the ultimate purpose of this operation.
444 Therefore, if the picksplit function ends up placing all the
445 leaf tuples in the same node, the core SP-GiST code will
446 override that decision and generate an inner tuple in which the
447 leaf tuples are assigned at random to several
448 identically-labeled nodes. Such a tuple is marked allTheSame to
449 signify that this has happened. The choose and inner_consistent
450 functions must take suitable care with such inner tuples. See
451 Section 65.3.4.3 for more information.
453 picksplit can be applied to a single leaf tuple only in the case
454 that the config function set longValuesOK to true and a
455 larger-than-a-page input value has been supplied. In this case
456 the point of the operation is to strip off a prefix and produce
457 a new, shorter leaf datum value. The call will be repeated until
458 a leaf datum short enough to fit on a page has been produced.
459 See Section 65.3.4.1 for more information.
462 Returns set of nodes (branches) to follow during tree search.
464 The SQL declaration of the function must look like this:
466 CREATE FUNCTION my_inner_consistent(internal, internal) RETURNS void ...
468 The first argument is a pointer to a spgInnerConsistentIn C
469 struct, containing input data for the function. The second
470 argument is a pointer to a spgInnerConsistentOut C struct, which
471 the function must fill with result data.
473 typedef struct spgInnerConsistentIn
475 ScanKey scankeys; /* array of operators and comparison values */
476 ScanKey orderbys; /* array of ordering operators and comparison
478 int nkeys; /* length of scankeys array */
479 int norderbys; /* length of orderbys array */
481 Datum reconstructedValue; /* value reconstructed at parent */
482 void *traversalValue; /* opclass-specific traverse value */
483 MemoryContext traversalMemoryContext; /* put new traverse values here */
484 int level; /* current level (counting from zero) */
485 bool returnData; /* original data must be returned? */
487 /* Data from current inner tuple */
488 bool allTheSame; /* tuple is marked all-the-same? */
489 bool hasPrefix; /* tuple has a prefix? */
490 Datum prefixDatum; /* if so, the prefix value */
491 int nNodes; /* number of nodes in the inner tuple */
492 Datum *nodeLabels; /* node label values (NULL if none) */
493 } spgInnerConsistentIn;
495 typedef struct spgInnerConsistentOut
497 int nNodes; /* number of child nodes to be visited */
498 int *nodeNumbers; /* their indexes in the node array */
499 int *levelAdds; /* increment level by this much for each */
500 Datum *reconstructedValues; /* associated reconstructed values */
501 void **traversalValues; /* opclass-specific traverse values */
502 double **distances; /* associated distances */
503 } spgInnerConsistentOut;
505 The array scankeys, of length nkeys, describes the index search
506 condition(s). These conditions are combined with AND — only
507 index entries that satisfy all of them are interesting. (Note
508 that nkeys = 0 implies that all index entries satisfy the
509 query.) Usually the consistent function only cares about the
510 sk_strategy and sk_argument fields of each array entry, which
511 respectively give the indexable operator and comparison value.
512 In particular it is not necessary to check sk_flags to see if
513 the comparison value is NULL, because the SP-GiST core code will
514 filter out such conditions. The array orderbys, of length
515 norderbys, describes ordering operators (if any) in the same
516 manner. reconstructedValue is the value reconstructed for the
517 parent tuple; it is (Datum) 0 at the root level or if the
518 inner_consistent function did not provide a value at the parent
519 level. traversalValue is a pointer to any traverse data passed
520 down from the previous call of inner_consistent on the parent
521 index tuple, or NULL at the root level. traversalMemoryContext
522 is the memory context in which to store output traverse values
523 (see below). level is the current inner tuple's level, starting
524 at zero for the root level. returnData is true if reconstructed
525 data is required for this query; this will only be so if the
526 config function asserted canReturnData. allTheSame is true if
527 the current inner tuple is marked “all-the-same”; in this case
528 all the nodes have the same label (if any) and so either all or
529 none of them match the query (see Section 65.3.4.3). hasPrefix
530 is true if the current inner tuple contains a prefix; if so,
531 prefixDatum is its value. nNodes is the number of child nodes
532 contained in the inner tuple, and nodeLabels is an array of
533 their label values, or NULL if the nodes do not have labels.
535 nNodes must be set to the number of child nodes that need to be
536 visited by the search, and nodeNumbers must be set to an array
537 of their indexes. If the operator class keeps track of levels,
538 set levelAdds to an array of the level increments required when
539 descending to each node to be visited. (Often these increments
540 will be the same for all the nodes, but that's not necessarily
541 so, so an array is used.) If value reconstruction is needed, set
542 reconstructedValues to an array of the values reconstructed for
543 each child node to be visited; otherwise, leave
544 reconstructedValues as NULL. The reconstructed values are
545 assumed to be of type spgConfigOut.leafType. (However, since the
546 core system will do nothing with them except possibly copy them,
547 it is sufficient for them to have the same typlen and typbyval
548 properties as leafType.) If ordered search is performed, set
549 distances to an array of distance values according to orderbys
550 array (nodes with lowest distances will be processed first).
551 Leave it NULL otherwise. If it is desired to pass down
552 additional out-of-band information (“traverse values”) to lower
553 levels of the tree search, set traversalValues to an array of
554 the appropriate traverse values, one for each child node to be
555 visited; otherwise, leave traversalValues as NULL. Note that the
556 inner_consistent function is responsible for palloc'ing the
557 nodeNumbers, levelAdds, distances, reconstructedValues, and
558 traversalValues arrays in the current memory context. However,
559 any output traverse values pointed to by the traversalValues
560 array should be allocated in traversalMemoryContext. Each
561 traverse value must be a single palloc'd chunk.
564 Returns true if a leaf tuple satisfies a query.
566 The SQL declaration of the function must look like this:
568 CREATE FUNCTION my_leaf_consistent(internal, internal) RETURNS bool ...
570 The first argument is a pointer to a spgLeafConsistentIn C
571 struct, containing input data for the function. The second
572 argument is a pointer to a spgLeafConsistentOut C struct, which
573 the function must fill with result data.
575 typedef struct spgLeafConsistentIn
577 ScanKey scankeys; /* array of operators and comparison values */
578 ScanKey orderbys; /* array of ordering operators and comparison
580 int nkeys; /* length of scankeys array */
581 int norderbys; /* length of orderbys array */
583 Datum reconstructedValue; /* value reconstructed at parent */
584 void *traversalValue; /* opclass-specific traverse value */
585 int level; /* current level (counting from zero) */
586 bool returnData; /* original data must be returned? */
588 Datum leafDatum; /* datum in leaf tuple */
589 } spgLeafConsistentIn;
591 typedef struct spgLeafConsistentOut
593 Datum leafValue; /* reconstructed original data, if any */
594 bool recheck; /* set true if operator must be rechecked */
595 bool recheckDistances; /* set true if distances must be rechecked */
596 double *distances; /* associated distances */
597 } spgLeafConsistentOut;
599 The array scankeys, of length nkeys, describes the index search
600 condition(s). These conditions are combined with AND — only
601 index entries that satisfy all of them satisfy the query. (Note
602 that nkeys = 0 implies that all index entries satisfy the
603 query.) Usually the consistent function only cares about the
604 sk_strategy and sk_argument fields of each array entry, which
605 respectively give the indexable operator and comparison value.
606 In particular it is not necessary to check sk_flags to see if
607 the comparison value is NULL, because the SP-GiST core code will
608 filter out such conditions. The array orderbys, of length
609 norderbys, describes the ordering operators in the same manner.
610 reconstructedValue is the value reconstructed for the parent
611 tuple; it is (Datum) 0 at the root level or if the
612 inner_consistent function did not provide a value at the parent
613 level. traversalValue is a pointer to any traverse data passed
614 down from the previous call of inner_consistent on the parent
615 index tuple, or NULL at the root level. level is the current
616 leaf tuple's level, starting at zero for the root level.
617 returnData is true if reconstructed data is required for this
618 query; this will only be so if the config function asserted
619 canReturnData. leafDatum is the key value of
620 spgConfigOut.leafType stored in the current leaf tuple.
622 The function must return true if the leaf tuple matches the
623 query, or false if not. In the true case, if returnData is true
624 then leafValue must be set to the value (of type
625 spgConfigIn.attType) originally supplied to be indexed for this
626 leaf tuple. Also, recheck may be set to true if the match is
627 uncertain and so the operator(s) must be re-applied to the
628 actual heap tuple to verify the match. If ordered search is
629 performed, set distances to an array of distance values
630 according to orderbys array. Leave it NULL otherwise. If at
631 least one of returned distances is not exact, set
632 recheckDistances to true. In this case, the executor will
633 calculate the exact distances after fetching the tuple from the
634 heap, and will reorder the tuples if needed.
636 The optional user-defined methods are:
638 Datum compress(Datum in)
639 Converts a data item into a format suitable for physical storage
640 in a leaf tuple of the index. It accepts a value of type
641 spgConfigIn.attType and returns a value of type
642 spgConfigOut.leafType. The output value must not contain an
643 out-of-line TOAST pointer.
645 Note: the compress method is only applied to values to be
646 stored. The consistent methods receive query scankeys unchanged,
647 without transformation using compress.
650 Defines a set of user-visible parameters that control operator
653 The SQL declaration of the function must look like this:
655 CREATE OR REPLACE FUNCTION my_options(internal)
660 The function is passed a pointer to a local_relopts struct,
661 which needs to be filled with a set of operator class specific
662 options. The options can be accessed from other support
663 functions using the PG_HAS_OPCLASS_OPTIONS() and
664 PG_GET_OPCLASS_OPTIONS() macros.
666 Since the representation of the key in SP-GiST is flexible, it
667 may depend on user-specified parameters.
669 All the SP-GiST support methods are normally called in a short-lived
670 memory context; that is, CurrentMemoryContext will be reset after
671 processing of each tuple. It is therefore not very important to worry
672 about pfree'ing everything you palloc. (The config method is an
673 exception: it should try to avoid leaking memory. But usually the
674 config method need do nothing but assign constants into the passed
677 If the indexed column is of a collatable data type, the index collation
678 will be passed to all the support methods, using the standard
679 PG_GET_COLLATION() mechanism.
681 65.3.4. Implementation #
683 This section covers implementation details and other tricks that are
684 useful for implementers of SP-GiST operator classes to know.
686 65.3.4.1. SP-GiST Limits #
688 Individual leaf tuples and inner tuples must fit on a single index page
689 (8kB by default). Therefore, when indexing values of variable-length
690 data types, long values can only be supported by methods such as radix
691 trees, in which each level of the tree includes a prefix that is short
692 enough to fit on a page, and the final leaf level includes a suffix
693 also short enough to fit on a page. The operator class should set
694 longValuesOK to true only if it is prepared to arrange for this to
695 happen. Otherwise, the SP-GiST core will reject any request to index a
696 value that is too large to fit on an index page.
698 Likewise, it is the operator class's responsibility that inner tuples
699 do not grow too large to fit on an index page; this limits the number
700 of child nodes that can be used in one inner tuple, as well as the
701 maximum size of a prefix value.
703 Another limitation is that when an inner tuple's node points to a set
704 of leaf tuples, those tuples must all be in the same index page. (This
705 is a design decision to reduce seeking and save space in the links that
706 chain such tuples together.) If the set of leaf tuples grows too large
707 for a page, a split is performed and an intermediate inner tuple is
708 inserted. For this to fix the problem, the new inner tuple must divide
709 the set of leaf values into more than one node group. If the operator
710 class's picksplit function fails to do that, the SP-GiST core resorts
711 to extraordinary measures described in Section 65.3.4.3.
713 When longValuesOK is true, it is expected that successive levels of the
714 SP-GiST tree will absorb more and more information into the prefixes
715 and node labels of the inner tuples, making the required leaf datum
716 smaller and smaller, so that eventually it will fit on a page. To
717 prevent bugs in operator classes from causing infinite insertion loops,
718 the SP-GiST core will raise an error if the leaf datum does not become
719 any smaller within ten cycles of choose method calls.
721 65.3.4.2. SP-GiST Without Node Labels #
723 Some tree algorithms use a fixed set of nodes for each inner tuple; for
724 example, in a quad-tree there are always exactly four nodes
725 corresponding to the four quadrants around the inner tuple's centroid
726 point. In such a case the code typically works with the nodes by
727 number, and there is no need for explicit node labels. To suppress node
728 labels (and thereby save some space), the picksplit function can return
729 NULL for the nodeLabels array, and likewise the choose function can
730 return NULL for the prefixNodeLabels array during a spgSplitTuple
731 action. This will in turn result in nodeLabels being NULL during
732 subsequent calls to choose and inner_consistent. In principle, node
733 labels could be used for some inner tuples and omitted for others in
736 When working with an inner tuple having unlabeled nodes, it is an error
737 for choose to return spgAddNode, since the set of nodes is supposed to
738 be fixed in such cases.
740 65.3.4.3. “All-the-Same” Inner Tuples #
742 The SP-GiST core can override the results of the operator class's
743 picksplit function when picksplit fails to divide the supplied leaf
744 values into at least two node categories. When this happens, the new
745 inner tuple is created with multiple nodes that each have the same
746 label (if any) that picksplit gave to the one node it did use, and the
747 leaf values are divided at random among these equivalent nodes. The
748 allTheSame flag is set on the inner tuple to warn the choose and
749 inner_consistent functions that the tuple does not have the node set
750 that they might otherwise expect.
752 When dealing with an allTheSame tuple, a choose result of spgMatchNode
753 is interpreted to mean that the new value can be assigned to any of the
754 equivalent nodes; the core code will ignore the supplied nodeN value
755 and descend into one of the nodes at random (so as to keep the tree
756 balanced). It is an error for choose to return spgAddNode, since that
757 would make the nodes not all equivalent; the spgSplitTuple action must
758 be used if the value to be inserted doesn't match the existing nodes.
760 When dealing with an allTheSame tuple, the inner_consistent function
761 should return either all or none of the nodes as targets for continuing
762 the index search, since they are all equivalent. This may or may not
763 require any special-case code, depending on how much the
764 inner_consistent function normally assumes about the meaning of the
769 The PostgreSQL source distribution includes several examples of index
770 operator classes for SP-GiST, as described in Table 65.2. Look into
771 src/backend/access/spgist/ and src/backend/utils/adt/ to see the code.