begriffs open source - ai-pg/blob - full-docs/txt/index-functions.txt

   1
   2 63.2. Index Access Method Functions #
   3
   4    The index construction and maintenance functions that an index access
   5    method must provide in IndexAmRoutine are:
   6
   7 IndexBuildResult *
   8 ambuild (Relation heapRelation,
   9          Relation indexRelation,
  10          IndexInfo *indexInfo);
  11
  12    Build a new index. The index relation has been physically created, but
  13    is empty. It must be filled in with whatever fixed data the access
  14    method requires, plus entries for all tuples already existing in the
  15    table. Ordinarily the ambuild function will call
  16    table_index_build_scan() to scan the table for existing tuples and
  17    compute the keys that need to be inserted into the index. The function
  18    must return a palloc'd struct containing statistics about the new
  19    index. The amcanbuildparallel flag indicates whether the access method
  20    supports parallel index builds. When set to true, the system will
  21    attempt to allocate parallel workers for the build. Access methods
  22    supporting only non-parallel index builds should leave this flag set to
  23    false.
  24
  25 void
  26 ambuildempty (Relation indexRelation);
  27
  28    Build an empty index, and write it to the initialization fork
  29    (INIT_FORKNUM) of the given relation. This method is called only for
  30    unlogged indexes; the empty index written to the initialization fork
  31    will be copied over the main relation fork on each server restart.
  32
  33 bool
  34 aminsert (Relation indexRelation,
  35           Datum *values,
  36           bool *isnull,
  37           ItemPointer heap_tid,
  38           Relation heapRelation,
  39           IndexUniqueCheck checkUnique,
  40           bool indexUnchanged,
  41           IndexInfo *indexInfo);
  42
  43    Insert a new tuple into an existing index. The values and isnull arrays
  44    give the key values to be indexed, and heap_tid is the TID to be
  45    indexed. If the access method supports unique indexes (its amcanunique
  46    flag is true) then checkUnique indicates the type of uniqueness check
  47    to perform. This varies depending on whether the unique constraint is
  48    deferrable; see Section 63.5 for details. Normally the access method
  49    only needs the heapRelation parameter when performing uniqueness
  50    checking (since then it will have to look into the heap to verify tuple
  51    liveness).
  52
  53    The indexUnchanged Boolean value gives a hint about the nature of the
  54    tuple to be indexed. When it is true, the tuple is a duplicate of some
  55    existing tuple in the index. The new tuple is a logically unchanged
  56    successor MVCC tuple version. This happens when an UPDATE takes place
  57    that does not modify any columns covered by the index, but nevertheless
  58    requires a new version in the index. The index AM may use this hint to
  59    decide to apply bottom-up index deletion in parts of the index where
  60    many versions of the same logical row accumulate. Note that updating a
  61    non-key column or a column that only appears in a partial index
  62    predicate does not affect the value of indexUnchanged. The core code
  63    determines each tuple's indexUnchanged value using a low overhead
  64    approach that allows both false positives and false negatives. Index
  65    AMs must not treat indexUnchanged as an authoritative source of
  66    information about tuple visibility or versioning.
  67
  68    The function's Boolean result value is significant only when
  69    checkUnique is UNIQUE_CHECK_PARTIAL. In this case a true result means
  70    the new entry is known unique, whereas false means it might be
  71    non-unique (and a deferred uniqueness check must be scheduled). For
  72    other cases a constant false result is recommended.
  73
  74    Some indexes might not index all tuples. If the tuple is not to be
  75    indexed, aminsert should just return without doing anything.
  76
  77    If the index AM wishes to cache data across successive index insertions
  78    within an SQL statement, it can allocate space in indexInfo->ii_Context
  79    and store a pointer to the data in indexInfo->ii_AmCache (which will be
  80    NULL initially). If resources other than memory have to be released
  81    after index insertions, aminsertcleanup may be provided, which will be
  82    called before the memory is released.
  83
  84 void
  85 aminsertcleanup (Relation indexRelation,
  86                  IndexInfo *indexInfo);
  87
  88    Clean up state that was maintained across successive inserts in
  89    indexInfo->ii_AmCache. This is useful if the data requires additional
  90    cleanup steps (e.g., releasing pinned buffers), and simply releasing
  91    the memory is not sufficient.
  92
  93 IndexBulkDeleteResult *
  94 ambulkdelete (IndexVacuumInfo *info,
  95               IndexBulkDeleteResult *stats,
  96               IndexBulkDeleteCallback callback,
  97               void *callback_state);
  98
  99    Delete tuple(s) from the index. This is a “bulk delete” operation that
 100    is intended to be implemented by scanning the whole index and checking
 101    each entry to see if it should be deleted. The passed-in callback
 102    function must be called, in the style callback(TID, callback_state)
 103    returns bool, to determine whether any particular index entry, as
 104    identified by its referenced TID, is to be deleted. Must return either
 105    NULL or a palloc'd struct containing statistics about the effects of
 106    the deletion operation. It is OK to return NULL if no information needs
 107    to be passed on to amvacuumcleanup.
 108
 109    Because of limited maintenance_work_mem, ambulkdelete might need to be
 110    called more than once when many tuples are to be deleted. The stats
 111    argument is the result of the previous call for this index (it is NULL
 112    for the first call within a VACUUM operation). This allows the AM to
 113    accumulate statistics across the whole operation. Typically,
 114    ambulkdelete will modify and return the same struct if the passed stats
 115    is not null.
 116
 117 IndexBulkDeleteResult *
 118 amvacuumcleanup (IndexVacuumInfo *info,
 119                  IndexBulkDeleteResult *stats);
 120
 121    Clean up after a VACUUM operation (zero or more ambulkdelete calls).
 122    This does not have to do anything beyond returning index statistics,
 123    but it might perform bulk cleanup such as reclaiming empty index pages.
 124    stats is whatever the last ambulkdelete call returned, or NULL if
 125    ambulkdelete was not called because no tuples needed to be deleted. If
 126    the result is not NULL it must be a palloc'd struct. The statistics it
 127    contains will be used to update pg_class, and will be reported by
 128    VACUUM if VERBOSE is given. It is OK to return NULL if the index was
 129    not changed at all during the VACUUM operation, but otherwise correct
 130    stats should be returned.
 131
 132    amvacuumcleanup will also be called at completion of an ANALYZE
 133    operation. In this case stats is always NULL and any return value will
 134    be ignored. This case can be distinguished by checking
 135    info->analyze_only. It is recommended that the access method do nothing
 136    except post-insert cleanup in such a call, and that only in an
 137    autovacuum worker process.
 138
 139 bool
 140 amcanreturn (Relation indexRelation, int attno);
 141
 142    Check whether the index can support index-only scans on the given
 143    column, by returning the column's original indexed value. The attribute
 144    number is 1-based, i.e., the first column's attno is 1. Returns true if
 145    supported, else false. This function should always return true for
 146    included columns (if those are supported), since there's little point
 147    in an included column that can't be retrieved. If the access method
 148    does not support index-only scans at all, the amcanreturn field in its
 149    IndexAmRoutine struct can be set to NULL.
 150
 151 void
 152 amcostestimate (PlannerInfo *root,
 153                 IndexPath *path,
 154                 double loop_count,
 155                 Cost *indexStartupCost,
 156                 Cost *indexTotalCost,
 157                 Selectivity *indexSelectivity,
 158                 double *indexCorrelation,
 159                 double *indexPages);
 160
 161    Estimate the costs of an index scan. This function is described fully
 162    in Section 63.6, below.
 163
 164 int
 165 amgettreeheight (Relation rel);
 166
 167    Compute the height of a tree-shaped index. This information is supplied
 168    to the amcostestimate function in path->indexinfo->tree_height and can
 169    be used to support the cost estimation. The result is not used anywhere
 170    else, so this function can actually be used to compute any kind of data
 171    (that fits into an integer) about the index that the cost estimation
 172    function might want to know. If the computation is expensive, it could
 173    be useful to cache the result as part of RelationData.rd_amcache.
 174
 175 bytea *
 176 amoptions (ArrayType *reloptions,
 177            bool validate);
 178
 179    Parse and validate the reloptions array for an index. This is called
 180    only when a non-null reloptions array exists for the index. reloptions
 181    is a text array containing entries of the form name=value. The function
 182    should construct a bytea value, which will be copied into the
 183    rd_options field of the index's relcache entry. The data contents of
 184    the bytea value are open for the access method to define; most of the
 185    standard access methods use struct StdRdOptions. When validate is true,
 186    the function should report a suitable error message if any of the
 187    options are unrecognized or have invalid values; when validate is
 188    false, invalid entries should be silently ignored. (validate is false
 189    when loading options already stored in pg_catalog; an invalid entry
 190    could only be found if the access method has changed its rules for
 191    options, and in that case ignoring obsolete entries is appropriate.) It
 192    is OK to return NULL if default behavior is wanted.
 193
 194 bool
 195 amproperty (Oid index_oid, int attno,
 196             IndexAMProperty prop, const char *propname,
 197             bool *res, bool *isnull);
 198
 199    The amproperty method allows index access methods to override the
 200    default behavior of pg_index_column_has_property and related functions.
 201    If the access method does not have any special behavior for index
 202    property inquiries, the amproperty field in its IndexAmRoutine struct
 203    can be set to NULL. Otherwise, the amproperty method will be called
 204    with index_oid and attno both zero for pg_indexam_has_property calls,
 205    or with index_oid valid and attno zero for pg_index_has_property calls,
 206    or with index_oid valid and attno greater than zero for
 207    pg_index_column_has_property calls. prop is an enum value identifying
 208    the property being tested, while propname is the original property name
 209    string. If the core code does not recognize the property name then prop
 210    is AMPROP_UNKNOWN. Access methods can define custom property names by
 211    checking propname for a match (use pg_strcasecmp to match, for
 212    consistency with the core code); for names known to the core code, it's
 213    better to inspect prop. If the amproperty method returns true then it
 214    has determined the property test result: it must set *res to the
 215    Boolean value to return, or set *isnull to true to return a NULL. (Both
 216    of the referenced variables are initialized to false before the call.)
 217    If the amproperty method returns false then the core code will proceed
 218    with its normal logic for determining the property test result.
 219
 220    Access methods that support ordering operators should implement
 221    AMPROP_DISTANCE_ORDERABLE property testing, as the core code does not
 222    know how to do that and will return NULL. It may also be advantageous
 223    to implement AMPROP_RETURNABLE testing, if that can be done more
 224    cheaply than by opening the index and calling amcanreturn, which is the
 225    core code's default behavior. The default behavior should be
 226    satisfactory for all other standard properties.
 227
 228 char *
 229 ambuildphasename (int64 phasenum);
 230
 231    Return the textual name of the given build phase number. The phase
 232    numbers are those reported during an index build via the
 233    pgstat_progress_update_param interface. The phase names are then
 234    exposed in the pg_stat_progress_create_index view.
 235
 236 bool
 237 amvalidate (Oid opclassoid);
 238
 239    Validate the catalog entries for the specified operator class, so far
 240    as the access method can reasonably do that. For example, this might
 241    include testing that all required support functions are provided. The
 242    amvalidate function must return false if the opclass is invalid.
 243    Problems should be reported with ereport messages, typically at INFO
 244    level.
 245
 246 void
 247 amadjustmembers (Oid opfamilyoid,
 248                  Oid opclassoid,
 249                  List *operators,
 250                  List *functions);
 251
 252    Validate proposed new operator and function members of an operator
 253    family, so far as the access method can reasonably do that, and set
 254    their dependency types if the default is not satisfactory. This is
 255    called during CREATE OPERATOR CLASS and during ALTER OPERATOR FAMILY
 256    ADD; in the latter case opclassoid is InvalidOid. The List arguments
 257    are lists of OpFamilyMember structs, as defined in amapi.h. Tests done
 258    by this function will typically be a subset of those performed by
 259    amvalidate, since amadjustmembers cannot assume that it is seeing a
 260    complete set of members. For example, it would be reasonable to check
 261    the signature of a support function, but not to check whether all
 262    required support functions are provided. Any problems can be reported
 263    by throwing an error. The dependency-related fields of the
 264    OpFamilyMember structs are initialized by the core code to create hard
 265    dependencies on the opclass if this is CREATE OPERATOR CLASS, or soft
 266    dependencies on the opfamily if this is ALTER OPERATOR FAMILY ADD.
 267    amadjustmembers can adjust these fields if some other behavior is more
 268    appropriate. For example, GIN, GiST, and SP-GiST always set operator
 269    members to have soft dependencies on the opfamily, since the connection
 270    between an operator and an opclass is relatively weak in these index
 271    types; so it is reasonable to allow operator members to be added and
 272    removed freely. Optional support functions are typically also given
 273    soft dependencies, so that they can be removed if necessary.
 274
 275    The purpose of an index, of course, is to support scans for tuples
 276    matching an indexable WHERE condition, often called a qualifier or scan
 277    key. The semantics of index scanning are described more fully in
 278    Section 63.3, below. An index access method can support “plain” index
 279    scans, “bitmap” index scans, or both. The scan-related functions that
 280    an index access method must or may provide are:
 281
 282 IndexScanDesc
 283 ambeginscan (Relation indexRelation,
 284              int nkeys,
 285              int norderbys);
 286
 287    Prepare for an index scan. The nkeys and norderbys parameters indicate
 288    the number of quals and ordering operators that will be used in the
 289    scan; these may be useful for space allocation purposes. Note that the
 290    actual values of the scan keys aren't provided yet. The result must be
 291    a palloc'd struct. For implementation reasons the index access method
 292    must create this struct by calling RelationGetIndexScan(). In most
 293    cases ambeginscan does little beyond making that call and perhaps
 294    acquiring locks; the interesting parts of index-scan startup are in
 295    amrescan.
 296
 297 void
 298 amrescan (IndexScanDesc scan,
 299           ScanKey keys,
 300           int nkeys,
 301           ScanKey orderbys,
 302           int norderbys);
 303
 304    Start or restart an index scan, possibly with new scan keys. (To
 305    restart using previously-passed keys, NULL is passed for keys and/or
 306    orderbys.) Note that it is not allowed for the number of keys or
 307    order-by operators to be larger than what was passed to ambeginscan. In
 308    practice the restart feature is used when a new outer tuple is selected
 309    by a nested-loop join and so a new key comparison value is needed, but
 310    the scan key structure remains the same.
 311
 312 bool
 313 amgettuple (IndexScanDesc scan,
 314             ScanDirection direction);
 315
 316    Fetch the next tuple in the given scan, moving in the given direction
 317    (forward or backward in the index). Returns true if a tuple was
 318    obtained, false if no matching tuples remain. In the true case the
 319    tuple TID is stored into the scan structure. Note that “success” means
 320    only that the index contains an entry that matches the scan keys, not
 321    that the tuple necessarily still exists in the heap or will pass the
 322    caller's snapshot test. On success, amgettuple must also set
 323    scan->xs_recheck to true or false. False means it is certain that the
 324    index entry matches the scan keys. True means this is not certain, and
 325    the conditions represented by the scan keys must be rechecked against
 326    the heap tuple after fetching it. This provision supports “lossy” index
 327    operators. Note that rechecking will extend only to the scan
 328    conditions; a partial index predicate (if any) is never rechecked by
 329    amgettuple callers.
 330
 331    If the index supports index-only scans (i.e., amcanreturn returns true
 332    for any of its columns), then on success the AM must also check
 333    scan->xs_want_itup, and if that is true it must return the originally
 334    indexed data for the index entry. Columns for which amcanreturn returns
 335    false can be returned as nulls. The data can be returned in the form of
 336    an IndexTuple pointer stored at scan->xs_itup, with tuple descriptor
 337    scan->xs_itupdesc; or in the form of a HeapTuple pointer stored at
 338    scan->xs_hitup, with tuple descriptor scan->xs_hitupdesc. (The latter
 339    format should be used when reconstructing data that might possibly not
 340    fit into an IndexTuple.) In either case, management of the data
 341    referenced by the pointer is the access method's responsibility. The
 342    data must remain good at least until the next amgettuple, amrescan, or
 343    amendscan call for the scan.
 344
 345    The amgettuple function need only be provided if the access method
 346    supports “plain” index scans. If it doesn't, the amgettuple field in
 347    its IndexAmRoutine struct must be set to NULL.
 348
 349 int64
 350 amgetbitmap (IndexScanDesc scan,
 351              TIDBitmap *tbm);
 352
 353    Fetch all tuples in the given scan and add them to the caller-supplied
 354    TIDBitmap (that is, OR the set of tuple IDs into whatever set is
 355    already in the bitmap). The number of tuples fetched is returned (this
 356    might be just an approximate count, for instance some AMs do not detect
 357    duplicates). While inserting tuple IDs into the bitmap, amgetbitmap can
 358    indicate that rechecking of the scan conditions is required for
 359    specific tuple IDs. This is analogous to the xs_recheck output
 360    parameter of amgettuple. Note: in the current implementation, support
 361    for this feature is conflated with support for lossy storage of the
 362    bitmap itself, and therefore callers recheck both the scan conditions
 363    and the partial index predicate (if any) for recheckable tuples. That
 364    might not always be true, however. amgetbitmap and amgettuple cannot be
 365    used in the same index scan; there are other restrictions too when
 366    using amgetbitmap, as explained in Section 63.3.
 367
 368    The amgetbitmap function need only be provided if the access method
 369    supports “bitmap” index scans. If it doesn't, the amgetbitmap field in
 370    its IndexAmRoutine struct must be set to NULL.
 371
 372 void
 373 amendscan (IndexScanDesc scan);
 374
 375    End a scan and release resources. The scan struct itself should not be
 376    freed, but any locks or pins taken internally by the access method must
 377    be released, as well as any other memory allocated by ambeginscan and
 378    other scan-related functions.
 379
 380 void
 381 ammarkpos (IndexScanDesc scan);
 382
 383    Mark current scan position. The access method need only support one
 384    remembered scan position per scan.
 385
 386    The ammarkpos function need only be provided if the access method
 387    supports ordered scans. If it doesn't, the ammarkpos field in its
 388    IndexAmRoutine struct may be set to NULL.
 389
 390 void
 391 amrestrpos (IndexScanDesc scan);
 392
 393    Restore the scan to the most recently marked position.
 394
 395    The amrestrpos function need only be provided if the access method
 396    supports ordered scans. If it doesn't, the amrestrpos field in its
 397    IndexAmRoutine struct may be set to NULL.
 398
 399    In addition to supporting ordinary index scans, some types of index may
 400    wish to support parallel index scans, which allow multiple backends to
 401    cooperate in performing an index scan. The index access method should
 402    arrange things so that each cooperating process returns a subset of the
 403    tuples that would be performed by an ordinary, non-parallel index scan,
 404    but in such a way that the union of those subsets is equal to the set
 405    of tuples that would be returned by an ordinary, non-parallel index
 406    scan. Furthermore, while there need not be any global ordering of
 407    tuples returned by a parallel scan, the ordering of that subset of
 408    tuples returned within each cooperating backend must match the
 409    requested ordering. The following functions may be implemented to
 410    support parallel index scans:
 411
 412 Size
 413 amestimateparallelscan (Relation indexRelation,
 414                         int nkeys,
 415                         int norderbys);
 416
 417    Estimate and return the number of bytes of dynamic shared memory which
 418    the access method will be needed to perform a parallel scan. (This
 419    number is in addition to, not in lieu of, the amount of space needed
 420    for AM-independent data in ParallelIndexScanDescData.)
 421
 422    The nkeys and norderbys parameters indicate the number of quals and
 423    ordering operators that will be used in the scan; the same values will
 424    be passed to amrescan. Note that the actual values of the scan keys
 425    aren't provided yet.
 426
 427    It is not necessary to implement this function for access methods which
 428    do not support parallel scans or for which the number of additional
 429    bytes of storage required is zero.
 430
 431 void
 432 aminitparallelscan (void *target);
 433
 434    This function will be called to initialize dynamic shared memory at the
 435    beginning of a parallel scan. target will point to at least the number
 436    of bytes previously returned by amestimateparallelscan, and this
 437    function may use that amount of space to store whatever data it wishes.
 438
 439    It is not necessary to implement this function for access methods which
 440    do not support parallel scans or in cases where the shared memory space
 441    required needs no initialization.
 442
 443 void
 444 amparallelrescan (IndexScanDesc scan);
 445
 446    This function, if implemented, will be called when a parallel index
 447    scan must be restarted. It should reset any shared state set up by
 448    aminitparallelscan such that the scan will be restarted from the
 449    beginning.
 450
 451 CompareType
 452 amtranslatestrategy (StrategyNumber strategy, Oid opfamily, Oid opcintype);
 453
 454 StrategyNumber
 455 amtranslatecmptype (CompareType cmptype, Oid opfamily, Oid opcintype);
 456
 457    These functions, if implemented, will be called by the planner and
 458    executor to convert between fixed CompareType values and the specific
 459    strategy numbers used by the access method. These functions can be
 460    implemented by access methods that implement functionality similar to
 461    the built-in btree or hash access methods, and by implementing these
 462    translations, the system can learn about the semantics of the access
 463    method's operations and can use them in place of btree or hash indexes
 464    in various places. If the functionality of the access method is not
 465    similar to those built-in access methods, these functions do not need
 466    to be implemented. If the functions are not implemented, the access
 467    method will be ignored for certain planner and executor decisions, but
 468    is otherwise fully functional.