begriffs open source - ai-pg/blob - full-docs/txt/fdw-planning.txt

   1
   2 58.4. Foreign Data Wrapper Query Planning #
   3
   4    The FDW callback functions GetForeignRelSize, GetForeignPaths,
   5    GetForeignPlan, PlanForeignModify, GetForeignJoinPaths,
   6    GetForeignUpperPaths, and PlanDirectModify must fit into the workings
   7    of the PostgreSQL planner. Here are some notes about what they must do.
   8
   9    The information in root and baserel can be used to reduce the amount of
  10    information that has to be fetched from the foreign table (and
  11    therefore reduce the cost). baserel->baserestrictinfo is particularly
  12    interesting, as it contains restriction quals (WHERE clauses) that
  13    should be used to filter the rows to be fetched. (The FDW itself is not
  14    required to enforce these quals, as the core executor can check them
  15    instead.) baserel->reltarget->exprs can be used to determine which
  16    columns need to be fetched; but note that it only lists columns that
  17    have to be emitted by the ForeignScan plan node, not columns that are
  18    used in qual evaluation but not output by the query.
  19
  20    Various private fields are available for the FDW planning functions to
  21    keep information in. Generally, whatever you store in FDW private
  22    fields should be palloc'd, so that it will be reclaimed at the end of
  23    planning.
  24
  25    baserel->fdw_private is a void pointer that is available for FDW
  26    planning functions to store information relevant to the particular
  27    foreign table. The core planner does not touch it except to initialize
  28    it to NULL when the RelOptInfo node is created. It is useful for
  29    passing information forward from GetForeignRelSize to GetForeignPaths
  30    and/or GetForeignPaths to GetForeignPlan, thereby avoiding
  31    recalculation.
  32
  33    GetForeignPaths can identify the meaning of different access paths by
  34    storing private information in the fdw_private field of ForeignPath
  35    nodes. fdw_private is declared as a List pointer, but could actually
  36    contain anything since the core planner does not touch it. However,
  37    best practice is to use a representation that's dumpable by
  38    nodeToString, for use with debugging support available in the backend.
  39
  40    GetForeignPlan can examine the fdw_private field of the selected
  41    ForeignPath node, and can generate fdw_exprs and fdw_private lists to
  42    be placed in the ForeignScan plan node, where they will be available at
  43    execution time. Both of these lists must be represented in a form that
  44    copyObject knows how to copy. The fdw_private list has no other
  45    restrictions and is not interpreted by the core backend in any way. The
  46    fdw_exprs list, if not NIL, is expected to contain expression trees
  47    that are intended to be executed at run time. These trees will undergo
  48    post-processing by the planner to make them fully executable.
  49
  50    In GetForeignPlan, generally the passed-in target list can be copied
  51    into the plan node as-is. The passed scan_clauses list contains the
  52    same clauses as baserel->baserestrictinfo, but may be re-ordered for
  53    better execution efficiency. In simple cases the FDW can just strip
  54    RestrictInfo nodes from the scan_clauses list (using
  55    extract_actual_clauses) and put all the clauses into the plan node's
  56    qual list, which means that all the clauses will be checked by the
  57    executor at run time. More complex FDWs may be able to check some of
  58    the clauses internally, in which case those clauses can be removed from
  59    the plan node's qual list so that the executor doesn't waste time
  60    rechecking them.
  61
  62    As an example, the FDW might identify some restriction clauses of the
  63    form foreign_variable = sub_expression, which it determines can be
  64    executed on the remote server given the locally-evaluated value of the
  65    sub_expression. The actual identification of such a clause should
  66    happen during GetForeignPaths, since it would affect the cost estimate
  67    for the path. The path's fdw_private field would probably include a
  68    pointer to the identified clause's RestrictInfo node. Then
  69    GetForeignPlan would remove that clause from scan_clauses, but add the
  70    sub_expression to fdw_exprs to ensure that it gets massaged into
  71    executable form. It would probably also put control information into
  72    the plan node's fdw_private field to tell the execution functions what
  73    to do at run time. The query transmitted to the remote server would
  74    involve something like WHERE foreign_variable = $1, with the parameter
  75    value obtained at run time from evaluation of the fdw_exprs expression
  76    tree.
  77
  78    Any clauses removed from the plan node's qual list must instead be
  79    added to fdw_recheck_quals or rechecked by RecheckForeignScan in order
  80    to ensure correct behavior at the READ COMMITTED isolation level. When
  81    a concurrent update occurs for some other table involved in the query,
  82    the executor may need to verify that all of the original quals are
  83    still satisfied for the tuple, possibly against a different set of
  84    parameter values. Using fdw_recheck_quals is typically easier than
  85    implementing checks inside RecheckForeignScan, but this method will be
  86    insufficient when outer joins have been pushed down, since the join
  87    tuples in that case might have some fields go to NULL without rejecting
  88    the tuple entirely.
  89
  90    Another ForeignScan field that can be filled by FDWs is fdw_scan_tlist,
  91    which describes the tuples returned by the FDW for this plan node. For
  92    simple foreign table scans this can be set to NIL, implying that the
  93    returned tuples have the row type declared for the foreign table. A
  94    non-NIL value must be a target list (list of TargetEntrys) containing
  95    Vars and/or expressions representing the returned columns. This might
  96    be used, for example, to show that the FDW has omitted some columns
  97    that it noticed won't be needed for the query. Also, if the FDW can
  98    compute expressions used by the query more cheaply than can be done
  99    locally, it could add those expressions to fdw_scan_tlist. Note that
 100    join plans (created from paths made by GetForeignJoinPaths) must always
 101    supply fdw_scan_tlist to describe the set of columns they will return.
 102
 103    The FDW should always construct at least one path that depends only on
 104    the table's restriction clauses. In join queries, it might also choose
 105    to construct path(s) that depend on join clauses, for example
 106    foreign_variable = local_variable. Such clauses will not be found in
 107    baserel->baserestrictinfo but must be sought in the relation's join
 108    lists. A path using such a clause is called a “parameterized path”. It
 109    must identify the other relations used in the selected join clause(s)
 110    with a suitable value of param_info; use get_baserel_parampathinfo to
 111    compute that value. In GetForeignPlan, the local_variable portion of
 112    the join clause would be added to fdw_exprs, and then at run time the
 113    case works the same as for an ordinary restriction clause.
 114
 115    If an FDW supports remote joins, GetForeignJoinPaths should produce
 116    ForeignPaths for potential remote joins in much the same way as
 117    GetForeignPaths works for base tables. Information about the intended
 118    join can be passed forward to GetForeignPlan in the same ways described
 119    above. However, baserestrictinfo is not relevant for join relations;
 120    instead, the relevant join clauses for a particular join are passed to
 121    GetForeignJoinPaths as a separate parameter (extra->restrictlist).
 122
 123    An FDW might additionally support direct execution of some plan actions
 124    that are above the level of scans and joins, such as grouping or
 125    aggregation. To offer such options, the FDW should generate paths and
 126    insert them into the appropriate upper relation. For example, a path
 127    representing remote aggregation should be inserted into the
 128    UPPERREL_GROUP_AGG relation, using add_path. This path will be compared
 129    on a cost basis with local aggregation performed by reading a simple
 130    scan path for the foreign relation (note that such a path must also be
 131    supplied, else there will be an error at plan time). If the
 132    remote-aggregation path wins, which it usually would, it will be
 133    converted into a plan in the usual way, by calling GetForeignPlan. The
 134    recommended place to generate such paths is in the GetForeignUpperPaths
 135    callback function, which is called for each upper relation (i.e., each
 136    post-scan/join processing step), if all the base relations of the query
 137    come from the same FDW.
 138
 139    PlanForeignModify and the other callbacks described in Section 58.2.4
 140    are designed around the assumption that the foreign relation will be
 141    scanned in the usual way and then individual row updates will be driven
 142    by a local ModifyTable plan node. This approach is necessary for the
 143    general case where an update requires reading local tables as well as
 144    foreign tables. However, if the operation could be executed entirely by
 145    the foreign server, the FDW could generate a path representing that and
 146    insert it into the UPPERREL_FINAL upper relation, where it would
 147    compete against the ModifyTable approach. This approach could also be
 148    used to implement remote SELECT FOR UPDATE, rather than using the row
 149    locking callbacks described in Section 58.2.6. Keep in mind that a path
 150    inserted into UPPERREL_FINAL is responsible for implementing all
 151    behavior of the query.
 152
 153    When planning an UPDATE or DELETE, PlanForeignModify and
 154    PlanDirectModify can look up the RelOptInfo struct for the foreign
 155    table and make use of the baserel->fdw_private data previously created
 156    by the scan-planning functions. However, in INSERT the target table is
 157    not scanned so there is no RelOptInfo for it. The List returned by
 158    PlanForeignModify has the same restrictions as the fdw_private list of
 159    a ForeignScan plan node, that is it must contain only structures that
 160    copyObject knows how to copy.
 161
 162    INSERT with an ON CONFLICT clause does not support specifying the
 163    conflict target, as unique constraints or exclusion constraints on
 164    remote tables are not locally known. This in turn implies that ON
 165    CONFLICT DO UPDATE is not supported, since the specification is
 166    mandatory there.