begriffs open source - ai-pg/blob - full-docs/txt/plpgsql-implementation.txt

   1
   2 41.11. PL/pgSQL under the Hood #
   3
   4    41.11.1. Variable Substitution
   5    41.11.2. Plan Caching
   6
   7    This section discusses some implementation details that are frequently
   8    important for PL/pgSQL users to know.
   9
  10 41.11.1. Variable Substitution #
  11
  12    SQL statements and expressions within a PL/pgSQL function can refer to
  13    variables and parameters of the function. Behind the scenes, PL/pgSQL
  14    substitutes query parameters for such references. Query parameters will
  15    only be substituted in places where they are syntactically permissible.
  16    As an extreme case, consider this example of poor programming style:
  17 INSERT INTO foo (foo) VALUES (foo(foo));
  18
  19    The first occurrence of foo must syntactically be a table name, so it
  20    will not be substituted, even if the function has a variable named foo.
  21    The second occurrence must be the name of a column of that table, so it
  22    will not be substituted either. Likewise the third occurrence must be a
  23    function name, so it also will not be substituted for. Only the last
  24    occurrence is a candidate to be a reference to a variable of the
  25    PL/pgSQL function.
  26
  27    Another way to understand this is that variable substitution can only
  28    insert data values into an SQL command; it cannot dynamically change
  29    which database objects are referenced by the command. (If you want to
  30    do that, you must build a command string dynamically, as explained in
  31    Section 41.5.4.)
  32
  33    Since the names of variables are syntactically no different from the
  34    names of table columns, there can be ambiguity in statements that also
  35    refer to tables: is a given name meant to refer to a table column, or a
  36    variable? Let's change the previous example to
  37 INSERT INTO dest (col) SELECT foo + bar FROM src;
  38
  39    Here, dest and src must be table names, and col must be a column of
  40    dest, but foo and bar might reasonably be either variables of the
  41    function or columns of src.
  42
  43    By default, PL/pgSQL will report an error if a name in an SQL statement
  44    could refer to either a variable or a table column. You can fix such a
  45    problem by renaming the variable or column, or by qualifying the
  46    ambiguous reference, or by telling PL/pgSQL which interpretation to
  47    prefer.
  48
  49    The simplest solution is to rename the variable or column. A common
  50    coding rule is to use a different naming convention for PL/pgSQL
  51    variables than you use for column names. For example, if you
  52    consistently name function variables v_something while none of your
  53    column names start with v_, no conflicts will occur.
  54
  55    Alternatively you can qualify ambiguous references to make them clear.
  56    In the above example, src.foo would be an unambiguous reference to the
  57    table column. To create an unambiguous reference to a variable, declare
  58    it in a labeled block and use the block's label (see Section 41.2). For
  59    example,
  60 <<block>>
  61 DECLARE
  62     foo int;
  63 BEGIN
  64     foo := ...;
  65     INSERT INTO dest (col) SELECT block.foo + bar FROM src;
  66
  67    Here block.foo means the variable even if there is a column foo in src.
  68    Function parameters, as well as special variables such as FOUND, can be
  69    qualified by the function's name, because they are implicitly declared
  70    in an outer block labeled with the function's name.
  71
  72    Sometimes it is impractical to fix all the ambiguous references in a
  73    large body of PL/pgSQL code. In such cases you can specify that
  74    PL/pgSQL should resolve ambiguous references as the variable (which is
  75    compatible with PL/pgSQL's behavior before PostgreSQL 9.0), or as the
  76    table column (which is compatible with some other systems such as
  77    Oracle).
  78
  79    To change this behavior on a system-wide basis, set the configuration
  80    parameter plpgsql.variable_conflict to one of error, use_variable, or
  81    use_column (where error is the factory default). This parameter affects
  82    subsequent compilations of statements in PL/pgSQL functions, but not
  83    statements already compiled in the current session. Because changing
  84    this setting can cause unexpected changes in the behavior of PL/pgSQL
  85    functions, it can only be changed by a superuser.
  86
  87    You can also set the behavior on a function-by-function basis, by
  88    inserting one of these special commands at the start of the function
  89    text:
  90 #variable_conflict error
  91 #variable_conflict use_variable
  92 #variable_conflict use_column
  93
  94    These commands affect only the function they are written in, and
  95    override the setting of plpgsql.variable_conflict. An example is
  96 CREATE FUNCTION stamp_user(id int, comment text) RETURNS void AS $$
  97     #variable_conflict use_variable
  98     DECLARE
  99         curtime timestamp := now();
 100     BEGIN
 101         UPDATE users SET last_modified = curtime, comment = comment
 102           WHERE users.id = id;
 103     END;
 104 $$ LANGUAGE plpgsql;
 105
 106    In the UPDATE command, curtime, comment, and id will refer to the
 107    function's variable and parameters whether or not users has columns of
 108    those names. Notice that we had to qualify the reference to users.id in
 109    the WHERE clause to make it refer to the table column. But we did not
 110    have to qualify the reference to comment as a target in the UPDATE
 111    list, because syntactically that must be a column of users. We could
 112    write the same function without depending on the variable_conflict
 113    setting in this way:
 114 CREATE FUNCTION stamp_user(id int, comment text) RETURNS void AS $$
 115     <<fn>>
 116     DECLARE
 117         curtime timestamp := now();
 118     BEGIN
 119         UPDATE users SET last_modified = fn.curtime, comment = stamp_user.commen
 120 t
 121           WHERE users.id = stamp_user.id;
 122     END;
 123 $$ LANGUAGE plpgsql;
 124
 125    Variable substitution does not happen in a command string given to
 126    EXECUTE or one of its variants. If you need to insert a varying value
 127    into such a command, do so as part of constructing the string value, or
 128    use USING, as illustrated in Section 41.5.4.
 129
 130    Variable substitution currently works only in SELECT, INSERT, UPDATE,
 131    DELETE, and commands containing one of these (such as EXPLAIN and
 132    CREATE TABLE ... AS SELECT), because the main SQL engine allows query
 133    parameters only in these commands. To use a non-constant name or value
 134    in other statement types (generically called utility statements), you
 135    must construct the utility statement as a string and EXECUTE it.
 136
 137 41.11.2. Plan Caching #
 138
 139    The PL/pgSQL interpreter parses the function's source text and produces
 140    an internal binary instruction tree the first time the function is
 141    called (within each session). The instruction tree fully translates the
 142    PL/pgSQL statement structure, but individual SQL expressions and SQL
 143    commands used in the function are not translated immediately.
 144
 145    As each expression and SQL command is first executed in the function,
 146    the PL/pgSQL interpreter parses and analyzes the command to create a
 147    prepared statement, using the SPI manager's SPI_prepare function.
 148    Subsequent visits to that expression or command reuse the prepared
 149    statement. Thus, a function with conditional code paths that are seldom
 150    visited will never incur the overhead of analyzing those commands that
 151    are never executed within the current session. A disadvantage is that
 152    errors in a specific expression or command cannot be detected until
 153    that part of the function is reached in execution. (Trivial syntax
 154    errors will be detected during the initial parsing pass, but anything
 155    deeper will not be detected until execution.)
 156
 157    PL/pgSQL (or more precisely, the SPI manager) can furthermore attempt
 158    to cache the execution plan associated with any particular prepared
 159    statement. If a cached plan is not used, then a fresh execution plan is
 160    generated on each visit to the statement, and the current parameter
 161    values (that is, PL/pgSQL variable values) can be used to optimize the
 162    selected plan. If the statement has no parameters, or is executed many
 163    times, the SPI manager will consider creating a generic plan that is
 164    not dependent on specific parameter values, and caching that for
 165    re-use. Typically this will happen only if the execution plan is not
 166    very sensitive to the values of the PL/pgSQL variables referenced in
 167    it. If it is, generating a plan each time is a net win. See PREPARE for
 168    more information about the behavior of prepared statements.
 169
 170    Because PL/pgSQL saves prepared statements and sometimes execution
 171    plans in this way, SQL commands that appear directly in a PL/pgSQL
 172    function must refer to the same tables and columns on every execution;
 173    that is, you cannot use a parameter as the name of a table or column in
 174    an SQL command. To get around this restriction, you can construct
 175    dynamic commands using the PL/pgSQL EXECUTE statement — at the price of
 176    performing new parse analysis and constructing a new execution plan on
 177    every execution.
 178
 179    The mutable nature of record variables presents another problem in this
 180    connection. When fields of a record variable are used in expressions or
 181    statements, the data types of the fields must not change from one call
 182    of the function to the next, since each expression will be analyzed
 183    using the data type that is present when the expression is first
 184    reached. EXECUTE can be used to get around this problem when necessary.
 185
 186    If the same function is used as a trigger for more than one table,
 187    PL/pgSQL prepares and caches statements independently for each such
 188    table — that is, there is a cache for each trigger function and table
 189    combination, not just for each function. This alleviates some of the
 190    problems with varying data types; for instance, a trigger function will
 191    be able to work successfully with a column named key even if it happens
 192    to have different types in different tables.
 193
 194    Likewise, functions having polymorphic argument types have a separate
 195    statement cache for each combination of actual argument types they have
 196    been invoked for, so that data type differences do not cause unexpected
 197    failures.
 198
 199    Statement caching can sometimes have surprising effects on the
 200    interpretation of time-sensitive values. For example there is a
 201    difference between what these two functions do:
 202 CREATE FUNCTION logfunc1(logtxt text) RETURNS void AS $$
 203     BEGIN
 204         INSERT INTO logtable VALUES (logtxt, 'now');
 205     END;
 206 $$ LANGUAGE plpgsql;
 207
 208    and:
 209 CREATE FUNCTION logfunc2(logtxt text) RETURNS void AS $$
 210     DECLARE
 211         curtime timestamp;
 212     BEGIN
 213         curtime := 'now';
 214         INSERT INTO logtable VALUES (logtxt, curtime);
 215     END;
 216 $$ LANGUAGE plpgsql;
 217
 218    In the case of logfunc1, the PostgreSQL main parser knows when
 219    analyzing the INSERT that the string 'now' should be interpreted as
 220    timestamp, because the target column of logtable is of that type. Thus,
 221    'now' will be converted to a timestamp constant when the INSERT is
 222    analyzed, and then used in all invocations of logfunc1 during the
 223    lifetime of the session. Needless to say, this isn't what the
 224    programmer wanted. A better idea is to use the now() or
 225    current_timestamp function.
 226
 227    In the case of logfunc2, the PostgreSQL main parser does not know what
 228    type 'now' should become and therefore it returns a data value of type
 229    text containing the string now. During the ensuing assignment to the
 230    local variable curtime, the PL/pgSQL interpreter casts this string to
 231    the timestamp type by calling the textout and timestamp_in functions
 232    for the conversion. So, the computed time stamp is updated on each
 233    execution as the programmer expects. Even though this happens to work
 234    as expected, it's not terribly efficient, so use of the now() function
 235    would still be a better idea.