begriffs open source - ai-pg/blob - full-docs/txt/xfunc-c.txt

   1
   2 36.10. C-Language Functions #
   3
   4    36.10.1. Dynamic Loading
   5    36.10.2. Base Types in C-Language Functions
   6    36.10.3. Version 1 Calling Conventions
   7    36.10.4. Writing Code
   8    36.10.5. Compiling and Linking Dynamically-Loaded Functions
   9    36.10.6. Server API and ABI Stability Guidance
  10    36.10.7. Composite-Type Arguments
  11    36.10.8. Returning Rows (Composite Types)
  12    36.10.9. Returning Sets
  13    36.10.10. Polymorphic Arguments and Return Types
  14    36.10.11. Shared Memory
  15    36.10.12. LWLocks
  16    36.10.13. Custom Wait Events
  17    36.10.14. Injection Points
  18    36.10.15. Custom Cumulative Statistics
  19    36.10.16. Using C++ for Extensibility
  20
  21    User-defined functions can be written in C (or a language that can be
  22    made compatible with C, such as C++). Such functions are compiled into
  23    dynamically loadable objects (also called shared libraries) and are
  24    loaded by the server on demand. The dynamic loading feature is what
  25    distinguishes “C language” functions from “internal” functions — the
  26    actual coding conventions are essentially the same for both. (Hence,
  27    the standard internal function library is a rich source of coding
  28    examples for user-defined C functions.)
  29
  30    Currently only one calling convention is used for C functions (“version
  31    1”). Support for that calling convention is indicated by writing a
  32    PG_FUNCTION_INFO_V1() macro call for the function, as illustrated
  33    below.
  34
  35 36.10.1. Dynamic Loading #
  36
  37    The first time a user-defined function in a particular loadable object
  38    file is called in a session, the dynamic loader loads that object file
  39    into memory so that the function can be called. The CREATE FUNCTION for
  40    a user-defined C function must therefore specify two pieces of
  41    information for the function: the name of the loadable object file, and
  42    the C name (link symbol) of the specific function to call within that
  43    object file. If the C name is not explicitly specified then it is
  44    assumed to be the same as the SQL function name.
  45
  46    The following algorithm is used to locate the shared object file based
  47    on the name given in the CREATE FUNCTION command:
  48     1. If the name is an absolute path, the given file is loaded.
  49     2. If the name starts with the string $libdir, that part is replaced
  50        by the PostgreSQL package library directory name, which is
  51        determined at build time.
  52     3. If the name does not contain a directory part, the file is searched
  53        for in the path specified by the configuration variable
  54        dynamic_library_path.
  55     4. Otherwise (the file was not found in the path, or it contains a
  56        non-absolute directory part), the dynamic loader will try to take
  57        the name as given, which will most likely fail. (It is unreliable
  58        to depend on the current working directory.)
  59
  60    If this sequence does not work, the platform-specific shared library
  61    file name extension (often .so) is appended to the given name and this
  62    sequence is tried again. If that fails as well, the load will fail.
  63
  64    It is recommended to locate shared libraries either relative to $libdir
  65    or through the dynamic library path. This simplifies version upgrades
  66    if the new installation is at a different location. The actual
  67    directory that $libdir stands for can be found out with the command
  68    pg_config --pkglibdir.
  69
  70    The user ID the PostgreSQL server runs as must be able to traverse the
  71    path to the file you intend to load. Making the file or a higher-level
  72    directory not readable and/or not executable by the postgres user is a
  73    common mistake.
  74
  75    In any case, the file name that is given in the CREATE FUNCTION command
  76    is recorded literally in the system catalogs, so if the file needs to
  77    be loaded again the same procedure is applied.
  78
  79 Note
  80
  81    PostgreSQL will not compile a C function automatically. The object file
  82    must be compiled before it is referenced in a CREATE FUNCTION command.
  83    See Section 36.10.5 for additional information.
  84
  85    To ensure that a dynamically loaded object file is not loaded into an
  86    incompatible server, PostgreSQL checks that the file contains a “magic
  87    block” with the appropriate contents. This allows the server to detect
  88    obvious incompatibilities, such as code compiled for a different major
  89    version of PostgreSQL. To include a magic block, write this in one (and
  90    only one) of the module source files, after having included the header
  91    fmgr.h:
  92 PG_MODULE_MAGIC;
  93
  94    or
  95 PG_MODULE_MAGIC_EXT(parameters);
  96
  97    The PG_MODULE_MAGIC_EXT variant allows the specification of additional
  98    information about the module; currently, a name and/or a version string
  99    can be added. (More fields might be allowed in future.) Write something
 100    like this:
 101 PG_MODULE_MAGIC_EXT(
 102     .name = "my_module_name",
 103     .version = "1.2.3"
 104 );
 105
 106    Subsequently the name and version can be examined via the
 107    pg_get_loaded_modules() function. The meaning of the version string is
 108    not restricted by PostgreSQL, but use of semantic versioning rules is
 109    recommended.
 110
 111    After it is used for the first time, a dynamically loaded object file
 112    is retained in memory. Future calls in the same session to the
 113    function(s) in that file will only incur the small overhead of a symbol
 114    table lookup. If you need to force a reload of an object file, for
 115    example after recompiling it, begin a fresh session.
 116
 117    Optionally, a dynamically loaded file can contain an initialization
 118    function. If the file includes a function named _PG_init, that function
 119    will be called immediately after loading the file. The function
 120    receives no parameters and should return void. There is presently no
 121    way to unload a dynamically loaded file.
 122
 123 36.10.2. Base Types in C-Language Functions #
 124
 125    To know how to write C-language functions, you need to know how
 126    PostgreSQL internally represents base data types and how they can be
 127    passed to and from functions. Internally, PostgreSQL regards a base
 128    type as a “blob of memory”. The user-defined functions that you define
 129    over a type in turn define the way that PostgreSQL can operate on it.
 130    That is, PostgreSQL will only store and retrieve the data from disk and
 131    use your user-defined functions to input, process, and output the data.
 132
 133    Base types can have one of three internal formats:
 134      * pass by value, fixed-length
 135      * pass by reference, fixed-length
 136      * pass by reference, variable-length
 137
 138    By-value types can only be 1, 2, or 4 bytes in length (also 8 bytes, if
 139    sizeof(Datum) is 8 on your machine). You should be careful to define
 140    your types such that they will be the same size (in bytes) on all
 141    architectures. For example, the long type is dangerous because it is 4
 142    bytes on some machines and 8 bytes on others, whereas int type is 4
 143    bytes on most Unix machines. A reasonable implementation of the int4
 144    type on Unix machines might be:
 145 /* 4-byte integer, passed by value */
 146 typedef int int4;
 147
 148    (The actual PostgreSQL C code calls this type int32, because it is a
 149    convention in C that intXX means XX bits. Note therefore also that the
 150    C type int8 is 1 byte in size. The SQL type int8 is called int64 in C.
 151    See also Table 36.2.)
 152
 153    On the other hand, fixed-length types of any size can be passed
 154    by-reference. For example, here is a sample implementation of a
 155    PostgreSQL type:
 156 /* 16-byte structure, passed by reference */
 157 typedef struct
 158 {
 159     double  x, y;
 160 } Point;
 161
 162    Only pointers to such types can be used when passing them in and out of
 163    PostgreSQL functions. To return a value of such a type, allocate the
 164    right amount of memory with palloc, fill in the allocated memory, and
 165    return a pointer to it. (Also, if you just want to return the same
 166    value as one of your input arguments that's of the same data type, you
 167    can skip the extra palloc and just return the pointer to the input
 168    value.)
 169
 170    Finally, all variable-length types must also be passed by reference.
 171    All variable-length types must begin with an opaque length field of
 172    exactly 4 bytes, which will be set by SET_VARSIZE; never set this field
 173    directly! All data to be stored within that type must be located in the
 174    memory immediately following that length field. The length field
 175    contains the total length of the structure, that is, it includes the
 176    size of the length field itself.
 177
 178    Another important point is to avoid leaving any uninitialized bits
 179    within data type values; for example, take care to zero out any
 180    alignment padding bytes that might be present in structs. Without this,
 181    logically-equivalent constants of your data type might be seen as
 182    unequal by the planner, leading to inefficient (though not incorrect)
 183    plans.
 184
 185 Warning
 186
 187    Never modify the contents of a pass-by-reference input value. If you do
 188    so you are likely to corrupt on-disk data, since the pointer you are
 189    given might point directly into a disk buffer. The sole exception to
 190    this rule is explained in Section 36.12.
 191
 192    As an example, we can define the type text as follows:
 193 typedef struct {
 194     int32 length;
 195     char data[FLEXIBLE_ARRAY_MEMBER];
 196 } text;
 197
 198    The [FLEXIBLE_ARRAY_MEMBER] notation means that the actual length of
 199    the data part is not specified by this declaration.
 200
 201    When manipulating variable-length types, we must be careful to allocate
 202    the correct amount of memory and set the length field correctly. For
 203    example, if we wanted to store 40 bytes in a text structure, we might
 204    use a code fragment like this:
 205 #include "postgres.h"
 206 ...
 207 char buffer[40]; /* our source data */
 208 ...
 209 text *destination = (text *) palloc(VARHDRSZ + 40);
 210 SET_VARSIZE(destination, VARHDRSZ + 40);
 211 memcpy(destination->data, buffer, 40);
 212 ...
 213
 214
 215    VARHDRSZ is the same as sizeof(int32), but it's considered good style
 216    to use the macro VARHDRSZ to refer to the size of the overhead for a
 217    variable-length type. Also, the length field must be set using the
 218    SET_VARSIZE macro, not by simple assignment.
 219
 220    Table 36.2 shows the C types corresponding to many of the built-in SQL
 221    data types of PostgreSQL. The “Defined In” column gives the header file
 222    that needs to be included to get the type definition. (The actual
 223    definition might be in a different file that is included by the listed
 224    file. It is recommended that users stick to the defined interface.)
 225    Note that you should always include postgres.h first in any source file
 226    of server code, because it declares a number of things that you will
 227    need anyway, and because including other headers first can cause
 228    portability issues.
 229
 230    Table 36.2. Equivalent C Types for Built-in SQL Types
 231            SQL Type             C Type                 Defined In
 232    boolean                   bool          postgres.h (maybe compiler built-in)
 233    box                       BOX*          utils/geo_decls.h
 234    bytea                     bytea*        postgres.h
 235    "char"                    char          (compiler built-in)
 236    character                 BpChar*       postgres.h
 237    cid                       CommandId     postgres.h
 238    date                      DateADT       utils/date.h
 239    float4 (real)             float4        postgres.h
 240    float8 (double precision) float8        postgres.h
 241    int2 (smallint)           int16         postgres.h
 242    int4 (integer)            int32         postgres.h
 243    int8 (bigint)             int64         postgres.h
 244    interval                  Interval*     datatype/timestamp.h
 245    lseg                      LSEG*         utils/geo_decls.h
 246    name                      Name          postgres.h
 247    numeric                   Numeric       utils/numeric.h
 248    oid                       Oid           postgres.h
 249    oidvector                 oidvector*    postgres.h
 250    path                      PATH*         utils/geo_decls.h
 251    point                     POINT*        utils/geo_decls.h
 252    regproc                   RegProcedure  postgres.h
 253    text                      text*         postgres.h
 254    tid                       ItemPointer   storage/itemptr.h
 255    time                      TimeADT       utils/date.h
 256    time with time zone       TimeTzADT     utils/date.h
 257    timestamp                 Timestamp     datatype/timestamp.h
 258    timestamp with time zone  TimestampTz   datatype/timestamp.h
 259    varchar                   VarChar*      postgres.h
 260    xid                       TransactionId postgres.h
 261
 262    Now that we've gone over all of the possible structures for base types,
 263    we can show some examples of real functions.
 264
 265 36.10.3. Version 1 Calling Conventions #
 266
 267    The version-1 calling convention relies on macros to suppress most of
 268    the complexity of passing arguments and results. The C declaration of a
 269    version-1 function is always:
 270 Datum funcname(PG_FUNCTION_ARGS)
 271
 272    In addition, the macro call:
 273 PG_FUNCTION_INFO_V1(funcname);
 274
 275    must appear in the same source file. (Conventionally, it's written just
 276    before the function itself.) This macro call is not needed for
 277    internal-language functions, since PostgreSQL assumes that all internal
 278    functions use the version-1 convention. It is, however, required for
 279    dynamically-loaded functions.
 280
 281    In a version-1 function, each actual argument is fetched using a
 282    PG_GETARG_xxx() macro that corresponds to the argument's data type. (In
 283    non-strict functions there needs to be a previous check about argument
 284    null-ness using PG_ARGISNULL(); see below.) The result is returned
 285    using a PG_RETURN_xxx() macro for the return type. PG_GETARG_xxx()
 286    takes as its argument the number of the function argument to fetch,
 287    where the count starts at 0. PG_RETURN_xxx() takes as its argument the
 288    actual value to return.
 289
 290    To call another version-1 function, you can use
 291    DirectFunctionCalln(func, arg1, ..., argn). This is particularly useful
 292    when you want to call functions defined in the standard internal
 293    library, by using an interface similar to their SQL signature.
 294
 295    These convenience functions and similar ones can be found in fmgr.h.
 296    The DirectFunctionCalln family expect a C function name as their first
 297    argument. There are also OidFunctionCalln which take the OID of the
 298    target function, and some other variants. All of these expect the
 299    function's arguments to be supplied as Datums, and likewise they return
 300    Datum. Note that neither arguments nor result are allowed to be NULL
 301    when using these convenience functions.
 302
 303    For example, to call the starts_with(text, text) function from C, you
 304    can search through the catalog and find out that its C implementation
 305    is the Datum text_starts_with(PG_FUNCTION_ARGS) function. Typically you
 306    would use DirectFunctionCall2(text_starts_with, ...) to call such a
 307    function. However, starts_with(text, text) requires collation
 308    information, so it will fail with “could not determine which collation
 309    to use for string comparison” if called that way. Instead you must use
 310    DirectFunctionCall2Coll(text_starts_with, ...) and provide the desired
 311    collation, which typically is just passed through from
 312    PG_GET_COLLATION(), as shown in the example below.
 313
 314    fmgr.h also supplies macros that facilitate conversions between C types
 315    and Datum. For example to turn Datum into text*, you can use
 316    DatumGetTextPP(X). While some types have macros named like
 317    TypeGetDatum(X) for the reverse conversion, text* does not; it's
 318    sufficient to use the generic macro PointerGetDatum(X) for that. If
 319    your extension defines additional types, it is usually convenient to
 320    define similar macros for your types too.
 321
 322    Here are some examples using the version-1 calling convention:
 323 #include "postgres.h"
 324 #include <string.h>
 325 #include "fmgr.h"
 326 #include "utils/geo_decls.h"
 327 #include "varatt.h"
 328
 329 PG_MODULE_MAGIC;
 330
 331 /* by value */
 332
 333 PG_FUNCTION_INFO_V1(add_one);
 334
 335 Datum
 336 add_one(PG_FUNCTION_ARGS)
 337 {
 338     int32   arg = PG_GETARG_INT32(0);
 339
 340     PG_RETURN_INT32(arg + 1);
 341 }
 342
 343 /* by reference, fixed length */
 344
 345 PG_FUNCTION_INFO_V1(add_one_float8);
 346
 347 Datum
 348 add_one_float8(PG_FUNCTION_ARGS)
 349 {
 350     /* The macros for FLOAT8 hide its pass-by-reference nature. */
 351     float8   arg = PG_GETARG_FLOAT8(0);
 352
 353     PG_RETURN_FLOAT8(arg + 1.0);
 354 }
 355
 356 PG_FUNCTION_INFO_V1(makepoint);
 357
 358 Datum
 359 makepoint(PG_FUNCTION_ARGS)
 360 {
 361     /* Here, the pass-by-reference nature of Point is not hidden. */
 362     Point     *pointx = PG_GETARG_POINT_P(0);
 363     Point     *pointy = PG_GETARG_POINT_P(1);
 364     Point     *new_point = (Point *) palloc(sizeof(Point));
 365
 366     new_point->x = pointx->x;
 367     new_point->y = pointy->y;
 368
 369     PG_RETURN_POINT_P(new_point);
 370 }
 371
 372 /* by reference, variable length */
 373
 374 PG_FUNCTION_INFO_V1(copytext);
 375
 376 Datum
 377 copytext(PG_FUNCTION_ARGS)
 378 {
 379     text     *t = PG_GETARG_TEXT_PP(0);
 380
 381     /*
 382      * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the
 383      * VARHDRSZ or VARHDRSZ_SHORT of its header.  Construct the copy with a
 384      * full-length header.
 385      */
 386     text     *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
 387     SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
 388
 389     /*
 390      * VARDATA is a pointer to the data region of the new struct.  The source
 391      * could be a short datum, so retrieve its data through VARDATA_ANY.
 392      */
 393     memcpy(VARDATA(new_t),          /* destination */
 394            VARDATA_ANY(t),          /* source */
 395            VARSIZE_ANY_EXHDR(t));   /* how many bytes */
 396     PG_RETURN_TEXT_P(new_t);
 397 }
 398
 399 PG_FUNCTION_INFO_V1(concat_text);
 400
 401 Datum
 402 concat_text(PG_FUNCTION_ARGS)
 403 {
 404     text  *arg1 = PG_GETARG_TEXT_PP(0);
 405     text  *arg2 = PG_GETARG_TEXT_PP(1);
 406     int32 arg1_size = VARSIZE_ANY_EXHDR(arg1);
 407     int32 arg2_size = VARSIZE_ANY_EXHDR(arg2);
 408     int32 new_text_size = arg1_size + arg2_size + VARHDRSZ;
 409     text *new_text = (text *) palloc(new_text_size);
 410
 411     SET_VARSIZE(new_text, new_text_size);
 412     memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size);
 413     memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
 414     PG_RETURN_TEXT_P(new_text);
 415 }
 416
 417 /* A wrapper around starts_with(text, text) */
 418
 419 PG_FUNCTION_INFO_V1(t_starts_with);
 420
 421 Datum
 422 t_starts_with(PG_FUNCTION_ARGS)
 423 {
 424     text       *t1 = PG_GETARG_TEXT_PP(0);
 425     text       *t2 = PG_GETARG_TEXT_PP(1);
 426     Oid         collid = PG_GET_COLLATION();
 427     bool        result;
 428
 429     result = DatumGetBool(DirectFunctionCall2Coll(text_starts_with,
 430                                                   collid,
 431                                                   PointerGetDatum(t1),
 432                                                   PointerGetDatum(t2)));
 433     PG_RETURN_BOOL(result);
 434 }
 435
 436
 437    Supposing that the above code has been prepared in file funcs.c and
 438    compiled into a shared object, we could define the functions to
 439    PostgreSQL with commands like this:
 440 CREATE FUNCTION add_one(integer) RETURNS integer
 441      AS 'DIRECTORY/funcs', 'add_one'
 442      LANGUAGE C STRICT;
 443
 444 -- note overloading of SQL function name "add_one"
 445 CREATE FUNCTION add_one(double precision) RETURNS double precision
 446      AS 'DIRECTORY/funcs', 'add_one_float8'
 447      LANGUAGE C STRICT;
 448
 449 CREATE FUNCTION makepoint(point, point) RETURNS point
 450      AS 'DIRECTORY/funcs', 'makepoint'
 451      LANGUAGE C STRICT;
 452
 453 CREATE FUNCTION copytext(text) RETURNS text
 454      AS 'DIRECTORY/funcs', 'copytext'
 455      LANGUAGE C STRICT;
 456
 457 CREATE FUNCTION concat_text(text, text) RETURNS text
 458      AS 'DIRECTORY/funcs', 'concat_text'
 459      LANGUAGE C STRICT;
 460
 461 CREATE FUNCTION t_starts_with(text, text) RETURNS boolean
 462      AS 'DIRECTORY/funcs', 't_starts_with'
 463      LANGUAGE C STRICT;
 464
 465    Here, DIRECTORY stands for the directory of the shared library file
 466    (for instance the PostgreSQL tutorial directory, which contains the
 467    code for the examples used in this section). (Better style would be to
 468    use just 'funcs' in the AS clause, after having added DIRECTORY to the
 469    search path. In any case, we can omit the system-specific extension for
 470    a shared library, commonly .so.)
 471
 472    Notice that we have specified the functions as “strict”, meaning that
 473    the system should automatically assume a null result if any input value
 474    is null. By doing this, we avoid having to check for null inputs in the
 475    function code. Without this, we'd have to check for null values
 476    explicitly, using PG_ARGISNULL().
 477
 478    The macro PG_ARGISNULL(n) allows a function to test whether each input
 479    is null. (Of course, doing this is only necessary in functions not
 480    declared “strict”.) As with the PG_GETARG_xxx() macros, the input
 481    arguments are counted beginning at zero. Note that one should refrain
 482    from executing PG_GETARG_xxx() until one has verified that the argument
 483    isn't null. To return a null result, execute PG_RETURN_NULL(); this
 484    works in both strict and nonstrict functions.
 485
 486    At first glance, the version-1 coding conventions might appear to be
 487    just pointless obscurantism, compared to using plain C calling
 488    conventions. They do however allow us to deal with NULLable
 489    arguments/return values, and “toasted” (compressed or out-of-line)
 490    values.
 491
 492    Other options provided by the version-1 interface are two variants of
 493    the PG_GETARG_xxx() macros. The first of these, PG_GETARG_xxx_COPY(),
 494    guarantees to return a copy of the specified argument that is safe for
 495    writing into. (The normal macros will sometimes return a pointer to a
 496    value that is physically stored in a table, which must not be written
 497    to. Using the PG_GETARG_xxx_COPY() macros guarantees a writable
 498    result.) The second variant consists of the PG_GETARG_xxx_SLICE()
 499    macros which take three arguments. The first is the number of the
 500    function argument (as above). The second and third are the offset and
 501    length of the segment to be returned. Offsets are counted from zero,
 502    and a negative length requests that the remainder of the value be
 503    returned. These macros provide more efficient access to parts of large
 504    values in the case where they have storage type “external”. (The
 505    storage type of a column can be specified using ALTER TABLE tablename
 506    ALTER COLUMN colname SET STORAGE storagetype. storagetype is one of
 507    plain, external, extended, or main.)
 508
 509    Finally, the version-1 function call conventions make it possible to
 510    return set results (Section 36.10.9) and implement trigger functions
 511    (Chapter 37) and procedural-language call handlers (Chapter 57). For
 512    more details see src/backend/utils/fmgr/README in the source
 513    distribution.
 514
 515 36.10.4. Writing Code #
 516
 517    Before we turn to the more advanced topics, we should discuss some
 518    coding rules for PostgreSQL C-language functions. While it might be
 519    possible to load functions written in languages other than C into
 520    PostgreSQL, this is usually difficult (when it is possible at all)
 521    because other languages, such as C++, FORTRAN, or Pascal often do not
 522    follow the same calling convention as C. That is, other languages do
 523    not pass argument and return values between functions in the same way.
 524    For this reason, we will assume that your C-language functions are
 525    actually written in C.
 526
 527    The basic rules for writing and building C functions are as follows:
 528      * Use pg_config --includedir-server to find out where the PostgreSQL
 529        server header files are installed on your system (or the system
 530        that your users will be running on).
 531      * Compiling and linking your code so that it can be dynamically
 532        loaded into PostgreSQL always requires special flags. See
 533        Section 36.10.5 for a detailed explanation of how to do it for your
 534        particular operating system.
 535      * Remember to define a “magic block” for your shared library, as
 536        described in Section 36.10.1.
 537      * When allocating memory, use the PostgreSQL functions palloc and
 538        pfree instead of the corresponding C library functions malloc and
 539        free. The memory allocated by palloc will be freed automatically at
 540        the end of each transaction, preventing memory leaks.
 541      * Always zero the bytes of your structures using memset (or allocate
 542        them with palloc0 in the first place). Even if you assign to each
 543        field of your structure, there might be alignment padding (holes in
 544        the structure) that contain garbage values. Without this, it's
 545        difficult to support hash indexes or hash joins, as you must pick
 546        out only the significant bits of your data structure to compute a
 547        hash. The planner also sometimes relies on comparing constants via
 548        bitwise equality, so you can get undesirable planning results if
 549        logically-equivalent values aren't bitwise equal.
 550      * Most of the internal PostgreSQL types are declared in postgres.h,
 551        while the function manager interfaces (PG_FUNCTION_ARGS, etc.) are
 552        in fmgr.h, so you will need to include at least these two files.
 553        For portability reasons it's best to include postgres.h first,
 554        before any other system or user header files. Including postgres.h
 555        will also include elog.h and palloc.h for you.
 556      * Symbol names defined within object files must not conflict with
 557        each other or with symbols defined in the PostgreSQL server
 558        executable. You will have to rename your functions or variables if
 559        you get error messages to this effect.
 560
 561 36.10.5. Compiling and Linking Dynamically-Loaded Functions #
 562
 563    Before you are able to use your PostgreSQL extension functions written
 564    in C, they must be compiled and linked in a special way to produce a
 565    file that can be dynamically loaded by the server. To be precise, a
 566    shared library needs to be created.
 567
 568    For information beyond what is contained in this section you should
 569    read the documentation of your operating system, in particular the
 570    manual pages for the C compiler, cc, and the link editor, ld. In
 571    addition, the PostgreSQL source code contains several working examples
 572    in the contrib directory. If you rely on these examples you will make
 573    your modules dependent on the availability of the PostgreSQL source
 574    code, however.
 575
 576    Creating shared libraries is generally analogous to linking
 577    executables: first the source files are compiled into object files,
 578    then the object files are linked together. The object files need to be
 579    created as position-independent code (PIC), which conceptually means
 580    that they can be placed at an arbitrary location in memory when they
 581    are loaded by the executable. (Object files intended for executables
 582    are usually not compiled that way.) The command to link a shared
 583    library contains special flags to distinguish it from linking an
 584    executable (at least in theory — on some systems the practice is much
 585    uglier).
 586
 587    In the following examples we assume that your source code is in a file
 588    foo.c and we will create a shared library foo.so. The intermediate
 589    object file will be called foo.o unless otherwise noted. A shared
 590    library can contain more than one object file, but we only use one
 591    here.
 592
 593    FreeBSD
 594           The compiler flag to create PIC is -fPIC. To create shared
 595           libraries the compiler flag is -shared.
 596
 597 cc -fPIC -c foo.c
 598 cc -shared -o foo.so foo.o
 599
 600           This is applicable as of version 13.0 of FreeBSD, older versions
 601           used the gcc compiler.
 602
 603    Linux
 604           The compiler flag to create PIC is -fPIC. The compiler flag to
 605           create a shared library is -shared. A complete example looks
 606           like this:
 607
 608 cc -fPIC -c foo.c
 609 cc -shared -o foo.so foo.o
 610
 611    macOS
 612           Here is an example. It assumes the developer tools are
 613           installed.
 614
 615 cc -c foo.c
 616 cc -bundle -flat_namespace -undefined suppress -o foo.so foo.o
 617
 618    NetBSD
 619           The compiler flag to create PIC is -fPIC. For ELF systems, the
 620           compiler with the flag -shared is used to link shared libraries.
 621           On the older non-ELF systems, ld -Bshareable is used.
 622
 623 gcc -fPIC -c foo.c
 624 gcc -shared -o foo.so foo.o
 625
 626    OpenBSD
 627           The compiler flag to create PIC is -fPIC. ld -Bshareable is used
 628           to link shared libraries.
 629
 630 gcc -fPIC -c foo.c
 631 ld -Bshareable -o foo.so foo.o
 632
 633    Solaris
 634           The compiler flag to create PIC is -KPIC with the Sun compiler
 635           and -fPIC with GCC. To link shared libraries, the compiler
 636           option is -G with either compiler or alternatively -shared with
 637           GCC.
 638
 639 cc -KPIC -c foo.c
 640 cc -G -o foo.so foo.o
 641
 642           or
 643
 644 gcc -fPIC -c foo.c
 645 gcc -G -o foo.so foo.o
 646
 647 Tip
 648
 649    If this is too complicated for you, you should consider using GNU
 650    Libtool, which hides the platform differences behind a uniform
 651    interface.
 652
 653    The resulting shared library file can then be loaded into PostgreSQL.
 654    When specifying the file name to the CREATE FUNCTION command, one must
 655    give it the name of the shared library file, not the intermediate
 656    object file. Note that the system's standard shared-library extension
 657    (usually .so or .sl) can be omitted from the CREATE FUNCTION command,
 658    and normally should be omitted for best portability.
 659
 660    Refer back to Section 36.10.1 about where the server expects to find
 661    the shared library files.
 662
 663 36.10.6. Server API and ABI Stability Guidance #
 664
 665    This section contains guidance to authors of extensions and other
 666    server plugins about API and ABI stability in the PostgreSQL server.
 667
 668 36.10.6.1. General #
 669
 670    The PostgreSQL server contains several well-demarcated APIs for server
 671    plugins, such as the function manager (fmgr, described in this
 672    chapter), SPI (Chapter 45), and various hooks specifically designed for
 673    extensions. These interfaces are carefully managed for long-term
 674    stability and compatibility. However, the entire set of global
 675    functions and variables in the server effectively constitutes the
 676    publicly usable API, and most of it was not designed with extensibility
 677    and long-term stability in mind.
 678
 679    Therefore, while taking advantage of these interfaces is valid, the
 680    further one strays from the well-trodden path, the likelier it will be
 681    that one might encounter API or ABI compatibility issues at some point.
 682    Extension authors are encouraged to provide feedback about their
 683    requirements, so that over time, as new use patterns arise, certain
 684    interfaces can be considered more stabilized or new, better-designed
 685    interfaces can be added.
 686
 687 36.10.6.2. API Compatibility #
 688
 689    The API, or application programming interface, is the interface used at
 690    compile time.
 691
 692 36.10.6.2.1. Major Versions #
 693
 694    There is no promise of API compatibility between PostgreSQL major
 695    versions. Extension code therefore might require source code changes to
 696    work with multiple major versions. These can usually be managed with
 697    preprocessor conditions such as #if PG_VERSION_NUM >= 160000.
 698    Sophisticated extensions that use interfaces beyond the well-demarcated
 699    ones usually require a few such changes for each major server version.
 700
 701 36.10.6.2.2. Minor Versions #
 702
 703    PostgreSQL makes an effort to avoid server API breaks in minor
 704    releases. In general, extension code that compiles and works with a
 705    minor release should also compile and work with any other minor release
 706    of the same major version, past or future.
 707
 708    When a change is required, it will be carefully managed, taking the
 709    requirements of extensions into account. Such changes will be
 710    communicated in the release notes (Appendix E).
 711
 712 36.10.6.3. ABI Compatibility #
 713
 714    The ABI, or application binary interface, is the interface used at run
 715    time.
 716
 717 36.10.6.3.1. Major Versions #
 718
 719    Servers of different major versions have intentionally incompatible
 720    ABIs. Extensions that use server APIs must therefore be re-compiled for
 721    each major release. The inclusion of PG_MODULE_MAGIC (see
 722    Section 36.10.1) ensures that code compiled for one major version will
 723    be rejected by other major versions.
 724
 725 36.10.6.3.2. Minor Versions #
 726
 727    PostgreSQL makes an effort to avoid server ABI breaks in minor
 728    releases. In general, an extension compiled against any minor release
 729    should work with any other minor release of the same major version,
 730    past or future.
 731
 732    When a change is required, PostgreSQL will choose the least invasive
 733    change possible, for example by squeezing a new field into padding
 734    space or appending it to the end of a struct. These sorts of changes
 735    should not impact extensions unless they use very unusual code
 736    patterns.
 737
 738    In rare cases, however, even such non-invasive changes may be
 739    impractical or impossible. In such an event, the change will be
 740    carefully managed, taking the requirements of extensions into account.
 741    Such changes will also be documented in the release notes (Appendix E).
 742
 743    Note, however, that many parts of the server are not designed or
 744    maintained as publicly-consumable APIs (and that, in most cases, the
 745    actual boundary is also not well-defined). If urgent needs arise,
 746    changes in those parts will naturally be made with less consideration
 747    for extension code than changes in well-defined and widely used
 748    interfaces.
 749
 750    Also, in the absence of automated detection of such changes, this is
 751    not a guarantee, but historically such breaking changes have been
 752    extremely rare.
 753
 754 36.10.7. Composite-Type Arguments #
 755
 756    Composite types do not have a fixed layout like C structures. Instances
 757    of a composite type can contain null fields. In addition, composite
 758    types that are part of an inheritance hierarchy can have different
 759    fields than other members of the same inheritance hierarchy. Therefore,
 760    PostgreSQL provides a function interface for accessing fields of
 761    composite types from C.
 762
 763    Suppose we want to write a function to answer the query:
 764 SELECT name, c_overpaid(emp, 1500) AS overpaid
 765     FROM emp
 766     WHERE name = 'Bill' OR name = 'Sam';
 767
 768    Using the version-1 calling conventions, we can define c_overpaid as:
 769 #include "postgres.h"
 770 #include "executor/executor.h"  /* for GetAttributeByName() */
 771
 772 PG_MODULE_MAGIC;
 773
 774 PG_FUNCTION_INFO_V1(c_overpaid);
 775
 776 Datum
 777 c_overpaid(PG_FUNCTION_ARGS)
 778 {
 779     HeapTupleHeader  t = PG_GETARG_HEAPTUPLEHEADER(0);
 780     int32            limit = PG_GETARG_INT32(1);
 781     bool isnull;
 782     Datum salary;
 783
 784     salary = GetAttributeByName(t, "salary", &isnull);
 785     if (isnull)
 786         PG_RETURN_BOOL(false);
 787     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
 788
 789     PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
 790 }
 791
 792
 793    GetAttributeByName is the PostgreSQL system function that returns
 794    attributes out of the specified row. It has three arguments: the
 795    argument of type HeapTupleHeader passed into the function, the name of
 796    the desired attribute, and a return parameter that tells whether the
 797    attribute is null. GetAttributeByName returns a Datum value that you
 798    can convert to the proper data type by using the appropriate
 799    DatumGetXXX() function. Note that the return value is meaningless if
 800    the null flag is set; always check the null flag before trying to do
 801    anything with the result.
 802
 803    There is also GetAttributeByNum, which selects the target attribute by
 804    column number instead of name.
 805
 806    The following command declares the function c_overpaid in SQL:
 807 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
 808     AS 'DIRECTORY/funcs', 'c_overpaid'
 809     LANGUAGE C STRICT;
 810
 811    Notice we have used STRICT so that we did not have to check whether the
 812    input arguments were NULL.
 813
 814 36.10.8. Returning Rows (Composite Types) #
 815
 816    To return a row or composite-type value from a C-language function, you
 817    can use a special API that provides macros and functions to hide most
 818    of the complexity of building composite data types. To use this API,
 819    the source file must include:
 820 #include "funcapi.h"
 821
 822    There are two ways you can build a composite data value (henceforth a
 823    “tuple”): you can build it from an array of Datum values, or from an
 824    array of C strings that can be passed to the input conversion functions
 825    of the tuple's column data types. In either case, you first need to
 826    obtain or construct a TupleDesc descriptor for the tuple structure.
 827    When working with Datums, you pass the TupleDesc to BlessTupleDesc, and
 828    then call heap_form_tuple for each row. When working with C strings,
 829    you pass the TupleDesc to TupleDescGetAttInMetadata, and then call
 830    BuildTupleFromCStrings for each row. In the case of a function
 831    returning a set of tuples, the setup steps can all be done once during
 832    the first call of the function.
 833
 834    Several helper functions are available for setting up the needed
 835    TupleDesc. The recommended way to do this in most functions returning
 836    composite values is to call:
 837 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
 838                                    Oid *resultTypeId,
 839                                    TupleDesc *resultTupleDesc)
 840
 841    passing the same fcinfo struct passed to the calling function itself.
 842    (This of course requires that you use the version-1 calling
 843    conventions.) resultTypeId can be specified as NULL or as the address
 844    of a local variable to receive the function's result type OID.
 845    resultTupleDesc should be the address of a local TupleDesc variable.
 846    Check that the result is TYPEFUNC_COMPOSITE; if so, resultTupleDesc has
 847    been filled with the needed TupleDesc. (If it is not, you can report an
 848    error along the lines of “function returning record called in context
 849    that cannot accept type record”.)
 850
 851 Tip
 852
 853    get_call_result_type can resolve the actual type of a polymorphic
 854    function result; so it is useful in functions that return scalar
 855    polymorphic results, not only functions that return composites. The
 856    resultTypeId output is primarily useful for functions returning
 857    polymorphic scalars.
 858
 859 Note
 860
 861    get_call_result_type has a sibling get_expr_result_type, which can be
 862    used to resolve the expected output type for a function call
 863    represented by an expression tree. This can be used when trying to
 864    determine the result type from outside the function itself. There is
 865    also get_func_result_type, which can be used when only the function's
 866    OID is available. However these functions are not able to deal with
 867    functions declared to return record, and get_func_result_type cannot
 868    resolve polymorphic types, so you should preferentially use
 869    get_call_result_type.
 870
 871    Older, now-deprecated functions for obtaining TupleDescs are:
 872 TupleDesc RelationNameGetTupleDesc(const char *relname)
 873
 874    to get a TupleDesc for the row type of a named relation, and:
 875 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
 876
 877    to get a TupleDesc based on a type OID. This can be used to get a
 878    TupleDesc for a base or composite type. It will not work for a function
 879    that returns record, however, and it cannot resolve polymorphic types.
 880
 881    Once you have a TupleDesc, call:
 882 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
 883
 884    if you plan to work with Datums, or:
 885 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
 886
 887    if you plan to work with C strings. If you are writing a function
 888    returning set, you can save the results of these functions in the
 889    FuncCallContext structure — use the tuple_desc or attinmeta field
 890    respectively.
 891
 892    When working with Datums, use:
 893 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
 894
 895    to build a HeapTuple given user data in Datum form.
 896
 897    When working with C strings, use:
 898 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
 899
 900    to build a HeapTuple given user data in C string form. values is an
 901    array of C strings, one for each attribute of the return row. Each C
 902    string should be in the form expected by the input function of the
 903    attribute data type. In order to return a null value for one of the
 904    attributes, the corresponding pointer in the values array should be set
 905    to NULL. This function will need to be called again for each row you
 906    return.
 907
 908    Once you have built a tuple to return from your function, it must be
 909    converted into a Datum. Use:
 910 HeapTupleGetDatum(HeapTuple tuple)
 911
 912    to convert a HeapTuple into a valid Datum. This Datum can be returned
 913    directly if you intend to return just a single row, or it can be used
 914    as the current return value in a set-returning function.
 915
 916    An example appears in the next section.
 917
 918 36.10.9. Returning Sets #
 919
 920    C-language functions have two options for returning sets (multiple
 921    rows). In one method, called ValuePerCall mode, a set-returning
 922    function is called repeatedly (passing the same arguments each time)
 923    and it returns one new row on each call, until it has no more rows to
 924    return and signals that by returning NULL. The set-returning function
 925    (SRF) must therefore save enough state across calls to remember what it
 926    was doing and return the correct next item on each call. In the other
 927    method, called Materialize mode, an SRF fills and returns a tuplestore
 928    object containing its entire result; then only one call occurs for the
 929    whole result, and no inter-call state is needed.
 930
 931    When using ValuePerCall mode, it is important to remember that the
 932    query is not guaranteed to be run to completion; that is, due to
 933    options such as LIMIT, the executor might stop making calls to the
 934    set-returning function before all rows have been fetched. This means it
 935    is not safe to perform cleanup activities in the last call, because
 936    that might not ever happen. It's recommended to use Materialize mode
 937    for functions that need access to external resources, such as file
 938    descriptors.
 939
 940    The remainder of this section documents a set of helper macros that are
 941    commonly used (though not required to be used) for SRFs using
 942    ValuePerCall mode. Additional details about Materialize mode can be
 943    found in src/backend/utils/fmgr/README. Also, the contrib modules in
 944    the PostgreSQL source distribution contain many examples of SRFs using
 945    both ValuePerCall and Materialize mode.
 946
 947    To use the ValuePerCall support macros described here, include
 948    funcapi.h. These macros work with a structure FuncCallContext that
 949    contains the state that needs to be saved across calls. Within the
 950    calling SRF, fcinfo->flinfo->fn_extra is used to hold a pointer to
 951    FuncCallContext across calls. The macros automatically fill that field
 952    on first use, and expect to find the same pointer there on subsequent
 953    uses.
 954 typedef struct FuncCallContext
 955 {
 956     /*
 957      * Number of times we've been called before
 958      *
 959      * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
 960      * incremented for you every time SRF_RETURN_NEXT() is called.
 961      */
 962     uint64 call_cntr;
 963
 964     /*
 965      * OPTIONAL maximum number of calls
 966      *
 967      * max_calls is here for convenience only and setting it is optional.
 968      * If not set, you must provide alternative means to know when the
 969      * function is done.
 970      */
 971     uint64 max_calls;
 972
 973     /*
 974      * OPTIONAL pointer to miscellaneous user-provided context information
 975      *
 976      * user_fctx is for use as a pointer to your own data to retain
 977      * arbitrary context information between calls of your function.
 978      */
 979     void *user_fctx;
 980
 981     /*
 982      * OPTIONAL pointer to struct containing attribute type input metadata
 983      *
 984      * attinmeta is for use when returning tuples (i.e., composite data types)
 985      * and is not used when returning base data types. It is only needed
 986      * if you intend to use BuildTupleFromCStrings() to create the return
 987      * tuple.
 988      */
 989     AttInMetadata *attinmeta;
 990
 991     /*
 992      * memory context used for structures that must live for multiple calls
 993      *
 994      * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
 995      * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
 996      * context for any memory that is to be reused across multiple calls
 997      * of the SRF.
 998      */
 999     MemoryContext multi_call_memory_ctx;
1000
1001     /*
1002      * OPTIONAL pointer to struct containing tuple description
1003      *
1004      * tuple_desc is for use when returning tuples (i.e., composite data types)
1005      * and is only needed if you are going to build the tuples with
1006      * heap_form_tuple() rather than with BuildTupleFromCStrings().  Note that
1007      * the TupleDesc pointer stored here should usually have been run through
1008      * BlessTupleDesc() first.
1009      */
1010     TupleDesc tuple_desc;
1011
1012 } FuncCallContext;
1013
1014    The macros to be used by an SRF using this infrastructure are:
1015 SRF_IS_FIRSTCALL()
1016
1017    Use this to determine if your function is being called for the first or
1018    a subsequent time. On the first call (only), call:
1019 SRF_FIRSTCALL_INIT()
1020
1021    to initialize the FuncCallContext. On every function call, including
1022    the first, call:
1023 SRF_PERCALL_SETUP()
1024
1025    to set up for using the FuncCallContext.
1026
1027    If your function has data to return in the current call, use:
1028 SRF_RETURN_NEXT(funcctx, result)
1029
1030    to return it to the caller. (result must be of type Datum, either a
1031    single value or a tuple prepared as described above.) Finally, when
1032    your function is finished returning data, use:
1033 SRF_RETURN_DONE(funcctx)
1034
1035    to clean up and end the SRF.
1036
1037    The memory context that is current when the SRF is called is a
1038    transient context that will be cleared between calls. This means that
1039    you do not need to call pfree on everything you allocated using palloc;
1040    it will go away anyway. However, if you want to allocate any data
1041    structures to live across calls, you need to put them somewhere else.
1042    The memory context referenced by multi_call_memory_ctx is a suitable
1043    location for any data that needs to survive until the SRF is finished
1044    running. In most cases, this means that you should switch into
1045    multi_call_memory_ctx while doing the first-call setup. Use
1046    funcctx->user_fctx to hold a pointer to any such cross-call data
1047    structures. (Data you allocate in multi_call_memory_ctx will go away
1048    automatically when the query ends, so it is not necessary to free that
1049    data manually, either.)
1050
1051 Warning
1052
1053    While the actual arguments to the function remain unchanged between
1054    calls, if you detoast the argument values (which is normally done
1055    transparently by the PG_GETARG_xxx macro) in the transient context then
1056    the detoasted copies will be freed on each cycle. Accordingly, if you
1057    keep references to such values in your user_fctx, you must either copy
1058    them into the multi_call_memory_ctx after detoasting, or ensure that
1059    you detoast the values only in that context.
1060
1061    A complete pseudo-code example looks like the following:
1062 Datum
1063 my_set_returning_function(PG_FUNCTION_ARGS)
1064 {
1065     FuncCallContext  *funcctx;
1066     Datum             result;
1067     further declarations as needed
1068
1069     if (SRF_IS_FIRSTCALL())
1070     {
1071         MemoryContext oldcontext;
1072
1073         funcctx = SRF_FIRSTCALL_INIT();
1074         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
1075         /* One-time setup code appears here: */
1076         user code
1077         if returning composite
1078             build TupleDesc, and perhaps AttInMetadata
1079         endif returning composite
1080         user code
1081         MemoryContextSwitchTo(oldcontext);
1082     }
1083
1084     /* Each-time setup code appears here: */
1085     user code
1086     funcctx = SRF_PERCALL_SETUP();
1087     user code
1088
1089     /* this is just one way we might test whether we are done: */
1090     if (funcctx->call_cntr < funcctx->max_calls)
1091     {
1092         /* Here we want to return another item: */
1093         user code
1094         obtain result Datum
1095         SRF_RETURN_NEXT(funcctx, result);
1096     }
1097     else
1098     {
1099         /* Here we are done returning items, so just report that fact. */
1100         /* (Resist the temptation to put cleanup code here.) */
1101         SRF_RETURN_DONE(funcctx);
1102     }
1103 }
1104
1105    A complete example of a simple SRF returning a composite type looks
1106    like:
1107 PG_FUNCTION_INFO_V1(retcomposite);
1108
1109 Datum
1110 retcomposite(PG_FUNCTION_ARGS)
1111 {
1112     FuncCallContext     *funcctx;
1113     int                  call_cntr;
1114     int                  max_calls;
1115     TupleDesc            tupdesc;
1116     AttInMetadata       *attinmeta;
1117
1118     /* stuff done only on the first call of the function */
1119     if (SRF_IS_FIRSTCALL())
1120     {
1121         MemoryContext   oldcontext;
1122
1123         /* create a function context for cross-call persistence */
1124         funcctx = SRF_FIRSTCALL_INIT();
1125
1126         /* switch to memory context appropriate for multiple function calls */
1127         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
1128
1129         /* total number of tuples to be returned */
1130         funcctx->max_calls = PG_GETARG_INT32(0);
1131
1132         /* Build a tuple descriptor for our result type */
1133         if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
1134             ereport(ERROR,
1135                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
1136                      errmsg("function returning record called in context "
1137                             "that cannot accept type record")));
1138
1139         /*
1140          * generate attribute metadata needed later to produce tuples from raw
1141          * C strings
1142          */
1143         attinmeta = TupleDescGetAttInMetadata(tupdesc);
1144         funcctx->attinmeta = attinmeta;
1145
1146         MemoryContextSwitchTo(oldcontext);
1147     }
1148
1149     /* stuff done on every call of the function */
1150     funcctx = SRF_PERCALL_SETUP();
1151
1152     call_cntr = funcctx->call_cntr;
1153     max_calls = funcctx->max_calls;
1154     attinmeta = funcctx->attinmeta;
1155
1156     if (call_cntr < max_calls)    /* do when there is more left to send */
1157     {
1158         char       **values;
1159         HeapTuple    tuple;
1160         Datum        result;
1161
1162         /*
1163          * Prepare a values array for building the returned tuple.
1164          * This should be an array of C strings which will
1165          * be processed later by the type input functions.
1166          */
1167         values = (char **) palloc(3 * sizeof(char *));
1168         values[0] = (char *) palloc(16 * sizeof(char));
1169         values[1] = (char *) palloc(16 * sizeof(char));
1170         values[2] = (char *) palloc(16 * sizeof(char));
1171
1172         snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
1173         snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
1174         snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
1175
1176         /* build a tuple */
1177         tuple = BuildTupleFromCStrings(attinmeta, values);
1178
1179         /* make the tuple into a datum */
1180         result = HeapTupleGetDatum(tuple);
1181
1182         /* clean up (this is not really necessary) */
1183         pfree(values[0]);
1184         pfree(values[1]);
1185         pfree(values[2]);
1186         pfree(values);
1187
1188         SRF_RETURN_NEXT(funcctx, result);
1189     }
1190     else    /* do when there is no more left */
1191     {
1192         SRF_RETURN_DONE(funcctx);
1193     }
1194 }
1195
1196
1197    One way to declare this function in SQL is:
1198 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
1199
1200 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
1201     RETURNS SETOF __retcomposite
1202     AS 'filename', 'retcomposite'
1203     LANGUAGE C IMMUTABLE STRICT;
1204
1205    A different way is to use OUT parameters:
1206 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
1207     OUT f1 integer, OUT f2 integer, OUT f3 integer)
1208     RETURNS SETOF record
1209     AS 'filename', 'retcomposite'
1210     LANGUAGE C IMMUTABLE STRICT;
1211
1212    Notice that in this method the output type of the function is formally
1213    an anonymous record type.
1214
1215 36.10.10. Polymorphic Arguments and Return Types #
1216
1217    C-language functions can be declared to accept and return the
1218    polymorphic types described in Section 36.2.5. When a function's
1219    arguments or return types are defined as polymorphic types, the
1220    function author cannot know in advance what data type it will be called
1221    with, or need to return. There are two routines provided in fmgr.h to
1222    allow a version-1 C function to discover the actual data types of its
1223    arguments and the type it is expected to return. The routines are
1224    called get_fn_expr_rettype(FmgrInfo *flinfo) and
1225    get_fn_expr_argtype(FmgrInfo *flinfo, int argnum). They return the
1226    result or argument type OID, or InvalidOid if the information is not
1227    available. The structure flinfo is normally accessed as fcinfo->flinfo.
1228    The parameter argnum is zero based. get_call_result_type can also be
1229    used as an alternative to get_fn_expr_rettype. There is also
1230    get_fn_expr_variadic, which can be used to find out whether variadic
1231    arguments have been merged into an array. This is primarily useful for
1232    VARIADIC "any" functions, since such merging will always have occurred
1233    for variadic functions taking ordinary array types.
1234
1235    For example, suppose we want to write a function to accept a single
1236    element of any type, and return a one-dimensional array of that type:
1237 PG_FUNCTION_INFO_V1(make_array);
1238 Datum
1239 make_array(PG_FUNCTION_ARGS)
1240 {
1241     ArrayType  *result;
1242     Oid         element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
1243     Datum       element;
1244     bool        isnull;
1245     int16       typlen;
1246     bool        typbyval;
1247     char        typalign;
1248     int         ndims;
1249     int         dims[MAXDIM];
1250     int         lbs[MAXDIM];
1251
1252     if (!OidIsValid(element_type))
1253         elog(ERROR, "could not determine data type of input");
1254
1255     /* get the provided element, being careful in case it's NULL */
1256     isnull = PG_ARGISNULL(0);
1257     if (isnull)
1258         element = (Datum) 0;
1259     else
1260         element = PG_GETARG_DATUM(0);
1261
1262     /* we have one dimension */
1263     ndims = 1;
1264     /* and one element */
1265     dims[0] = 1;
1266     /* and lower bound is 1 */
1267     lbs[0] = 1;
1268
1269     /* get required info about the element type */
1270     get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
1271
1272     /* now build the array */
1273     result = construct_md_array(&element, &isnull, ndims, dims, lbs,
1274                                 element_type, typlen, typbyval, typalign);
1275
1276     PG_RETURN_ARRAYTYPE_P(result);
1277 }
1278
1279    The following command declares the function make_array in SQL:
1280 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
1281     AS 'DIRECTORY/funcs', 'make_array'
1282     LANGUAGE C IMMUTABLE;
1283
1284    There is a variant of polymorphism that is only available to C-language
1285    functions: they can be declared to take parameters of type "any". (Note
1286    that this type name must be double-quoted, since it's also an SQL
1287    reserved word.) This works like anyelement except that it does not
1288    constrain different "any" arguments to be the same type, nor do they
1289    help determine the function's result type. A C-language function can
1290    also declare its final parameter to be VARIADIC "any". This will match
1291    one or more actual arguments of any type (not necessarily the same
1292    type). These arguments will not be gathered into an array as happens
1293    with normal variadic functions; they will just be passed to the
1294    function separately. The PG_NARGS() macro and the methods described
1295    above must be used to determine the number of actual arguments and
1296    their types when using this feature. Also, users of such a function
1297    might wish to use the VARIADIC keyword in their function call, with the
1298    expectation that the function would treat the array elements as
1299    separate arguments. The function itself must implement that behavior if
1300    wanted, after using get_fn_expr_variadic to detect that the actual
1301    argument was marked with VARIADIC.
1302
1303 36.10.11. Shared Memory #
1304
1305 36.10.11.1. Requesting Shared Memory at Startup #
1306
1307    Add-ins can reserve shared memory on server startup. To do so, the
1308    add-in's shared library must be preloaded by specifying it in
1309    shared_preload_libraries. The shared library should also register a
1310    shmem_request_hook in its _PG_init function. This shmem_request_hook
1311    can reserve shared memory by calling:
1312 void RequestAddinShmemSpace(Size size)
1313
1314    Each backend should obtain a pointer to the reserved shared memory by
1315    calling:
1316 void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
1317
1318    If this function sets foundPtr to false, the caller should proceed to
1319    initialize the contents of the reserved shared memory. If foundPtr is
1320    set to true, the shared memory was already initialized by another
1321    backend, and the caller need not initialize further.
1322
1323    To avoid race conditions, each backend should use the LWLock
1324    AddinShmemInitLock when initializing its allocation of shared memory,
1325    as shown here:
1326 static mystruct *ptr = NULL;
1327 bool        found;
1328
1329 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
1330 ptr = ShmemInitStruct("my struct name", size, &found);
1331 if (!found)
1332 {
1333     ... initialize contents of shared memory ...
1334     ptr->locks = GetNamedLWLockTranche("my tranche name");
1335 }
1336 LWLockRelease(AddinShmemInitLock);
1337
1338    shmem_startup_hook provides a convenient place for the initialization
1339    code, but it is not strictly required that all such code be placed in
1340    this hook. On Windows (and anywhere else where EXEC_BACKEND is
1341    defined), each backend executes the registered shmem_startup_hook
1342    shortly after it attaches to shared memory, so add-ins should still
1343    acquire AddinShmemInitLock within this hook, as shown in the example
1344    above. On other platforms, only the postmaster process executes the
1345    shmem_startup_hook, and each backend automatically inherits the
1346    pointers to shared memory.
1347
1348    An example of a shmem_request_hook and shmem_startup_hook can be found
1349    in contrib/pg_stat_statements/pg_stat_statements.c in the PostgreSQL
1350    source tree.
1351
1352 36.10.11.2. Requesting Shared Memory After Startup #
1353
1354    There is another, more flexible method of reserving shared memory that
1355    can be done after server startup and outside a shmem_request_hook. To
1356    do so, each backend that will use the shared memory should obtain a
1357    pointer to it by calling:
1358 void *GetNamedDSMSegment(const char *name, size_t size,
1359                          void (*init_callback) (void *ptr),
1360                          bool *found)
1361
1362    If a dynamic shared memory segment with the given name does not yet
1363    exist, this function will allocate it and initialize it with the
1364    provided init_callback callback function. If the segment has already
1365    been allocated and initialized by another backend, this function simply
1366    attaches the existing dynamic shared memory segment to the current
1367    backend.
1368
1369    Unlike shared memory reserved at server startup, there is no need to
1370    acquire AddinShmemInitLock or otherwise take action to avoid race
1371    conditions when reserving shared memory with GetNamedDSMSegment. This
1372    function ensures that only one backend allocates and initializes the
1373    segment and that all other backends receive a pointer to the fully
1374    allocated and initialized segment.
1375
1376    A complete usage example of GetNamedDSMSegment can be found in
1377    src/test/modules/test_dsm_registry/test_dsm_registry.c in the
1378    PostgreSQL source tree.
1379
1380 36.10.12. LWLocks #
1381
1382 36.10.12.1. Requesting LWLocks at Startup #
1383
1384    Add-ins can reserve LWLocks on server startup. As with shared memory
1385    reserved at server startup, the add-in's shared library must be
1386    preloaded by specifying it in shared_preload_libraries, and the shared
1387    library should register a shmem_request_hook in its _PG_init function.
1388    This shmem_request_hook can reserve LWLocks by calling:
1389 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
1390
1391    This ensures that an array of num_lwlocks LWLocks is available under
1392    the name tranche_name. A pointer to this array can be obtained by
1393    calling:
1394 LWLockPadded *GetNamedLWLockTranche(const char *tranche_name)
1395
1396 36.10.12.2. Requesting LWLocks After Startup #
1397
1398    There is another, more flexible method of obtaining LWLocks that can be
1399    done after server startup and outside a shmem_request_hook. To do so,
1400    first allocate a tranche_id by calling:
1401 int LWLockNewTrancheId(void)
1402
1403    Next, initialize each LWLock, passing the new tranche_id as an
1404    argument:
1405 void LWLockInitialize(LWLock *lock, int tranche_id)
1406
1407    Similar to shared memory, each backend should ensure that only one
1408    process allocates a new tranche_id and initializes each new LWLock. One
1409    way to do this is to only call these functions in your shared memory
1410    initialization code with the AddinShmemInitLock held exclusively. If
1411    using GetNamedDSMSegment, calling these functions in the init_callback
1412    callback function is sufficient to avoid race conditions.
1413
1414    Finally, each backend using the tranche_id should associate it with a
1415    tranche_name by calling:
1416 void LWLockRegisterTranche(int tranche_id, const char *tranche_name)
1417
1418    A complete usage example of LWLockNewTrancheId, LWLockInitialize, and
1419    LWLockRegisterTranche can be found in contrib/pg_prewarm/autoprewarm.c
1420    in the PostgreSQL source tree.
1421
1422 36.10.13. Custom Wait Events #
1423
1424    Add-ins can define custom wait events under the wait event type
1425    Extension by calling:
1426 uint32 WaitEventExtensionNew(const char *wait_event_name)
1427
1428    The wait event is associated to a user-facing custom string. An example
1429    can be found in src/test/modules/worker_spi in the PostgreSQL source
1430    tree.
1431
1432    Custom wait events can be viewed in pg_stat_activity:
1433 =# SELECT wait_event_type, wait_event FROM pg_stat_activity
1434      WHERE backend_type ~ 'worker_spi';
1435  wait_event_type |  wait_event
1436 -----------------+---------------
1437  Extension       | WorkerSpiMain
1438 (1 row)
1439
1440 36.10.14. Injection Points #
1441
1442    An injection point with a given name is declared using macro:
1443 INJECTION_POINT(name, arg);
1444
1445    There are a few injection points already declared at strategic points
1446    within the server code. After adding a new injection point the code
1447    needs to be compiled in order for that injection point to be available
1448    in the binary. Add-ins written in C-language can declare injection
1449    points in their own code using the same macro. The injection point
1450    names should use lower-case characters, with terms separated by dashes.
1451    arg is an optional argument value given to the callback at run-time.
1452
1453    Executing an injection point can require allocating a small amount of
1454    memory, which can fail. If you need to have an injection point in a
1455    critical section where dynamic allocations are not allowed, you can use
1456    a two-step approach with the following macros:
1457 INJECTION_POINT_LOAD(name);
1458 INJECTION_POINT_CACHED(name, arg);
1459
1460    Before entering the critical section, call INJECTION_POINT_LOAD. It
1461    checks the shared memory state, and loads the callback into
1462    backend-private memory if it is active. Inside the critical section,
1463    use INJECTION_POINT_CACHED to execute the callback.
1464
1465    Add-ins can attach callbacks to an already-declared injection point by
1466    calling:
1467 extern void InjectionPointAttach(const char *name,
1468                                  const char *library,
1469                                  const char *function,
1470                                  const void *private_data,
1471                                  int private_data_size);
1472
1473    name is the name of the injection point, which when reached during
1474    execution will execute the function loaded from library. private_data
1475    is a private area of data of size private_data_size given as argument
1476    to the callback when executed.
1477
1478    Here is an example of callback for InjectionPointCallback:
1479 static void
1480 custom_injection_callback(const char *name,
1481                           const void *private_data,
1482                           void *arg)
1483 {
1484     uint32 wait_event_info = WaitEventInjectionPointNew(name);
1485
1486     pgstat_report_wait_start(wait_event_info);
1487     elog(NOTICE, "%s: executed custom callback", name);
1488     pgstat_report_wait_end();
1489 }
1490
1491    This callback prints a message to server error log with severity
1492    NOTICE, but callbacks may implement more complex logic.
1493
1494    An alternative way to define the action to take when an injection point
1495    is reached is to add the testing code alongside the normal source code.
1496    This can be useful if the action e.g. depends on local variables that
1497    are not accessible to loaded modules. The IS_INJECTION_POINT_ATTACHED
1498    macro can then be used to check if an injection point is attached, for
1499    example:
1500 #ifdef USE_INJECTION_POINTS
1501 if (IS_INJECTION_POINT_ATTACHED("before-foobar"))
1502 {
1503     /* change a local variable if injection point is attached */
1504     local_var = 123;
1505
1506     /* also execute the callback */
1507     INJECTION_POINT_CACHED("before-foobar", NULL);
1508 }
1509 #endif
1510
1511    Note that the callback attached to the injection point will not be
1512    executed by the IS_INJECTION_POINT_ATTACHED macro. If you want to
1513    execute the callback, you must also call INJECTION_POINT_CACHED like in
1514    the above example.
1515
1516    Optionally, it is possible to detach an injection point by calling:
1517 extern bool InjectionPointDetach(const char *name);
1518
1519    On success, true is returned, false otherwise.
1520
1521    A callback attached to an injection point is available across all the
1522    backends including the backends started after InjectionPointAttach is
1523    called. It remains attached while the server is running or until the
1524    injection point is detached using InjectionPointDetach.
1525
1526    An example can be found in src/test/modules/injection_points in the
1527    PostgreSQL source tree.
1528
1529    Enabling injections points requires --enable-injection-points with
1530    configure or -Dinjection_points=true with Meson.
1531
1532 36.10.15. Custom Cumulative Statistics #
1533
1534    It is possible for add-ins written in C-language to use custom types of
1535    cumulative statistics registered in the Cumulative Statistics System.
1536
1537    First, define a PgStat_KindInfo that includes all the information
1538    related to the custom type registered. For example:
1539 static const PgStat_KindInfo custom_stats = {
1540     .name = "custom_stats",
1541     .fixed_amount = false,
1542     .shared_size = sizeof(PgStatShared_Custom),
1543     .shared_data_off = offsetof(PgStatShared_Custom, stats),
1544     .shared_data_len = sizeof(((PgStatShared_Custom *) 0)->stats),
1545     .pending_size = sizeof(PgStat_StatCustomEntry),
1546 }
1547
1548    Then, each backend that needs to use this custom type needs to register
1549    it with pgstat_register_kind and a unique ID used to store the entries
1550    related to this type of statistics:
1551 extern PgStat_Kind pgstat_register_kind(PgStat_Kind kind,
1552                                         const PgStat_KindInfo *kind_info);
1553
1554    While developing a new extension, use PGSTAT_KIND_EXPERIMENTAL for
1555    kind. When you are ready to release the extension to users, reserve a
1556    kind ID at the Custom Cumulative Statistics page.
1557
1558    The details of the API for PgStat_KindInfo can be found in
1559    src/include/utils/pgstat_internal.h.
1560
1561    The type of statistics registered is associated with a name and a
1562    unique ID shared across the server in shared memory. Each backend using
1563    a custom type of statistics maintains a local cache storing the
1564    information of each custom PgStat_KindInfo.
1565
1566    Place the extension module implementing the custom cumulative
1567    statistics type in shared_preload_libraries so that it will be loaded
1568    early during PostgreSQL startup.
1569
1570    An example describing how to register and use custom statistics can be
1571    found in src/test/modules/injection_points.
1572
1573 36.10.16. Using C++ for Extensibility #
1574
1575    Although the PostgreSQL backend is written in C, it is possible to
1576    write extensions in C++ if these guidelines are followed:
1577      * All functions accessed by the backend must present a C interface to
1578        the backend; these C functions can then call C++ functions. For
1579        example, extern C linkage is required for backend-accessed
1580        functions. This is also necessary for any functions that are passed
1581        as pointers between the backend and C++ code.
1582      * Free memory using the appropriate deallocation method. For example,
1583        most backend memory is allocated using palloc(), so use pfree() to
1584        free it. Using C++ delete in such cases will fail.
1585      * Prevent exceptions from propagating into the C code (use a
1586        catch-all block at the top level of all extern C functions). This
1587        is necessary even if the C++ code does not explicitly throw any
1588        exceptions, because events like out-of-memory can still throw
1589        exceptions. Any exceptions must be caught and appropriate errors
1590        passed back to the C interface. If possible, compile C++ with
1591        -fno-exceptions to eliminate exceptions entirely; in such cases,
1592        you must check for failures in your C++ code, e.g., check for NULL
1593        returned by new().
1594      * If calling backend functions from C++ code, be sure that the C++
1595        call stack contains only plain old data structures (POD). This is
1596        necessary because backend errors generate a distant longjmp() that
1597        does not properly unroll a C++ call stack with non-POD objects.
1598
1599    In summary, it is best to place C++ code behind a wall of extern C
1600    functions that interface to the backend, and avoid exception, memory,
1601    and call stack leakage.