begriffs open source - ai-pg/blob - full-docs/txt/tablefunc.txt

   1
   2 F.43. tablefunc — functions that return tables (crosstab and others) #
   3
   4    F.43.1. Functions Provided
   5    F.43.2. Author
   6
   7    The tablefunc module includes various functions that return tables
   8    (that is, multiple rows). These functions are useful both in their own
   9    right and as examples of how to write C functions that return multiple
  10    rows.
  11
  12    This module is considered “trusted”, that is, it can be installed by
  13    non-superusers who have CREATE privilege on the current database.
  14
  15 F.43.1. Functions Provided #
  16
  17    Table F.33 summarizes the functions provided by the tablefunc module.
  18
  19    Table F.33. tablefunc Functions
  20
  21    Function
  22
  23    Description
  24
  25    normal_rand ( numvals integer, mean float8, stddev float8 ) → setof
  26    float8
  27
  28    Produces a set of normally distributed random values.
  29
  30    crosstab ( sql text ) → setof record
  31
  32    Produces a “pivot table” containing row names plus N value columns,
  33    where N is determined by the row type specified in the calling query.
  34
  35    crosstabN ( sql text ) → setof table_crosstab_N
  36
  37    Produces a “pivot table” containing row names plus N value columns.
  38    crosstab2, crosstab3, and crosstab4 are predefined, but you can create
  39    additional crosstabN functions as described below.
  40
  41    crosstab ( source_sql text, category_sql text ) → setof record
  42
  43    Produces a “pivot table” with the value columns specified by a second
  44    query.
  45
  46    crosstab ( sql text, N integer ) → setof record
  47
  48    Obsolete version of crosstab(text). The parameter N is now ignored,
  49    since the number of value columns is always determined by the calling
  50    query.
  51
  52    connectby ( relname text, keyid_fld text, parent_keyid_fld text [,
  53    orderby_fld text ], start_with text, max_depth integer [, branch_delim
  54    text ] ) → setof record
  55
  56    Produces a representation of a hierarchical tree structure.
  57
  58 F.43.1.1. normal_rand #
  59
  60 normal_rand(int numvals, float8 mean, float8 stddev) returns setof float8
  61
  62    normal_rand produces a set of normally distributed random values
  63    (Gaussian distribution).
  64
  65    numvals is the number of values to be returned from the function. mean
  66    is the mean of the normal distribution of values and stddev is the
  67    standard deviation of the normal distribution of values.
  68
  69    For example, this call requests 1000 values with a mean of 5 and a
  70    standard deviation of 3:
  71 test=# SELECT * FROM normal_rand(1000, 5, 3);
  72      normal_rand
  73 ----------------------
  74      1.56556322244898
  75      9.10040991424657
  76      5.36957140345079
  77    -0.369151492880995
  78     0.283600703686639
  79        .
  80        .
  81        .
  82      4.82992125404908
  83      9.71308014517282
  84      2.49639286969028
  85 (1000 rows)
  86
  87 F.43.1.2. crosstab(text) #
  88
  89 crosstab(text sql)
  90 crosstab(text sql, int N)
  91
  92    The crosstab function is used to produce “pivot” displays, wherein data
  93    is listed across the page rather than down. For example, we might have
  94    data like
  95 row1    val11
  96 row1    val12
  97 row1    val13
  98 ...
  99 row2    val21
 100 row2    val22
 101 row2    val23
 102 ...
 103
 104    which we wish to display like
 105 row1    val11   val12   val13   ...
 106 row2    val21   val22   val23   ...
 107 ...
 108
 109    The crosstab function takes a text parameter that is an SQL query
 110    producing raw data formatted in the first way, and produces a table
 111    formatted in the second way.
 112
 113    The sql parameter is an SQL statement that produces the source set of
 114    data. This statement must return one row_name column, one category
 115    column, and one value column. N is an obsolete parameter, ignored if
 116    supplied (formerly this had to match the number of output value
 117    columns, but now that is determined by the calling query).
 118
 119    For example, the provided query might produce a set something like:
 120  row_name    cat    value
 121 ----------+-------+-------
 122   row1      cat1    val1
 123   row1      cat2    val2
 124   row1      cat3    val3
 125   row1      cat4    val4
 126   row2      cat1    val5
 127   row2      cat2    val6
 128   row2      cat3    val7
 129   row2      cat4    val8
 130
 131    The crosstab function is declared to return setof record, so the actual
 132    names and types of the output columns must be defined in the FROM
 133    clause of the calling SELECT statement, for example:
 134 SELECT * FROM crosstab('...') AS ct(row_name text, category_1 text, category_2 t
 135 ext);
 136
 137    This example produces a set something like:
 138            <== value  columns  ==>
 139  row_name   category_1   category_2
 140 ----------+------------+------------
 141   row1        val1         val2
 142   row2        val5         val6
 143
 144    The FROM clause must define the output as one row_name column (of the
 145    same data type as the first result column of the SQL query) followed by
 146    N value columns (all of the same data type as the third result column
 147    of the SQL query). You can set up as many output value columns as you
 148    wish. The names of the output columns are up to you.
 149
 150    The crosstab function produces one output row for each consecutive
 151    group of input rows with the same row_name value. It fills the output
 152    value columns, left to right, with the value fields from these rows. If
 153    there are fewer rows in a group than there are output value columns,
 154    the extra output columns are filled with nulls; if there are more rows,
 155    the extra input rows are skipped.
 156
 157    In practice the SQL query should always specify ORDER BY 1,2 to ensure
 158    that the input rows are properly ordered, that is, values with the same
 159    row_name are brought together and correctly ordered within the row.
 160    Notice that crosstab itself does not pay any attention to the second
 161    column of the query result; it's just there to be ordered by, to
 162    control the order in which the third-column values appear across the
 163    page.
 164
 165    Here is a complete example:
 166 CREATE TABLE ct(id SERIAL, rowid TEXT, attribute TEXT, value TEXT);
 167 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');
 168 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att2','val2');
 169 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att3','val3');
 170 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att4','val4');
 171 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att1','val5');
 172 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att2','val6');
 173 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att3','val7');
 174 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att4','val8');
 175
 176 SELECT *
 177 FROM crosstab(
 178   'select rowid, attribute, value
 179    from ct
 180    where attribute = ''att2'' or attribute = ''att3''
 181    order by 1,2')
 182 AS ct(row_name text, category_1 text, category_2 text, category_3 text);
 183
 184  row_name | category_1 | category_2 | category_3
 185 ----------+------------+------------+------------
 186  test1    | val2       | val3       |
 187  test2    | val6       | val7       |
 188 (2 rows)
 189
 190    You can avoid always having to write out a FROM clause to define the
 191    output columns, by setting up a custom crosstab function that has the
 192    desired output row type wired into its definition. This is described in
 193    the next section. Another possibility is to embed the required FROM
 194    clause in a view definition.
 195
 196 Note
 197
 198    See also the \crosstabview command in psql, which provides
 199    functionality similar to crosstab().
 200
 201 F.43.1.3. crosstabN(text) #
 202
 203 crosstabN(text sql)
 204
 205    The crosstabN functions are examples of how to set up custom wrappers
 206    for the general crosstab function, so that you need not write out
 207    column names and types in the calling SELECT query. The tablefunc
 208    module includes crosstab2, crosstab3, and crosstab4, whose output row
 209    types are defined as
 210 CREATE TYPE tablefunc_crosstab_N AS (
 211     row_name TEXT,
 212     category_1 TEXT,
 213     category_2 TEXT,
 214         .
 215         .
 216         .
 217     category_N TEXT
 218 );
 219
 220    Thus, these functions can be used directly when the input query
 221    produces row_name and value columns of type text, and you want 2, 3, or
 222    4 output values columns. In all other ways they behave exactly as
 223    described above for the general crosstab function.
 224
 225    For instance, the example given in the previous section would also work
 226    as
 227 SELECT *
 228 FROM crosstab3(
 229   'select rowid, attribute, value
 230    from ct
 231    where attribute = ''att2'' or attribute = ''att3''
 232    order by 1,2');
 233
 234    These functions are provided mostly for illustration purposes. You can
 235    create your own return types and functions based on the underlying
 236    crosstab() function. There are two ways to do it:
 237      * Create a composite type describing the desired output columns,
 238        similar to the examples in contrib/tablefunc/tablefunc--1.0.sql.
 239        Then define a unique function name accepting one text parameter and
 240        returning setof your_type_name, but linking to the same underlying
 241        crosstab C function. For example, if your source data produces row
 242        names that are text, and values that are float8, and you want 5
 243        value columns:
 244 CREATE TYPE my_crosstab_float8_5_cols AS (
 245     my_row_name text,
 246     my_category_1 float8,
 247     my_category_2 float8,
 248     my_category_3 float8,
 249     my_category_4 float8,
 250     my_category_5 float8
 251 );
 252
 253 CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(text)
 254     RETURNS setof my_crosstab_float8_5_cols
 255     AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
 256
 257      * Use OUT parameters to define the return type implicitly. The same
 258        example could also be done this way:
 259 CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(
 260     IN text,
 261     OUT my_row_name text,
 262     OUT my_category_1 float8,
 263     OUT my_category_2 float8,
 264     OUT my_category_3 float8,
 265     OUT my_category_4 float8,
 266     OUT my_category_5 float8)
 267   RETURNS setof record
 268   AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
 269
 270 F.43.1.4. crosstab(text, text) #
 271
 272 crosstab(text source_sql, text category_sql)
 273
 274    The main limitation of the single-parameter form of crosstab is that it
 275    treats all values in a group alike, inserting each value into the first
 276    available column. If you want the value columns to correspond to
 277    specific categories of data, and some groups might not have data for
 278    some of the categories, that doesn't work well. The two-parameter form
 279    of crosstab handles this case by providing an explicit list of the
 280    categories corresponding to the output columns.
 281
 282    source_sql is an SQL statement that produces the source set of data.
 283    This statement must return one row_name column, one category column,
 284    and one value column. It may also have one or more “extra” columns. The
 285    row_name column must be first. The category and value columns must be
 286    the last two columns, in that order. Any columns between row_name and
 287    category are treated as “extra”. The “extra” columns are expected to be
 288    the same for all rows with the same row_name value.
 289
 290    For example, source_sql might produce a set something like:
 291 SELECT row_name, extra_col, cat, value FROM foo ORDER BY 1;
 292
 293  row_name    extra_col   cat    value
 294 ----------+------------+-----+---------
 295   row1         extra1    cat1    val1
 296   row1         extra1    cat2    val2
 297   row1         extra1    cat4    val4
 298   row2         extra2    cat1    val5
 299   row2         extra2    cat2    val6
 300   row2         extra2    cat3    val7
 301   row2         extra2    cat4    val8
 302
 303    category_sql is an SQL statement that produces the set of categories.
 304    This statement must return only one column. It must produce at least
 305    one row, or an error will be generated. Also, it must not produce
 306    duplicate values, or an error will be generated. category_sql might be
 307    something like:
 308 SELECT DISTINCT cat FROM foo ORDER BY 1;
 309     cat
 310   -------
 311     cat1
 312     cat2
 313     cat3
 314     cat4
 315
 316    The crosstab function is declared to return setof record, so the actual
 317    names and types of the output columns must be defined in the FROM
 318    clause of the calling SELECT statement, for example:
 319 SELECT * FROM crosstab('...', '...')
 320     AS ct(row_name text, extra text, cat1 text, cat2 text, cat3 text, cat4 text)
 321 ;
 322
 323    This will produce a result something like:
 324                   <==  value  columns   ==>
 325 row_name   extra   cat1   cat2   cat3   cat4
 326 ---------+-------+------+------+------+------
 327   row1     extra1  val1   val2          val4
 328   row2     extra2  val5   val6   val7   val8
 329
 330    The FROM clause must define the proper number of output columns of the
 331    proper data types. If there are N columns in the source_sql query's
 332    result, the first N-2 of them must match up with the first N-2 output
 333    columns. The remaining output columns must have the type of the last
 334    column of the source_sql query's result, and there must be exactly as
 335    many of them as there are rows in the category_sql query's result.
 336
 337    The crosstab function produces one output row for each consecutive
 338    group of input rows with the same row_name value. The output row_name
 339    column, plus any “extra” columns, are copied from the first row of the
 340    group. The output value columns are filled with the value fields from
 341    rows having matching category values. If a row's category does not
 342    match any output of the category_sql query, its value is ignored.
 343    Output columns whose matching category is not present in any input row
 344    of the group are filled with nulls.
 345
 346    In practice the source_sql query should always specify ORDER BY 1 to
 347    ensure that values with the same row_name are brought together.
 348    However, ordering of the categories within a group is not important.
 349    Also, it is essential to be sure that the order of the category_sql
 350    query's output matches the specified output column order.
 351
 352    Here are two complete examples:
 353 create table sales(year int, month int, qty int);
 354 insert into sales values(2007, 1, 1000);
 355 insert into sales values(2007, 2, 1500);
 356 insert into sales values(2007, 7, 500);
 357 insert into sales values(2007, 11, 1500);
 358 insert into sales values(2007, 12, 2000);
 359 insert into sales values(2008, 1, 1000);
 360
 361 select * from crosstab(
 362   'select year, month, qty from sales order by 1',
 363   'select m from generate_series(1,12) m'
 364 ) as (
 365   year int,
 366   "Jan" int,
 367   "Feb" int,
 368   "Mar" int,
 369   "Apr" int,
 370   "May" int,
 371   "Jun" int,
 372   "Jul" int,
 373   "Aug" int,
 374   "Sep" int,
 375   "Oct" int,
 376   "Nov" int,
 377   "Dec" int
 378 );
 379  year | Jan  | Feb  | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov  | Dec
 380 ------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+------+----
 381 --
 382  2007 | 1000 | 1500 |     |     |     |     | 500 |     |     |     | 1500 | 200
 383 0
 384  2008 | 1000 |      |     |     |     |     |     |     |     |     |      |
 385 (2 rows)
 386
 387 CREATE TABLE cth(rowid text, rowdt timestamp, attribute text, val text);
 388 INSERT INTO cth VALUES('test1','01 March 2003','temperature','42');
 389 INSERT INTO cth VALUES('test1','01 March 2003','test_result','PASS');
 390 INSERT INTO cth VALUES('test1','01 March 2003','volts','2.6987');
 391 INSERT INTO cth VALUES('test2','02 March 2003','temperature','53');
 392 INSERT INTO cth VALUES('test2','02 March 2003','test_result','FAIL');
 393 INSERT INTO cth VALUES('test2','02 March 2003','test_startdate','01 March 2003')
 394 ;
 395 INSERT INTO cth VALUES('test2','02 March 2003','volts','3.1234');
 396
 397 SELECT * FROM crosstab
 398 (
 399   'SELECT rowid, rowdt, attribute, val FROM cth ORDER BY 1',
 400   'SELECT DISTINCT attribute FROM cth ORDER BY 1'
 401 )
 402 AS
 403 (
 404        rowid text,
 405        rowdt timestamp,
 406        temperature int4,
 407        test_result text,
 408        test_startdate timestamp,
 409        volts float8
 410 );
 411  rowid |          rowdt           | temperature | test_result |      test_startd
 412 ate      | volts
 413 -------+--------------------------+-------------+-------------+-----------------
 414 ---------+--------
 415  test1 | Sat Mar 01 00:00:00 2003 |          42 | PASS        |
 416          | 2.6987
 417  test2 | Sun Mar 02 00:00:00 2003 |          53 | FAIL        | Sat Mar 01 00:00
 418 :00 2003 | 3.1234
 419 (2 rows)
 420
 421    You can create predefined functions to avoid having to write out the
 422    result column names and types in each query. See the examples in the
 423    previous section. The underlying C function for this form of crosstab
 424    is named crosstab_hash.
 425
 426 F.43.1.5. connectby #
 427
 428 connectby(text relname, text keyid_fld, text parent_keyid_fld
 429           [, text orderby_fld ], text start_with, int max_depth
 430           [, text branch_delim ])
 431
 432    The connectby function produces a display of hierarchical data that is
 433    stored in a table. The table must have a key field that uniquely
 434    identifies rows, and a parent-key field that references the parent (if
 435    any) of each row. connectby can display the sub-tree descending from
 436    any row.
 437
 438    Table F.34 explains the parameters.
 439
 440    Table F.34. connectby Parameters
 441       Parameter                           Description
 442    relname          Name of the source relation
 443    keyid_fld        Name of the key field
 444    parent_keyid_fld Name of the parent-key field
 445    orderby_fld      Name of the field to order siblings by (optional)
 446    start_with       Key value of the row to start at
 447    max_depth        Maximum depth to descend to, or zero for unlimited depth
 448    branch_delim     String to separate keys with in branch output (optional)
 449
 450    The key and parent-key fields can be any data type, but they must be
 451    the same type. Note that the start_with value must be entered as a text
 452    string, regardless of the type of the key field.
 453
 454    The connectby function is declared to return setof record, so the
 455    actual names and types of the output columns must be defined in the
 456    FROM clause of the calling SELECT statement, for example:
 457 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2'
 458 , 0, '~')
 459     AS t(keyid text, parent_keyid text, level int, branch text, pos int);
 460
 461    The first two output columns are used for the current row's key and its
 462    parent row's key; they must match the type of the table's key field.
 463    The third output column is the depth in the tree and must be of type
 464    integer. If a branch_delim parameter was given, the next output column
 465    is the branch display and must be of type text. Finally, if an
 466    orderby_fld parameter was given, the last output column is a serial
 467    number, and must be of type integer.
 468
 469    The “branch” output column shows the path of keys taken to reach the
 470    current row. The keys are separated by the specified branch_delim
 471    string. If no branch display is wanted, omit both the branch_delim
 472    parameter and the branch column in the output column list.
 473
 474    If the ordering of siblings of the same parent is important, include
 475    the orderby_fld parameter to specify which field to order siblings by.
 476    This field can be of any sortable data type. The output column list
 477    must include a final integer serial-number column, if and only if
 478    orderby_fld is specified.
 479
 480    The parameters representing table and field names are copied as-is into
 481    the SQL queries that connectby generates internally. Therefore, include
 482    double quotes if the names are mixed-case or contain special
 483    characters. You may also need to schema-qualify the table name.
 484
 485    In large tables, performance will be poor unless there is an index on
 486    the parent-key field.
 487
 488    It is important that the branch_delim string not appear in any key
 489    values, else connectby may incorrectly report an infinite-recursion
 490    error. Note that if branch_delim is not provided, a default value of ~
 491    is used for recursion detection purposes.
 492
 493    Here is an example:
 494 CREATE TABLE connectby_tree(keyid text, parent_keyid text, pos int);
 495
 496 INSERT INTO connectby_tree VALUES('row1',NULL, 0);
 497 INSERT INTO connectby_tree VALUES('row2','row1', 0);
 498 INSERT INTO connectby_tree VALUES('row3','row1', 0);
 499 INSERT INTO connectby_tree VALUES('row4','row2', 1);
 500 INSERT INTO connectby_tree VALUES('row5','row2', 0);
 501 INSERT INTO connectby_tree VALUES('row6','row4', 0);
 502 INSERT INTO connectby_tree VALUES('row7','row3', 0);
 503 INSERT INTO connectby_tree VALUES('row8','row6', 0);
 504 INSERT INTO connectby_tree VALUES('row9','row5', 0);
 505
 506 -- with branch, without orderby_fld (order of results is not guaranteed)
 507 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0, '~
 508 ')
 509  AS t(keyid text, parent_keyid text, level int, branch text);
 510  keyid | parent_keyid | level |       branch
 511 -------+--------------+-------+---------------------
 512  row2  |              |     0 | row2
 513  row4  | row2         |     1 | row2~row4
 514  row6  | row4         |     2 | row2~row4~row6
 515  row8  | row6         |     3 | row2~row4~row6~row8
 516  row5  | row2         |     1 | row2~row5
 517  row9  | row5         |     2 | row2~row5~row9
 518 (6 rows)
 519
 520 -- without branch, without orderby_fld (order of results is not guaranteed)
 521 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0)
 522  AS t(keyid text, parent_keyid text, level int);
 523  keyid | parent_keyid | level
 524 -------+--------------+-------
 525  row2  |              |     0
 526  row4  | row2         |     1
 527  row6  | row4         |     2
 528  row8  | row6         |     3
 529  row5  | row2         |     1
 530  row9  | row5         |     2
 531 (6 rows)
 532
 533 -- with branch, with orderby_fld (notice that row5 comes before row4)
 534 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2'
 535 , 0, '~')
 536  AS t(keyid text, parent_keyid text, level int, branch text, pos int);
 537  keyid | parent_keyid | level |       branch        | pos
 538 -------+--------------+-------+---------------------+-----
 539  row2  |              |     0 | row2                |   1
 540  row5  | row2         |     1 | row2~row5           |   2
 541  row9  | row5         |     2 | row2~row5~row9      |   3
 542  row4  | row2         |     1 | row2~row4           |   4
 543  row6  | row4         |     2 | row2~row4~row6      |   5
 544  row8  | row6         |     3 | row2~row4~row6~row8 |   6
 545 (6 rows)
 546
 547 -- without branch, with orderby_fld (notice that row5 comes before row4)
 548 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2'
 549 , 0)
 550  AS t(keyid text, parent_keyid text, level int, pos int);
 551  keyid | parent_keyid | level | pos
 552 -------+--------------+-------+-----
 553  row2  |              |     0 |   1
 554  row5  | row2         |     1 |   2
 555  row9  | row5         |     2 |   3
 556  row4  | row2         |     1 |   4
 557  row6  | row4         |     2 |   5
 558  row8  | row6         |     3 |   6
 559 (6 rows)
 560
 561 F.43.2. Author #
 562
 563    Joe Conway