2 F.43. tablefunc — functions that return tables (crosstab and others) #
4 F.43.1. Functions Provided
7 The tablefunc module includes various functions that return tables
8 (that is, multiple rows). These functions are useful both in their own
9 right and as examples of how to write C functions that return multiple
12 This module is considered “trusted”, that is, it can be installed by
13 non-superusers who have CREATE privilege on the current database.
15 F.43.1. Functions Provided #
17 Table F.33 summarizes the functions provided by the tablefunc module.
19 Table F.33. tablefunc Functions
25 normal_rand ( numvals integer, mean float8, stddev float8 ) → setof
28 Produces a set of normally distributed random values.
30 crosstab ( sql text ) → setof record
32 Produces a “pivot table” containing row names plus N value columns,
33 where N is determined by the row type specified in the calling query.
35 crosstabN ( sql text ) → setof table_crosstab_N
37 Produces a “pivot table” containing row names plus N value columns.
38 crosstab2, crosstab3, and crosstab4 are predefined, but you can create
39 additional crosstabN functions as described below.
41 crosstab ( source_sql text, category_sql text ) → setof record
43 Produces a “pivot table” with the value columns specified by a second
46 crosstab ( sql text, N integer ) → setof record
48 Obsolete version of crosstab(text). The parameter N is now ignored,
49 since the number of value columns is always determined by the calling
52 connectby ( relname text, keyid_fld text, parent_keyid_fld text [,
53 orderby_fld text ], start_with text, max_depth integer [, branch_delim
54 text ] ) → setof record
56 Produces a representation of a hierarchical tree structure.
58 F.43.1.1. normal_rand #
60 normal_rand(int numvals, float8 mean, float8 stddev) returns setof float8
62 normal_rand produces a set of normally distributed random values
63 (Gaussian distribution).
65 numvals is the number of values to be returned from the function. mean
66 is the mean of the normal distribution of values and stddev is the
67 standard deviation of the normal distribution of values.
69 For example, this call requests 1000 values with a mean of 5 and a
70 standard deviation of 3:
71 test=# SELECT * FROM normal_rand(1000, 5, 3);
73 ----------------------
87 F.43.1.2. crosstab(text) #
90 crosstab(text sql, int N)
92 The crosstab function is used to produce “pivot” displays, wherein data
93 is listed across the page rather than down. For example, we might have
104 which we wish to display like
105 row1 val11 val12 val13 ...
106 row2 val21 val22 val23 ...
109 The crosstab function takes a text parameter that is an SQL query
110 producing raw data formatted in the first way, and produces a table
111 formatted in the second way.
113 The sql parameter is an SQL statement that produces the source set of
114 data. This statement must return one row_name column, one category
115 column, and one value column. N is an obsolete parameter, ignored if
116 supplied (formerly this had to match the number of output value
117 columns, but now that is determined by the calling query).
119 For example, the provided query might produce a set something like:
121 ----------+-------+-------
131 The crosstab function is declared to return setof record, so the actual
132 names and types of the output columns must be defined in the FROM
133 clause of the calling SELECT statement, for example:
134 SELECT * FROM crosstab('...') AS ct(row_name text, category_1 text, category_2 t
137 This example produces a set something like:
138 <== value columns ==>
139 row_name category_1 category_2
140 ----------+------------+------------
144 The FROM clause must define the output as one row_name column (of the
145 same data type as the first result column of the SQL query) followed by
146 N value columns (all of the same data type as the third result column
147 of the SQL query). You can set up as many output value columns as you
148 wish. The names of the output columns are up to you.
150 The crosstab function produces one output row for each consecutive
151 group of input rows with the same row_name value. It fills the output
152 value columns, left to right, with the value fields from these rows. If
153 there are fewer rows in a group than there are output value columns,
154 the extra output columns are filled with nulls; if there are more rows,
155 the extra input rows are skipped.
157 In practice the SQL query should always specify ORDER BY 1,2 to ensure
158 that the input rows are properly ordered, that is, values with the same
159 row_name are brought together and correctly ordered within the row.
160 Notice that crosstab itself does not pay any attention to the second
161 column of the query result; it's just there to be ordered by, to
162 control the order in which the third-column values appear across the
165 Here is a complete example:
166 CREATE TABLE ct(id SERIAL, rowid TEXT, attribute TEXT, value TEXT);
167 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');
168 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att2','val2');
169 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att3','val3');
170 INSERT INTO ct(rowid, attribute, value) VALUES('test1','att4','val4');
171 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att1','val5');
172 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att2','val6');
173 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att3','val7');
174 INSERT INTO ct(rowid, attribute, value) VALUES('test2','att4','val8');
178 'select rowid, attribute, value
180 where attribute = ''att2'' or attribute = ''att3''
182 AS ct(row_name text, category_1 text, category_2 text, category_3 text);
184 row_name | category_1 | category_2 | category_3
185 ----------+------------+------------+------------
186 test1 | val2 | val3 |
187 test2 | val6 | val7 |
190 You can avoid always having to write out a FROM clause to define the
191 output columns, by setting up a custom crosstab function that has the
192 desired output row type wired into its definition. This is described in
193 the next section. Another possibility is to embed the required FROM
194 clause in a view definition.
198 See also the \crosstabview command in psql, which provides
199 functionality similar to crosstab().
201 F.43.1.3. crosstabN(text) #
205 The crosstabN functions are examples of how to set up custom wrappers
206 for the general crosstab function, so that you need not write out
207 column names and types in the calling SELECT query. The tablefunc
208 module includes crosstab2, crosstab3, and crosstab4, whose output row
210 CREATE TYPE tablefunc_crosstab_N AS (
220 Thus, these functions can be used directly when the input query
221 produces row_name and value columns of type text, and you want 2, 3, or
222 4 output values columns. In all other ways they behave exactly as
223 described above for the general crosstab function.
225 For instance, the example given in the previous section would also work
229 'select rowid, attribute, value
231 where attribute = ''att2'' or attribute = ''att3''
234 These functions are provided mostly for illustration purposes. You can
235 create your own return types and functions based on the underlying
236 crosstab() function. There are two ways to do it:
237 * Create a composite type describing the desired output columns,
238 similar to the examples in contrib/tablefunc/tablefunc--1.0.sql.
239 Then define a unique function name accepting one text parameter and
240 returning setof your_type_name, but linking to the same underlying
241 crosstab C function. For example, if your source data produces row
242 names that are text, and values that are float8, and you want 5
244 CREATE TYPE my_crosstab_float8_5_cols AS (
246 my_category_1 float8,
247 my_category_2 float8,
248 my_category_3 float8,
249 my_category_4 float8,
253 CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(text)
254 RETURNS setof my_crosstab_float8_5_cols
255 AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
257 * Use OUT parameters to define the return type implicitly. The same
258 example could also be done this way:
259 CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(
261 OUT my_row_name text,
262 OUT my_category_1 float8,
263 OUT my_category_2 float8,
264 OUT my_category_3 float8,
265 OUT my_category_4 float8,
266 OUT my_category_5 float8)
268 AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
270 F.43.1.4. crosstab(text, text) #
272 crosstab(text source_sql, text category_sql)
274 The main limitation of the single-parameter form of crosstab is that it
275 treats all values in a group alike, inserting each value into the first
276 available column. If you want the value columns to correspond to
277 specific categories of data, and some groups might not have data for
278 some of the categories, that doesn't work well. The two-parameter form
279 of crosstab handles this case by providing an explicit list of the
280 categories corresponding to the output columns.
282 source_sql is an SQL statement that produces the source set of data.
283 This statement must return one row_name column, one category column,
284 and one value column. It may also have one or more “extra” columns. The
285 row_name column must be first. The category and value columns must be
286 the last two columns, in that order. Any columns between row_name and
287 category are treated as “extra”. The “extra” columns are expected to be
288 the same for all rows with the same row_name value.
290 For example, source_sql might produce a set something like:
291 SELECT row_name, extra_col, cat, value FROM foo ORDER BY 1;
293 row_name extra_col cat value
294 ----------+------------+-----+---------
295 row1 extra1 cat1 val1
296 row1 extra1 cat2 val2
297 row1 extra1 cat4 val4
298 row2 extra2 cat1 val5
299 row2 extra2 cat2 val6
300 row2 extra2 cat3 val7
301 row2 extra2 cat4 val8
303 category_sql is an SQL statement that produces the set of categories.
304 This statement must return only one column. It must produce at least
305 one row, or an error will be generated. Also, it must not produce
306 duplicate values, or an error will be generated. category_sql might be
308 SELECT DISTINCT cat FROM foo ORDER BY 1;
316 The crosstab function is declared to return setof record, so the actual
317 names and types of the output columns must be defined in the FROM
318 clause of the calling SELECT statement, for example:
319 SELECT * FROM crosstab('...', '...')
320 AS ct(row_name text, extra text, cat1 text, cat2 text, cat3 text, cat4 text)
323 This will produce a result something like:
324 <== value columns ==>
325 row_name extra cat1 cat2 cat3 cat4
326 ---------+-------+------+------+------+------
327 row1 extra1 val1 val2 val4
328 row2 extra2 val5 val6 val7 val8
330 The FROM clause must define the proper number of output columns of the
331 proper data types. If there are N columns in the source_sql query's
332 result, the first N-2 of them must match up with the first N-2 output
333 columns. The remaining output columns must have the type of the last
334 column of the source_sql query's result, and there must be exactly as
335 many of them as there are rows in the category_sql query's result.
337 The crosstab function produces one output row for each consecutive
338 group of input rows with the same row_name value. The output row_name
339 column, plus any “extra” columns, are copied from the first row of the
340 group. The output value columns are filled with the value fields from
341 rows having matching category values. If a row's category does not
342 match any output of the category_sql query, its value is ignored.
343 Output columns whose matching category is not present in any input row
344 of the group are filled with nulls.
346 In practice the source_sql query should always specify ORDER BY 1 to
347 ensure that values with the same row_name are brought together.
348 However, ordering of the categories within a group is not important.
349 Also, it is essential to be sure that the order of the category_sql
350 query's output matches the specified output column order.
352 Here are two complete examples:
353 create table sales(year int, month int, qty int);
354 insert into sales values(2007, 1, 1000);
355 insert into sales values(2007, 2, 1500);
356 insert into sales values(2007, 7, 500);
357 insert into sales values(2007, 11, 1500);
358 insert into sales values(2007, 12, 2000);
359 insert into sales values(2008, 1, 1000);
361 select * from crosstab(
362 'select year, month, qty from sales order by 1',
363 'select m from generate_series(1,12) m'
379 year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec
380 ------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+------+----
382 2007 | 1000 | 1500 | | | | | 500 | | | | 1500 | 200
384 2008 | 1000 | | | | | | | | | | |
387 CREATE TABLE cth(rowid text, rowdt timestamp, attribute text, val text);
388 INSERT INTO cth VALUES('test1','01 March 2003','temperature','42');
389 INSERT INTO cth VALUES('test1','01 March 2003','test_result','PASS');
390 INSERT INTO cth VALUES('test1','01 March 2003','volts','2.6987');
391 INSERT INTO cth VALUES('test2','02 March 2003','temperature','53');
392 INSERT INTO cth VALUES('test2','02 March 2003','test_result','FAIL');
393 INSERT INTO cth VALUES('test2','02 March 2003','test_startdate','01 March 2003')
395 INSERT INTO cth VALUES('test2','02 March 2003','volts','3.1234');
397 SELECT * FROM crosstab
399 'SELECT rowid, rowdt, attribute, val FROM cth ORDER BY 1',
400 'SELECT DISTINCT attribute FROM cth ORDER BY 1'
408 test_startdate timestamp,
411 rowid | rowdt | temperature | test_result | test_startd
413 -------+--------------------------+-------------+-------------+-----------------
415 test1 | Sat Mar 01 00:00:00 2003 | 42 | PASS |
417 test2 | Sun Mar 02 00:00:00 2003 | 53 | FAIL | Sat Mar 01 00:00
421 You can create predefined functions to avoid having to write out the
422 result column names and types in each query. See the examples in the
423 previous section. The underlying C function for this form of crosstab
424 is named crosstab_hash.
426 F.43.1.5. connectby #
428 connectby(text relname, text keyid_fld, text parent_keyid_fld
429 [, text orderby_fld ], text start_with, int max_depth
430 [, text branch_delim ])
432 The connectby function produces a display of hierarchical data that is
433 stored in a table. The table must have a key field that uniquely
434 identifies rows, and a parent-key field that references the parent (if
435 any) of each row. connectby can display the sub-tree descending from
438 Table F.34 explains the parameters.
440 Table F.34. connectby Parameters
441 Parameter Description
442 relname Name of the source relation
443 keyid_fld Name of the key field
444 parent_keyid_fld Name of the parent-key field
445 orderby_fld Name of the field to order siblings by (optional)
446 start_with Key value of the row to start at
447 max_depth Maximum depth to descend to, or zero for unlimited depth
448 branch_delim String to separate keys with in branch output (optional)
450 The key and parent-key fields can be any data type, but they must be
451 the same type. Note that the start_with value must be entered as a text
452 string, regardless of the type of the key field.
454 The connectby function is declared to return setof record, so the
455 actual names and types of the output columns must be defined in the
456 FROM clause of the calling SELECT statement, for example:
457 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2'
459 AS t(keyid text, parent_keyid text, level int, branch text, pos int);
461 The first two output columns are used for the current row's key and its
462 parent row's key; they must match the type of the table's key field.
463 The third output column is the depth in the tree and must be of type
464 integer. If a branch_delim parameter was given, the next output column
465 is the branch display and must be of type text. Finally, if an
466 orderby_fld parameter was given, the last output column is a serial
467 number, and must be of type integer.
469 The “branch” output column shows the path of keys taken to reach the
470 current row. The keys are separated by the specified branch_delim
471 string. If no branch display is wanted, omit both the branch_delim
472 parameter and the branch column in the output column list.
474 If the ordering of siblings of the same parent is important, include
475 the orderby_fld parameter to specify which field to order siblings by.
476 This field can be of any sortable data type. The output column list
477 must include a final integer serial-number column, if and only if
478 orderby_fld is specified.
480 The parameters representing table and field names are copied as-is into
481 the SQL queries that connectby generates internally. Therefore, include
482 double quotes if the names are mixed-case or contain special
483 characters. You may also need to schema-qualify the table name.
485 In large tables, performance will be poor unless there is an index on
486 the parent-key field.
488 It is important that the branch_delim string not appear in any key
489 values, else connectby may incorrectly report an infinite-recursion
490 error. Note that if branch_delim is not provided, a default value of ~
491 is used for recursion detection purposes.
494 CREATE TABLE connectby_tree(keyid text, parent_keyid text, pos int);
496 INSERT INTO connectby_tree VALUES('row1',NULL, 0);
497 INSERT INTO connectby_tree VALUES('row2','row1', 0);
498 INSERT INTO connectby_tree VALUES('row3','row1', 0);
499 INSERT INTO connectby_tree VALUES('row4','row2', 1);
500 INSERT INTO connectby_tree VALUES('row5','row2', 0);
501 INSERT INTO connectby_tree VALUES('row6','row4', 0);
502 INSERT INTO connectby_tree VALUES('row7','row3', 0);
503 INSERT INTO connectby_tree VALUES('row8','row6', 0);
504 INSERT INTO connectby_tree VALUES('row9','row5', 0);
506 -- with branch, without orderby_fld (order of results is not guaranteed)
507 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0, '~
509 AS t(keyid text, parent_keyid text, level int, branch text);
510 keyid | parent_keyid | level | branch
511 -------+--------------+-------+---------------------
513 row4 | row2 | 1 | row2~row4
514 row6 | row4 | 2 | row2~row4~row6
515 row8 | row6 | 3 | row2~row4~row6~row8
516 row5 | row2 | 1 | row2~row5
517 row9 | row5 | 2 | row2~row5~row9
520 -- without branch, without orderby_fld (order of results is not guaranteed)
521 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0)
522 AS t(keyid text, parent_keyid text, level int);
523 keyid | parent_keyid | level
524 -------+--------------+-------
533 -- with branch, with orderby_fld (notice that row5 comes before row4)
534 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2'
536 AS t(keyid text, parent_keyid text, level int, branch text, pos int);
537 keyid | parent_keyid | level | branch | pos
538 -------+--------------+-------+---------------------+-----
539 row2 | | 0 | row2 | 1
540 row5 | row2 | 1 | row2~row5 | 2
541 row9 | row5 | 2 | row2~row5~row9 | 3
542 row4 | row2 | 1 | row2~row4 | 4
543 row6 | row4 | 2 | row2~row4~row6 | 5
544 row8 | row6 | 3 | row2~row4~row6~row8 | 6
547 -- without branch, with orderby_fld (notice that row5 comes before row4)
548 SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2'
550 AS t(keyid text, parent_keyid text, level int, pos int);
551 keyid | parent_keyid | level | pos
552 -------+--------------+-------+-----