2 9.21. Aggregate Functions #
4 Aggregate functions compute a single result from a set of input values.
5 The built-in general-purpose aggregate functions are listed in
6 Table 9.62 while statistical aggregates are in Table 9.63. The built-in
7 within-group ordered-set aggregate functions are listed in Table 9.64
8 while the built-in within-group hypothetical-set ones are in
9 Table 9.65. Grouping operations, which are closely related to aggregate
10 functions, are listed in Table 9.66. The special syntax considerations
11 for aggregate functions are explained in Section 4.2.7. Consult
12 Section 2.7 for additional introductory information.
14 Aggregate functions that support Partial Mode are eligible to
15 participate in various optimizations, such as parallel aggregation.
17 While all aggregates below accept an optional ORDER BY clause (as
18 outlined in Section 4.2.7), the clause has only been added to
19 aggregates whose output is affected by ordering.
21 Table 9.62. General-Purpose Aggregate Functions
28 any_value ( anyelement ) → same as input type
30 Returns an arbitrary value from the non-null input values.
33 array_agg ( anynonarray ORDER BY input_sort_columns ) → anyarray
35 Collects all the input values, including nulls, into an array.
38 array_agg ( anyarray ORDER BY input_sort_columns ) → anyarray
40 Concatenates all the input arrays into an array of one higher
41 dimension. (The inputs must all have the same dimensionality, and
42 cannot be empty or null.)
45 avg ( smallint ) → numeric
47 avg ( integer ) → numeric
49 avg ( bigint ) → numeric
51 avg ( numeric ) → numeric
53 avg ( real ) → double precision
55 avg ( double precision ) → double precision
57 avg ( interval ) → interval
59 Computes the average (arithmetic mean) of all the non-null input
63 bit_and ( smallint ) → smallint
65 bit_and ( integer ) → integer
67 bit_and ( bigint ) → bigint
71 Computes the bitwise AND of all non-null input values.
74 bit_or ( smallint ) → smallint
76 bit_or ( integer ) → integer
78 bit_or ( bigint ) → bigint
82 Computes the bitwise OR of all non-null input values.
85 bit_xor ( smallint ) → smallint
87 bit_xor ( integer ) → integer
89 bit_xor ( bigint ) → bigint
93 Computes the bitwise exclusive OR of all non-null input values. Can be
94 useful as a checksum for an unordered set of values.
97 bool_and ( boolean ) → boolean
99 Returns true if all non-null input values are true, otherwise false.
102 bool_or ( boolean ) → boolean
104 Returns true if any non-null input value is true, otherwise false.
109 Computes the number of input rows.
112 count ( "any" ) → bigint
114 Computes the number of input rows in which the input value is not null.
117 every ( boolean ) → boolean
119 This is the SQL standard's equivalent to bool_and.
122 json_agg ( anyelement ORDER BY input_sort_columns ) → json
124 jsonb_agg ( anyelement ORDER BY input_sort_columns ) → jsonb
126 Collects all the input values, including nulls, into a JSON array.
127 Values are converted to JSON as per to_json or to_jsonb.
130 json_agg_strict ( anyelement ) → json
132 jsonb_agg_strict ( anyelement ) → jsonb
134 Collects all the input values, skipping nulls, into a JSON array.
135 Values are converted to JSON as per to_json or to_jsonb.
138 json_arrayagg ( [ value_expression ] [ ORDER BY sort_expression ] [ {
139 NULL | ABSENT } ON NULL ] [ RETURNING data_type [ FORMAT JSON [
142 Behaves in the same way as json_array but as an aggregate function so
143 it only takes one value_expression parameter. If ABSENT ON NULL is
144 specified, any NULL values are omitted. If ORDER BY is specified, the
145 elements will appear in the array in that order rather than in the
148 SELECT json_arrayagg(v) FROM (VALUES(2),(1)) t(v) → [2, 1]
151 json_objectagg ( [ { key_expression { VALUE | ':' } value_expression }
152 ] [ { NULL | ABSENT } ON NULL ] [ { WITH | WITHOUT } UNIQUE [ KEYS ] ]
153 [ RETURNING data_type [ FORMAT JSON [ ENCODING UTF8 ] ] ])
155 Behaves like json_object, but as an aggregate function, so it only
156 takes one key_expression and one value_expression parameter.
158 SELECT json_objectagg(k:v) FROM (VALUES
159 ('a'::text,current_date),('b',current_date + 1)) AS t(k,v) → { "a" :
160 "2022-05-10", "b" : "2022-05-11" }
163 json_object_agg ( key "any", value "any" ORDER BY input_sort_columns )
166 jsonb_object_agg ( key "any", value "any" ORDER BY input_sort_columns )
169 Collects all the key/value pairs into a JSON object. Key arguments are
170 coerced to text; value arguments are converted as per to_json or
171 to_jsonb. Values can be null, but keys cannot.
174 json_object_agg_strict ( key "any", value "any" ) → json
176 jsonb_object_agg_strict ( key "any", value "any" ) → jsonb
178 Collects all the key/value pairs into a JSON object. Key arguments are
179 coerced to text; value arguments are converted as per to_json or
180 to_jsonb. The key can not be null. If the value is null then the entry
184 json_object_agg_unique ( key "any", value "any" ) → json
186 jsonb_object_agg_unique ( key "any", value "any" ) → jsonb
188 Collects all the key/value pairs into a JSON object. Key arguments are
189 coerced to text; value arguments are converted as per to_json or
190 to_jsonb. Values can be null, but keys cannot. If there is a duplicate
191 key an error is thrown.
194 json_object_agg_unique_strict ( key "any", value "any" ) → json
196 jsonb_object_agg_unique_strict ( key "any", value "any" ) → jsonb
198 Collects all the key/value pairs into a JSON object. Key arguments are
199 coerced to text; value arguments are converted as per to_json or
200 to_jsonb. The key can not be null. If the value is null then the entry
201 is skipped. If there is a duplicate key an error is thrown.
204 max ( see text ) → same as input type
206 Computes the maximum of the non-null input values. Available for any
207 numeric, string, date/time, or enum type, as well as bytea, inet,
208 interval, money, oid, pg_lsn, tid, xid8, and also arrays and composite
209 types containing sortable data types.
212 min ( see text ) → same as input type
214 Computes the minimum of the non-null input values. Available for any
215 numeric, string, date/time, or enum type, as well as bytea, inet,
216 interval, money, oid, pg_lsn, tid, xid8, and also arrays and composite
217 types containing sortable data types.
220 range_agg ( value anyrange ) → anymultirange
222 range_agg ( value anymultirange ) → anymultirange
224 Computes the union of the non-null input values.
227 range_intersect_agg ( value anyrange ) → anyrange
229 range_intersect_agg ( value anymultirange ) → anymultirange
231 Computes the intersection of the non-null input values.
234 string_agg ( value text, delimiter text ) → text
236 string_agg ( value bytea, delimiter bytea ORDER BY input_sort_columns )
239 Concatenates the non-null input values into a string. Each value after
240 the first is preceded by the corresponding delimiter (if it's not
244 sum ( smallint ) → bigint
246 sum ( integer ) → bigint
248 sum ( bigint ) → numeric
250 sum ( numeric ) → numeric
254 sum ( double precision ) → double precision
256 sum ( interval ) → interval
258 sum ( money ) → money
260 Computes the sum of the non-null input values.
263 xmlagg ( xml ORDER BY input_sort_columns ) → xml
265 Concatenates the non-null XML input values (see Section 9.15.1.8).
268 It should be noted that except for count, these functions return a null
269 value when no rows are selected. In particular, sum of no rows returns
270 null, not zero as one might expect, and array_agg returns null rather
271 than an empty array when there are no input rows. The coalesce function
272 can be used to substitute zero or an empty array for null when
275 The aggregate functions array_agg, json_agg, jsonb_agg,
276 json_agg_strict, jsonb_agg_strict, json_object_agg, jsonb_object_agg,
277 json_object_agg_strict, jsonb_object_agg_strict,
278 json_object_agg_unique, jsonb_object_agg_unique,
279 json_object_agg_unique_strict, jsonb_object_agg_unique_strict,
280 string_agg, and xmlagg, as well as similar user-defined aggregate
281 functions, produce meaningfully different result values depending on
282 the order of the input values. This ordering is unspecified by default,
283 but can be controlled by writing an ORDER BY clause within the
284 aggregate call, as shown in Section 4.2.7. Alternatively, supplying the
285 input values from a sorted subquery will usually work. For example:
286 SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
288 Beware that this approach can fail if the outer query level contains
289 additional processing, such as a join, because that might cause the
290 subquery's output to be reordered before the aggregate is computed.
294 The boolean aggregates bool_and and bool_or correspond to the standard
295 SQL aggregates every and any or some. PostgreSQL supports every, but
296 not any or some, because there is an ambiguity built into the standard
298 SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...;
300 Here ANY can be considered either as introducing a subquery, or as
301 being an aggregate function, if the subquery returns one row with a
302 Boolean value. Thus the standard name cannot be given to these
307 Users accustomed to working with other SQL database management systems
308 might be disappointed by the performance of the count aggregate when it
309 is applied to the entire table. A query like:
310 SELECT count(*) FROM sometable;
312 will require effort proportional to the size of the table: PostgreSQL
313 will need to scan either the entire table or the entirety of an index
314 that includes all rows in the table.
316 Table 9.63 shows aggregate functions typically used in statistical
317 analysis. (These are separated out merely to avoid cluttering the
318 listing of more-commonly-used aggregates.) Functions shown as accepting
319 numeric_type are available for all the types smallint, integer, bigint,
320 numeric, real, and double precision. Where the description mentions N,
321 it means the number of input rows for which all the input expressions
322 are non-null. In all cases, null is returned if the computation is
323 meaningless, for example when N is zero.
325 Table 9.63. Aggregate Functions for Statistics
332 corr ( Y double precision, X double precision ) → double precision
334 Computes the correlation coefficient.
337 covar_pop ( Y double precision, X double precision ) → double precision
339 Computes the population covariance.
342 covar_samp ( Y double precision, X double precision ) → double
345 Computes the sample covariance.
348 regr_avgx ( Y double precision, X double precision ) → double precision
350 Computes the average of the independent variable, sum(X)/N.
353 regr_avgy ( Y double precision, X double precision ) → double precision
355 Computes the average of the dependent variable, sum(Y)/N.
358 regr_count ( Y double precision, X double precision ) → bigint
360 Computes the number of rows in which both inputs are non-null.
363 regr_intercept ( Y double precision, X double precision ) → double
366 Computes the y-intercept of the least-squares-fit linear equation
367 determined by the (X, Y) pairs.
370 regr_r2 ( Y double precision, X double precision ) → double precision
372 Computes the square of the correlation coefficient.
375 regr_slope ( Y double precision, X double precision ) → double
378 Computes the slope of the least-squares-fit linear equation determined
382 regr_sxx ( Y double precision, X double precision ) → double precision
384 Computes the “sum of squares” of the independent variable, sum(X^2) -
388 regr_sxy ( Y double precision, X double precision ) → double precision
390 Computes the “sum of products” of independent times dependent
391 variables, sum(X*Y) - sum(X) * sum(Y)/N.
394 regr_syy ( Y double precision, X double precision ) → double precision
396 Computes the “sum of squares” of the dependent variable, sum(Y^2) -
400 stddev ( numeric_type ) → double precision for real or double
401 precision, otherwise numeric
403 This is a historical alias for stddev_samp.
406 stddev_pop ( numeric_type ) → double precision for real or double
407 precision, otherwise numeric
409 Computes the population standard deviation of the input values.
412 stddev_samp ( numeric_type ) → double precision for real or double
413 precision, otherwise numeric
415 Computes the sample standard deviation of the input values.
418 variance ( numeric_type ) → double precision for real or double
419 precision, otherwise numeric
421 This is a historical alias for var_samp.
424 var_pop ( numeric_type ) → double precision for real or double
425 precision, otherwise numeric
427 Computes the population variance of the input values (square of the
428 population standard deviation).
431 var_samp ( numeric_type ) → double precision for real or double
432 precision, otherwise numeric
434 Computes the sample variance of the input values (square of the sample
438 Table 9.64 shows some aggregate functions that use the ordered-set
439 aggregate syntax. These functions are sometimes referred to as “inverse
440 distribution” functions. Their aggregated input is introduced by ORDER
441 BY, and they may also take a direct argument that is not aggregated,
442 but is computed only once. All these functions ignore null values in
443 their aggregated input. For those that take a fraction parameter, the
444 fraction value must be between 0 and 1; an error is thrown if not.
445 However, a null fraction value simply produces a null result.
447 Table 9.64. Ordered-Set Aggregate Functions
454 mode () WITHIN GROUP ( ORDER BY anyelement ) → anyelement
456 Computes the mode, the most frequent value of the aggregated argument
457 (arbitrarily choosing the first one if there are multiple
458 equally-frequent values). The aggregated argument must be of a sortable
462 percentile_cont ( fraction double precision ) WITHIN GROUP ( ORDER BY
463 double precision ) → double precision
465 percentile_cont ( fraction double precision ) WITHIN GROUP ( ORDER BY
466 interval ) → interval
468 Computes the continuous percentile, a value corresponding to the
469 specified fraction within the ordered set of aggregated argument
470 values. This will interpolate between adjacent input items if needed.
473 percentile_cont ( fractions double precision[] ) WITHIN GROUP ( ORDER
474 BY double precision ) → double precision[]
476 percentile_cont ( fractions double precision[] ) WITHIN GROUP ( ORDER
477 BY interval ) → interval[]
479 Computes multiple continuous percentiles. The result is an array of the
480 same dimensions as the fractions parameter, with each non-null element
481 replaced by the (possibly interpolated) value corresponding to that
485 percentile_disc ( fraction double precision ) WITHIN GROUP ( ORDER BY
486 anyelement ) → anyelement
488 Computes the discrete percentile, the first value within the ordered
489 set of aggregated argument values whose position in the ordering equals
490 or exceeds the specified fraction. The aggregated argument must be of a
494 percentile_disc ( fractions double precision[] ) WITHIN GROUP ( ORDER
495 BY anyelement ) → anyarray
497 Computes multiple discrete percentiles. The result is an array of the
498 same dimensions as the fractions parameter, with each non-null element
499 replaced by the input value corresponding to that percentile. The
500 aggregated argument must be of a sortable type.
503 Each of the “hypothetical-set” aggregates listed in Table 9.65 is
504 associated with a window function of the same name defined in
505 Section 9.22. In each case, the aggregate's result is the value that
506 the associated window function would have returned for the
507 “hypothetical” row constructed from args, if such a row had been added
508 to the sorted group of rows represented by the sorted_args. For each of
509 these functions, the list of direct arguments given in args must match
510 the number and types of the aggregated arguments given in sorted_args.
511 Unlike most built-in aggregates, these aggregates are not strict, that
512 is they do not drop input rows containing nulls. Null values sort
513 according to the rule specified in the ORDER BY clause.
515 Table 9.65. Hypothetical-Set Aggregate Functions
522 rank ( args ) WITHIN GROUP ( ORDER BY sorted_args ) → bigint
524 Computes the rank of the hypothetical row, with gaps; that is, the row
525 number of the first row in its peer group.
528 dense_rank ( args ) WITHIN GROUP ( ORDER BY sorted_args ) → bigint
530 Computes the rank of the hypothetical row, without gaps; this function
531 effectively counts peer groups.
534 percent_rank ( args ) WITHIN GROUP ( ORDER BY sorted_args ) → double
537 Computes the relative rank of the hypothetical row, that is (rank - 1)
538 / (total rows - 1). The value thus ranges from 0 to 1 inclusive.
541 cume_dist ( args ) WITHIN GROUP ( ORDER BY sorted_args ) → double
544 Computes the cumulative distribution, that is (number of rows preceding
545 or peers with hypothetical row) / (total rows). The value thus ranges
549 Table 9.66. Grouping Operations
555 GROUPING ( group_by_expression(s) ) → integer
557 Returns a bit mask indicating which GROUP BY expressions are not
558 included in the current grouping set. Bits are assigned with the
559 rightmost argument corresponding to the least-significant bit; each bit
560 is 0 if the corresponding expression is included in the grouping
561 criteria of the grouping set generating the current result row, and 1
562 if it is not included.
564 The grouping operations shown in Table 9.66 are used in conjunction
565 with grouping sets (see Section 7.2.4) to distinguish result rows. The
566 arguments to the GROUPING function are not actually evaluated, but they
567 must exactly match expressions given in the GROUP BY clause of the
568 associated query level. For example:
569 => SELECT * FROM items_sold;
571 -------+-------+-------
578 => SELECT make, model, GROUPING(make,model), sum(sales) FROM items_sold GROUP BY
580 make | model | grouping | sum
581 -------+-------+----------+-----
591 Here, the grouping value 0 in the first four rows shows that those have
592 been grouped normally, over both the grouping columns. The value 1
593 indicates that model was not grouped by in the next-to-last two rows,
594 and the value 3 indicates that neither make nor model was grouped by in
595 the last row (which therefore is an aggregate over all the input rows).