functions-aggregate.md

## 9.21. Aggregate Functions

[]()

*Aggregate functions* compute a single result from a set of input values. The built-in general-purpose aggregate functions are listed in [Table 9.57](functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE) while statistical aggregates are in [Table 9.58](functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE). The built-in within-group ordered-set aggregate functions are listed in [Table 9.59](functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE) while the built-in within-group hypothetical-set ones are in [Table 9.60](functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE). Grouping operations, which are closely related to aggregate functions, are listed in [Table 9.61](functions-aggregate.html#FUNCTIONS-GROUPING-TABLE). The special syntax considerations for aggregate functions are explained in [Section 4.2.7](sql-expressions.html#SYNTAX-AGGREGATES). Consult [Section 2.7](tutorial-agg.html) for additional introductory information.

 Aggregate functions that support *Partial Mode* are eligible to participate in various optimizations, such as parallel aggregation.

**Table 9.57. General-Purpose Aggregate Functions**

|                                                                                                                                                                                       Function<br/><br/> Description                                                                                                                                                                                       |Partial Mode|
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
|                                                                                                                                          []() `array_agg` ( `anynonarray` ) → `anyarray`<br/><br/> Collects all the input values, including nulls, into an array.                                                                                                                                          |     No     |
|                                                                                                `array_agg` ( `anyarray` ) → `anyarray`<br/><br/> Concatenates all the input arrays into an array of one higher dimension. (The inputs must all have the same dimensionality, and cannot be empty or null.)                                                                                                 |     No     |
|[]() []() `avg` ( `smallint` ) → `numeric`<br/><br/>`avg` ( `integer` ) → `numeric`<br/><br/>`avg` ( `bigint` ) → `numeric`<br/><br/>`avg` ( `numeric` ) → `numeric`<br/><br/>`avg` ( `real` ) → `double precision`<br/><br/>`avg` ( `double precision` ) → `double precision`<br/><br/>`avg` ( `interval` ) → `interval`<br/><br/> Computes the average (arithmetic mean) of all the non-null input values.|    Yes     |
|                                                                                  []() `bit_and` ( `smallint` ) → `smallint`<br/><br/>`bit_and` ( `integer` ) → `integer`<br/><br/>`bit_and` ( `bigint` ) → `bigint`<br/><br/>`bit_and` ( `bit` ) → `bit`<br/><br/> Computes the bitwise AND of all non-null input values.                                                                                  |    Yes     |
|                                                                                    []() `bit_or` ( `smallint` ) → `smallint`<br/><br/>`bit_or` ( `integer` ) → `integer`<br/><br/>`bit_or` ( `bigint` ) → `bigint`<br/><br/>`bit_or` ( `bit` ) → `bit`<br/><br/> Computes the bitwise OR of all non-null input values.                                                                                     |    Yes     |
|                                               []() `bit_xor` ( `smallint` ) → `smallint`<br/><br/>`bit_xor` ( `integer` ) → `integer`<br/><br/>`bit_xor` ( `bigint` ) → `bigint`<br/><br/>`bit_xor` ( `bit` ) → `bit`<br/><br/> Computes the bitwise exclusive OR of all non-null input values. Can be useful as a checksum for an unordered set of values.                                                |    Yes     |
|                                                                                                                                          []() `bool_and` ( `boolean` ) → `boolean`<br/><br/> Returns true if all non-null input values are true, otherwise false.                                                                                                                                          |    Yes     |
|                                                                                                                                           []() `bool_or` ( `boolean` ) → `boolean`<br/><br/> Returns true if any non-null input value is true, otherwise false.                                                                                                                                            |    Yes     |
|                                                                                                                                                                []() `count` ( `*` ) → `bigint`<br/><br/> Computes the number of input rows.                                                                                                                                                                |    Yes     |
|                                                                                                                                              `count` ( `"any"` ) → `bigint`<br/><br/> Computes the number of input rows in which the input value is not null.                                                                                                                                              |    Yes     |
|                                                                                                                                                   []() `every` ( `boolean` ) → `boolean`<br/><br/> This is the SQL standard's equivalent to `bool_and`.                                                                                                                                                    |    Yes     |
|                                                                                  []() `json_agg` ( `anyelement` ) → `json`<br/><br/>[]() `jsonb_agg` ( `anyelement` ) → `jsonb`<br/><br/> Collects all the input values, including nulls, into a JSON array. Values are converted to JSON as per `to_json` or `to_jsonb`.                                                                                  |     No     |
|                         []() `json_object_agg` ( *`key`* `"any"`, *`value`* `"any"` ) → `json`<br/><br/>[]() `jsonb_object_agg` ( *`key`* `"any"`, *`value`* `"any"` ) → `jsonb`<br/><br/> Collects all the key/value pairs into a JSON object. Key arguments are coerced to text; value arguments are converted as per `to_json` or `to_jsonb`. Values can be null, but not keys.                         |     No     |
|                                                             []() `max` ( *`see text`* ) → `*`same as input type`*`<br/><br/> Computes the maximum of the non-null input values. Available for any numeric, string, date/time, or enum type, as well as `inet`, `interval`, `money`, `oid`, `pg_lsn`, `tid`, and arrays of any of these types.                                                              |    Yes     |
|                                                             []() `min` ( *`see text`* ) → `*`same as input type`*`<br/><br/> Computes the minimum of the non-null input values. Available for any numeric, string, date/time, or enum type, as well as `inet`, `interval`, `money`, `oid`, `pg_lsn`, `tid`, and arrays of any of these types.                                                              |    Yes     |
|                                                                                                                                           []() `range_agg` ( *`value`* `anyrange` ) → `anymultirange`<br/><br/> Computes the union of the non-null input values.                                                                                                                                           |     No     |
|                                                                                                                                  []() `range_intersect_agg` ( *`value`* `anyrange` ) → `anymultirange`<br/><br/> Computes the intersection of the non-null input values.                                                                                                                                   |     No     |
|                                              []() `string_agg` ( *`value`* `text`, *`delimiter`* `text` ) → `text`<br/><br/>`string_agg` ( *`value`* `bytea`, *`delimiter`* `bytea` ) → `bytea`<br/><br/> Concatenates the non-null input values into a string. Each value after the first is preceded by the corresponding *`delimiter`* (if it's not null).                                              |     No     |
|    []() `sum` ( `smallint` ) → `bigint`<br/><br/>`sum` ( `integer` ) → `bigint`<br/><br/>`sum` ( `bigint` ) → `numeric`<br/><br/>`sum` ( `numeric` ) → `numeric`<br/><br/>`sum` ( `real` ) → `real`<br/><br/>`sum` ( `double precision` ) → `double precision`<br/><br/>`sum` ( `interval` ) → `interval`<br/><br/>`sum` ( `money` ) → `money`<br/><br/> Computes the sum of the non-null input values.    |    Yes     |
|                                                                                                                          []() `xmlagg` ( `xml` ) → `xml`<br/><br/> Concatenates the non-null XML input values (see [Section 9.15.1.7](functions-xml.html#FUNCTIONS-XML-XMLAGG)).                                                                                                                           |     No     |

 It should be noted that except for `count`, these functions return a null value when no rows are selected. In particular, `sum` of no rows returns null, not zero as one might expect, and `array_agg` returns null rather than an empty array when there are no input rows. The `coalesce` function can be used to substitute zero or an empty array for null when necessary.

 The aggregate functions `array_agg`, `json_agg`, `jsonb_agg`, `json_object_agg`, `jsonb_object_agg`, `string_agg`, and `xmlagg`, as well as similar user-defined aggregate functions, produce meaningfully different result values depending on the order of the input values. This ordering is unspecified by default, but can be controlled by writing an `ORDER BY` clause within the aggregate call, as shown in [Section 4.2.7](sql-expressions.html#SYNTAX-AGGREGATES). Alternatively, supplying the input values from a sorted subquery will usually work. For example:

```
SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

```

 Beware that this approach can fail if the outer query level contains additional processing, such as a join, because that might cause the subquery's output to be reordered before the aggregate is computed.

### Note

[]()[]()

 The boolean aggregates `bool_and` and `bool_or` correspond to the standard SQL aggregates `every` and `any` or `some`. PostgreSQL supports `every`, but not `any` or `some`, because there is an ambiguity built into the standard syntax:

```
SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...;

```

 Here `ANY` can be considered either as introducing a subquery, or as being an aggregate function, if the subquery returns one row with a Boolean value. Thus the standard name cannot be given to these aggregates.

### Note

 Users accustomed to working with other SQL database management systems might be disappointed by the performance of the `count` aggregate when it is applied to the entire table. A query like:

```
SELECT count(*) FROM sometable;

```

 will require effort proportional to the size of the table: PostgreSQL will need to scan either the entire table or the entirety of an index that includes all rows in the table.

[Table 9.58](functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE) shows aggregate functions typically used in statistical analysis. (These are separated out merely to avoid cluttering the listing of more-commonly-used aggregates.) Functions shown as accepting *`numeric_type`* are available for all the types `smallint`, `integer`, `bigint`, `numeric`, `real`, and `double precision`. Where the description mentions *`N`*, it means the number of input rows for which all the input expressions are non-null. In all cases, null is returned if the computation is meaningless, for example when *`N`* is zero.

[]()[]()

**Table 9.58. Aggregate Functions for Statistics**

|                                                                                                   Function<br/><br/> Description                                                                                                   |Partial Mode|
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
|                                            []() []() `corr` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the correlation coefficient.                                            |    Yes     |
|                                          []() []() `covar_pop` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the population covariance.                                           |    Yes     |
|                                            []() []() `covar_samp` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the sample covariance.                                            |    Yes     |
|                            []() `regr_avgx` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the average of the independent variable, `sum(*`X`*)/*`N`*`.                            |    Yes     |
|                             []() `regr_avgy` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the average of the dependent variable, `sum(*`Y`*)/*`N`*`.                             |    Yes     |
|                                    []() `regr_count` ( *`Y`* `double precision`, *`X`* `double precision` ) → `bigint`<br/><br/> Computes the number of rows in which both inputs are non-null.                                    |    Yes     |
|     []() []() `regr_intercept` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the y-intercept of the least-squares-fit linear equation determined by the (*`X`*, *`Y`*) pairs.     |    Yes     |
|                                      []() `regr_r2` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the square of the correlation coefficient.                                      |    Yes     |
|          []() []() `regr_slope` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the slope of the least-squares-fit linear equation determined by the (*`X`*, *`Y`*) pairs.          |    Yes     |
|               []() `regr_sxx` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the “sum of squares” of the independent variable, `sum(*`X`*^2) - sum(*`X`*)^2/*`N`*`.                |    Yes     |
| []() `regr_sxy` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the “sum of products” of independent times dependent variables, `sum(*`X`***`Y`*) - sum(*`X`*) * sum(*`Y`*)/*`N`*`. |    Yes     |
|                []() `regr_syy` ( *`Y`* `double precision`, *`X`* `double precision` ) → `double precision`<br/><br/> Computes the “sum of squares” of the dependent variable, `sum(*`Y`*^2) - sum(*`Y`*)^2/*`N`*`.                 |    Yes     |
|                           []() []() `stddev` ( *`numeric_type`* ) → `` `double precision` for `real` or `double precision`, otherwise `numeric`<br/><br/> This is a historical alias for `stddev_samp`.                            |    Yes     |
|                []() []() `stddev_pop` ( *`numeric_type`* ) → `` `double precision` for `real` or `double precision`, otherwise `numeric`<br/><br/> Computes the population standard deviation of the input values.                 |    Yes     |
|                  []() []() `stddev_samp` ( *`numeric_type`* ) → `` `double precision` for `real` or `double precision`, otherwise `numeric`<br/><br/> Computes the sample standard deviation of the input values.                  |    Yes     |
|                              []() `variance` ( *`numeric_type`* ) → `` `double precision` for `real` or `double precision`, otherwise `numeric`<br/><br/> This is a historical alias for `var_samp`.                               |    Yes     |
|[]() []() `var_pop` ( *`numeric_type`* ) → `` `double precision` for `real` or `double precision`, otherwise `numeric`<br/><br/> Computes the population variance of the input values (square of the population standard deviation).|    Yes     |
|   []() []() `var_samp` ( *`numeric_type`* ) → `` `double precision` for `real` or `double precision`, otherwise `numeric`<br/><br/> Computes the sample variance of the input values (square of the sample standard deviation).    |    Yes     |

[Table 9.59](functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE) shows some aggregate functions that use the *ordered-set aggregate* syntax. These functions are sometimes referred to as “inverse distribution” functions. Their aggregated input is introduced by `ORDER BY`, and they may also take a *direct argument* that is not aggregated, but is computed only once. All these functions ignore null values in their aggregated input. For those that take a *`fraction`* parameter, the fraction value must be between 0 and 1; an error is thrown if not. However, a null *`fraction`* value simply produces a null result.

[]()[]()

**Table 9.59. Ordered-Set Aggregate Functions**

|                                                                                                                                                                                                                                     Function<br/><br/> Description                                                                                                                                                                                                                                    |Partial Mode|
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
|                                                                                                  []() `mode` () `WITHIN GROUP` ( `ORDER BY` `anyelement` ) → `anyelement`<br/><br/> Computes the *mode*, the most frequent value of the aggregated argument (arbitrarily choosing the first one if there are multiple equally-frequent values). The aggregated argument must be of a sortable type.                                                                                                   |     No     |
|             []() `percentile_cont` ( *`fraction`* `double precision` ) `WITHIN GROUP` ( `ORDER BY` `double precision` ) → `double precision`<br/><br/>`percentile_cont` ( *`fraction`* `double precision` ) `WITHIN GROUP` ( `ORDER BY` `interval` ) → `interval`<br/><br/> Computes the *continuous percentile*, a value corresponding to the specified *`fraction`* within the ordered set of aggregated argument values. This will interpolate between adjacent input items if needed.             |     No     |
|`percentile_cont` ( *`fractions`* `double precision[]` ) `WITHIN GROUP` ( `ORDER BY` `double precision` ) → `double precision[]`<br/><br/>`percentile_cont` ( *`fractions`* `double precision[]` ) `WITHIN GROUP` ( `ORDER BY` `interval` ) → `interval[]`<br/><br/> Computes multiple continuous percentiles. The result is an array of the same dimensions as the *`fractions`* parameter, with each non-null element replaced by the (possibly interpolated) value corresponding to that percentile.|     No     |
|                                                               []() `percentile_disc` ( *`fraction`* `double precision` ) `WITHIN GROUP` ( `ORDER BY` `anyelement` ) → `anyelement`<br/><br/> Computes the *discrete percentile*, the first value within the ordered set of aggregated argument values whose position in the ordering equals or exceeds the specified *`fraction`*. The aggregated argument must be of a sortable type.                                                                |     No     |
|                                                     `percentile_disc` ( *`fractions`* `double precision[]` ) `WITHIN GROUP` ( `ORDER BY` `anyelement` ) → `anyarray`<br/><br/> Computes multiple discrete percentiles. The result is an array of the same dimensions as the *`fractions`* parameter, with each non-null element replaced by the input value corresponding to that percentile. The aggregated argument must be of a sortable type.                                                     |     No     |

[]()

 Each of the “hypothetical-set” aggregates listed in [Table 9.60](functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE) is associated with a window function of the same name defined in [Section 9.22](functions-window.html). In each case, the aggregate's result is the value that the associated window function would have returned for the “hypothetical” row constructed from *`args`*, if such a row had been added to the sorted group of rows represented by the *`sorted_args`*. For each of these functions, the list of direct arguments given in *`args`* must match the number and types of the aggregated arguments given in *`sorted_args`*. Unlike most built-in aggregates, these aggregates are not strict, that is they do not drop input rows containing nulls. Null values sort according to the rule specified in the `ORDER BY` clause.

**Table 9.60. Hypothetical-Set Aggregate Functions**

|                                                                                                                       Function<br/><br/> Description                                                                                                                      |Partial Mode|
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
|                               []() `rank` ( *`args`* ) `WITHIN GROUP` ( `ORDER BY` *`sorted_args`* ) → `bigint`<br/><br/> Computes the rank of the hypothetical row, with gaps; that is, the row number of the first row in its peer group.                               |     No     |
|                                 []() `dense_rank` ( *`args`* ) `WITHIN GROUP` ( `ORDER BY` *`sorted_args`* ) → `bigint`<br/><br/> Computes the rank of the hypothetical row, without gaps; this function effectively counts peer groups.                                  |     No     |
|          []() `percent_rank` ( *`args`* ) `WITHIN GROUP` ( `ORDER BY` *`sorted_args`* ) → `double precision`<br/><br/> Computes the relative rank of the hypothetical row, that is (`rank` - 1) / (total rows - 1). The value thus ranges from 0 to 1 inclusive.          |     No     |
|[]() `cume_dist` ( *`args`* ) `WITHIN GROUP` ( `ORDER BY` *`sorted_args`* ) → `double precision`<br/><br/> Computes the cumulative distribution, that is (number of rows preceding or peers with hypothetical row) / (total rows). The value thus ranges from 1/*`N`* to 1.|     No     |

**Table 9.61. Grouping Operations**

|                                                                                                                                                                                                        Function<br/><br/> Description                                                                                                                                                                                                        |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|[]() `GROUPING` ( *`group_by_expression(s)`* ) → `integer`<br/><br/> Returns a bit mask indicating which `GROUP BY` expressions are not included in the current grouping set. Bits are assigned with the rightmost argument corresponding to the least-significant bit; each bit is 0 if the corresponding expression is included in the grouping criteria of the grouping set generating the current result row, and 1 if it is not included.|

 The grouping operations shown in [Table 9.61](functions-aggregate.html#FUNCTIONS-GROUPING-TABLE) are used in conjunction with grouping sets (see [Section 7.2.4](queries-table-expressions.html#QUERIES-GROUPING-SETS)) to distinguish result rows. The arguments to the `GROUPING` function are not actually evaluated, but they must exactly match expressions given in the `GROUP BY` clause of the associated query level. For example:

```
=> SELECT * FROM items_sold;
 make  | model | sales