# 7.2. Table Expressions
7.2.3. TheGROUP BY
andHAVING
Clauses
7.2.4.GROUPING SETS
,CUBE
, andROLLUP
7.2.5. Window Function Processing
Atable expressioncomputes a table. The table expression contains aFROM
clause that is optionally followed byWHERE
,GROUP BY
, andHAVING
条款。普通表表达式只是指磁盘上的一个表,即所谓的基表,但更复杂的表达式可用于以各种方式修改或组合基表。
可选的在哪里
,通过...分组
, 和拥有
表表达式中的子句指定对在表中派生的表执行的连续转换管道从
条款。所有这些转换都会生成一个虚拟表,该表提供传递给选择列表的行以计算查询的输出行。
# 7.2.1.这
从
条款这
从
FROM table_reference [, table_reference [, ...]]
子句从逗号分隔的表引用列表中给出的一个或多个其他表派生表。表引用可以是表名(可能是模式限定的),也可以是派生表,例如子查询、
加入构造,或这些的复杂组合。如果列表中列出了多个表参考
从子句中,表是交叉连接的(即,形成它们行的笛卡尔积;见下文)。结果从
list 是一个中间虚拟表,然后可以由在哪里
,通过...分组
, 和拥有
子句,最后是整个表表达式的结果。
When a table reference names a table that is the parent of a table inheritance hierarchy, the table reference produces rows of not only that table but all of its descendant tables, unless the key wordONLY
precedes the table name. However, the reference produces only the columns that appear in the named table — any columns added in subtables are ignored.
Instead of writingONLY
before the table name, you can write*
after the table name to explicitly specify that descendant tables are included. There is no real reason to use this syntax any more, because searching descendant tables is now always the default behavior. However, it is supported for compatibility with older releases.
# 7.2.1.1. Joined Tables
A joined table is a table derived from two other (real or derived) tables according to the rules of the particular join type. Inner, outer, and cross-joins are available. The general syntax of a joined table is
T1 join_type T2 [ join_condition ]
Joins of all types can be chained together, or nested: either or both*T1
andT2
*can be joined tables. Parentheses can be used aroundJOIN
clauses to control the join order. In the absence of parentheses,JOIN
clauses nest left-to-right.
Join Types
T1 CROSS JOIN T2
For every possible combination of rows from*T1
andT2
(i.e., a Cartesian product), the joined table will contain a row consisting of all columns inT1
followed by all columns inT2
*.如果表分别有 N 和 M 行,则连接表将有 N*M 行。
从 *
T1* 交叉连接 *
T2*
相当于从 *
T1* 内部联接 *
T2* 为真
(见下文)。它也相当于从 *
T1*, *
T2*
.
# 笔记
当出现两个以上的表时,后一种等价性并不完全成立,因为加入
比逗号绑定得更紧密。例如从 *
T1* 交叉连接 *
T2* 内部联接 *
T3* 在 *
健康)状况*
不一样从 *
T1*, *
T2* 内部联接 *
T3* 在 *
健康)状况*
因为*健康)状况
可以参考T1
*在第一种情况下,但不是第二种情况。
T1 { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOIN T2 ON boolean_expression
T1 { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOIN T2 USING ( join column list )
T1 NATURAL { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOIN T2
话内
和外
在所有形式中都是可选的。内
是默认值;剩下
,正确的
, 和满的
暗示外连接。
这加入条件被指定在在
要么使用
子句,或隐含的词自然的
.连接条件确定两个源表中的哪些行被认为是“匹配”的,如下文详细说明。
可能的合格连接类型有:
内部联接
对于 T1 的每一行 R1,连接表对于 T2 中满足与 R1 的连接条件的每一行都有一行。
首先,执行内连接。然后,对于 T1 中与 T2 中的任何行不满足连接条件的每一行,在 T2 的列中添加一个带有空值的连接行。因此,连接表对于 T1 中的每一行总是至少有一行。
首先,执行内连接。然后,对于 T2 中与 T1 中的任何行不满足连接条件的每一行,在 T1 的列中添加一个带有空值的连接行。这与左连接相反:结果表将始终为 T2 中的每一行保留一行。
全外连接
首先,执行内连接。然后,对于 T1 中与 T2 中的任何行不满足连接条件的每一行,在 T2 的列中添加一个带有空值的连接行。此外,对于 T2 中与 T1 中的任何行不满足连接条件的每一行,添加一个在 T1 的列中具有空值的连接行。
这在
子句是最通用的连接条件:它采用与在哪里
条款。一对从*T1
和T2
*如果匹配在
表达式的计算结果为真。
这使用
子句是一种速记,它允许您利用连接双方对连接列使用相同名称的特定情况。它采用逗号分隔的共享列名列表,并形成一个连接条件,其中包括每个列的相等比较。例如,加入*T1
和T2
*和使用(a,b)
产生连接条件在 *
T1*.a = *
T2*.a 和 *
T1*.b = *
T2*.b
.
此外,输出加入使用
抑制冗余列:不需要打印两个匹配的列,因为它们必须具有相同的值。尽管加入
产生所有列*T1
紧随其后的所有列T2
,加入使用
为每个列出的列对(按列出的顺序)生成一个输出列,然后是来自T1
,后跟任何剩余的列T2
*.
最后,自然
是的简写形式使用
: 它形成一个使用
由出现在两个输入表中的所有列名组成的列表。与使用
,这些列在输出表中只出现一次。如果没有通用的列名,自然加入
表现得像加入...正确
,产生一个叉积连接。
# 笔记
使用
由于仅组合了列出的列,因此对于连接关系中的列更改是相当安全的。自然
风险要大得多,因为对任何一个关系的任何模式更改都会导致出现新的匹配列名,这将导致连接也合并该新列。
综上所述,假设我们有表格t1
:
num | name
#### 7.2.1.2. Table and Column Aliases
[]()[]()
A temporary name can be given to tables and complex table references to be used for references to the derived table in the rest of the query. This is called a *table alias*.
To create a table alias, write
FROM table_reference AS 别名
or
FROM table_reference 别名
The `AS` key word is optional noise. *`alias`* can be any identifier.
A typical application of table aliases is to assign short identifiers to long table names to keep the join clauses readable. For example:
SELECT * FROM some_very_long_table_name s JOIN another_fairly_long_name a ON s.id = a.num;
The alias becomes the new name of the table reference so far as the current query is concerned — it is not allowed to refer to the table by the original name elsewhere in the query. Thus, this is not valid:
SELECT * FROM my_table AS m WHERE my_table.a > 5;- 错误的
Table aliases are mainly for notational convenience, but it is necessary to use them when joining a table to itself, e.g.:
SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
Additionally, an alias is required if the table reference is a subquery (see [Section 7.2.1.3](queries-table-expressions.html#QUERIES-SUBQUERIES)).
Parentheses are used to resolve ambiguities. In the following example, the first statement assigns the alias `b` to the second instance of `my_table`, but the second statement assigns the alias to the result of the join:
选择FROM my_table AS a CROSS JOIN my_table AS b ... SELECTFROM (my_table AS a CROSS JOIN my_table) AS b ...
Another form of table aliasing gives temporary names to the columns of the table, as well as the table itself:
FROM table_reference[作为]别名 ( column1 [, column2[, ...]])
If fewer column aliases are specified than the actual table has columns, the remaining columns are not renamed. This syntax is especially useful for self-joins or subqueries.
When an alias is applied to the output of a `JOIN` clause, the alias hides the original name(s) within the `JOIN`. For example:
SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
is valid SQL, but:
SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
is not valid; the table alias `a` is not visible outside the alias `c`.
#### 7.2.1.3. Subqueries
[]()
Subqueries specifying a derived table must be enclosed in parentheses and *must* be assigned a table alias name (as in [Section 7.2.1.2](queries-table-expressions.html#QUERIES-TABLE-ALIASES)). For example:
FROM (SELECT * FROM table1) AS alias_name
This example is equivalent to `FROM table1 AS alias_name`. More interesting cases, which cannot be reduced to a plain join, arise when the subquery involves grouping or aggregation.
A subquery can also be a `VALUES` list:
FROM (VALUES ('anne', 'smith'), ('bob', 'jones'), ('joe', 'blow')) AS names(first, last)
Again, a table alias is required. Assigning alias names to the columns of the `VALUES` list is optional, but is good practice. For more information see [Section 7.7](queries-values.html).
#### 7.2.1.4. Table Functions
[]()[]()
Table functions are functions that produce a set of rows, made up of either base data types (scalar types) or composite data types (table rows). They are used like a table, view, or subquery in the `FROM` clause of a query. Columns returned by table functions can be included in `SELECT`, `JOIN`, or `WHERE` clauses in the same manner as columns of a table, view, or subquery.
Table functions may also be combined using the `ROWS FROM` syntax, with the results returned in parallel columns; the number of result rows in this case is that of the largest function result, with smaller results padded with null values to match.
函数调用[具有顺序性][作为]table_alias [(column_alias[, ...])]] ROWS FROM(function_call[, ...])[具有顺序性][作为]table_alias [(column_alias[, ...])]]
If the `WITH ORDINALITY` clause is specified, an additional column of type `bigint` will be added to the function result columns. This column numbers the rows of the function result set, starting from 1. (This is a generalization of the SQL-standard syntax for `UNNEST ... WITH ORDINALITY`.) By default, the ordinal column is called `ordinality`, but a different column name can be assigned to it using an `AS` clause.
The special table function `UNNEST` may be called with any number of array parameters, and it returns a corresponding number of columns, as if `UNNEST` ([Section 9.19](functions-array.html)) had been called on each parameter separately and combined using the `ROWS FROM` construct.
UNNEST(数组表达式[, ...])[具有顺序性][作为]table_alias [(column_alias[, ...])]]
If no *`table_alias`* is specified, the function name is used as the table name; in the case of a `ROWS FROM()` construct, the first function's name is used.
If column aliases are not supplied, then for a function returning a base data type, the column name is also the same as the function name. For a function returning a composite type, the result columns get the names of the individual attributes of the type.
Some examples:
CREATE TABLE foo (fooid int, foosubid int, fooname text);
CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$ SELECT * FROM foo WHERE fooid = $1;$$ 语言 SQL;
SELECT * FROM getfoo(1) AS t1;
SELECT * FROM foo WHERE foosubid IN ( SELECT foosubid FROM getfoo(foo.fooid) z WHERE z.fooid = foo.fooid );
创建视图 vw_getfoo AS SELECT * FROM getfoo(1);
选择 * 从 vw_getfoo;
In some cases it is useful to define table functions that can return different column sets depending on how they are invoked. To support this, the table function can be declared as returning the pseudo-type `record` with no `OUT` parameters. When such a function is used in a query, the expected row structure must be specified in the query itself, so that the system can know how to parse and plan the query. This syntax looks like:
函数调用[作为]别名(列定义[, ...]) function_call AS[别名](列定义[, ...]) ROWS FROM( ... function_call AS (column_definition[, ...])[, ...])
When not using the `ROWS FROM()` syntax, the *`column_definition`* list replaces the column alias list that could otherwise be attached to the `FROM` item; the names in the column definitions serve as column aliases. When using the `ROWS FROM()` syntax, a *`column_definition`* list can be attached to each member function separately; or if there is only one member function and no `WITH ORDINALITY` clause, a *`column_definition`* list can be written in place of a column alias list following `ROWS FROM()`.
Consider this example:
SELECT * FROM dblink('dbname=mydb', 'SELECT proname, prosrc FROM pg_proc') AS t1(proname name, prosrc text) WHERE proname LIKE 'bytea%';
The [dblink](contrib-dblink-function.html) function (part of the [dblink](dblink.html) module) executes a remote query. It is declared to return `record` since it might be used for any kind of query. The actual column set must be specified in the calling query so that the parser knows, for example, what `*` should expand to.
This example uses `ROWS FROM`:
SELECT * FROM ROWS FROM ( json_to_recordset('[{"a":40,"b":"foo"},{"a":"100","b":"bar"}]') AS (a INTEGER, b TEXT), generate_series(1, 3) ) AS x (p, q, s) ORDER BY p;
p | q | s
# 7.2.1.5.LATERAL
Subqueries
Subqueries appearing inFROM
can be preceded by the key wordLATERAL
. This allows them to reference columns provided by precedingFROM
items. (WithoutLATERAL
, each subquery is evaluated independently and so cannot cross-reference any otherFROM
item.)
Table functions appearing inFROM
can also be preceded by the key wordLATERAL
, but for functions the key word is optional; the function's arguments can contain references to columns provided by precedingFROM
items in any case.
ALATERAL
项目可以出现在顶层从
列表,或在一个加入
树。在后一种情况下,它也可以指代左侧的任何项目加入
它位于右侧。
当一个从
项目包含侧
交叉引用,评估过程如下:对于每一行从
提供交叉引用的列或一组多行的项目从
提供列的项目,侧
使用该行或行集的列值评估项目。结果行像往常一样与计算它们的行连接。对列源表中的每一行或每组行重复此操作。
一个简单的例子侧
是
SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
这不是特别有用,因为它与更传统的结果完全相同
SELECT * FROM foo, bar WHERE bar.id = foo.bar_id;
侧
当需要交叉引用的列来计算要连接的行时,它主要有用。一个常见的应用是为一个集合返回函数提供一个参数值。例如,假设顶点(多边形)
返回多边形的顶点集,我们可以识别存储在表中的多边形的靠近在一起的顶点:
SELECT p1.id, p2.id, v1, v2
FROM polygons p1, polygons p2,
LATERAL vertices(p1.poly) v1,
LATERAL vertices(p2.poly) v2
WHERE (v1 <-> v2) < 10 AND p1.id != p2.id;
这个查询也可以写成
SELECT p1.id, p2.id, v1, v2
FROM polygons p1 CROSS JOIN LATERAL vertices(p1.poly) v1,
polygons p2 CROSS JOIN LATERAL vertices(p2.poly) v2
WHERE (v1 <-> v2) < 10 AND p1.id != p2.id;
或其他几种等效的配方。(如前所述,侧
key word is unnecessary in this example, but we use it for clarity.)
It is often particularly handy toLEFT JOIN
to aLATERAL
subquery, so that source rows will appear in the result even if theLATERAL
subquery produces no rows for them. For example, ifget_product_names()
returns the names of products made by a manufacturer, but some manufacturers in our table currently produce no products, we could find out which ones those are like this:
SELECT m.name
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true
WHERE pname IS NULL;
# 7.2.2. TheWHERE
Clause
The syntax of theWHERE
clause is
WHERE search_condition
where*search_condition
*is any value expression (seeSection 4.2) that returns a value of typeboolean
.
After the processing of theFROM
clause is done, each row of the derived virtual table is checked against the search condition. If the result of the condition is true, the row is kept in the output table, otherwise (i.e., if the result is false or null) it is discarded. The search condition typically references at least one column of the table generated in theFROM
clause; this is not required, but otherwise theWHERE
子句将毫无用处。
# 笔记
内连接的连接条件可以写成在哪里
条款或在加入
条款。例如,这些表表达式是等价的:
FROM a, b WHERE a.id = b.id AND b.val > 5
和:
FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5
或者甚至:
FROM a NATURAL JOIN b WHERE b.val > 5
您使用哪一个主要是风格问题。这加入
中的语法从
子句可能不像其他 SQL 数据库管理系统那样可移植,即使它在 SQL 标准中。对于外连接没有选择:它们必须在从
条款。这在
要么使用
外连接的子句是不是相当于一个在哪里
条件,因为它会导致添加行(对于不匹配的输入行)以及删除最终结果中的行。
以下是一些示例在哪里
条款:
SELECT ... FROM fdt WHERE c1 > 5
SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)
SELECT ... FROM fdt WHERE c1 IN (SELECT c1 FROM t2)
SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)
SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100
SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 > fdt.c1)
fdt
是在从…起
条款不符合搜索条件的行哪里
从句从句中删除fdt
.注意标量子查询用作值表达式。与任何其他查询一样,子查询可以使用复杂的表表达式。还要注意fdt
在子查询中引用。排位赛c1
像fdt。c1
只有在以下情况下才有必要c1
也是子查询的派生输入表中的列的名称。但是,即使不需要列名,对列名进行限定也会增加清晰度。本例显示了外部查询的列命名范围如何扩展到内部查询。
# 7.2.3.那个分组
和有
条款
通过考试后哪里
筛选时,派生的输入表可能会使用分组
子句,并使用有
条款
SELECT select_list
FROM ...
[WHERE ...]
GROUP BY grouping_column_reference [, grouping_column_reference]...
这个分组
子句用于将表中所有列中具有相同值的行组合在一起。列的列出顺序并不重要。其效果是将具有公共值的每组行组合到一个表示组中所有行的组行中。这样做是为了消除适用于这些组的输出和/或计算聚合中的冗余。例如:
=> SELECT * FROM test1;
x | y
### Tip
Grouping without aggregate expressions effectively calculates the set of distinct values in a column. This can also be achieved using the `DISTINCT` clause (see [Section 7.3.3](queries-select-lists.html#QUERIES-DISTINCT)).
Here is another example: it calculates the total sales for each product (rather than the total sales of all products):
选择product_id,p.name,(总和(单位)*p.price)作为products p的销售额,按product_id,p.name,p.price使用(product_id)组加入sales s;
In this example, the columns `product_id`, `p.name`, and `p.price` must be in the `GROUP BY` clause since they are referenced in the query select list (but see below). The column `s.units` does not have to be in the `GROUP BY` list since it is only used in an aggregate expression (`sum(...)`), which represents the sales of a product. For each product, the query returns a summary row about all sales of the product.
[]()
If the products table is set up so that, say, `product_id` is the primary key, then it would be enough to group by `product_id` in the above example, since name and price would be *functionally dependent* on the product ID, and so there would be no ambiguity about which name and price value to return for each product ID group.
In strict SQL, `GROUP BY` can only group by columns of the source table but PostgreSQL extends this to also allow `GROUP BY` to group by columns in the select list. Grouping by value expressions instead of simple column names is also allowed.
[]()
If a table has been grouped using `GROUP BY`, but only certain groups are of interest, the `HAVING` clause can be used, much like a `WHERE` clause, to eliminate groups from the result. The syntax is:
从…中选择列表。。。[哪里...]分组方式。。。有布尔表达式的
Expressions in the `HAVING` clause can refer both to grouped expressions and to ungrouped expressions (which necessarily involve an aggregate function).
Example:
=>通过x的总和(y)>3,从test1组中选择x,总和(y);x |和
# 7.2.4. 分组集
, 立方体
和汇总
与上述操作相比,更复杂的分组操作可以使用分组集.用户选择的数据从…起
和哪里
子句按每个指定的分组集分别分组,为每个组计算聚合,就像对简单分组
子句,然后返回结果。例如:
=> SELECT * FROM items_sold;
brand | size | sales
### Note
The construct `(a, b)` is normally recognized in expressions as a [row constructor](sql-expressions.html#SQL-SYNTAX-ROW-CONSTRUCTORS). Within the `GROUP BY` clause, this does not apply at the top levels of expressions, and `(a, b)` is parsed as a list of expressions as described above. If for some reason you *need* a row constructor in a grouping expression, use `ROW(a, b)`.
### 7.2.5. Window Function Processing
[]()
If the query contains any window functions (see [Section 3.5](tutorial-window.html), [Section 9.22](functions-window.html) and [Section 4.2.8](sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS)), these functions are evaluated after any grouping, aggregation, and `HAVING` filtering is performed. That is, if the query uses any aggregates, `GROUP BY`, or `HAVING`, then the rows seen by the window functions are the group rows instead of the original table rows from `FROM`/`WHERE`.
When multiple window functions are used, all the window functions having syntactically equivalent `PARTITION BY` and `ORDER BY` clauses in their window definitions are guaranteed to be evaluated in a single pass over the data. Therefore they will see the same sort ordering, even if the `ORDER BY` does not uniquely determine an ordering. However, no guarantees are made about the evaluation of functions having different `PARTITION BY` or `ORDER BY` specifications. (In such cases a sort step is typically required between the passes of window function evaluations, and the sort is not guaranteed to preserve ordering of rows that its `ORDER BY` sees as equivalent.)
Currently, window functions always require presorted data, and so the query output will be ordered according to one or another of the window functions' `PARTITION BY`/`ORDER BY` clauses. It is not recommended to rely on this, however. Use an explicit top-level `ORDER BY` clause if you want to be sure the results are sorted in a particular way.