## 7.8.`和`查询(公用表表达式) [7.8.1.`选择`在`和`](queries-with.html#QUERIES-WITH-SELECT) [7.8.2.递归查询](queries-with.html#QUERIES-WITH-RECURSIVE) [7.8.3.公用表表达式实现](queries-with.html#id-1.5.6.12.7) [7.8.4。中的数据修改语句`和`](queries-with.html#QUERIES-WITH-MODIFYING) [](<>)[](<>) `和`提供了一种编写辅助语句以用于更大查询的方法。这些语句,通常被称为公用表表达式或 CTE,可以被认为是定义仅用于一个查询的临时表。一个辅助语句中的每个`和`子句可以是`选择`,`插入`,`更新`, 或者`删除`;和`和`子句本身附加到一个主要语句,该语句也可以是`选择`,`插入`,`更新`, 要么`删除`. ### 7.8.1.`选择`在`和` 的基本价值`选择`在`和`就是将复杂的查询分解成更简单的部分。一个例子是: ``` WITH regional_sales AS ( SELECT region, SUM(amount) AS total_sales FROM orders GROUP BY region ), top_regions AS ( SELECT region FROM regional_sales WHERE total_sales > (SELECT SUM(total_sales)/10 FROM regional_sales) ) SELECT region, product, SUM(quantity) AS product_units, SUM(amount) AS product_sales FROM orders WHERE region IN (SELECT region FROM top_regions) GROUP BY region, product; ``` 它仅显示顶级销售地区的每种产品的销售总额。这`和`子句定义了两个名为`区域销售`和`顶部区域`,其中的输出`区域销售`用于`顶部区域`和输出`顶部区域`用于初级`选择`询问。这个例子可以不用写`和`, but we'd have needed two levels of nested sub-`SELECT`s. It's a bit easier to follow this way. ### 7.8.2. Recursive Queries [](<>)The optional`RECURSIVE`modifier changes`WITH`from a mere syntactic convenience into a feature that accomplishes things not otherwise possible in standard SQL. Using`RECURSIVE`, a`WITH`query can refer to its own output. A very simple example is this query to sum the integers from 1 through 100: ``` WITH RECURSIVE t(n) AS ( VALUES (1) UNION ALL SELECT n+1 FROM t WHERE n < 100 ) SELECT sum(n) FROM t; ``` The general form of a recursive`WITH`query is always a*non-recursive term*, then`UNION`(or`UNION ALL`), then a*recursive term*, where only the recursive term can contain a reference to the query's own output. Such a query is executed as follows: **Recursive Query Evaluation** 1. Evaluate the non-recursive term. For`UNION`(but not`UNION ALL`), discard duplicate rows. Include all remaining rows in the result of the recursive query, and also place them in a temporary*working table*. 2. So long as the working table is not empty, repeat these steps: 1. Evaluate the recursive term, substituting the current contents of the working table for the recursive self-reference. For`UNION`(but not`UNION ALL`), discard duplicate rows and rows that duplicate any previous result row. Include all remaining rows in the result of the recursive query, and also place them in a temporary*intermediate table*. 2. Replace the contents of the working table with the contents of the intermediate table, then empty the intermediate table. ### Note Strictly speaking, this process is iteration not recursion, but`RECURSIVE`is the terminology chosen by the SQL standards committee. In the example above, the working table has just a single row in each step, and it takes on the values from 1 through 100 in successive steps. In the 100th step, there is no output because of the`WHERE`clause, and so the query terminates. Recursive queries are typically used to deal with hierarchical or tree-structured data. A useful example is this query to find all the direct and indirect sub-parts of a product, given only a table that shows immediate inclusions: ``` WITH RECURSIVE included_parts(sub_part, part, quantity) AS ( SELECT sub_part, part, quantity FROM parts WHERE part = 'our_product' UNION ALL SELECT p.sub_part, p.part, p.quantity FROM included_parts pr, parts p WHERE p.part = pr.sub_part ) SELECT sub_part, SUM(quantity) as total_quantity FROM included_parts GROUP BY sub_part ``` #### 7.8.2.1. Search Order When computing a tree traversal using a recursive query, you might want to order the results in either depth-first or breadth-first order. This can be done by computing an ordering column alongside the other data columns and using that to sort the results at the end. Note that this does not actually control in which order the query evaluation visits the rows; that is as always in SQL implementation-dependent. This approach merely provides a convenient way to order the results afterwards. To create a depth-first order, we compute for each result row an array of rows that we have visited so far. For example, consider the following query that searches a table`tree`using a`link`field: ``` WITH RECURSIVE search_tree(id, link, data) AS ( SELECT t.id, t.link, t.data FROM tree t UNION ALL SELECT t.id, t.link, t.data FROM tree t, search_tree st WHERE t.id = st.link ) SELECT * FROM search_tree; ``` To add depth-first ordering information, you can write this: ``` WITH RECURSIVE search_tree(id, link, data, path) AS ( SELECT t.id, t.link, t.data, ARRAY[t.id] FROM tree t UNION ALL SELECT t.id, t.link, t.data, path || t.id FROM tree t, search_tree st WHERE t.id = st.link ) SELECT * FROM search_tree ORDER BY path; ``` In the general case where more than one field needs to be used to identify a row, use an array of rows. For example, if we needed to track fields`f1`and`f2`: ``` WITH RECURSIVE search_tree(id, link, data, path) AS ( SELECT t.id, t.link, t.data, ARRAY[ROW(t.f1, t.f2)] FROM tree t UNION ALL SELECT t.id, t.link, t.data, path || ROW(t.f1, t.f2) FROM tree t, search_tree st WHERE t.id = st.link ) SELECT * FROM search_tree ORDER BY path; ``` ### Tip Omit the`ROW()`syntax in the common case where only one field needs to be tracked. This allows a simple array rather than a composite-type array to be used, gaining efficiency. To create a breadth-first order, you can add a column that tracks the depth of the search, for example: ``` WITH RECURSIVE search_tree(id, link, data, depth) AS ( SELECT t.id, t.link, t.data, 0 FROM tree t UNION ALL SELECT t.id, t.link, t.data, depth + 1 FROM tree t, search_tree st WHERE t.id = st.link ) SELECT * FROM search_tree ORDER BY depth; ``` To get a stable sort, add data columns as secondary sorting columns. ### Tip The recursive query evaluation algorithm produces its output in breadth-first search order. However, this is an implementation detail and it is perhaps unsound to rely on it. The order of the rows within each level is certainly undefined, so some explicit ordering might be desired in any case. There is built-in syntax to compute a depth- or breadth-first sort column. For example: ``` WITH RECURSIVE search_tree(id, link, data) AS ( SELECT t.id, t.link, t.data FROM tree t UNION ALL SELECT t.id, t.link, t.data FROM tree t, search_tree st WHERE t.id = st.link ) SEARCH DEPTH FIRST BY id SET ordercol SELECT * FROM search_tree ORDER BY ordercol; WITH RECURSIVE search_tree(id, link, data) AS ( SELECT t.id, t.link, t.data FROM tree t UNION ALL SELECT t.id, t.link, t.data FROM tree t, search_tree st WHERE t.id = st.link ) SEARCH BREADTH FIRST BY id SET ordercol SELECT * FROM search_tree ORDER BY ordercol; ``` This syntax is internally expanded to something similar to the above hand-written forms. The`SEARCH`clause specifies whether depth- or breadth first search is wanted, the list of columns to track for sorting, and a column name that will contain the result data that can be used for sorting. That column will implicitly be added to the output rows of the CTE. #### 7.8.2.2. Cycle Detection When working with recursive queries it is important to be sure that the recursive part of the query will eventually return no tuples, or else the query will loop indefinitely. Sometimes, using`UNION`instead of`UNION ALL`can accomplish this by discarding rows that duplicate previous output rows. However, often a cycle does not involve output rows that are completely duplicate: it may be necessary to check just one or a few fields to see if the same point has been reached before. The standard method for handling such situations is to compute an array of the already-visited values. For example, consider again the following query that searches a table`graph`using a`link`field: ``` WITH RECURSIVE search_graph(id, link, data, depth) AS ( SELECT g.id, g.link, g.data, 0 FROM graph g UNION ALL SELECT g.id, g.link, g.data, sg.depth + 1 FROM graph g, search_graph sg WHERE g.id = sg.link ) SELECT * FROM search_graph; ``` This query will loop if the`link`relationships contain cycles. Because we require a “depth” output, just changing`联合所有`到`联盟`不会消除循环。相反,我们需要识别在遵循特定链接路径时是否再次到达同一行。我们添加两列`is_cycle`和`小路`到容易循环的查询: ``` WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS ( SELECT g.id, g.link, g.data, 0, false, ARRAY[g.id] FROM graph g UNION ALL SELECT g.id, g.link, g.data, sg.depth + 1, g.id = ANY(path), path || g.id FROM graph g, search_graph sg WHERE g.id = sg.link AND NOT is_cycle ) SELECT * FROM search_graph; ``` 除了防止循环之外,数组值本身通常也很有用,因为它代表了到达任何特定行所采用的“路径”。 在需要检查多个字段以识别循环的一般情况下,使用一组行。例如,如果我们需要比较字段`f1`和`f2`: ``` WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS ( SELECT g.id, g.link, g.data, 0, false, ARRAY[ROW(g.f1, g.f2)] FROM graph g UNION ALL SELECT g.id, g.link, g.data, sg.depth + 1, ROW(g.f1, g.f2) = ANY(path), path || ROW(g.f1, g.f2) FROM graph g, search_graph sg WHERE g.id = sg.link AND NOT is_cycle ) SELECT * FROM search_graph; ``` ### 提示 省略`排()`只需要检查一个字段来识别循环的常见情况下的语法。这允许使用简单数组而不是复合类型数组,从而提高效率。 有用于简化循环检测的内置语法。上面的查询也可以这样写: ``` WITH RECURSIVE search_graph(id, link, data, depth) AS ( SELECT g.id, g.link, g.data, 1 FROM graph g UNION ALL SELECT g.id, g.link, g.data, sg.depth + 1 FROM graph g, search_graph sg WHERE g.id = sg.link ) CYCLE id SET is_cycle USING path SELECT * FROM search_graph; ``` 它将在内部重写为上述形式。这`循环`子句首先指定要跟踪以进行循环检测的列列表,然后指定显示是否检测到循环的列名,最后指定将跟踪路径的另一列的名称。循环和路径列将隐式添加到 CTE 的输出行。 ### 提示 循环路径列的计算方式与上一节中显示的深度优先排序列相同。一个查询可以同时具有`搜索`和一个`循环`子句,但深度优先搜索规范和循环检测规范会产生冗余计算,因此仅使用`循环`clause and order by the path column. If breadth-first ordering is wanted, then specifying both`SEARCH`and`CYCLE`can be useful. A helpful trick for testing queries when you are not certain if they might loop is to place a`LIMIT`in the parent query. For example, this query would loop forever without the`LIMIT`: ``` WITH RECURSIVE t(n) AS ( SELECT 1 UNION ALL SELECT n+1 FROM t ) SELECT n FROM t LIMIT 100; ``` This works because PostgreSQL's implementation evaluates only as many rows of a`WITH`query as are actually fetched by the parent query. Using this trick in production is not recommended, because other systems might work differently. Also, it usually won't work if you make the outer query sort the recursive query's results or join them to some other table, because in such cases the outer query will usually try to fetch all of the`WITH`query's output anyway. ### 7.8.3. Common Table Expression Materialization A useful property of`WITH`queries is that they are normally evaluated only once per execution of the parent query, even if they are referred to more than once by the parent query or sibling`WITH`queries. Thus, expensive calculations that are needed in multiple places can be placed within a`WITH`query to avoid redundant work. Another possible application is to prevent unwanted multiple evaluations of functions with side-effects. However, the other side of this coin is that the optimizer is not able to push restrictions from the parent query down into a multiply-referenced`WITH`query, since that might affect all uses of the`WITH`query's output when it should affect only one. The multiply-referenced`WITH`query will be evaluated as written, without suppression of rows that the parent query might discard afterwards. (But, as mentioned above, evaluation might stop early if the reference(s) to the query demand only a limited number of rows.) 但是,如果一个`和`查询是非递归且无副作用的(也就是说,它是一个`选择`不包含 volatile 函数)然后可以将其折叠到父查询中,从而允许两个查询级别的联合优化。默认情况下,如果父查询引用`和`只查询一次,但如果它引用`和`多次查询。您可以通过指定来覆盖该决定`物化`强制单独计算`和`查询,或通过指定`未物化`强制将其合并到父查询中。后一种选择有重复计算的风险`和`查询,但如果每次使用`和`查询只需要一小部分`和`查询的完整输出。 这些规则的一个简单示例是 ``` WITH w AS ( SELECT * FROM big_table ) SELECT * FROM w WHERE key = 123; ``` 这`和`查询将被折叠,产生相同的执行计划 ``` SELECT * FROM big_table WHERE key = 123; ``` 特别是,如果有一个索引`钥匙`,它可能会被用来只获取具有`键 = 123`. On the other hand, in ``` WITH w AS ( SELECT * FROM big_table ) SELECT * FROM w AS w1 JOIN w AS w2 ON w1.key = w2.ref WHERE w2.key = 123; ``` the`WITH`query will be materialized, producing a temporary copy of`big_table`that is then joined with itself — without benefit of any index. This query will be executed much more efficiently if written as ``` WITH w AS NOT MATERIALIZED ( SELECT * FROM big_table ) SELECT * FROM w AS w1 JOIN w AS w2 ON w1.key = w2.ref WHERE w2.key = 123; ``` so that the parent query's restrictions can be applied directly to scans of`big_table`. An example where`NOT MATERIALIZED`could be undesirable is ``` WITH w AS ( SELECT key, very_expensive_function(val) as f FROM some_table ) SELECT * FROM w AS w1 JOIN w AS w2 ON w1.f = w2.f; ``` Here, materialization of the`WITH`query ensures that`very_expensive_function`is evaluated only once per table row, not twice. The examples above only show`WITH`being used with`SELECT`, but it can be attached in the same way to`INSERT`,`UPDATE`, or`DELETE`. In each case it effectively provides temporary table(s) that can be referred to in the main command. ### 7.8.4. Data-Modifying Statements in`和` 您可以使用数据修改语句(`插入`,`更新`, 或者`删除`) 在`和`.这允许您在同一个查询中执行几个不同的操作。一个例子是: ``` WITH moved_rows AS ( DELETE FROM products WHERE "date" >= '2010-10-01' AND "date" < '2010-11-01' RETURNING * ) INSERT INTO products_log SELECT * FROM moved_rows; ``` 此查询有效地从`产品`到`产品日志`.这`删除`在`和`删除指定的行`产品`,通过其返回其内容`返回`条款;然后主查询读取该输出并将其插入`产品日志`. 上述示例的一个优点是`和`条款附在`插入`,而不是子`选择`内`插入`.这是必要的,因为数据修改语句只允许在`和`附加到顶级语句的子句。不过,正常`和`可见性规则适用,因此可以参考`和`子语句的输出`选择`. 中的数据修改语句`和`通常有`返回`条款(见[第 6.4 节](dml-returning.html)),如上例所示。它是输出`返回`条款,*不是*数据修改语句的目标表,它形成了可以由查询的其余部分引用的临时表。如果数据修改语句`和`缺少一个`返回`子句,则它不会形成临时表,并且不能在查询的其余部分中引用。尽管如此,这样的语句仍将被执行。一个不是特别有用的例子是: ``` WITH t AS ( DELETE FROM foo ) DELETE FROM bar; ``` 此示例将删除表中的所有行`富`和`酒吧`.报告给客户端的受影响行数将仅包括从`酒吧`. 不允许在数据修改语句中进行递归自引用。在某些情况下,可以通过引用递归的输出来解决这个限制`和`, 例如: ``` WITH RECURSIVE included_parts(sub_part, part) AS ( SELECT sub_part, part FROM parts WHERE part = 'our_product' UNION ALL SELECT p.sub_part, p.part FROM included_parts pr, parts p WHERE p.part = pr.sub_part ) DELETE FROM parts WHERE part IN (SELECT part FROM included_parts); ``` 此查询将删除产品的所有直接和间接子部分。 中的数据修改语句`和`只执行一次,并且总是完成,与主查询是否读取所有(或实际上任何)输出无关。请注意,这与规则不同`选择`在`和`:如上一节所述,执行`选择`仅在主查询需要其输出时才进行。 中的子语句`和`彼此同时执行并与主查询同时执行。因此,当在`和`,指定更新实际发生的顺序是不可预测的。所有语句都以相同的方式执行*快照*(看[第 13 章](mvcc.html)),因此他们无法“看到”彼此对目标表的影响。这减轻了行更新的实际顺序的不可预测性的影响,并意味着`返回`数据是不同人之间沟通变化的唯一方式`和`子语句和主查询。这方面的一个例子是 ``` WITH t AS ( UPDATE products SET price = price * 1.05 RETURNING * ) SELECT * FROM products; ``` 外层`选择`将在行动之前返回原始价格`更新`, 而在 ``` WITH t AS ( UPDATE products SET price = price * 1.05 RETURNING * ) SELECT * FROM t; ``` 外层`选择`将返回更新的数据。 不支持在单个语句中尝试两次更新同一行。只进行了一项修改,但要可靠地预测哪一项并不容易(有时也不可能)。这也适用于删除已在同一语句中更新的行:仅执行更新。因此,您通常应避免尝试在单个语句中两次修改单个行。特别避免写`和`可能影响由主语句或兄弟子语句更改的相同行的子语句。这种声明的影响是不可预测的。 目前,任何用作数据修改语句目标的表`和`不能有条件规则,也不能有`还`规则,也不是`反而`扩展到多个语句的规则。