Faithfully parse inheritance-recursion for external tables
When SQL standard table inheritance was added in upstream (by commit 2fb6cc90 in Postgres 7.1), mentioning a table in the FROM clause of a query would necessarily mean traversing through the inheritance hierarchy. The need to distinguish between the (legacy, less common, but legitimate nonetheless) intent of not recursing into child tables gave rise to two things: the guc `sql_inheritance` which toggles the default semantics of parent tables, and the `ONLY` keyword used in front of parent table names to explicitly skip descendant tables. ORCA doesn't like queries that skip descendant tables: it falls back to the legacy planner as soon as it detects that intent. Way way back in Greenplum-land, when external tables were given a separate designation in relstorage (RELSTORAGE_EXTERNAL), we seemed to have added code in parser (parse analysis) so that queries on external tables *never* recurse into their child tables, regardless of what the user specifies -- either via `ONLY` or `*` in the query, or via guc `sql_inheritance`. Technically, that process scrubs the range table entries to hard-code "do not recurse". The combination of those two things -- hard coding "do not recurse" in the RTE for the analyzed parse tree and ORCA detecting intent of `ONLY` through RTE -- led ORCA to *always* fall back to planner when an external table is mentioned in the FROM clause. commit 013a6e9d tried fixing this by *detecting harder* whether there's an external table. The behavior of the parse-analyzer hard coding a "do not recurse" in the RTE for an external table seems wrong for several reasons: 1. It seems unnecessarily defensive 2. It doesn't seem to belong in the parser. a. While changing "recurse" back to "do not recurse" abounds, all other occurrences happen in the planner as an optimization for childless tables. b. It deprives an optimizer of the actual intent expressed by the user: because of this hardcoding, neither ORCA nor planner would have a way of knowing whether the user specified `ONLY` in the query. c. It deprives the user of the ability to use child tables with an external table, either deliberately or coincidentally. d. A corollary is that any old views created as `SELECT a,b FROM ext_table` will be perpetuated as `SELECT a,b FROM ONLY ext_table`. This commit removes this defensive setting in the parse analyzer. As a consequence, we're able to reinstate the simpler RTE check before commit 013a6e9d. Queries and new views will include child tables as expected. Note that this commit will introduce a behavior change: (taken from https://github.com/greenplum-db/gpdb/pull/5455#issuecomment-412247709) 1. a (not external) table by default means "me and my descendants" even if it's childless 2. an external table with child tables previously would never recurse into child tables 2. after this patch you need to use ONLY to exclude descendant tables. (cherry picked from commit 2371cb3b)
Showing
想要评论请 注册 或 登录