- 27 9月, 2018 1 次提交
-
-
由 Paul Guo 提交于
As the comment said, this was useful howerver now that we have upstream add_rte_to_flat_rtable() to handle that, let's remove this call.
-
- 23 9月, 2018 1 次提交
-
-
由 Daniel Gustafsson 提交于
getgpsegmentCount() was defined in both cdbvars.h and cdbutil.h. While not needing another header include in some cases, getgpsegmentCount() is not a variable and the correct location is cdbutil.h. Remove the prototype from cdbvars.g and update includes as required. Also fix the function comment to match reality and minor tweaking of the debug elog() performed. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
-
- 22 9月, 2018 2 次提交
-
-
由 Jesse Zhang 提交于
Commit 825ca1e3 didn't seem to work well when we hook up ORCA's memory system to memory accounting. We are tripping multiple asserts in regression tests. The reg test failures seem to suggest we are double-free'ing somewhere (or incorrectly accounting). Reverting for now to get master back to green. This reverts commit 825ca1e3.
-
由 Taylor Vesely 提交于
The memory accounting system generates a new memory account for every execution node initialized in ExecInitNode. The address to these memory accounts is stored in the shortLivingMemoryAccountArray. If the memory allocated for shortLivingMemoryAccountArray is full, we will repalloc the array with double the number of available entries. After creating approximately 67000000 memory accounts, it will need to allocate more than 1GB of memory to increase the array size, and throw an ERROR, canceling the running query. PL/pgSQL and SQL functions will create new executors/plan nodes that must be tracked my the memory accounting system. This level of detail is not necessary for tracking memory leaks, and creating a separate memory account for every executor will use large amount of memory just to track these memory accounts. Instead of tracking millions of individual memory accounts, we consolidate any child executor account into a special 'X_NestedExecutor' account. If explain_memory_verbosity is set to 'detailed' and below, consolidate all child executors into this account. If more detail is needed for debugging, set explain_memory_verbosity to 'debug', where, as was the previous behavior, every executor will be assigned its own MemoryAccountId. Originally we tried to remove nested execution accounts after they finish executing, but rolling over those accounts into a 'X_NestedExecutor' account was impracticable to accomplish without the possibility of a future regression. If any accounts are created between nested executors that are not rolled over to an 'X_NestedExecutor' account, recording which accounts are rolled over can grow in the same way that the shortLivingMemoryAccountArray is growing today, and would also grow too large to reasonably fit in memory. If we were to iterate through the SharedHeaders every time that we finish a nested executor, it is not likely to be very performant. While we were at it, convert some of the convenience macros dealing with memory accounting for executor / planner node into functions, and move them out of memory accounting header files into the sole callers' compilation units. Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io> Co-authored-by: NEkta Khanna <ekhanna@pivotal.io> Co-authored-by: NAdam Berlin <aberlin@pivotal.io> Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io> Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
-
- 05 9月, 2018 3 次提交
-
-
由 Daniel Gustafsson 提交于
-
由 Richard Guo 提交于
Co-authored-by: NAlexandra Wang <lewang@pivotal.io> Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
-
由 Paul Guo 提交于
Checked our previous GPDB hacking code and newest implementation. I found that previous gpdb related diffs do not seem to apply to current gpdb code. So removing the FIXME. This patch also slightly refactors some code and changes some other existing comments.
-
- 03 9月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
Be more careful to not build a Redistribute Motion on an expression that's not GPDB-hashable. Fixes github issue #4868, as well as a couple of other similar cases that were found while investigating this.
-
- 31 8月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
The GPDB "prelim" functions did the same things as the "combine" functions introduced in PostgreSQL 9.6 This commit includes just the catalog changes, to essentially search & replace "prelim" with "combine". I did not pick the planner and executor changes that were made as part of this in the upstream, yet. Also replace the GPDB implementation of float8_amalg() and float8_regr_amalg(), with the upstream float8_combine() and float8_regr_combine(). They do the same thing, but let's use upstream functions where possible. Upstream commits: commit a7de3dc5 Author: Robert Haas <rhaas@postgresql.org> Date: Wed Jan 20 13:46:50 2016 -0500 Support multi-stage aggregation. Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me. commit af025eed Author: Robert Haas <rhaas@postgresql.org> Date: Fri Apr 8 13:44:50 2016 -0400 Add combine functions for various floating-point aggregates. This allows parallel aggregation to use them. It may seem surprising that we use float8_combine for both float4_accum and float8_accum transition functions, but that's because those functions differ only in the type of the non-transition-state argument. Haribabu Kommi, reviewed by David Rowley and Tomas Vondra
-
- 03 8月, 2018 1 次提交
-
-
由 Karen Huddleston 提交于
This reverts commit 4750e1b6.
-
- 02 8月, 2018 1 次提交
-
-
由 Richard Guo 提交于
This is the final batch of commits from PostgreSQL 9.2 development, up to the point where the REL9_2_STABLE branch was created, and 9.3 development started on the PostgreSQL master branch. Notable upstream changes: * Index-only scan was included in the batch of upstream commits. It allows queries to retrieve data only from indexes, avoiding heap access. * Group commit was added to work effectively under heavy load. Previously, batching of commits became ineffective as the write workload increased, because of internal lock contention. * A new fast-path lock mechanism was added to reduce the overhead of taking and releasing certain types of locks which are taken and released very frequently but rarely conflict. * The new "parameterized path" mechanism was added. It allows inner index scans to use values from relations that are more than one join level up from the scan. This can greatly improve performance in situations where semantic restrictions (such as outer joins) limit the allowed join orderings. * SP-GiST (Space-Partitioned GiST) index access method was added to support unbalanced partitioned search structures. For suitable problems, SP-GiST can be faster than GiST in both index build time and search time. * Checkpoints now are performed by a dedicated background process. Formerly the background writer did both dirty-page writing and checkpointing. Separating this into two processes allows each goal to be accomplished more predictably. * Custom plan was supported for specific parameter values even when using prepared statements. * API for FDW was improved to provide multiple access "paths" for their tables, allowing more flexibility in join planning. * Security_barrier option was added for views to prevents optimizations that might allow view-protected data to be exposed to users. * Range data type was added to store a lower and upper bound belonging to its base data type. * CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The SELECT query is planned during the execution of the utility. To conform to this change, GPDB executes the utility statement only on QD and dispatches the plan of the SELECT query to QEs. Co-authored-by: NAdam Lee <ali@pivotal.io> Co-authored-by: NAlexandra Wang <lewang@pivotal.io> Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io> Co-authored-by: NAsim R P <apraveen@pivotal.io> Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io> Co-authored-by: NGang Xiong <gxiong@pivotal.io> Co-authored-by: NHaozhou Wang <hawang@pivotal.io> Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Co-authored-by: NJesse Zhang <sbjesse@gmail.com> Co-authored-by: NJinbao Chen <jinchen@pivotal.io> Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io> Co-authored-by: NMelanie Plageman <mplageman@pivotal.io> Co-authored-by: NPaul Guo <paulguo@gmail.com> Co-authored-by: NRichard Guo <guofenglinux@gmail.com> Co-authored-by: NShujie Zhang <shzhang@pivotal.io> Co-authored-by: NTaylor Vesely <tvesely@pivotal.io> Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
-
- 09 7月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
Instead of completely disabling the generation of Paths with disabled plan types, add a high penalty to their cost estimates, like in the upstream. This reduces our diff vs. upstream, making future merges more straightforward. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/Az2cDcqf73g/_tY6Yv1kBgAJCo-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io> Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io> Reviewed-by: NRichard Guo <riguo@pivotal.io>
-
- 12 5月, 2018 1 次提交
-
-
由 Ashwin Agrawal 提交于
-
- 03 5月, 2018 1 次提交
-
-
由 Zhenghua Lyu 提交于
To prevent distributed deadlock, in Greenplum DB an exclusive table lock is held for UPDATE and DELETE commands, so concurrent updates the same table are actually disabled. We add a backend process to do global deadlock detect so that we do not lock the whole table while doing UPDATE/DELETE and this will help improve the concurrency of Greenplum DB. The core idea of the algorithm is to divide lock into two types: - Persistent: the lock can only be released after the transaction is over(abort/commit) - Otherwise cases This PR’s implementation adds a persistent flag in the LOCK, and the set rule is: - Xid lock is always persistent - Tuple lock is never persistent - Relation is persistent if it has been closed with NoLock parameter, otherwise it is not persistent Other types of locks are not persistent More details please refer the code and README. There are several known issues to pay attention to: - This PR’s implementation only cares about the locks can be shown in the view pg_locks. - This PR’s implementation does not support AO table. We keep upgrading the locks for AO table. - This PR’s implementation does not take networking wait into account. Thus we cannot detect the deadlock of GitHub issue #2837. - SELECT FOR UPDATE still lock the whole table. Co-authored-by: NZhenghua Lyu <zlv@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io>
-
- 02 5月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
I'm not sure why it's been disabled. It's not very hard to make it work, so let's do it. Might not be a very common query type, but if you happen to have a query where it helps, it helps a lot. This adds a GUC, gp_enable_minmax_optimization, to enable/disable the optimization. There's no such GUC in upstream, but we need at least a flag in PlannerConfig for it, so that we can disable the optimization for correlated subqueries, along with some other optimizer tricks. Seems best to also have a GUC for it, for consistency with other flags in PlannerConfig.
-
- 29 3月, 2018 2 次提交
-
-
由 Pengzhou Tang 提交于
* Support replicated table in GPDB Currently, tables are distributed across all segments by hash or random in GPDB. There are requirements to introduce a new table type that all segments have the duplicate and full table data called replicated table. To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify the distribution of tuples of a replicated table. CdbLocusType_SegmentGeneral implies data is generally available on all segments but not available on qDisp, so plan node with this locus type can be flexibly planned to execute on either single QE or all QEs. it is similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other rel has bottleneck locus type, a problem is such motion may be redundant if the single QE is not promoted to executed on qDisp finally, so we need to detect such case and omit the redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since it's always implies a broadcast motion bellow, it's not easy to plan such node as direct dispatch to avoid getting duplicate data. We don't support replicated table with inherit/partition by clause now, the main problem is that update/delete on multiple result relations can't work correctly now, we can fix this later. * Allow spi_* to access replicated table on QE Previously, GPDB didn't allow QE to access non-catalog table because the data is incomplete, we can remove this limitation now if it only accesses replicated table. One problem is QE need to know if a table is replicated table, previously, QE didn't maintain the gp_distribution_policy catalog, so we need to pass policy info to QE for replicated table. * Change schema of gp_distribution_policy to identify replicated table Previously, we used a magic number -128 in gp_distribution_policy table to identify replicated table which is quite a hack, so we add a new column in gp_distribution_policy to identify replicated table and partitioned table. This commit also abandon the old way that used 1-length-NULL list and 2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED FULLY clause. Beside, this commit refactor the code to make the decision-making of distribution policy more clear. * support COPY for replicated table * Disable row ctid unique path for replicated table. Previously, GPDB use a special Unique path on rowid to address queries like "x IN (subquery)", For example: select * from t1 where t1.c2 in (select c2 from t3), the plan looks like: -> HashAggregate Group By: t1.ctid, t1.gp_segment_id -> Hash Join Hash Cond: t2.c2 = t1.c2 -> Seq Scan on t2 -> Hash -> Seq Scan on t1 Obviously, the plan is wrong if t1 is a replicated table because ctid + gp_segment_id can't identify a tuple, in replicated table, a logical row may have different ctid and gp_segment_id. So we disable such plan for replicated table temporarily, it's not the best way because rowid unique way maybe the cheapest plan than normal hash semi join, so we left a FIXME for later optimization. * ORCA related fix Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io> Fallback to legacy query optimizer for queries over replicated table * Adapt pg_dump/gpcheckcat to replicated table gp_distribution_policy is no longer a master-only catalog, do same check as other catalogs. * Support gpexpand on replicated table && alter the dist policy of replicated table
-
由 Dhanashree Kashid 提交于
With the 8.4 merge, planner considers using HashAgg to implement DISTINCT. At the end of planning, we replace the expressions in the targetlist of certain operators (including Agg) into OUTER references in targetlist of its lefttree (see set_plan_refs() > set_upper_references()). But, as per the code, in the case when grouping() or group_id() are present in the target list of Agg, it skips the replacement and this is problematic in case the Agg is implementing DISTINCT. It seems that the Agg's targetlist need not compute grouping() or group_id() when its lefttree is computing it. In that case, it may simply refer to it. This would then also apply to other operators WindowAgg, Result & PartitionSelector. However, the Repeat node needs to compute these functions at each stage because group_id is derived from RepeatState::repeat_count. Thus, it connot be replaced by an OUTER reference. Hence, this commit removes the special case for these functions for all operators except Repeat. Then, a DISTINCT HashAgg produces the correct results. Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>
-
- 16 3月, 2018 1 次提交
-
-
由 Shreedhar Hardikar 提交于
These were related to chosing the right arguments to send to GPDB's make_agg() and cost_agg() methods for queries containing DISTINCT or set operations. Hash aggregation when used to implement a DISTINCT (in either form) in the query is not related to grouping sets and thus the argments to num_nullcols, input_grouping, grouping and rollup_gs_times should be 0. However, since SetOp uses the upstream TupleHashTable while HashAgg uses GPDB's HHashTable implementation, the hash table size calculations should be computed differently. This is fixed in this commit Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
-
- 09 2月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
This removes much of the GPDB machinery to handle "deduplication paths" within the planner. We will now use the upstream code to build JOIN_SEMI paths, as well as paths where the outer side of the join is first deduplicated (JOIN_UNIQUE_OUTER/INNER). The old style "join first and deduplicate later" plans can be better in some cases, however. To still be able to generate such plan, add new JOIN_DEDUP_SEMI join type, which is transformed into JOIN_INNER followed by the deduplication step after the join, during planning. This new way of constructing these plans is simpler, and allows removing a bunch of code, and reverting some more code to the way it is in the upstream. I'm not sure if this can generate the same plans that the old code could, in all cases. In particular, I think the old "late deduplication" mechanism could delay the deduplication further, all the way to the top of the join tree. I'm not sure when that woud be useful, though, and the regression suite doesn't seem to contain any such cases (with EXPLAIN). Or maybe I misunderstood the old code. In any case, I think this is good enough.
-
- 02 2月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
Instead, avoid creating such Result nodes in the first place, by making plan_pushdown_tlist() check if the Result node would have any work to do. With this, you get Result nodes in some cases where the old code could zap it away. But on the other hand, this can avoid inserting Result nodes, not only on top of Appends, but on top of any node. This can be seen in the included expected output changes: some test queries lose a Result, some gain one. So performance-wise this is about a wash, but this is simpler. The reason to do this right now is that we ran into issues with the "zapping" code while working on the 9.0 merge. I'm sure we could fix those issues, but let's do this rather than spend time debugging and fixing the zapping code with the merge.
-
- 13 12月, 2017 4 次提交
-
-
由 Daniel Gustafsson 提交于
The comment added in 916f460f created a nested comment structure by accident, which triggered a warning in clang for -Wcomment. Reword the comment slightly to make the compiler happy. planner.c:194:15: warning: '/*' within block comment [-Wcomment] * support pl/* statements (relevant when they are planned on the segments). ^
-
由 Shreedhar Hardikar 提交于
The default value of Gp_role is set to GP_ROLE_DISPATCH. Which means auxiliary processes inherit this value. FileRep does the same, but also executes queries using SPI on the segment. Which means Gp_role == GP_ROLE_DISPATCH is not a sufficient check for master QD. So, bring back the check on GpIdentity. Author: Asim R P <apraveen@pivotal.io> Author: Shreedhar Hardikar <shardikar@pivotal.io>
-
由 Shreedhar Hardikar 提交于
The original name was deceptive because this check is also done for QE slices that run on master. For example: EXPLAIN SELECT * FROM func1_nosql_vol(5), foo; QUERY PLAN -------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=0.30..1.37 rows=4 width=12) -> Nested Loop (cost=0.30..1.37 rows=2 width=12) -> Seq Scan on foo (cost=0.00..1.01 rows=1 width=8) -> Materialize (cost=0.30..0.33 rows=1 width=4) -> Broadcast Motion 1:3 (slice1) (cost=0.00..0.30 rows=3 width=4) -> Function Scan on func1_nosql_vol (cost=0.00..0.26 rows=1 width=4) Settings: optimizer=off Optimizer status: legacy query optimizer (8 rows) Note that in the plan, the function func1_nosql_vol() will be executed on a master slice with Gp_role as GP_ROLE_EXECUTE. Also, update output files Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
-
由 Shreedhar Hardikar 提交于
We don't want to use the optimizer for planning queries in SQL, pl/pgSQL etc. functions when that is done on the segments. ORCA excels in complex queries, most of which will access distributed tables. We can't run such queries from the segments slices anyway because they require dispatching a query within another - which is not allowed in GPDB. Note that this restriction also applies to non-QD master slices. Furthermore, ORCA doesn't currently support pl/* statements (relevant when they are planned on the segments). For these reasons, restrict to using ORCA on the master QD processes only. Also revert commit d79a2c7f ("Fix pipeline failures caused by 0dfd0ebc.") and separate out gporca fault injector tests in newly added gporca_faults.sql so that the rest can run in a parallel group. Signed-off-by: NJesse Zhang <sbjesse@gmail.com>
-
- 12 12月, 2017 1 次提交
-
-
由 Daniel Gustafsson 提交于
These error codes were marked as deprecated in September 2007 but the code didn't get the memo. Extend the deprecation into the code and actually replace the usage. Ten years seems long enough notice so also remove the renames, the odds of anyone using these in code which compiles against a 6X tree should be low (and easily fixed).
-
- 30 11月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
Looks like you can't actually get here with any aggregates or placeholders in the start/end offsets, or we would've gotten errors.
-
- 24 11月, 2017 7 次提交
-
-
由 Heikki Linnakangas 提交于
commit 96f990e2 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed Jul 13 20:23:09 2011 -0400 Update some comments to clarify who does what in targetlist creation. No code changes; just avoid blaming query_planner for things it doesn't really do.
-
由 Heikki Linnakangas 提交于
The test case added to the regression suite actually seems to work on GPDB even without this, but nevertheless seems like a good idea to pick it now, since we have the code it affected. Also, I'm about to backport more stuff that depend on this. commit c1d9579d Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue Jul 12 18:23:55 2011 -0400 Avoid listing ungrouped Vars in the targetlist of Agg-underneath-Window. Regular aggregate functions in combination with, or within the arguments of, window functions are OK per spec; they have the semantics that the aggregate output rows are computed and then we run the window functions over that row set. (Thus, this combination is not really useful unless there's a GROUP BY so that more than one aggregate output row is possible.) The case without GROUP BY could fail, as recently reported by Jeff Davis, because sloppy construction of the Agg node's targetlist resulted in extra references to possibly-ungrouped Vars appearing outside the aggregate function calls themselves. See the added regression test case for an example. Fixing this requires modifying the API of flatten_tlist and its underlying function pull_var_clause. I chose to make pull_var_clause's API for aggregates identical to what it was already doing for placeholders, since the useful behaviors turn out to be the same (error, report node as-is, or recurse into it). I also tightened the error checking in this area a bit: if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here, that was a long time ago, so complain instead of ignoring them. Backpatch into 9.1. The failure exists in 8.4 and 9.0 as well, but seeing that it only occurs in a basically-useless corner case, it doesn't seem worth the risks of changing a function API in a minor release. There might be third-party code using pull_var_clause.
-
由 Heikki Linnakangas 提交于
We would get this later in PostgreSQL 8.4, but I'm about to cherry-pick more commits now, that depends on this. Upstream commmit: commit 1d97c19a Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Apr 19 19:46:33 2009 +0000 Fix estimate_num_groups() to not fail on PlaceHolderVars, per report from Stefan Kaltenbrunner. The most reasonable behavior (at least for the near term) seems to be to ignore the PlaceHolderVar and examine its argument instead. In support of this, change the API of pull_var_clause() to allow callers to request recursion into PlaceHolderVars. Currently estimate_num_groups() is the only customer for that behavior, but where there's one there may be others.
-
由 Heikki Linnakangas 提交于
This is similar to the old implementation, in that we use "+", "-" to compute the boundaries. Unfortunately it seems unlikely that this would be accepted in the upstream, but at least we have that feature back in GPDB now, the way it used to be. See discussion on pgsql-hackers about that: https://www.postgresql.org/message-id/26801.1265656635@sss.pgh.pa.us
-
由 Heikki Linnakangas 提交于
This is functionality that was lost by the ripout & replace. commit 34d26872 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue Dec 15 17:57:48 2009 +0000 Support ORDER BY within aggregate function calls, at long last providing a non-kluge method for controlling the order in which values are fed to an aggregate function. At the same time eliminate the old implementation restriction that DISTINCT was only supported for single-argument aggregates. Possibly release-notable behavioral change: formerly, agg(DISTINCT x) dropped null values of x unconditionally. Now, it does so only if the agg transition function is strict; otherwise nulls are treated as DISTINCT normally would, ie, you get one copy. Andrew Gierth, reviewed by Hitoshi Harada
-
由 Heikki Linnakangas 提交于
This loses the functionality, and leaves all the regression tests that used those functions failing. The plan is to later backport the upstream implementation of those functions from PostgreSQL 9.4. The feature is called "ordered set aggregates" there.
-
由 Heikki Linnakangas 提交于
This adds some limitations, and removes some functionality that tte old implementation had. These limitations will be lifted, and missing functionality will be added back, in subsequent commits: * You can no longer have variables in start/end offsets * RANGE is not implemented (except for UNBOUNDED) * If you have multiple window functions that require a different sort ordering, the planner is not smart about placing them in a way that minimizes the number of sorts. This also lifts some limitations that the GPDB implementation had: * LEAD/LAG offset can now be negative. In the qp_olap_windowerr, a lot of queries that used to throw an "ROWS parameter cannot be negative" error are now passing. That error was an artifact of the eay LEAD/LAG were implemented. Those queries contain window function calls like "LEAD(col1, col2 - col3)", and sometimes with suitable values in col2 and col3, the second argument went negative. That caused the error. implementation of LEAD/LAG is OK with a negative argument. * Aggregate functions with no prelimfn or invprelimfn are now supported as window functions * Window functions, e.g. rank(), no longer require an ORDER BY. (The output will vary from one invocation to another, though, because the order is then not well defined. This is more annoying on GPDB than on PostgreSQL, because in GDPB the row order tends to vary because the rows are spread out across the cluster and will arrive in the master in unpredictable order) * NTILE doesn't require the argument expression to be in PARTITION BY * A window function's arguments may contain references to an outer query. This changes the OIDs of the built-in window functions to match upstream. Unfortunately, the OIDs had been hard-coded in ORCA, so to work around that until those hard-coded values are fixed in ORCA, the ORCA translator code contains a hack to map the old OID to the new ones.
-
- 23 11月, 2017 2 次提交
-
-
由 Heikki Linnakangas 提交于
The old logic was: 1. Decide if we need to put a Gather motion on top of the plan 2. Add nodes to handle DISTINCT 3. Add nodes to handle ORDER BY. 4. Add Gather node, if we decided so in step 1. If in step 1, if the result was already focused on a single segment, we would make note that no Gather is needed, and not add one in step 4. However, the DISTINCT processing might add a Redistribute Motion node, so that the final result is not focused on a single node. I couldn't come up with a query where that would happen, as the code stands, but we saw such a case on the "window functions rewrite" branch we've been working on. There, the sort order/distribution of the input can be changed to process window functions. But even if this isn't actively broken right now, it seems more robust to change the logic so that 'must_gather' means 'at the end, the result must end up on a single node', instead of 'we must add a Gather node'. The test that this adds exercises this issue after the the window functions rewrite, but right now it passes with or without these code changes. But might as well add it now.
-
由 Heikki Linnakangas 提交于
The last 8.4 merge commit introduced support for DISTINCT with hashing, and refactored the way grouping_planner() works with the path keys. That broke DISTINCT with window functions, because the new distinct_pathkeys field was not set correctly. In commit 474f1db0, I moved some GPDB-added tests from the 'aggregates' test, to a new 'gp_aggregates' test. But I forgot to add the new test file to the test schedule, so it was not run. Oops. Add it to the schedule now. The tests in 'gp_aggregates' cover this bug.
-
- 21 10月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
If a CREATE TABLE AS query contained an ORDER BY, the planner put a Motion node on top of the plan that focuses all the rows to a single node. However, that was confused with the re-distribute motion that CREATE TABLE AS that is supposed to go to the top, to distribute the rows according to the DISTRIBUTED BY of the table. This used to work before commit 7e268107, because we used to not add an explicit Motion node on top of the plan for ORDER BY, but we just changed the sort-order information in the Flow. I have a nagging feeling that the apply_motion code isn't dealing with Motion on top of a Motion node correctly, because I would've expected to get a plan like that without this fix. Perhaps apply_motion silentlye refuses to add a Motion node on top of an existing Motion? That'd be a silly plan, of course, and the planner doesn't fortunately create such plans, so I'm not going to dig deeper into that right now. The test case is a simplified version from one of the "mpp21090_drop_col_oids_dml_*" TINC tests. I noticed this while moving those tests over from TINC to the main suite. We only run those tests in the concourse pipeline with "set optimizer=on", so it didn't catch this issue with optimizer=off. Fixes github issue #3577.
-
- 13 10月, 2017 1 次提交
-
-
由 Jesse Zhang 提交于
`make_pathkeys_for_sortclauses` with a `true` last argument promises to canonicalize the returned path keys. We somehow cargo-culted a few unnecessary `canonicalize_pathkeys` immediately after those calls. This commit removes such superfluous calls to `canonicalize_pathkeys`. Signed-off-by: NMax Yang <myang@pivotal.io>
-
- 12 10月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
This bug was introduced in commit 7e268107, which changed the way we track the "current" ordering in the planner.
-
- 27 9月, 2017 2 次提交
-
-
由 Shreedhar Hardikar 提交于
This was used to keep information about the subquery join tree for pulled-up sublinks for use later in deconstruct_recurse(). With the upstream subselect merge, a JoinExpr constructed at the pull-up time itself, so this is no longer needed since the subquery join tree information is available in the constructed JoinExpr. Also with the merge, deconstruct_recurse() handles JOIN_SEMI JoinExprs. However, since GPDB differs from upstream by treating SEMI joins as INNER join for internal join planning, this commit also updates inner_join_rels correctly for SEMI joins (see regression test). Also remove unused function declaration for not_null_inner_vars(). Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Ekta Khanna 提交于
commit c59d8dd4 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue Apr 28 21:31:16 2009 +0000 Improve pull_up_subqueries logic so that it doesn't insert unnecessary PlaceHolderVar nodes in join quals appearing in or below the lowest outer join that could null the subquery being pulled up. This improves the planner's ability to recognize constant join quals, and probably helps with detection of common sort keys (equivalence classes) as well. Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-