- 27 9月, 2017 23 次提交
-
-
由 Jemish Patel 提交于
Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed Aug 20 19:58:24 2008 +0000 Marginal improvement in sublink planning: allow unknownEqFalse optimization to be used for SubLinks that are underneath a top-level OR clause. Just as at the very top level of WHERE, it's not necessary to be accurate about whether the sublink returns FALSE or NULL, because either result has the same impact on whether the WHERE will succeed. (cherry picked from commit cc0dd438) Ref [#142356645] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Jemish Patel 提交于
Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed Aug 20 15:49:30 2008 +0000 Fix obsolete comment. It's no longer the case that Param nodes don't carry typmod. (cherry picked from commit 390e59cd) Ref [#142356645] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Shreedhar Hardikar 提交于
This commit updates test answer files after merging e006a24a, 19e34b62 and e549722a. As a result of these merges, we produce inferior plans for which we have added FIX_ME's appropriately. They will be addressed in later commits.
-
For flattened IN or EXISTS sublinks, if we chose INNER JOIN path instead of SEMI JOIN then we need to apply duplicate suppression. The deduplication can be done in two ways: 1. post-join dedup unique-ify the inner join results. try_postjoin_dedup in CdbRelDedupInfo denotes if we need to got for post-join dedup 2. pre-join dedup unique-ify the rows coming from the rel containing the subquery result, before that is joined with any other rels. join_unique_ininfo in CdbRelDedupInfo denotes if we need to go for pre-join dedup. semi_operators and semi_rhs_exprs are used for this. We ported a function from 9.5 to compute these in make_outerjoininfo(). Upstream has completely different implementation of this. Upstream explores JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER paths for this and deduplication is done create_unique_path(). GPDB does this differently since JOIN_UNIQUE_INNER and JOIN_UNIQUE_OUTER are obsolete for us. Hence we have kept the GPDB style deduplication mechanism as it in this merge. Post-join has been implemented in previous merge commits. Ref [#146890743]
-
由 Shreedhar Hardikar 提交于
0. Fix up post join dedup logic after cherry-pick 0. Fix pull_up_sublinks_jointree_recurse returning garbage relids 0. Update gporca, rangefuncs, eagerfree answer fileis 1. gporca Previously we were generating a Hash Inner Join with an HashAggregate for deduplication. Now we generate a Hash Semi Join in which case we do not need to deduplicate the inner. 2. rangefuncs We updated this answer file during the cherry-pick of e006a24a since there was a change in plan. After these cherry-picks, we are back to the original plan as master. Hence we see the original error. 3. eagerfree We are generating a not-very-useful subquery scan node with this change. This is not producing wrong results. But this subqeury scan needs to be removed. We will file a follow-up chore to investigate and fix this. 0. We no longer need helper function `hasSemiJoin()` to check whether this specialInfo list has any specialJoinInfos constructed for Semi Join (IN/EXISTS sublink). We have moved that check inside `cdb_set_cheapest_dedup()` 0. We are not exercising the pre-join-deduplication code path after this cherry-pick. Before this merge, we had three CDB specific nodes in `InClauseInfo` in which we recorded information for pre-join-dedup in case of simple uncorrelated IN sublinks. `try_join_unique`, `sub_targetlist` and `InOperators` Since we now have `SpecialJoinInfo` instead of `InClauseInfo`, we need to devise a way to record this information in `SpecialJoinInfo`. We have filed a follow-up story for this. Ref [#142356521] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Dhanashree Kashid 提交于
commit e549722a Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed Feb 25 03:30:38 2009 +0000 Get rid of the rather fuzzily defined FlattenedSubLink node type in favor of making pull_up_sublinks() construct a full-blown JoinExpr tree representation of IN/EXISTS SubLinks that it is able to convert to semi or anti joins. This makes pull_up_sublinks() a shade more complex, but the gain in semantic clarity is worth it. I still have more to do in this area to address the previously-discussed problems, but this commit in itself fixes at least one bug in HEAD, as shown by added regression test case. Ref [#142356521] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
commit 19e34b62 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Aug 17 01:20:00 2008 +0000 Improve sublink pullup code to handle ANY/EXISTS sublinks that are at top level of a JOIN/ON clause, not only at top level of WHERE. (However, we can't do this in an outer join's ON clause, unless the ANY/EXISTS refers only to the nullable side of the outer join, so that it can effectively be pushed down into the nullable side.) Per request from Kevin Grittner. In passing, fix a bug in the initial implementation of EXISTS pullup: it would Assert if the EXIST's WHERE clause used a join alias variable. Since we haven't yet flattened join aliases when this transformation happens, it's necessary to include join relids in the computed set of RHS relids. Ref [#142356521] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
This commit updates test answer files after merging with e006a24a. 0. Replace `EXISTS Join` to `Semi Join` 0. Replace `Left Anti Semi Join` to `Anti Join` 0. Updated plan for `table_functions` for IN queries as we do not pull up the sublink and convert it into a join when sublink testexpr does not contain Vars of parent query. 0. Updated the error in `rangefuncs.out` since we now create a different plan for the following query : ``` CREATE TABLE foorescan (fooid int, foosubid int, fooname text, primary key(fooid,foosubid)); INSERT INTO foorescan values(5000,1,'abc.5000.1'); INSERT INTO foorescan values(5001,1,'abc.5001.1'); CREATE FUNCTION foorescan(int,int) RETURNS setof foorescan AS 'SELECT * FROM foorescan WHERE fooid >= $1 and fooid < $2 ;' LANGUAGE SQL; SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2; ``` Plan before fix : ``` QUERY PLAN ------------------------------------------------------------------------------------------------------ Sort (cost=270.41..270.54 rows=50 width=50) Sort Key: f.fooid, f.foosubid -> HashAggregate (cost=268.50..269.00 rows=50 width=50) Group By: f.ctid::bigint, f.gp_segment_id -> Hash Join (cost=5.12..268.25 rows=50 width=29) Hash Cond: foorescan.fooid = f.fooid -> Function Scan on foorescan (cost=0.00..260.00 rows=1000 width=4) -> Hash (cost=4.50..4.50 rows=17 width=29) -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..4.50 rows=50 width=29) -> Seq Scan on foorescan f (cost=0.00..3.50 rows=17 width=29) Settings: optimizer=off Optimizer status: legacy query optimizer (12 rows) ``` here the function scan is done on master and since the function is accessing a distributed relation, `cdbdisp_dispatchToGang()` errors out. Plan after fix: ``` explain SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2; QUERY PLAN ------------------------------------------------------------------------------------------------------ Gather Motion 3:1 (slice3; segments: 3) (cost=299.16..299.29 rows=50 width=19) Merge Key: f.fooid, f.foosubid -> Sort (cost=299.16..299.29 rows=17 width=19) Sort Key: f.fooid, f.foosubid -> Hash Semi Join (cost=292.50..297.75 rows=17 width=19) Hash Cond: f.fooid = foorescan.fooid -> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..4.50 rows=17 width=19) Hash Key: f.fooid -> Seq Scan on foorescan f (cost=0.00..3.50 rows=17 width=19) -> Hash (cost=280.00..280.00 rows=334 width=4) -> Redistribute Motion 1:3 (slice2) (cost=0.00..280.00 rows=1000 width=4) Hash Key: foorescan.fooid -> Function Scan on foorescan (cost=0.00..260.00 rows=1000 width=4) Settings: optimizer=off Optimizer status: legacy query optimizer (15 rows) ``` With this new plan, function scan is executed on segment in which case `init_sql_fcache()` first walks the query tree and checks if it is safe to be planned and executed on the segment using `querytree_safe_for_segment_walker()`. `querytree_safe_for_segment_walker()` errors out since the function is accessing distributed table. Both the new and old errors are testing the same scenario. But due to plan change, the place where we bail out is different. Ref [#142355175] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
After merging with e006a24a, Anti Semi Join will be denoted by `JOIN_ANTI` instead of `JOIN_LASJ` Ref [#142355175] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Ekta Khanna 提交于
Since `InClauseInfo` and `OuterJoinInfo` are now combined into `SpecialJoinInfo` after merging with e006a24a; this commit remove them from the relevant places. Access `join_info_list` instead of `in_info_list` and `oj_info_list` Previously, `CdbRelDedupInfo` contained list of `InClauseInfo` s. While making join decisions and overall join processing, we traversed this list and invoked cdb specific functions: `cdb_make_rel_dedup_info()`, `cdbpath_dedup_fixup()` Since `InClauseInfo` is no longer available, `CdbRelDedupInfo` will contain list of `SpecialJoinInfo` s. All the cdb specific routines which were previously called for `InClauseInfo` list will now be called if `CdbRelDedupInfo` has valid `SpecialJoinInfo` list and if join type in `SpecialJoinInfo` is `JOIN_SEMI`. A new helper routine `hasSemiJoin()` has been added which traverses `SpecialJoinInfo` list to check if it contains `JOIN_SEMI`. Ref [#142355175] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Ekta Khanna 提交于
After merging with e006a24a, the jointype JOIN_IN has been renamed to JOIN_SEMI. This commit makes appropriate changes in ORCA for the same. Ref [#142355175] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Ekta Khanna 提交于
Add pullup decisions specific to CDB from `convert_IN_to_join` Ref [#142355175] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Dhanashree Kashid 提交于
After merging with e006a24a, this commit adds CDB specific restrictions as follows: 0. Add pullup decisions specific to CDB from `convert_EXISTS_to_join` and `convert_NOT_EXISTS_to_join` 0. Before this cherry-pick, we used to generate extra quals for the NOT EXISTS query. This was done by calling `cdbpullup_expr()` in `convert_NOT_EXISTS_to_join()`. However, for the exact same query with EXISTS, we never generated these extra quals. ``` create table foo(t text, n numeric, i int, v varchar(10)) distributed by (t); explain select * from foo t0 where not exists (select 1 from foo t1 where t0.i=t1.i + 1); QUERY PLAN ------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=1.08..2.12 rows=4 width=19) -> Hash Left Anti Semi Join (cost=1.08..2.12 rows=2 width=19) Hash Cond: t0.i = (t1.i + 1) -> Seq Scan on foo t0 (cost=0.00..1.00 rows=1 width=19) -> Hash (cost=1.04..1.04 rows=1 width=4) -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..1.04 rows=1 width=4) -> Seq Scan on foo t1 (cost=0.00..1.00 rows=1 width=4) Filter: (i + 1) IS NOT NULL -> extra filter Settings: optimizer=off Optimizer status: legacy query optimizer (10 rows) explain select * from foo t0 where exists (select 1 from foo t1 where t0.i=t1.i + 1); QUERY PLAN ------------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice2; segments: 3) (cost=1.08..2.12 rows=4 width=19) -> Hash EXISTS Join (cost=1.08..2.12 rows=2 width=19) Hash Cond: t0.i = (t1.i + 1) -> Seq Scan on foo t0 (cost=0.00..1.00 rows=1 width=19) -> Hash (cost=1.04..1.04 rows=1 width=4) -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..1.04 rows=1 width=4) -> Seq Scan on foo t1 (cost=0.00..1.00 rows=1 width=4) Settings: optimizer=off Optimizer status: legacy query optimizer (9 rows) ``` Currently with this commit, the combined pull-up code for EXISTS and NOT EXISTS does not generate extra filters. This will be a future TODO. 0. Use `is_simple_subquery` in `simplify_EXISTS_query` to check if subquery can be pulled up or not. Ref [#142355175] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Dhanashree Kashid 提交于
With the new flow, we don't need the following functions: - pull_up_IN_clauses - convert_EXISTS_to_join - convert_NOT_EXISTS_to_antijoin - not_null_inner_vars - safe_to_convert_NOT_EXISTS - convert_sublink_to_join Ref [#142355175] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
Original Flow: cdb_flatten_sublinks +--> pull_up_IN_clauses +--> convert_sublink_to_join New Flow: cdb_flatten_sublinks +--> pull_up_sublinks This commit contains relevant changes for the above flow. Previously, `try_join_unique` was part of `InClauseInfo`. It was getting set in `convert_IN_to_join()` and used in `cdb_make_rel_dedup_info()`. Now, since `InClauseInfo` is not present and we construct `FlattenedSublink` instead in `convert_ANY_sublink_to_join()`. And later in the flow, we construct `SpecialJoinInfo` from `FlattenedSublink` in `deconstruct_sublink_quals_to_rel()`. Hence, adding `try_join_unique` as part of both `FlattenedSublink` and `SpecialJoinInfo`. Ref [#142355175] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
commit e006a24a Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Thu Aug 14 18:48:00 2008 +0000 Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus. Ref [#142355175] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Lisa Owen 提交于
-
由 Lisa Owen 提交于
* admin guide working w/dbs section - pull some topics up a level * promote ddl, crud topics; move querying topic up
-
由 Nadeem Ghani 提交于
Behave by default just grabs the latest parse_type package as part of the setup requirement. However, the newest parse_type package (0.4.2) uses a new convention on how it handles packaging which is not in the older python versions before 2.7.13. Since we're in python 2.7.12, we break. Force requirement to use an older version as a hack to bypass this issue. Signed-off-by: NMarbin Tan <mtan@pivotal.io>
-
由 Mel Kiyama 提交于
-
由 Lav Jain 提交于
-
由 Lisa Owen 提交于
-
- 26 9月, 2017 13 次提交
-
-
由 Jacob Champion 提交于
Several GUCs are simply enumerated strings that are parsed into integer types behind the scenes. As of 8.4, the GUC system recognizes a new type, enum, which will do this for us. Move as many as we can to the new system. As part of this, - gp_idf_deduplicate was changed from a char* string to an int, and new IDF_DEDUPLICATE_* macros were added for each option - password_hash_algorithm was changed to an int - for codegen_optimization_level, "none" is the default now when codegen is not enabled during compilation (instead of the empty string). A couple of GUCs that *could* be represented as enums (optimizer_minidump, gp_workfile_compress_algorithm) have been purposefully kept with the prior system because they require the GUC variable to be something other than an integer anyway. Signed-off-by: NJacob Champion <pchampion@pivotal.io>
-
由 Heikki Linnakangas 提交于
We had a very simplistic implementation in parse-analysis already, which converted the FILTER WHERE clause into a CASE-WHEN expression. That did not work for non-strict aggregates, and didn't deparse back into a FILTER expression nicely, to name a few problems with it. Replace it with the PostgreSQL implementation. TODO: * ORCA support. It now falls back to the Postgres planner. * I disabled the three-stage DQA plan types if there are any FILTERs
-
由 Lav Jain 提交于
-
由 Xin Zhang 提交于
Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Ashwin Agrawal 提交于
Adding timestamp to better trace the command. Adding ClusterConfiguration to improve the reuse, and remove ColdStartMaster class. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Xin Zhang 提交于
Under SEGWALREP, the mode for gp_segment_configuration is 'n' for not-in-sync and 's' for in-sync. When new primary or mirror is added to the cluster, the initial state of the mode should be 'n' instead of 's' (in the original filerep mode.) If mirror exists and synchronized with the primary, then the mode will be updated to 's' to indicate the primary-mirror pair are in-sync. Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Ashwin Agrawal 提交于
Originally, gp_add_segment() is only used to add a primary segment. We created a new gp_add_segment_primary() to keep the original functionality, which automatically generate dbid and contentid. We generalized the gp_add_segment() to be able to directly update gp_segment_configuration and pg_filespace_entry for adding any type of segments with full specification of segment and filespace mappings. In this case, the new gp_add_segment() doesn't generate dbid and contentid automatically, and rely on input parameters. Originally there is separate code path in gp_add_segment_mirror() to figure out primary dbid, such logic is actually common even with gp_add_segment(), which can add both primary and mirror. In that case, we refactor the primary dbid detection logic in common function add_segment(), and refactor the gp_add_segment_mirror() to use the add_segment() instead of add_segment_config_entry(). We update gpinitsystem to use the function gp_add_segment() instead of update the gp_segment_configuration and pg_filespace_entry tables directly via SQL. Signed-off-by: NXin Zhang <xzhang@pivotal.io>
-
由 Xin Zhang 提交于
Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Ben Christel 提交于
- Move Conan instructions above manual orca setup instructions so that people don't do manual work first (like we did). - Inline recommendation to parallelize make with -j8. - Fix formatting of code blocks in README.md See issues: - https://github.com/greenplum-db/gpdb/issues/3178 - https://github.com/greenplum-db/gpdb/pull/3332
-
由 Kavinder Dhaliwal 提交于
This commit will display the contents of the Optimizer Mem Account when the optimizer GUC is on and explain_memory_verbosity is set to 'summary'. Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
-
由 Heikki Linnakangas 提交于
When we merged with PostgreSQL 8.3, and got support for NULLS FIRST, we missed or stubbed out some code, that is required to honor NULLS FIRST in an aggregate's ORDER BY, e.g. "select array_agg(a order by b desc nulls last)"
-
由 Lisa Owen 提交于
-
由 Asim R P 提交于
Signed-off-by: NTaylor Vesely <tvesely@pivotal.io>
-
- 25 9月, 2017 4 次提交
-
-
由 Heikki Linnakangas 提交于
It wasn't very useful. ORCA and Postgres both just stack WindowAgg nodes on top of each other, and no-one's been unhappy about that, so we might as well do that, too. This reduces the difference between GPDB and the upstream implementation, and will hopefully make it smoother to switch. Rename the Window Plan node type to WindowAgg, to match upstream, now that it is fairly close to the upstream version.
-
由 Heikki Linnakangas 提交于
To match upstream.
-
由 Heikki Linnakangas 提交于
This test case could throw either "ROWS parameter cannot be negative", or "Division By Zero", depending on which gets evaluated first. Remove the division by zero error, to make it more predictable.
-
由 Richard Guo 提交于
This case is to test cancel/terminate queries concurrently while they are running or waiting in resource group.
-