- 27 9月, 2017 13 次提交
-
-
由 Ashwin Agrawal 提交于
After 7e268107 started seeing warnings like ------------------ cdbgroup.c:1478:68: warning: expression which evaluates to zero treated as a null pointer constant of type 'List *' (aka 'struct List *') [-Wnon-literal-null-conversion] result_plan = (Plan*)make_motion_gather_to_QE(root, result_plan, false); ^~~~~ ------------------
-
由 Ashwin Agrawal 提交于
As part of commit efed2fcc new walrep state was added 'n' (not-in-sync). The toolkit function gp_pgdatabase__() needs to be modified as well to check for that state. Original code not using #defines but direct characters like 's', 'c' makes it very tricky find stuff in code. Hopefully, future chnages would be easy to spot and make.
-
由 Shreedhar Hardikar 提交于
This was used to keep information about the subquery join tree for pulled-up sublinks for use later in deconstruct_recurse(). With the upstream subselect merge, a JoinExpr constructed at the pull-up time itself, so this is no longer needed since the subquery join tree information is available in the constructed JoinExpr. Also with the merge, deconstruct_recurse() handles JOIN_SEMI JoinExprs. However, since GPDB differs from upstream by treating SEMI joins as INNER join for internal join planning, this commit also updates inner_join_rels correctly for SEMI joins (see regression test). Also remove unused function declaration for not_null_inner_vars(). Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Shreedhar Hardikar 提交于
1. convert_IN_to_antijoin() should fail pull-up when left relids are not a subset of available_rels, otherwise we get wrong results. See regression tests in qp_correlated_query.sql. 2. convert_EXPR_to_join() is a GPDB-only function that already handles this case via ProcessSubqueryToJoin(). Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Tom Lane 提交于
commit e5536e77 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon Aug 25 22:42:34 2008 +0000 Move exprType(), exprTypmod(), expression_tree_walker(), and related routines into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>
-
We had a bunch of fixmes that we added as part of the subselect merge; All of the fixmes are now marked as `GPDB_84_MERGE_FIXME` so that they can be grepped easily.
-
由 Dhanashree Kashid 提交于
commit dc9cc887b74bfa0d40829c4df66dead509fdd8f6 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue Sep 28 14:15:42 2010 -0400 The point of a PlaceHolderVar is to allow a non-strict expression to be evaluated below an outer join, after which its value bubbles up like a Var and can be forced to NULL when the outer join's semantics require that. However, there was a serious design oversight in that, namely that we didn't ensure that there was actually a correct place in the plan tree to evaluate the placeholder :-(. It may be necessary to delay evaluation of an outer join to ensure that a placeholder that should be evaluated below the join can be evaluated there. Per recent bug report from Kirill Simonov. Back-patch to 8.4 where the PlaceHolderVar mechanism was introduced. Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Bhuvnesh Chaudhary 提交于
commit e6ae3b5d Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue Oct 21 20:42:53 2008 +0000 Add a concept of "placeholder" variables to the planner. These are variables that represent some expression that we desire to compute below the top level of the plan, and then let that value "bubble up" as though it were a plain Var (ie, a column value). The immediate application is to allow sub-selects to be flattened even when they are below an outer join and have non-nullable output expressions. Formerly we couldn't flatten because such an expression wouldn't properly go to NULL when evaluated above the outer join. Now, we wrap it in a PlaceHolderVar and arrange for the actual evaluation to occur below the outer join. When the resulting Var bubbles up through the join, it will be set to NULL if necessary, yielding the correct results. This fixes a planner limitation that's existed since 7.1. In future we might want to use this mechanism to re-introduce some form of Hellerstein's "expensive functions" optimization, ie place the evaluation of an expensive function at the most suitable point in the plan tree. Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Dhanashree Kashid 提交于
commit e549722a Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed Feb 25 03:30:38 2009 +0000 Get rid of the rather fuzzily defined FlattenedSubLink node type in favor of making pull_up_sublinks() construct a full-blown JoinExpr tree representation of IN/EXISTS SubLinks that it is able to convert to semi or anti joins. This makes pull_up_sublinks() a shade more complex, but the gain in semantic clarity is worth it. I still have more to do in this area to address the previously-discussed problems, but this commit in itself fixes at least one bug in HEAD, as shown by added regression test case. Ref [#142356521] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
commit 19e34b62 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Aug 17 01:20:00 2008 +0000 Improve sublink pullup code to handle ANY/EXISTS sublinks that are at top level of a JOIN/ON clause, not only at top level of WHERE. (However, we can't do this in an outer join's ON clause, unless the ANY/EXISTS refers only to the nullable side of the outer join, so that it can effectively be pushed down into the nullable side.) Per request from Kevin Grittner. In passing, fix a bug in the initial implementation of EXISTS pullup: it would Assert if the EXIST's WHERE clause used a join alias variable. Since we haven't yet flattened join aliases when this transformation happens, it's necessary to include join relids in the computed set of RHS relids. Ref [#142356521] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
Since `InClauseInfo` and `OuterJoinInfo` are now combined into `SpecialJoinInfo` after merging with e006a24a; this commit remove them from the relevant places. Access `join_info_list` instead of `in_info_list` and `oj_info_list` Previously, `CdbRelDedupInfo` contained list of `InClauseInfo` s. While making join decisions and overall join processing, we traversed this list and invoked cdb specific functions: `cdb_make_rel_dedup_info()`, `cdbpath_dedup_fixup()` Since `InClauseInfo` is no longer available, `CdbRelDedupInfo` will contain list of `SpecialJoinInfo` s. All the cdb specific routines which were previously called for `InClauseInfo` list will now be called if `CdbRelDedupInfo` has valid `SpecialJoinInfo` list and if join type in `SpecialJoinInfo` is `JOIN_SEMI`. A new helper routine `hasSemiJoin()` has been added which traverses `SpecialJoinInfo` list to check if it contains `JOIN_SEMI`. Ref [#142355175] Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Dhanashree Kashid 提交于
With the new flow, we don't need the following functions: - pull_up_IN_clauses - convert_EXISTS_to_join - convert_NOT_EXISTS_to_antijoin - not_null_inner_vars - safe_to_convert_NOT_EXISTS - convert_sublink_to_join Ref [#142355175] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Ekta Khanna 提交于
Original Flow: cdb_flatten_sublinks +--> pull_up_IN_clauses +--> convert_sublink_to_join New Flow: cdb_flatten_sublinks +--> pull_up_sublinks This commit contains relevant changes for the above flow. Previously, `try_join_unique` was part of `InClauseInfo`. It was getting set in `convert_IN_to_join()` and used in `cdb_make_rel_dedup_info()`. Now, since `InClauseInfo` is not present and we construct `FlattenedSublink` instead in `convert_ANY_sublink_to_join()`. And later in the flow, we construct `SpecialJoinInfo` from `FlattenedSublink` in `deconstruct_sublink_quals_to_rel()`. Hence, adding `try_join_unique` as part of both `FlattenedSublink` and `SpecialJoinInfo`. Ref [#142355175] Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
-
- 26 9月, 2017 4 次提交
-
-
由 Jacob Champion 提交于
Several GUCs are simply enumerated strings that are parsed into integer types behind the scenes. As of 8.4, the GUC system recognizes a new type, enum, which will do this for us. Move as many as we can to the new system. As part of this, - gp_idf_deduplicate was changed from a char* string to an int, and new IDF_DEDUPLICATE_* macros were added for each option - password_hash_algorithm was changed to an int - for codegen_optimization_level, "none" is the default now when codegen is not enabled during compilation (instead of the empty string). A couple of GUCs that *could* be represented as enums (optimizer_minidump, gp_workfile_compress_algorithm) have been purposefully kept with the prior system because they require the GUC variable to be something other than an integer anyway. Signed-off-by: NJacob Champion <pchampion@pivotal.io>
-
由 Heikki Linnakangas 提交于
We had a very simplistic implementation in parse-analysis already, which converted the FILTER WHERE clause into a CASE-WHEN expression. That did not work for non-strict aggregates, and didn't deparse back into a FILTER expression nicely, to name a few problems with it. Replace it with the PostgreSQL implementation. TODO: * ORCA support. It now falls back to the Postgres planner. * I disabled the three-stage DQA plan types if there are any FILTERs
-
由 Xin Zhang 提交于
Signed-off-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Kavinder Dhaliwal 提交于
This commit will display the contents of the Optimizer Mem Account when the optimizer GUC is on and explain_memory_verbosity is set to 'summary'. Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
-
- 25 9月, 2017 4 次提交
-
-
由 Heikki Linnakangas 提交于
It wasn't very useful. ORCA and Postgres both just stack WindowAgg nodes on top of each other, and no-one's been unhappy about that, so we might as well do that, too. This reduces the difference between GPDB and the upstream implementation, and will hopefully make it smoother to switch. Rename the Window Plan node type to WindowAgg, to match upstream, now that it is fairly close to the upstream version.
-
由 Heikki Linnakangas 提交于
To match upstream.
-
由 Heikki Linnakangas 提交于
A Motion node often needs to "merge" the incoming streams, to preserve the overall sort order. Instead of carrying sort order information throughout the later stages of planning, in the Flow struct, pass it as argument directly to make_motion() and other functions, where a Motion node is created. This simplifies things. To make that work, we can no longer rely on apply_motion() to add the final Motion on top of the plan, when the (sub-)query contains an ORDER BY. That's because we no longer have that information available at apply_motion(). Add the Motion node in grouping_planner() instead, where we still have that information, as a path key. When I started to work on this, this also fixed a bug, where the sortColIdx of plan flow node may refer to wrong resno. A test case for that is included. However, that case was since fixed by other coincidental changes to partition elimination, so now this is just refactoring.
-
由 Adam Lee 提交于
Replace popen() with popen_with_stderr() which is used in external web table also to collect the stderr output of program. Since popen_with_stderr() forks a `sh` process, it's almost always sucessful, this commit catches errors happen in fwrite(). Also passes variables as the same as what external web table does. Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
-
- 21 9月, 2017 3 次提交
-
-
由 Heikki Linnakangas 提交于
It only worked for cursors declared with DECLARE CURSOR, before. You got an "there is no parameter $0" error if you tried. This moves the decision on whether a plan is "simply updatable", from the parser to the planner. Doing it in the parser was awkward, because we only want to do it for queries that are used in a cursor, and for SPI queries, we don't know it at that time yet. For some reason, the copy, out, read-functions of CurrentOfExpr were missing the cursor_param field. While we're at it, reorder the code to match upstream. This only makes the required changes to the Postgres planner. ORCA has never supported updatable cursors. In fact, it will fall back to the Postgres planner on any DECLARE CURSOR command, so that's why the existing tests have passed even with optimizer=off.
-
由 Heikki Linnakangas 提交于
There was code in gp_read_error_log(), to "manually" dispatch the call to all the segments, if it was executed in the dispatcher. This was previously necessary, because even though the function was marked with prodataaccess='s', the planner did not guarantee that it's executed in the segments, when called in the targetlist like "SELECT gp_read_error_log('tab')". Now that we have the EXECUTE ON ALL SEGMENTS syntax, and are more rigorous about enforcing that in the planner, this hack is no longer required.
-
由 Bhuvnesh Chaudhary 提交于
If there are aggregation queries with aliases same as the table actual columns and they are propagated further from subqueries and grouping is applied on the column alias it may result in inconsistent targetlists for aggregation plan causing crash. CREATE TABLE t1 (a int) DISTRIBUTED RANDOMLY; SELECT substr(a, 2) as a FROM (SELECT ('-'||a)::varchar as a FROM (SELECT a FROM t1) t2 ) t3 GROUP BY a;
-
- 17 9月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
In GPDB, we have so far used a WindowFrame struct to represent the start and end window bound, in a ROWS/RANGE BETWEEN clause, while PostgreSQL uses the combination of a frameOptions bitmask and start and end expressions. Refactor to replace the WindowFrame with the upstream representation.
-
- 15 9月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
While working on the 8.4 merge, I had a bug that tripped an Insist inside the PG_TRY-CATCH. That was very difficult to track down, because the way the error is logged here. Using ereport() includes filename and line number where it's re-emitted, not the original place. So all I got was "Unexpected internal error" in the log, with meaningless filename & lineno. This rewrites the way the error is reported so that it preserves the original filename and line number. It will also use the original error level and will preserve all the other fields.
-
- 14 9月, 2017 1 次提交
-
-
由 Heikki Linnakangas 提交于
Although I'm not too familiar with SystemTap, I'm pretty sure that recent versions can do user space tracing better. I don't think anyone is using these hacks anymore, so remove them.
-
- 12 9月, 2017 2 次提交
-
-
由 Shreedhar Hardikar 提交于
During NOT EXISTS sublink pullup, we create a one-time false filter when the sublink contains aggregates without checking for limitcount. However in situations where the sublink contains an aggregate with limit 0, we should not generate such filter as it produces incorrect results. Added regress test. Also, initialize all the members of IncrementVarSublevelsUp_context properly. Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>
-
由 Bhuvnesh Chaudhary 提交于
nMotionNodes tracks the number of Motion in a plan, and each plan node maintains nMotionNodes. Counting number of Motions in a plan node by traversing the tree and adding up nMotionNodes found in nested plans will give incorrect number of Motion nodes. So instead of using nMotionNodes, use a boolean flag to track if the subtree tree excluding the initplans contains a motion node
-
- 11 9月, 2017 2 次提交
-
-
由 Heikki Linnakangas 提交于
In commit d16710ca, I added an optimization to NOT IN subqueries, to remove any DISTINCT and ORDER BY. But on second thoughts, we should use the existing functions to do that. Also, the add_notin_subquery_rte() was not a good place to do it. It's a surprising side-effect for that function. Move the code to convert_IN_to_antijoin(), which is in line with the similar calls in convert_EXPR_to_join() and convert_IN_to_join().
-
由 Heikki Linnakangas 提交于
It occurred to me while looking at PR #1460 that when there's a DISTINCT or an ORDER BY in a subselect NOT IN (...) subselect won't make any difference to the overall result, so we can strip it off and safe the effort. In its current form, PR #1460 would pessimize that case slightly more, by forcing the subselect's resul to be gathered to a singled node for deduplication or final ordering, while before, we would do only a local ordering / deduplication on each segment. But it is a waste of effort to do that even within each segment, and this PR gets rid of that.
-
- 08 9月, 2017 2 次提交
-
-
由 Ashwin Agrawal 提交于
With commit cedd89bf "Simplify tuple serialization in Motion nodes.", the usage for this function was removed.
-
由 Ashwin Agrawal 提交于
These functions as inline functions were producing warnings, based on discussion https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/6fgKvN9QpV4/zjysjqIZAgAJ converting them to macro as upstream. Adding explicit type casting wherever needed now that DatumGetPointer() returns (char *) instead of (void *).
-
- 07 9月, 2017 3 次提交
-
-
由 Richard Guo 提交于
Verify the newval for GUC 'statement_mem' and 'max_resource_groups' only if they are actually being set. In the process of starting gpdb, one step is to check if all GUCs are valid with new values, but without actually setting them.
-
由 Heikki Linnakangas 提交于
In a stand-alone backend ("postgres --single"), you cannot realistically expect any of the infrastructure needed for MPP processing to be present. Let's force a stand-alone backend to run in utility mode, to make sure that we don't try to dispatch queries, participate in distributed transactions, or anything like that, in a stand-alone backend. Fixes github issue #3172, which was one such case where we tried to dispatch a SET command in single-user mode, and got all confused.
-
由 Haisheng Yuan 提交于
Planner generates plan that doesn't insert any motion between WorkTableScan and its corresponding RecursiveUnion, because currently in GPDB motions are not rescannable. For example, a MPP plan for recursive CTE query may look like: ``` Gather Motion 3:1 -> Recursive Union -> Seq Scan on department Filter: name = 'A'::text -> Nested Loop Join Filter: d.parent_department = sd.id -> WorkTable Scan on subdepartment sd -> Materialize -> Broadcast Motion 3:3 -> Seq Scan on department d ``` For the current solution, the WorkTableScan is always put on the outer side of the top most Join (the recursive part of RecusiveUnion), so that we can safely rescan the inner child of join without worrying about the materialization of a potential underlying motion. This is a heuristic based plan, not a cost based plan. Ideally, the WorkTableScan can be placed on either side of the join with any depth, and the plan should be chosen based on the cost of the recursive plan and the number of recursions. But we will leave it for later work. Note: The hash join is temporarily disabled for plan generation of recursive part, because if the hash table spills, the batch file is going to be removed as it executes. We have a following story to enable spilled hash table to be rescannable. See discussion at gpdb-dev mailing list: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/s_SoXKlwd6I
-
- 06 9月, 2017 2 次提交
-
-
由 Heikki Linnakangas 提交于
CdbDispatchPlan() was making a copy of the plan tree, in the same memory context as the old plan tree was in. If the plan came from the plan cache, the copy will also be stored in the CachedPlan context. That means that every execution of the cached plan will leak a copy of the plan tree in the long-lived memory context. Commit 8b693868 fixed this for cached plans being used directly with the extended query protocol, but it did not fix the same issue with plans being cached as part of a user-defined function. To fix this properly, revert the changes to exec_bind_message, and instead in CdbDispatchPlan, make the copy of the plan tree in a short-lived memory context. Aside from the memory leak, it was never a good idea to change the original PlannedStmt's planTree pointer to point to the modified copy of the plan tree. That copy has had all the parameters replaced with their current values, but on the next execution, we should do that replacement again. I think that happened to not be an issue, because we had code elsewhere that forced re-planning of all queries anyway. Or maybe it was in fact broken. But in any case, stop scribbling on the original PlannedStmt, which might live in the plan cache, and make a temporary copy that we can freely scribble on in CdbDispatchPlan, that's only used for the dispatch.
-
由 Heikki Linnakangas 提交于
They're not really per-portal settings, so it doesn't make much sense to pass them to PortalStart. And most of the callers were passing savedSeqServerHost/Port anyway. Instead, set the "current" host and port in postgres.c, when we receive them from the QD.
-
- 05 9月, 2017 2 次提交
-
-
由 Ning Yu 提交于
* Simplify tuple serialization in Motion nodes. There is a fast-path for tuples that contain no toasted attributes, which writes the raw tuple almost as is. However, the slow path is significantly more complicated, calling each attribute's binary send/receive functions (although there's a fast-path for a few built-in datatypes). I don't see any need for calling I/O functions here. We can just write the raw Datum on the wire. If that works for tuples with no toasted attributes, it should work for all tuples, if we just detoast any toasted attributes first. This makes the code a lot simpler, and also fixes a bug with data types that don't have a binary send/receive routines. We used to call the regular (text) I/O functions in that case, but didn't handle the resulting cstring correctly. Diagnosis and test case by Foyzur Rahman. Signed-off-by: NHaisheng Yuan <hyuan@pivotal.io> Signed-off-by: NNing Yu <nyu@pivotal.io>
-
由 Heikki Linnakangas 提交于
These are just pro forma, as the location field isn't used for anything after parse analysis, but let's be tidy.
-