- 28 4月, 2020 2 次提交
-
-
由 Paul Guo 提交于
It was removed during the slice refactor work. I found that when running test isolation2/terminate_in_gang_creation. This feature should be quite useful in parallel OLTP load. With this code change, test isolation2/terminate_in_gang_creation needs to be modified to be deterministic. Also added tests in regression/dispatch for this code change. Changes in other tests are not relevant (just for cleanup). Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
- 11 1月, 2020 1 次提交
-
-
由 Heikki Linnakangas 提交于
* Build a preliminary "planner slice table" at the end of planning, and attach it to the PlannedStmt. Executor startup turns that into the final executor slice table. This replaces the step where executor startup scanned the whole Plan tree build the slice table. * Now that the executor startup gets a pre-built planner slice table, it doesn't need the Flow structures for building the slice table anymore. Also refactor the few other remaining places in nodeMotion.c and nodeResult.c that accessed the Flows to use the information from the slice table instead. The executor no longer looks at the Flows at all, so we don't need to include them in the serialized plan tree anymore. ORCA translator doesn't need to build Flow structures anymore either. Instead, it now builds the planner slice table like the Postgres planner does. * During createplan.c processing, keep track of the current "slice", and attach direct dispatch and other per-slice information to the PlanSlice struct directly, instead of carrying it in the Flow structs. This renders the Flows mostly unused in the planner too, but there is still one thing we use the Flows for: to figure out when we need to add a Motion on top of a SubPlan's plan tree, to make the subplan's result available in the slice where the SubPlan is evaluated. There's a "sender slice" struct attached to a Motion during create_plan() processing to represent the sending slice. But the slice ID is not assigned at that stage yet. Motion / Slice IDs are assigned later, when the slice table is created. * Only set 'flow' on the topmost Plan, and in the child of a Motion. * Remove unused initplans and subplans near the end of planning, after set_plan_references(), but before building the planner slice table. We used to remove init plans and subplans a little bit earlier, before set_plan_references(), but there was at least one corner case involving inherited tables where you could have a SubPlan referring a subplan in the plan tree before set_plan_references(), but set_plan_references() transformed it away. You would end up with an unused subplan in that case, even though we previously removed any unused subplans. This way we don't need to deal with unused slices in the executor. * Rewrite the logic to account direct-dispatch cost saving in plan cost. Reviewed-by: NTaylor Vesely <tvesely@pivotal.io>
-
- 16 12月, 2019 1 次提交
-
-
由 Zhenghua Lyu 提交于
Delete or Update statement may add motions above result relation to determine whether the tuple is to delete or update. For example, -- t1 distributed by (b), t2 distributed by(a) delete from t1 using t2 where t1.b = t2.a; The above sql might add motion above t1 (the result relation) when creating the t1 join t2 plan. ExecDelete or ExecUpdate use ctid to find the tuple and ctid only make sense in its original segment. So greenplum will add explicit redistributed motion to send back each tuple and then do delete or update. Previously, the condition to add explicit redistributed motion is that no motions in the subplan of modify table. This can be improved: if no motion is added above the result relations, then we could elide this motion. if subpath's locus is equal to result relation's locus and both are hashed locus.
-
- 21 11月, 2019 1 次提交
-
-
由 Gang Xiong 提交于
We introduced 'commit not prepared' command for transaction like: BEGIN; read-only queries; END; And then we introduced 'one-phase commit' for transaction goes to one single segment like: INSERT INTO tbl VALUES(1); They are actually the same thing, so merge the code together.
-
- 28 5月, 2019 1 次提交
-
-
由 xiong-gang 提交于
Currently, explicit 'BEGIN' creates a full-size writer gang and starts a transaction on it, the following 'END' will commit the transaction in a two-phase way. It can be optimized for some cases: case 1: BEGIN; SELECT * FROM pg_class; END; case 2: BEGIN; SELECT * FROM foo; SELECT * FROM bar; END; case 3: BEGIN; INSERT INTO foo VALUES(1); INSERT INTO bar VALUES(2); END; For case 1, it's unnecessary to create a gang and no need to have two-phase commit. For case 2, it's unnecessary to have two-phase commit as the executors don't write any XLOG. For case 3, don't have to create a full-size writer gang and do two-phase commit on a full-size gang. Co-authored-by: NJialun Du <jdu@pivotal.io>
-
- 21 5月, 2019 4 次提交
-
-
由 Gang Xiong 提交于
This reverts commit d97a7f6c. Some behave tests failed.
-
由 Gang Xiong 提交于
-
由 Gang Xiong 提交于
This reverts commit 0630a9c7.
-
由 xiong-gang 提交于
Currently, explicit 'BEGIN' creates a full-size writer gang and starts a transaction on it, the following 'END' will commit the transaction in a two-phase way. It can be optimized for some cases: case 1: BEGIN; SELECT * FROM pg_class; END; case 2: BEGIN; SELECT * FROM foo; SELECT * FROM bar; END; case 3: BEGIN; INSERT INTO foo VALUES(1); INSERT INTO bar VALUES(2); END; For case 1, it's unnecessary to create a gang and no need to have two-phase commit. For case 2, it's unnecessary to have two-phase commit as the executors don't write any XLOG. For case 3, don't have to create a full-size writer gang and do two-phase commit on a full-size gang. Co-authored-by: NJialun Du <jdu@pivotal.io>
-
- 15 3月, 2019 1 次提交
-
-
由 Ning Yu 提交于
We used to make cost calculation with this property, it is equal to the segments count of the cluster, however this is wrong when the table is a partial one (this happens during gpexpand). We should always get numsegments from the motion. The gangsize.sql test is updated as in some of its queries the slices order is different than before due to change of the costs.
-
- 11 3月, 2019 1 次提交
-
-
由 Ning Yu 提交于
This method was introduced to improve the data redistribution performance during gpexpand phase2, however per benchmark results the effect does not reach our expectation. For example when expanding a table from 7 segments to 8 segments the reshuffle method is only 30% faster than the traditional CTAS method, when expanding from 4 to 8 segments reshuffle is even 10% slower than CTAS. When there are indexes on the table the reshuffle performance can be worse, and extra VACUUM is needed to actually free the disk space. According to our experiments the bottleneck of reshuffle method is on the tuple deletion operation, it is much slower than the insertion operation used by CTAS. The reshuffle method does have some benefits, it requires less extra disk space, it also requires less network bandwidth (similar to CTAS method with the new JCH reduce method, but less than CTAS + MOD). And it can be faster in some cases, however as we can not automatically determine when it is faster it is not easy to get benefit from it in practice. On the other side the reshuffle method is less tested, it is possible to have bugs in corner cases, so it is not production ready yet. In such a case we decided to retire it entirely for now, we might add it back in the future if we can get rid of the slow deletion or find out reliable ways to automatically choose between reshuffle and ctas methods. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8xknWag-SkI/5OsIhZWdDgAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
- 27 11月, 2018 1 次提交
-
-
由 Zhenghua Lyu 提交于
Previously the reshuffle node's numsegments is always set to the cluster size. Now we have flexible gang & dispath API, we should correct the numsegments field of reshuffle node to set it as the its lefttree's flow->numsegments. Co-authored-by: NShujie Zhang <shzhang@pivotal.io> Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
-
- 22 11月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
When determining the locus for a LEFT or RIGHT JOIN, we can use the outer side's distribution key as is. The EquivalenceClasses from the nullable side are not of interest above the join, and the outer side's distribution key can lead to better plans, because it can be made a Hashed locus, rather than HashedOJ. A Hashed locus can be used for grouping, for example, unlike a HashedOJ. This buys back better plans for some INSERT and CTAS queries, that started to need Redistribute Motions after the previous commit. Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
-
- 07 11月, 2018 1 次提交
-
-
由 ZhangJackey 提交于
Now we have partial tables and flexible GANG API, so we can allocate GANG according to numsegments. With the commit 4eb65a53, GPDB supports table distributed on partial segments, and with the series of commits (a3ddac06, 576690f2), GPDB supports flexible gang API. Now it is a good time to combine both the new features. The goal is that creating gang only on the necessary segments for each slice. This commit also improves singleQE gang scheduling and does some code clean work. However, if ORCA is enabled, the behavior is just like before. The outline of this commit is: * Modify the FillSliceGangInfo API so that gang_size is truly flexible. * Remove numOutputSegs and outputSegIdx fields in motion node. Add a new field isBroadcast to mark if the motion is a broadcast motion. * Remove the global variable gp_singleton_segindex and make singleQE segment_id randomly(by gp_sess_id). * Remove the field numGangMembersToBeActive in Slice because it is now exactly slice->gangsize. * Modify the message printed if the GUC Test_print_direct_dispatch_info is set. * Explicitly BEGIN create a full gang now. * format and remove destSegIndex * The isReshuffle flag in ModifyTable is useless, because it only is used when we want to insert tuple to the segment which is out the range of the numsegments. Co-authored-by: Zhenghua Lyu zlv@pivotal.io
-