1. 28 4月, 2020 2 次提交
  2. 11 1月, 2020 1 次提交
    • H
      Refactor the late "parallelization" stages of the planner. · 93abe741
      Heikki Linnakangas 提交于
      * Build a preliminary "planner slice table" at the end of planning, and
        attach it to the PlannedStmt. Executor startup turns that into the
        final executor slice table. This replaces the step where executor
        startup scanned the whole Plan tree build the slice table.
      
      * Now that the executor startup gets a pre-built planner slice table, it
        doesn't need the Flow structures for building the slice table anymore.
        Also refactor the few other remaining places in nodeMotion.c and
        nodeResult.c that accessed the Flows to use the information from the
        slice table instead. The executor no longer looks at the Flows at all,
        so we don't need to include them in the serialized plan tree anymore.
        ORCA translator doesn't need to build Flow structures anymore either.
        Instead, it now builds the planner slice table like the Postgres planner
        does.
      
      * During createplan.c processing, keep track of the current "slice", and
        attach direct dispatch and other per-slice information to the PlanSlice
        struct directly, instead of carrying it in the Flow structs. This
        renders the Flows mostly unused in the planner too, but there is still
        one thing we use the Flows for: to figure out when we need to add a
        Motion on top of a SubPlan's plan tree, to make the subplan's result
        available in the slice where the SubPlan is evaluated. There's a "sender
        slice" struct attached to a Motion during create_plan() processing to
        represent the sending slice. But the slice ID is not assigned at that
        stage yet. Motion / Slice IDs are assigned later, when the slice table
        is created.
      
      * Only set 'flow' on the topmost Plan, and in the child of a Motion.
      
      * Remove unused initplans and subplans near the end of planning, after
        set_plan_references(), but before building the planner slice table. We
        used to remove init plans and subplans a little bit earlier, before
        set_plan_references(), but there was at least one corner case involving
        inherited tables where you could have a SubPlan referring a subplan
        in the plan tree before set_plan_references(), but set_plan_references()
        transformed it away. You would end up with an unused subplan in that
        case, even though we previously removed any unused subplans. This way
        we don't need to deal with unused slices in the executor.
      
      * Rewrite the logic to account direct-dispatch cost saving in plan cost.
      Reviewed-by: NTaylor Vesely <tvesely@pivotal.io>
      93abe741
  3. 16 12月, 2019 1 次提交
    • Z
      Elide explicit motion when result relation locus is not changed. · a46400ef
      Zhenghua Lyu 提交于
      Delete or Update statement may add motions above result relation
      to determine whether the tuple is to delete or update. For example,
      
      -- t1 distributed by (b), t2 distributed by(a)
      delete from t1 using t2 where t1.b = t2.a;
      The above sql might add motion above t1 (the result relation) when creating the
      t1 join t2 plan.
      
      ExecDelete or ExecUpdate use ctid to find the tuple and ctid only make sense
      in its original segment. So greenplum will add explicit redistributed motion to
      send back each tuple and then do delete or update.
      
      Previously, the condition to add explicit redistributed motion is that no motions in
      the subplan of modify table. This can be improved:
      
      if no motion is added above the result relations, then we could elide this motion.
      if subpath's locus is equal to result relation's locus and both are hashed locus.
      a46400ef
  4. 21 11月, 2019 1 次提交
    • G
      Merge 'one-phase commit' and 'commit not prepared' · 6e0d5998
      Gang Xiong 提交于
      We introduced 'commit not prepared' command for transaction like:
         BEGIN;
         read-only queries;
         END;
      And then we introduced 'one-phase commit' for transaction goes to one single
      segment like:
         INSERT INTO tbl VALUES(1);
      They are actually the same thing, so merge the code together.
      6e0d5998
  5. 28 5月, 2019 1 次提交
    • X
      Optimize explicit transactions · b43629be
      xiong-gang 提交于
      Currently, explicit 'BEGIN' creates a full-size writer gang and starts a transaction
      on it, the following 'END' will commit the transaction in a two-phase way. It can be
      optimized for some cases:
      case 1:
      BEGIN;
      SELECT * FROM pg_class;
      END;
      
      case 2:
      BEGIN;
      SELECT * FROM foo;
      SELECT * FROM bar;
      END;
      
      case 3:
      BEGIN;
      INSERT INTO foo VALUES(1);
      INSERT INTO bar VALUES(2);
      END;
      
      For case 1, it's unnecessary to create a gang and no need to have two-phase commit.
      For case 2, it's unnecessary to have two-phase commit as the executors don't write
      any XLOG.
      For case 3, don't have to create a full-size writer gang and do two-phase commit on
      a full-size gang.
      Co-authored-by: NJialun Du <jdu@pivotal.io>
      b43629be
  6. 21 5月, 2019 4 次提交
  7. 15 3月, 2019 1 次提交
    • N
      Retire PlannerConfig::cdbpath_segments · cd4c83a4
      Ning Yu 提交于
      We used to make cost calculation with this property, it is equal to the
      segments count of the cluster, however this is wrong when the table is a
      partial one (this happens during gpexpand).  We should always get
      numsegments from the motion.
      
      The gangsize.sql test is updated as in some of its queries the slices
      order is different than before due to change of the costs.
      cd4c83a4
  8. 11 3月, 2019 1 次提交
    • N
      Retire the reshuffle method for table data expansion (#7091) · 1c262c6e
      Ning Yu 提交于
      This method was introduced to improve the data redistribution
      performance during gpexpand phase2, however per benchmark results the
      effect does not reach our expectation.  For example when expanding a
      table from 7 segments to 8 segments the reshuffle method is only 30%
      faster than the traditional CTAS method, when expanding from 4 to 8
      segments reshuffle is even 10% slower than CTAS.  When there are indexes
      on the table the reshuffle performance can be worse, and extra VACUUM is
      needed to actually free the disk space.  According to our experiments
      the bottleneck of reshuffle method is on the tuple deletion operation,
      it is much slower than the insertion operation used by CTAS.
      
      The reshuffle method does have some benefits, it requires less extra
      disk space, it also requires less network bandwidth (similar to CTAS
      method with the new JCH reduce method, but less than CTAS + MOD).  And
      it can be faster in some cases, however as we can not automatically
      determine when it is faster it is not easy to get benefit from it in
      practice.
      
      On the other side the reshuffle method is less tested, it is possible to
      have bugs in corner cases, so it is not production ready yet.
      
      In such a case we decided to retire it entirely for now, we might add it
      back in the future if we can get rid of the slow deletion or find out
      reliable ways to automatically choose between reshuffle and ctas
      methods.
      
      Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8xknWag-SkI/5OsIhZWdDgAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      1c262c6e
  9. 27 11月, 2018 1 次提交
  10. 22 11月, 2018 1 次提交
    • H
      Pick a smarter Hashed locus for LEFT and RIGHT JOINs. · 3d6c78c9
      Heikki Linnakangas 提交于
      When determining the locus for a LEFT or RIGHT JOIN, we can use the outer
      side's distribution key as is. The EquivalenceClasses from the nullable
      side are not of interest above the join, and the outer side's distribution
      key can lead to better plans, because it can be made a Hashed locus,
      rather than HashedOJ. A Hashed locus can be used for grouping, for
      example, unlike a HashedOJ.
      
      This buys back better plans for some INSERT and CTAS queries, that started
      to need Redistribute Motions after the previous commit.
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      3d6c78c9
  11. 07 11月, 2018 1 次提交
    • Z
      Adjust GANG size according to numsegments · 6dd2759a
      ZhangJackey 提交于
      Now we have  partial tables and flexible GANG API, so we can allocate
      GANG according to numsegments.
      
      With the commit 4eb65a53, GPDB supports table distributed on partial segments,
      and with the series of commits (a3ddac06, 576690f2), GPDB supports flexible
      gang API. Now it is a good time to combine both the new features. The goal is
      that creating gang only on the necessary segments for each slice. This commit
      also improves singleQE gang scheduling and does some code clean work. However,
      if ORCA is enabled, the behavior is just like before.
      
      The outline of this commit is:
      
        * Modify the FillSliceGangInfo API so that gang_size is truly flexible.
        * Remove numOutputSegs and outputSegIdx fields in motion node. Add a new
           field isBroadcast to mark if the motion is a broadcast motion.
        * Remove the global variable gp_singleton_segindex and make singleQE
           segment_id randomly(by gp_sess_id).
        * Remove the field numGangMembersToBeActive in Slice because it is now
           exactly slice->gangsize.
        * Modify the message printed if the GUC Test_print_direct_dispatch_info
           is set.
        * Explicitly BEGIN create a full gang now.
        * format and remove destSegIndex
        * The isReshuffle flag in ModifyTable is useless, because it only is used
           when we want to insert tuple to the segment which is out the range of
           the numsegments.
      
      Co-authored-by: Zhenghua Lyu zlv@pivotal.io
      6dd2759a