1. 08 10月, 2019 1 次提交
  2. 04 10月, 2019 3 次提交
    • A
      Remove redundant lines from partition_ddl test · 6958b7e8
      Asim R P 提交于
      Spotted when analyzing a CI failure pertaining to another test.
      
      Reviewed by Heikki and Georgios.
      6958b7e8
    • A
      Strengthen the pattern to find auxiliary tables by name · eca68bbc
      Asim R P 提交于
      The privious pattern used by the test was not strong enough.  It could
      accidentally matched names of partitioned tables.  Name of a partition
      being added is generated by suffixing a random number, if no name is
      specified by the user.  E.g. "sales_1_prt_r1171829080_2_prt_usa".  At
      least one time, the test failed CI due to this weakness.
      
      The new pattern is strong enough to match only the auxiliary table
      names that end with "_<oid>".
      
      Reviewed by Heikki and Georgios.
      eca68bbc
    • H
      Force test to generate a plan similar to upstream's. · eff88417
      Heikki Linnakangas 提交于
      This test is about MergeAppends, there's even a comment saying "we want a
      plan with two MergeAppends". enable_mergejoin defaults to off in GPDB, so
      we have to enable it to get the same plan as in upstream.
      eff88417
  3. 02 10月, 2019 4 次提交
  4. 26 9月, 2019 2 次提交
    • G
      Fix GRANT/REVOKE ALL statement PANIC when the schema contains partitioned relations · 7ba2af39
      Georgios Kokolatos 提交于
      The cause of the PANIC was an incorrectly populated list containing the
      namespace information for the affected the relation. A GrantStmt contains the
      necessary objects in a list named objects. This gets initially populated during
      parsing (via the privilege_target rule) and processed during parse analysis based
      on the target type and object type to RangeVar nodes, FuncWithArgs nodes or
      plain names.
      
      In Greenplum, the catalog information about the partition hierarchies is not
      propagated to all segments. This information needs to be processed in the
      dispatcher and to be added backed in the parsed statement for the segments to
      consume.
      
      In this commit, the partition hierarchy information is expanded only for the
      target and object type required. The parsed statement gets updated
      uncoditionally of partitioned references before dispatching for required types.
      
      The privileges tests have been updated to get check for privileges in the
      segments also.
      
      Problem identified and initial patch by Fenggang <ginobiliwang@gmail.com>,
      reviewed and refactored by me.
      7ba2af39
    • A
      Fix crash in COPY FROM for non-distributed/non-replicated table · 6793882b
      Ashwin Agrawal 提交于
      Current code for COPY FROM picks mode as COPY_DISPATCH for
      non-distributed/non-replicated table as well. This causes crash. It
      should be using COPY_DIRECT, which is normal/direct mode to be used
      for such tables.
      
      The crash was exposed by following SQL commands:
      
          CREATE TABLE public.heap01 (a int, b int) distributed by (a);
          INSERT INTO public.heap01 VALUES (generate_series(0,99), generate_series(0,98));
          ANALYZE public.heap01;
      
          COPY (select * from pg_statistic where starelid = 'public.heap01'::regclass) TO '/tmp/heap01.stat';
          DELETE FROM pg_statistic where starelid = 'public.heap01'::regclass;
          COPY pg_statistic from '/tmp/heap01.stat';
      
      Important note: Yes, it's known and strongly recommended to not touch
      the `pg_statistics` or any other catalog table this way. But it's no
      good to panic either. The copy to `pg_statictics` is going to ERROR
      out "correctly" and not crash after this change with `cannot accept a
      value of type anyarray`, as there just isn't any way at the SQL level
      to insert data into pg_statistic's anyarray columns. Refer:
      https://www.postgresql.org/message-id/12138.1277130186%40sss.pgh.pa.us
      6793882b
  5. 25 9月, 2019 1 次提交
  6. 24 9月, 2019 7 次提交
    • F
      Fix issue for "grant all on all tables in schema xxx to yyy;" · ba6148c6
      Fenggang 提交于
      It has been discovered in GPDB v.6 and above that a 'GRAND ALL ON ALL TABLES IN
      SCHEMA XXX TO YYY;' statement will lead to PANIC.
      
      From the resulted coredumps, a now obsolete code in QD that tried to encode
      objects in a partition reference into RangeVars was identified as the culprit.
      The list that the resulting vars were ancored, was expecting and treating only
      StrVars. The original code was added following the premise that catalog
      informations were not available in Segments. Also it tried to optimise caching,
      yet the code was not fully writen.
      
      Instead, the offending block is removed which solves the issue and allows for
      greater alignment with upstream.
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      ba6148c6
    • H
      Allow multiple SubPlan references to a single subplan. · 50239de1
      Heikki Linnakangas 提交于
      In PostgreSQL, there can be multiple SubPlan expressions referring to the
      outputs of the same subquery, but this mechanism had been lobotomized in
      GPDB. There was a pass over the plan tree, fixup_subplans(), that
      duplicated any subplans that were referred more than once, and the rest
      of the GPDB planner and executor code assumed that there is only one
      reference to each subplan. Refactor the GPDB code, mostly cdbparallelize(),
      to remove that assumption, and stop duplicating SubPlans.
      
      * In cdbparallelize(), instead of immediately recursing into the plan tree
        of each SubPlan, process the subplan list in glob->subplans as a separate
        pass. Add a new 'recurse_into_subplans' argument to plan_tree_walker() to
        facilitate that; all other callers pass 'true' so that they still recurse.
      
      * Replace the SubPlan->qDispSliceId and initPlanParallel fields with a new
        arrays in PlannerGlobal.
      
      * In FillSliceTable(), keep track of which subplans have already been
        recursed into, and only recurse on first encounter. (I've got a feeling
        that the executor startup is doing more work than it should need to,
        to set up the slice table. The slice information is available when the
        plan is built, so why does the executor need to traverse the whole plan
        to build the slice table? But I'll leave refactoring that for another
        day..)
      
      * Move the logic to remove unused subplans into cdbparallelize(). This
        used to be done as a separate pass from standard_planner(), but after
        refactoring cdbparallelize(), it is now very convenient and logical to do
        the unused subplan removal there, too.
      
      * Early in the planner, wrap SubPlan references in PlaceHolderVars. This
        is needed in case a SubPlan reference gets duplicated to two different
        slices. A single subplan can only be executed from one slice, because
        the motion nodes in the subplan are set up to send to a particular parent
        slice. The PlaceHoldeVar makes sure that the SubPlan is evaluated only
        once, and if it's needed above the bottommost Plan node where it's
        evaluated, its values is propagated to the upper Plan nodes in the
        targetlists.
      
      There are many other plan tree walkers that still recurse to subplans from
      every SubPlan reference, but AFAICS recursing twice is harmless for all
      of them. Would be nice to refactor them, too, but I'll leave that for
      another day.
      Reviewed-by: NBhuvnesh Chaudhary <bhuvnesh2703@gmail.com>
      Reviewed-by: NRichard Guo <riguo@pivotal.io>
      50239de1
    • H
      Replace nodeSubplan cmockery test with a fault injection case. · dc345256
      Heikki Linnakangas 提交于
      The mock setup in the old test was very limited, the Node structs it set
      up were left to zeros, and even allocated with incorrect lengths (SubPlan
      vs SubPlanState). It worked just enough for the codepath that it was
      testing, but IMHO it's better to test the error "in vivo", and it requires
      less setup, too. So remove the mock test, and replace with a fault
      injector test that exercises the same codepath.
      dc345256
    • Z
      Use root's stat info instead of largest child's. · 8ca6c8d1
      Zhenghua Lyu 提交于
      Currently, for partition table we have maintained some
      stat info for root table if the GUC optimizer_analyze_root_partition
      is set so that we could use root's stat info directly.
      
      Previously we use largest child's stat info for root partition.
      This may lead to serious issue. Consider a partition table t,
      all data with null partition key goes into default partition and
      it happens to be the largest child. Then for the result size of the
      query that t join other table on partition key we will estimate 0
      because we use the default partition's stat info which contains all
      null partition key. What is worse, we may broadcast the join result.
      
      This commit fixes this issue but leave some future work to do:
      maintain STATISTIC_KIND_MCELEM and STATISTIC_KIND_DECHIST for root
      table. This commit sets the GUC gp_statistics_pullup_from_child_partition
      to false defaultly. Now the whole logic is:
        * if gp_statistics_pullup_from_child_partition is true, we try to
          use largest child's stat
        * if gp_statistics_pullup_from_child_partition is false, we first
          try to fetch root's stat:
            - if root contains stat info, that's fine, we just use it
            - otherwise, we still try to use largest child's stat
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      8ca6c8d1
    • H
      Omit slice information for SubPlans that are not dispatched separately. · 96c6d318
      Heikki Linnakangas 提交于
      Printing the slice information makes sense for Init Plans, which are
      dispatched separately, before the main query. But not so much for other
      Sub Plans, which are just part of the plan tree; there is no dispatching
      or motion involved at such SubPlans. The SubPlan might *contain* Motions,
      but we print the slice information for those Motions separately. The slice
      information was always just the same as the parent node's, which adds no
      information, and can be misleading if it makes the reader think that there
      is inter-node communication involved in such SubPlans.
      96c6d318
    • A
      Avoid gp_tablespace_with_faults test failure by pg_switch_xlog() · efd76c4c
      Ashwin Agrawal 提交于
      gp_tablespace_with_faults test writes no-op record and waits for
      mirror to replay the same before deleting the tablespace
      directories. This step fails sometime in CI and causes flaky
      behavior. The is due to existing code behavior in startup and
      walreceiver process. If primary writes big (means spanning across
      multiple pages) xlog record, flushes only partial xlog record due to
      XLogBackgroundFlush() but restarts before commiting the transaction,
      mirror only receives partial record and waits to get complete
      record. Meanwhile after recover, no-op record gets written in place of
      that big record, startup process on mirror continues to wait to
      receive xlog beyond previously received point to proceed further.
      
      Hence, as temperory workaround till the actual code problem is not
      resolved and to avoid failures for this test, switch xlog before
      emitting no-op xlog record, to have no-op record at far distance from
      previously emitted xlog record.
      efd76c4c
    • J
      Fix CTAS with gp_use_legacy_hashops GUC · 9040f296
      Jimmy Yih 提交于
      When gp_use_legacy_hashops GUC was set, CTAS would not assign the
      legacy hash class operator to the new table. This is because CTAS goes
      through a different code path and uses the first operator class of the
      SELECT's result when no distribution key is provided.
      9040f296
  7. 23 9月, 2019 2 次提交
    • Z
      Make estimate_hash_bucketsize MPP-correct · d6a567b4
      Zhenghua Lyu 提交于
      In Greenplum, when estimating costs, most of the time we are
      in a global view, but sometimes we should shift to a local
      view. Postgres does not suffer from this issue because everything
      is in one single segment.
      
      The function `estimate_hash_bucketsize` is from postgres and
      it plays a very important role in the cost model of hash join.
      It should output a result based on locally view. However, the
      input parameters like, rows in a table, and ndistinct of the
      relation, are all taken from a global view (from all segments).
      So, we have to do some compensation for it. The logic is:
        1. for broadcast-like locus, the global ndistinct is the same
           as the local one, we do the compensation by `ndistinct*=numsegments`.
        2. for the case that hash key collcated with locus, on each
           segment, there are `ndistinct/numsegments` distinct groups, so
           no need to do the compensation.
        3. otherwise, the locus has to be partitioned and not collocated with
           hash keys, for these cases, we first estimate the local distinct
           group number, and then do do the compensation.
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      d6a567b4
    • H
  8. 21 9月, 2019 1 次提交
    • H
      Enable Init Plans in queries executed locally in QEs. · 98c8b550
      Heikki Linnakangas 提交于
      I've been wondering for some time why we have disabled constructing Init
      Plans in queries that are planned in QEs, like in SPI queries that run in
      user-defined functions. So I removed the diff vs upstream in
      build_subplan() to see what happens. It turns out it was because we always
      ran the ExtractParamsFromInitPlans() function in QEs, to get the InitPlan
      values that the QD sent with the plan, even for queries that were not
      dispatched from the QD but planned locally. Fix the call in InitPlan to
      only call ExtractParamsFromInitPlans() for queries that were actually
      dispatched from the QD, and allow QE-local queries to build Init Plans.
      
      Include a new test case, for clarity, even though there were some existing
      ones that incidentally covered this case.
      98c8b550
  9. 20 9月, 2019 3 次提交
  10. 19 9月, 2019 5 次提交
  11. 17 9月, 2019 1 次提交
    • W
      Single set command failed, rollback guc value · 73cab2cf
      Weinan WANG 提交于
      * Single set command failed, rollback guc value
      
      In gpdb, guc set flow is that:
      	1. QD set guc
      	2. QD dispatch job to all QE
      	3. QE set guc
      
      For single set command, in gpdb, it is not 2pc safe. If the set command
      failed in QE, only QD guc value can rollback. For the guc which has
      GUC_GPDB_NEED_SYNC flag, it requires guc value is same in the whole session-level.
      
      To deal with it, record rollback guc in AbortTransaction to a restore
      guc list. Re-set these guc when next query coming.
      However, if it failed again, destroy all QE, since we can not have the same
      value in that session. Hopefully, guc value can synchronize success in
      later creategang stage using '-c' command.
      73cab2cf
  12. 13 9月, 2019 1 次提交
    • A
      Replace wait_for_trigger_fault() with gp_wait_until_triggered_fault() · 854233bb
      Ashwin Agrawal 提交于
      Old wait_for_trigger_fault() in setup.sql is no more
      needed. gp_wait_until_triggered_fault() now provides the same
      functionality in much better shape. Hence, deleting
      wait_for_trigger_fault().
      
      Replacing only existing usage of wait_for_trigger_fault() with
      gp_wait_until_triggered_fault().
      854233bb
  13. 12 9月, 2019 8 次提交
    • A
      Avoid flakiness for vacuum_drop_phase_ao test · 84461b97
      Ashwin Agrawal 提交于
      The "0U: End" should only be executed after vacuum has reached waiting
      for lock. If executed before then vacuum will not wait for the lock
      and invalidates the test, means makes it flaky as will fail with
      following diff
      
      --- \/tmp\/build\/e18b2f02\/gpdb_src\/src\/test\/isolation2\/expected\/vacuum_drop_phase_ao\.out    2019-09-05 00:14:41.580197372 +0000
      +++ \/tmp\/build\/e18b2f02\/gpdb_src\/src\/test\/isolation2\/results\/vacuum_drop_phase_ao\.out    2019-09-05 00:14:41.580197372 +0000
      @@ -32,11 +32,12 @@
       DELETE 4
       -- We should see that VACUUM blocks while the QE holds the access shared lock
       1&: VACUUM ao_test_drop_phase;  <waiting ...>
      +FAILED:  Forked command is not blocking; got output: VACUUM
      
       0U: END;
       END
       1<:  <... completed>
      -VACUUM
      +FAILED:  Execution failed
      
      This happens because "&:" in isolation2 framework instructs that step
      to be run in background, means the next step in separate session will
      get executed in parallel with this step.
      
      To resolve the situation adding helper
      wait_until_waiting_for_required_lock() to check if query has reached
      the intended blocking point, which is waiting for lock to be
      granted. Only after this state is reached, we wish to execute next
      command to unblock the same.
      
      Currently, the isolation2 framework lacks to provide direct support
      for such blocking behavior. Hence, for short term I feel adding the
      common helper function is good step. In long term we can similar to
      isolation framework add something simple and generic for this. Though
      I like the explicit checking for exact lock type and relation
      etc.. the current helpful function provides.
      84461b97
    • J
      Optimize the cost model of multi-stage AGG · 9936ca3b
      Jinbao Chen 提交于
      There are some problems on the old multi-stage agg cost model:
      1. We use global group number and global workmemmory to estimate
      the number of spilled tuple. But the situation between the first
      stage agg and the second stage agg is completely different.
      Tuples are randomly distributed on the group key on first stage
      agg. The number of groups on each segment is almost equal to the
      number of global groups. But in second stage agg, Distribution key
      is a subset of group key, so the number of groups on each segment
      is equal to (number of global groups / segment number). So the lld
      code can cause huge cost deviation.
      2. Using ((group number + input rows) / 2) as spilled tupule is
      3. Using global group number * 1.3 as the output rows of streaming
      agg node is very wrong. The out put row of streaming agg node
      should be group number * segment number * param.
      too rough.
      4. We use numGroups to estimate the initial size of the hash table
      in exec node. But numGroups is global group number.
      
      So we made the following changes:
      1. Use a funtion 'groupNumberPerSegemnt' to estimate the group
      number per segment on first stage agg. Use numGroups/segment number
      as the group number per segment on second stage agg.
      2. Use funtion 'spilledGroupNumber' to estimate spilled tuple number.
      3. Use spilled tuple number * segment number as output tuple number
      of streaming agg node.
      4. Use numGroups as group number per segment.
      
      Also, we have information on the number of tuples in top N groups. So we
      can predict the maximum number of tuples in the biggest segment
      when the skew occurs. When we can predict skew, enable the 1 phase agg.
      Co-authored-by: NZhenghua Lyu <kainwen@gmail.com>
      9936ca3b
    • N
      Do not hard code distrib values in partition deadlock tests · a53169ce
      Ning Yu 提交于
      The partition deadlock tests use hard coded distribution values to form
      waiting relations on different segments, we can't easily tell whether
      two rows are on the same segment or not.  Even worse, hard coded values
      are only correct when the cluster has default size (count of primaries)
      and uses default hash & reduce methods.
      
      In GDD tests we should use a helper function, segid(segid, nth), it
      returns the nth value on segment segid.  It's easier to design and
      understand the tests with it.
      
      Also put more rows to the testing tables, so segid() could always return
      a valid row.
      a53169ce
    • N
      Fix flaky partition deadlock tests · 09d82f2e
      Ning Yu 提交于
      To trigger a deadlock we need to construct several waiting relations,
      once the last waiting relation is formed the deadlock is detectable by
      the deadlock detector.  In update-deadlock-root-leaf-concurrent-op and
      delete-deadlock-root-leaf-concurrent-op we used to use `2&:` for the
      last waiting relation, the isolation2 framework will check that the
      query blocks, that is, it does not return a result in 0.5 seconds.
      However it's possible that the deadlock detector is triggered just
      within that 0.5 seconds, so the isolation2 framework will report a
      failure which makes the tests flaky.  To make these tests deterministic
      we should use `2>:` for the last waiting query, it puts the query
      background without checking.
      09d82f2e
    • A
      Revert "Avoid flakiness for vacuum_drop_phase_ao test" · 19182958
      Ashwin Agrawal 提交于
      This reverts commit 265bc393. Need to
      push the version of commit which has movement of the function to setup
      and avoid including the server_helpers.sql in these two tests. Will
      push fresh commit with that change.
      19182958
    • A
      27ba026f
    • A
      Replace "@gpcurusername@" with "@curusername@" · 8bfa2d6f
      Ashwin Agrawal 提交于
      8bfa2d6f
    • A
      Avoid flakiness for vacuum_drop_phase_ao test · 265bc393
      Ashwin Agrawal 提交于
      Add helper wait_until_waiting_for_required_lock() to check if query
      has reached the intended blocking point, which is waiting for lock to
      be granted. Only after this state is reached, we wish to execute next
      command to unblock the same.
      
      Currently, the isolation2 framework lacks to provide direct support
      for such blocking behavior. Hence, for short term I feel adding the
      common helper function is good step. In long term we can similar to
      isolation framework add something simple and generic for this. Though
      I like the explicit checking for exact lock type and relation
      etc.. the current helpful function provides.
      265bc393
  14. 11 9月, 2019 1 次提交
    • H
      Limit bypass memory usage per query instead of per session · 6159d91f
      Hubert Zhang 提交于
      In resgroup mode, bypass queries such as SET commands use
      RESGROUP_BYPASS_MODE_MEMORY_LIMIT_ON_QD and
      RESGROUP_BYPASS_MODE_MEMORY_LIMIT_ON_QE to limit the memory usage.  But these
      two value are fixed, while the memory usage of bypass queries could accumulated
      in a session. It results in bypass memory limit reached after many bypass
      queries allocated memories whose lifecylce is session level.
      
      We introduce bypassMemoryLimitBase to make the bypass memory limit as qeury
      level instead of session level
      6159d91f