1. 24 6月, 2020 5 次提交
    • J
      Check whether the directory exists when deleting the tablespace (#10305) · b1b99c43
      Jinbao Chen 提交于
      If the directory of tablespace does not exist, we should got a
      error on commit transaction. But error on commit transaction will
      cause a panic. So the directory of tablespace should be checked
      so that we can avoid panic.
      b1b99c43
    • H
      Only apply transformGroupedWindows() with ORCA. (#10306) · e52dd032
      Heikki Linnakangas 提交于
      * Only apply transformGroupedWindows() with ORCA.
      
      The Postgres planner doesn't need it. Move the code to do it, so that it's
      only used before passing a tree to ORCA. This doesn't change anything with
      ORCA, but with the Postgres planner, it has some benefits:
      
      * Some cases before this patch do not give correct results and now they run
        correctly (e.g. case `regress/olap_window_seq`)
      * Fixes github issue #10143.
      
      * Make transformGroupedWindows walk the entire tree
      
      The transformGroupedWindows function now recursively transforms any
      Query node in the tree that has both window functions and groupby or
      aggregates.
      
      Also fixed a pre-existing bug where we put a subquery in the target
      list of such a Query node into the upper query, Q'. This meant that
      any outer references to the scope of Q' no longer had the correct
      varattno. The fix is to place the subquery into the target list of
      the lower query, Q'' instead, which has the same range table as the
      original query Q. Therefore, the varattnos to outer references to the
      scope of Q (now Q'') don't need to be updated. Note that varlevelsup to
      scopes above Q still need to be adjusted, since we inserted a new
      scope Q'. (See comments in code for explanations of Q, Q', Q'').
      Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Co-authored-by: NHans Zeller <hzeller@vmware.com>
      Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      e52dd032
    • H
      Improve handling of target lists of window queries (#10309) · 33c4582e
      Hans Zeller 提交于
      Fixing two bugs related to handling queries with window functions and refactoring the related code.
      
      ORCA can't handle expressions on window functions like rank() over() - 1 in a target list. To avoid these, we split Query blocks that contain them into two. The new lower Query computes the window functions, the new upper Query computes the expressions.
      
      We use three mutators and walkers to help with this process:
      
      Increase the varlevelsup of outer references in the new lower Query, since we now inserted a new scope above it.
      Split expressions on window functions into the window functions (for the lower scope) and expressions with a Var substituted for the WindowFunc (for the upper scope). Also adjust the varattno for Vars that now appear in the upper scope.
      Increase the ctelevelsup for any RangeTblEntrys in the lower scope.
      The bugs we saw were related to these mutators. The second one didn't recurse correctly into the required types of subqueries, the third one didn't always increment the query level correctly. The refactor hopefully will simplify this code somewhat. For details, see individual commit messages.
      
      Note: In the 6X_STABLE branch, we currently have a temporary check that triggers a fallback to planner when we see window queries with outer refs in them. When this code gets merged into 6X, we will remove the temporary check. See #10265.
      
      * Add test cases
      * Refactor: Renaming misc variables and methods
      * Refactor RunIncrLevelsUpMutator
      
      Made multiple changes to how we use the mutator:
      
      1. Start the call with a method from gpdbwrappers.h, for two reasons:
         a) execute the needed wrapping code for GPDB calls
         b) avoid calling the walker function on the top node, since we don't
            want to increment the query level when we call the method on a
            query node
      
      2. Now that we don't have to worry anymore about finding a top-level
         query node, simplify the logic to recurse into subqueries by simply
         doing that when we encounter a Query node further down. Remove the
         code dealing with sublinks, RTEs, CTEs.
      
      3. From inside the walker functions, call GPDB methods without going
         through the wrapping layer again.
      
      4. Let the mutator code make a copy of the target entry instead of
         creating one before calling the mutator.
      
      * Refactor RunWindowProjListMutator, fix bug
      
      Same as previous commit, this time RunWindowProjListMutator gets refactored.
      This change also should fix one of the bugs we have seen, that this
      mutator did not recurse into derived tables that were inside scalar
      subqueries in the select list.
      
          Made multiple changes to how we use the mutator:
      
          1. Start the call with a method from gpdbwrappers.h, for two reasons:
             a) execute the needed wrapping code for GPDB calls
             b) avoid calling the walker function on the top node, since we don't
                want to increment the query level when we call the method on a
                query node
      
          2. Now that we don't have to worry anymore about finding a top-level
             query node, simplify the logic to recurse into subqueries by simply
             doing that when we encounter a Query node further down. Remove the
             code dealing with sublinks, RTEs, CTEs.
      
          3. From inside the walker functions, call GPDB methods without going
             through the wrapping layer again.
      
          4. Let the mutator code make a copy of the target entry instead of
             creating one before calling the mutator.
      
      * Refactor RunFixCTELevelsUpMutator, fix bug
      
      Converted this mutator into a walker, since only walkers visit RTEs, which
      makes things a lot easier.
      
      Fixed a bug where we incremented the CTE levels for scalar subqueries
      that went into the upper-level query.
      
      Otherwise, same types of changes as in previous two commits.
      
      * Refactor and reorder code
      
      Slightly modified the flow in methods CQueryMutators::ConvertToDerivedTable
      and CQueryMutators::NormalizeWindowProjList
      
      * Remove obsolete methods
      * Update expected files
      33c4582e
    • J
      Drop -Wno-variadic-macros, which is inapplicable. · 3a84a379
      Jesse Zhang 提交于
      ORCA actively uses variadic macros (__VA_ARGS__) and we used to suppress
      a warning out of pedantry (it's a widely available language extension,
      but not in C++98 standard). Now that variadic macros are part of
      standard C++11, and that we mandate C++14, drop the warning suppression.
      3a84a379
    • A
      Avoid non-transactional modification of relfrozenxid during CLUSTER · 7f7fa498
      Andrey Borodin 提交于
      Cluster calls vac_update_relstats() which in-place (non-transactional)
      modifies pg_class. Incase of CLUSTER command aborts, these changes
      can't be rolled back. This creates problem leaving behind inaccurate
      relfrozenxid and other fields.
      
      Non-transaction update to reltuples, relpages, relallvisible is fine
      but not to relfrozenxid and relminmxid. Hence, this commit avoids
      in-place updating relfrozenxid and relminmxid for CLUSTER.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/10150.
      Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
      7f7fa498
  2. 23 6月, 2020 3 次提交
    • H
      Fix pgbench --tablespace option. · f6ec65f7
      Heikki Linnakangas 提交于
      The CREATE TABLE commands constructed in pgbench had the DISTRIBUTED BY
      and TABLESPACE options the wrong way 'round, so that you got a syntax
      error. For example:
      
      $ pgbench postgres -i --tablespace "pg_default"
      creating tables...
      ERROR:  syntax error at or near "tablespace"
      LINE 1: ...22)) with (appendonly=false) DISTRIBUTED BY (bid) tablespace...
                                                                   ^
      Put the clauses in right order.
      
      We have no test coverage for this at the moment, but PostgreSQL v11 adds
      a test for this (commit ed8a7c6f). I noticed this while looking at test
      failures with the PostgreSQL v12 merge.
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      f6ec65f7
    • D
      Fix tupdesc dangling pointer segfault in HashAgg · 41ce55bf
      Denis Smirnov 提交于
      This problem manifests itself with HashAgg on the top of
      DynamicIndexScan node and can cause a segmentation fault.
      
      1. A HashAgg node initializes a tuple descriptor for its hash
      slot using a reference from input tuples (coming from
      DynamicIndexScan through a Sequence node).
      2. At the end of every partition index scan in DynamicIndexScan
      we unlink and free unused memory chunks and reset partition's
      memory context. It causes a total destruction of all objects in
      the context including partition index tuple descriptor used in a
      HashAgg node.
      As a result we get a dangling pointer in HashAgg on switching to
      a new index partition during DynamicIndexScan that can cause a
      segfault.
      41ce55bf
    • Z
      Make cdbpullup_missingVarWalker also consider PlaceHolderVar. · 2cb36320
      Zhenghua Lyu 提交于
      When planner adds a redistribute motion above this subplan, planner
      will invoke `cdbpullup_findEclassInTargetList` to make sure the
      distkey can be computed based on subplan's targetlist. When the distkey
      is an expression based on some PlaceholderVar elements in targetlist,
      the function `cdbpullup_missingVarWalker` does not handle it correctly.
      
      For example, when distkey is:
      
      ```sql
      CoalesceExpr [coalescetype=23 coalescecollid=0 location=586]
              [args]
                      PlaceHolderVar [phrels=0x00000040 phid=1 phlevelsup=0]
                              [phexpr]
                                      CoalesceExpr [coalescetype=23 coalescecollid=0 location=49]
                                              [args] Var [varno=6 varattno=1 vartype=23 varnoold=6 varoattno=1]
      ```
      
      and targetlist is:
      
      ```
      TargetEntry [resno=1]
              Var [varno=2 varattno=1 vartype=23 varnoold=2 varoattno=1]
      TargetEntry [resno=2]
              Var [varno=2 varattno=2 vartype=23 varnoold=2 varoattno=2]
      TargetEntry [resno=3]
              PlaceHolderVar [phrels=0x00000040 phid=1 phlevelsup=0]
                      [phexpr]
                              CoalesceExpr [coalescetype=23 coalescecollid=0 location=49]
                                      [args] Var [varno=6 varattno=1 vartype=23 varnoold=6 varoattno=1]
      TargetEntry [resno=4]
              PlaceHolderVar [phrels=0x00000040 phid=2 phlevelsup=0]
                      [phexpr]
                              CoalesceExpr [coalescetype=23 coalescecollid=0 location=78]
                                      [args] Var [varno=6 varattno=2 vartype=23 varnoold=6 varoattno=2]
      ```
      
      Previously only consider Var leads to `cdbpullup_missingVarWalker` fail.
      
      See Github issue: https://github.com/greenplum-db/gpdb/issues/10315 for
      details.
      
      This commit fixes the issue by considering PlaceHolderVar in function
      `cdbpullup_missingVarWalker`.
      2cb36320
  3. 22 6月, 2020 2 次提交
    • R
      Fix parameterized paths · 9cc1da61
      Richard Guo 提交于
      This patch fixes two issues related to parameterized path logic on
      master.
      
      1. When generating unique row ID on the outer/inner side for join
      JOIN_DEDUP_SEMI/JOIN_DEDUP_SEMI_REVERSE, we need to pass the param info
      of outerpath/innerpath to the projection path. Otherwise we would have
      problems when deciding whether a joinclause is movable to this join rel.
      
      2. We should not pick up the parameterized path when its required outer
      is beyond a Motion, since we cannot pass a param through Motion.
      
      Fixes issue #10012
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Reviewed-by: NJinbao Chen <jinchen@pivotal.io>
      9cc1da61
    • (
      Fix flaky appendonly test. · f860ff0c
      (Jerome)Junfeng Yang 提交于
      This fix the error:
      ```
      ---
      /tmp/build/e18b2f02/gpdb_src/src/test/regress/expected/appendonly.out
      2020-06-16 08:30:46.484398384 +0000
      +++ /tmp/build/e18b2f02/gpdb_src/src/test/regress/results/appendonly.out
      2020-06-16 08:30:46.556404454 +0000
      @@ -709,8 +709,8 @@
         SELECT oid FROM pg_class WHERE relname='tenk_ao2'));
             case    | objmod | last_sequence | gp_segment_id
              -----------+--------+---------------+---------------
            + NormalXid |      0 | 1-2900        |             1
              NormalXid |      0 | >= 3300       |             0
            - NormalXid |      0 | >= 3300       |             1
              NormalXid |      0 | >= 3300       |             2
              NormalXid |      1 | zero          |             0
              NormalXid |      1 | zero          |             1
      ```
      
      The flaky is because of the orca `CREATE TABLE` statement without
      `DISTRIBUTED BY` will treat the table as randomly distributed.
      But the planner will treat as distributed by the table's first column.
      
      ORCA:
      ```
      CREATE TABLE tenk_ao2 with(appendonly=true, compresslevel=0,
      blocksize=262144) AS SELECT * FROM tenk_heap;
      NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL
      policy entry.
      ```
      
      Planner:
      ```
      CREATE TABLE tenk_ao2 with(appendonly=true, compresslevel=0,
      blocksize=262144) AS SELECT * FROM tenk_heap;
      NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s)
      named 'unique1' as the Greenplum Database data distribution key for this
      table.
      ```
      
      So the data distribution for table tenk_ao2 is not as expected.
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      f860ff0c
  4. 19 6月, 2020 3 次提交
    • W
      Fix cursor snapshot dump xid issue · 32a3a4db
      Weinan WANG 提交于
      For cursor snapshot dump, we need to record both distributed and local
      xid. So far, we only record distributed xid in the dump, as well as,
      incorrectly assign distributed xid to local xid by dump read function.
      
      Fix it.
      32a3a4db
    • P
      Re-enable test segwalrep/dtx_recovery_wait_lsn (#10320) · fe26d931
      Paul Guo 提交于
      Enable and refactor test isolation2:segwalrep/dtx_recovery_wait_lsn
      
      The test was disabled in 791f3b01.
      Because there was concern about the change of the line number in
      sql_isolation_testcase.py in the answer file. We refactor the test
      to ease the concern and then enable the test again.
      Co-authored-by: NGang Xiong <gxiong@pivotal.io>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      fe26d931
    • P
      Avoid generating core files during testing. (#10304) · 4a61357c
      Paul Guo 提交于
      We had some negative tests that need to panic and thus generating core files
      finally if the system is configured with corefile dump. Long ago we did
      optimization to avoid generating core files in some cases. Now we found other
      new scenarios that could be further optimized.
      
      1. avoid core file generation with setrlimit() in the FATAL fault inject cod.
      Some times FATAL is upgraded to PANIC (e.g.  critical section, fail when doing
      QD prepare related work). So we could avoid generating core file for this
      scenario also. Note even if the FATAL is not upgraded, it's fine mostly to
      avoid core file generation since the process will quit soon.  With the code
      change, We avoid two core files from test isolation2:crash_recovery_dtm.
      
      2. We previously had sanity check dbid/segidx in QE:HandleFtsMessage(), and
      panic if there is inconsistency when cassert is enabled, but it seems that we
      really do not need to panic since the root cause of the failure is quite
      straightforward, and the call stack is quite simple: PostgresMain() ->
      HandleFtsMessage(), and also that part of code does not invovle shared memory
      so no need to worry about shared memory mess (else we might want a core file to
      check). Downgrading the log level to FATAL. This avoids 6 core files from test
      isolation2:segwalrep/recoverseg_from_file.
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      4a61357c
  5. 18 6月, 2020 4 次提交
    • (
      Fix CASE WHEN IS NOT DISTINCT FROM clause incorrect dump. (#10298) · 3b2aed6e
      (Jerome)Junfeng Yang 提交于
      The clause 'CASE WHEN (arg1) IS NOT DISTINCT FROM (arg2)' dump will miss
      the arg1. For example:
      ```
      CREATE OR REPLACE VIEW xxxtest AS
      SELECT
          CASE
          WHEN 'I will disappear' IS NOT DISTINCT FROM ''::text
          THEN 'A'::text
          ELSE 'B'::text
          END AS t;
      ```
      The dump will lose 'I will disappear'.
      
      ```
      SELECT
          CASE
          WHEN IS NOT DISTINCT FROM ''::text
          THEN 'A'::text
          ELSE 'B'::text
          END AS t;
      ```
      3b2aed6e
    • H
      Fix a flaky test for gdd/dist-deadlock-upsert (#10302) · a3f34ae7
      Hao Wu 提交于
      * Fix a flaky test for gdd/dist-deadlock-upsert
      
      When to run GDD probe is undermined, but it is import for the test
      gdd/dist-deadlock-upsert. If the GDD probe runs immediately after
      the 2 inter-dead-locked transactions, one of the transactions will
      be killed. The isolation2 framework consider the transaction being
      blocked if the transaction doesn't finished in 0.5 second. So, if
      the killed transaction is too early to be aborted, the test framework
      sees no dead lock.
      Analyzed-by: NGang Xiong <gxiong@pivotal.io>
      
      * rm sleep
      a3f34ae7
    • N
      resgroup: fix the cpu value of the per host status view · e0d78729
      Ning Yu 提交于
      Resource group we does not distinguish the per segment cpu usage, the
      cpu usage reported by a segment is actually the total cpu usage of all
      the segments on the host.  This is by design, not a bug.  However, in
      the gp_toolkit.gp_resgroup_status_per_host view it reports the cpu usage
      as the sum of all the segments on the same host, so the reported per
      host cpu usage is actually N times of the actual usage, where N is the
      count of the segments on that host.
      
      Fixed by reporting the avg() instead of the sum().
      
      Tests are not provided as the resgroup/resgroup_views did not verify cpu
      usages since the beginning, because the cpu usage is unstable on
      pipelines.  However, I have verified manually.
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      e0d78729
    • J
      Enable brin in ao/aocs table (#9537) · 46d9e26a
      Jinbao Chen 提交于
      We merge the brin from Postgres9.5, but greenplum did not enable
      brin on ao/aocs table.
      
      The reason brin cannot be used directly on the ao / aocs table is
      that the storage structure of ao / aocs is different from the heap
      table. Heap has only one physical file, and all block numbers are
      continuous. The revmap in brin is a array that spans multiple
      blocks, but it does not make sense in ao/aocs table.
      
      Ao/aocs has 128 segment files, and the block numbers in these
      segments are distributed over the entire value range. If we use an
      array to record the information of each block, this array will be
      too large.
      
      So we introduced an upper structure to solve this problem. The
      upper level is a array which records the block number of the
      revmap block. The revmap blocks are not continuous. When we need
      an new revmap block, just extend a new one and record the block
      number in the upper level array.
      Reviewed-by: NAsim R P <pasim@vmware.com>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Reviewed-by: Nxiong-gang <gxiong@pivotal.io>
      Reviewed-by: NAdam Lee <adam8157@gmail.com>
      46d9e26a
  6. 17 6月, 2020 6 次提交
    • H
      Disallow to change the distribution policy to REPLICATED for partition table (#10313) · 78cccb81
      Hao Wu 提交于
      This patch fixes the issue: https://github.com/greenplum-db/gpdb/issues/10224
      Replicated table is not allowed to be a partition table.
      So an existing partition table must not be altered its
      distribution policy to REPLICATED.
      Reported-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      78cccb81
    • Z
      GetLatestSnapshot on QEs always return without distributed snapshot. · d8f4a45f
      Zhenghua Lyu 提交于
      Greenplum tests the visibility of heap tuples firstly using
      distributed snapshot. Distributed snapshot is generated on
      QD and then dispatched to QEs. Some utility statement needs
      to work under the latest snapshot when executing, so that they
      invoke the function `GetLatestSnapshot` in QEs. But remember
      we cannot get the latest distributed snapshot.
      
      Subtle cases are: Alter Table or Alter Domain statements on QD
      get snapshot in Portal Run and then try to hold locks on the
      target table in ProcessUtilitySlow. Here is the key point:
        1. try to hold lock ==> it might be blocked by other transactions
        2. then it will be waked up to continue
        3. when it can continue, the world has changed because other transactions
           then blocks it has been over
      
      Previously, on QD we do not getsnapshot before we dispatch utility
      statement to QEs which leads to the distributed snapshot does not
      reflect the "world change". This will lead to some bugs. For example,
      if the first transaction is to rewrite the whole heap, and then
      the second Alter Table or Alter Domain statements continues with
      the distributed snapshot that txn1 does not commit yet, it will
      see no tuples in the new heap!
      
      This commit fixes the issue by getting a local snapshot when
      invoking `GetLatestSnapshot` when in QEs.
      
      See Github issue: https://github.com/greenplum-db/gpdb/issues/10216Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
      d8f4a45f
    • T
      Remove dtx_recovery_wait_lsn test · 791f3b01
      Tyler Ramer 提交于
      The test addressed in this commit was added in commit f3df8b18, fails
      for the entirely unrelated reason that, due to a modification of
      sql_isolation_testcase.py, the line numbers are different.
      
      I find this test very fragile for this reason, and for the fact that
      we're relying on an execution failure in isolation2 python code to test
      the database code. This means that any refactoring of isolation2 will
      cause this test to fail - which should not be.
      
      I looked into adding an ignore to the exact lines, but isolation2 wants
      there to be a matched ignore in the input sql file - which makes the
      test useless, because we're looking for some exact exception from
      isolation2 from a valid sql input. Isolation2 doesn't give us the
      framework to ignore just some messages on the output side. Using a
      isolation2 init modification still just ignores the actual problem, but
      in a different file.
      
      This fix should just be considered a tempory work to get the pipeline
      green while a better solution is determined later.
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      791f3b01
    • T
      Update isolation2 expected output considering changes in pg · 1131c5a9
      Tyler Ramer 提交于
      The update to pygresql pg connection allows the output of sql isolation2
      testing to be more similar to psql. Thus, we are reverting some of the
      changes made in commits 20b3aa3a to instead be more inline with the
      usual psql output. Notably, trailing zeroes on floats are trimmed.
      Co-authored-by: NTyler Ramer <tramer@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      1131c5a9
    • T
      Refactor dbconn · 330db230
      Tyler Ramer 提交于
      One reason pygresql was previously modified was that it did not handle
      closing a connection very gracefully. In the process of updating
      pygresql, we've wrapped the connection it provides with a
      ClosingConnection function, which should handle gracefully closing the
      connection when the "with dbconn.connect as conn" syntax is used.
      
      This did, however, illustrate issues where a cursor might have been
      created as the result of a dbconn.execSQL() call, which seems to hold
      the connection open if not specifically closed.
      
      It is therefore necessary to remove the ability to get a cursor from
      dbconn.execSQL(). To highlight this difference, and to ensure that
      future calls to this library is easy to use, I've cleaned up and
      clarified the dbconn execution code, to include the following features.
      
      - dbconn.execSQL() closes the cursor as part of the function. It returns
        no rows
      - functions dbconn.query() is added, which behaves like dbconn.execSQL()
        except that it now returns a cursor
      - function dbconn.execQueryforSingleton() is renamed
        dconn.querySingleton()
      - function dbconn.execQueryforSingletonRow() is renamed
        dconn.queryRow()
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      330db230
    • T
      Update PyGreSQL from 4.0.0 to 5.1.2 · f5758021
      Tyler Ramer 提交于
      This commit updates pygresql from 4.0.0 to 5.1.2, which requires
      numerous changes to take advantages of the major result syntax change
      that pygresql5 implemented. Of note, cursors or query objects
      automatically cast returned values as appropriate python types - list of
      ints, for example, instead of a string like "{1,2}". This is the bulk of
      the changes.
      
      Updating to pygresql 5.1.2 provides numerous benfits, including the
      following:
      
      - CVE-2018-1058 was addressed in pygresql 5.1.1
      
      - We can save notices in the pgdb module, rather than relying on importing
      the pg module, thanks to the new "set_notices()"
      
      - pygresql 5 supports python3
      
      - Thanks to a change in the cursor, using a "with" syntax guarentees a
        "commit" on the close of the with block.
      
      This commit is a starting point for additional changes, including
      refactoring the dbconn module.
      
      Additionally, since isolation2 uses pygresql, some pl/python scripts
      were updated, and isolation2 SQL output is further decoupled from
      pygresql. The output of a psql command should be similar enough to
      isolation2's pg output that minimal or no modification is needed to
      ensure gpdiff can recognize the output.
      Co-Authored-by: NTyler Ramer <tramer@pivotal.io>
      Co-authored-by: NJamie McAtamney <jmcatamney@pivotal.io>
      f5758021
  7. 16 6月, 2020 5 次提交
    • J
      Properly mark null return from combine functions · 736898ad
      Jesse Zhang 提交于
      We had a bug in a few of the combine functions where if the combine
      function returned a NULL, it didn't set fcinfo->isnull = true. This led
      to a segfault when we would spill in the final hashagg of a two-stage
      agg inside the serial function. So, properly mark NULL outputs from the
      combine functions.
      Co-authored-by: NDenis Smirnov <sd@arenadata.io>
      Co-authored-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
      736898ad
    • J
      Fix double deduction of FREEABLE_BATCHFILE_METADATA · 66a0cb4d
      Jesse Zhang 提交于
      Earlier, we always deducted FREEABLE_BATCHFILE_METADATA inside
      closeSpillFile() regardless of whether the spill file was already
      suspended. This deduction, is already performed inside
      suspendSpillFiles(). This double accounting leads to
      hashtable->mem_for_metadata becoming negative and we get:
      
      FailedAssertion("!(hashtable->mem_for_metadata > 0)", File: "execHHashagg.c", Line: 2141)
      Co-authored-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
      66a0cb4d
    • J
      Fix assert condition in spill_hash_table() · 067bb350
      Jesse Zhang 提交于
      This commit fixes the following assertion failure message reported in:
      (#9902) https://github.com/greenplum-db/gpdb/issues/9902
      
      FailedAssertion("!(hashtable->nbuckets > spill_set->num_spill_files)", File: "execHHashagg.c", Line: 1355)
      
      hashtable->nbuckets can actually end up being equal to
      spill_set->num_spill_files, which causes the failure. This is because:
      
      hashtable->nbuckets is set with HashAggTableSizes->nbuckets, which can
      end up being equal to: gp_hashagg_default_nbatches. Refer:
      nbuckets = Max(nbuckets, gp_hashagg_default_nbatches);
      
      Also, spill_set->num_spill_files is set with
      HashAggTableSizes->nbatches, which is further set to
      gp_hashagg_default_nbatches.
      
      Thus, these two entities can be equal.
      Co-authored-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
      067bb350
    • (
      Increase retry count for pg_rewind tests' replication promotion and streaming. (#10292) · a3d8302a
      (Jerome)Junfeng Yang 提交于
      Increase the retry count to prevent test failed. Most of the time, the
      failure is because slow processing.
      a3d8302a
    • C
      Fix ICW test if GPDB compiled without ORCA · 9aa2b26c
      Chris Hajas 提交于
      We need to ignore the output when enabling/disabling an Orca xform, as
      if the server is not compiled with Orca there will be a diff (and we
      don't really care about this output).
      
      Additionally, clean up unnecessaary/excessive setting of GUCs
      
      Some of these gucs were on by default or only intended for a specific
      test. Explicitly setting them caused them to appear at the end of
      `explain verbose` plans, making the expected output more difficult to
      match with if the server was built with/without Orca.
      9aa2b26c
  8. 15 6月, 2020 4 次提交
    • P
      Retry more for replication synchronization waiting to avoid isolation2 test flakiness. (#10281) · ca360700
      Paul Guo 提交于
      Some test cases have been failing due to too few retries. Let's increase them and also
      create some common UDF for use.
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      ca360700
    • P
      Fix flakiness of "select 1" output after master reset due to injected panic... · 02ad1fc4
      Paul Guo 提交于
      Fix flakiness of "select 1" output after master reset due to injected panic fault before_read_command (#10275)
      
      Several tests inject panic in before_read_command to trigger master reset.
      Previous we run "select 1" after the fault inject query to verify, but the
      output is not deterministic sometimes. i.e. sometimes we do not see the line
      
      PANIC:  fault triggered, fault name:'before_read_command' fault type:'panic'
      
      This was actually observed in test crash_recovery_redundant_dtx per commit
      message and test comment.  It ignores the output of "select 1", but probably
      we still want the output to verify the fault is encountered.
      
      It's still mysterious why sometimes the PANIC message is missing. I spent some
      time on digging but reckon that I can not root cause in short time. One guess
      is that the PANIC message was although sent to the frontend in errfinish() but
      the kernel buffer-ed data was dropped after abort() due to ereport(PANIC);
      Another guess is something wrong related to libpq protocol (not saying it's a
      libpq bug).  In any case, it does not deserve much time to work on the tests
      only, so simply mask the PANIC message to make the test result deterministic
      and also not affect the test purpose.
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      02ad1fc4
    • X
      Move to a resource group with memory_limit 0 · 37a19376
      xiong-gang 提交于
      When move a query to a resource group whose memory_limit is 0, the available
      memory is the current available global shared memory.
      37a19376
    • X
      Fix a recursive AbortTransaction issue · b5c4fdc0
      xiong-gang 提交于
      When the error happens after ProcArrayEndTransaction, it will recurse back to
      AbortTransaction, we need to make sure it will not generate extra WAL record
      and not fail the assertions.
      b5c4fdc0
  9. 13 6月, 2020 2 次提交
  10. 12 6月, 2020 1 次提交
    • (
      Create external table fdw extension under gpcontrib. (#10187) · d86f32e5
      (Jerome)Junfeng Yang 提交于
      Remove pg_exttable.h since the catalog is no longer exist anymore.
      Move function declaration in pg_exttable.h into external.h.
      Extract related code into external.c which maintains all codes that
      can not be moved into an external table fdw extension.
      
      Also, move the external table orca interface into external.c as a workaround.
      Maybe provide orca fdw routine in the future.
      
      Extract the external table's execution logic into external table fdw
      extension.
      
      Create the gp_exttable_fdw extension during gpinitsystem to allow
      creating system external tables.
      d86f32e5
  11. 11 6月, 2020 3 次提交
    • H
      Revert "Fix flaky test exttab1" · f538f4b6
      Hubert Zhang 提交于
      This reverts commit 026e4595.
      This commit break pxf test case. We need to handle it firstly.
      f538f4b6
    • H
      Fix flaky test terminate_in_gang_creation · 63b5adf9
      Hubert Zhang 提交于
      The test case restarts all primaries and expects the old session
      would fail for the next query since gangs are cached.
      But the restart may last more than 18s which is the max idle
      time QEs could exist. In this case, the new query in the old
      session will just fetch a new gang without expected errors.
      Just set gp_vmem_idle_resource_timeout to 0 to fix this flaky test.
      Reviewed-by: NPaul Guo <pguo@pivotal.io>
      63b5adf9
    • H
      Fix flaky test exttab1 · 026e4595
      Hubert Zhang 提交于
      The flaky case happens when select an external table with option
      "fill missing fields". By gdb the qe, this value is not false
      on QE sometimes. When ProcessCopyOptions, we use intVal(defel->arg)
      to parse the boolean value, which is not correct. Using defGetBoolean
      to replace it.
      026e4595
  12. 10 6月, 2020 2 次提交
    • H
      Add GUC write_to_gpfdist_timeout (#10214) · ab737132
      Huiliang.liu 提交于
      * Add GUC write_to_gpfdist_timeout
      
      write_to_gpfdist_timeout controls timeout value (in seconds) for writing data to gpfdist server. Default value is 300, valid scope is [1, 7200]
      
      Set CURLOPT_TIMEOUT as write_to_gpfdist_timeout
      For any error, retry with double interval time, returns SQL ERROR if write_to_gpfdist_timeout is reached
      
      Add regression test for GUC writable_external_table_timeout
      ab737132
    • C
      Fix test_gpdb slack command pipeline to work with new master changes · 6a979eec
      Chris Hajas 提交于
      The python changes required new images, so we now need a separate
      pipeline for slack commands from 6X. We also no longer need libsigar.
      6a979eec