1. 14 8月, 2018 13 次提交
    • R
      Fix 'stack_base_ptr' issue with '--enable-testutils'. · 0aa9289a
      Richard Guo 提交于
      This fixes  'stack_base_ptr' assertion failure with '--enable-testutils'
      and also revises related codes to keep the same with upstream.
      0aa9289a
    • H
      Fix INSERT RETURNING on partitioned table. · 62401aaa
      Heikki Linnakangas 提交于
      The ResultRelInfos we build for the partitions, in slot_get_partition(),
      don't contain the ProjectInfo needed to execute RETURNING. We need to
      look that up in the parent ResultRelInfo, and when executing it, be
      careful to use the "parent" version of the tuple, the one before
      mapping the columns for the target partition.
      
      Fixes github issue #4735.
      62401aaa
    • P
      Fix memory bug within cdbdisp_get_PQerror · f72429a7
      Pengzhou Tang 提交于
      cdbdisp_get_PQerror create a new error data object and initialize it
      with filename, function values from QE, errdata need a const filename,
      function, it does not copy it in ErrorContext. The problem is filename
      and function was point to a unstable memory, so when edata is used later,
      it may report a SIGSEGV. To resolve this, copy them in the transaction
      context because this error data can only be used inside current transaction.
      f72429a7
    • P
      eagerfree in executor: support index only scan. (#5462) · f445f830
      Paul Guo 提交于
      Index only scan is a new feature in PG9.2 merge. We do not have the support in
      eagerfree related functions.
      f445f830
    • P
      Refine dispatching of COPY command · a1b6b2ae
      Pengzhou Tang 提交于
      Previously, COPY use CdbDispatchUtilityStatement directly to
      dispatch 'COPY' statements to all QEs and then send/receive
      data from primaryWriterGang, this way happens to work because
      primaryWriterGang is not recycled when a dispatcher state is
      destroyed. This seems nasty because the COPY command has finished
      logically.
      
      This commit splits the COPY dispatching logic to two parts to
      make it more reasonable.
      a1b6b2ae
    • P
      Add two helper functions to construct query parms · b8fb0957
      Pengzhou Tang 提交于
      * cdbdisp_buildUtilityQueryParms
      * cdbdisp_buildCommandQueryParms
      b8fb0957
    • P
      Remove cdbdisp_finishCommand · 957629d1
      Pengzhou Tang 提交于
      Previously, cdbdisp_finishCommand did three things:
      1. cdbdisp_checkDispatchResult
      2. cdbdisp_getDispatchResult
      3. cdbdisp_destroyDispatcherState
      
      However, cdbdisp_finishCommand didn't make code cleaner or more
      convenient to use, in contrast, it makes error handling more
      difficult and makes code more complicated and inconsistent.
      
      This commit also reset estate->dispatcherState to NULL to avoid
      re-entry of cdbdisp_* functions.
      957629d1
    • P
      Rename CdbCheckDispatchResult for name convention · 60bd3ab2
      Pengzhou Tang 提交于
      Use cdbdisp_checkDispatchResult instead of CdbCheckDispatchResult
      to be consistent of cdbdisp_* functions.
      60bd3ab2
    • P
      Do not call CdbDispatchPlan() to dispatch nothing · 4f38b425
      Pengzhou Tang 提交于
      Previously, CdbDispatchPlan might be called to dispatch nothing if
      1. init plan is parallel but main plan is not. CdbDispatchPlan is
      still called for main plan.
      2. init plan is not parallel, CdbDispatchPlan is still called for
      init plan.
      
      The reason is DISPATCH_PARALLEL stands for the whole plan include
      main plan and init plans, this commit add ways to seperately tell
      which plan is parallel exactly to avoid unnecessary dispatching.
      4f38b425
    • L
      Upgrade docker Open JDK to 1.8 (#5457) · 318c0c74
      Lav Jain 提交于
      318c0c74
    • K
      Add madlib jobs to generated master pipeline. (#5469) · e1db3846
      kaknikhil 提交于
      PR https://github.com/greenplum-db/gpdb/pull/5432 was merged but the
      master pipeline wasn't recreated from the template. This PR generates
      the master pipeline from the template so that the madlib jobs can be run
      on the master pipeline.
      e1db3846
    • J
      Add madlib_build_gppkg job to master pipeline (#5432) · 0a974e5f
      Jingyi Mei 提交于
      * Add madlib_build_gppkg job to master pipeline
      
      The current gpdb master pipeline fetches madlib gppkg compiled against
      ealier version of gpdb code, installs the gppkg and runs dev check in a
      container with latest gpdb installed. If there is catalog change in gpdb,
      the test will fail.
      
      To solve this issue, we add a job build_madlib_gppkg to compile madlib
      gppkg from soure and pass it to downstream dev check jobs so that madlib is
      always compiled and tested against latest catalog change.
      Co-authored-by: NDomino Valdano <dvaldano@pivotal.io>
      0a974e5f
    • J
      Move distributed_transactions regression test in greenplum_schedule · 255c3c1e
      Jimmy Yih 提交于
      The distributed_transactions test contains a serializable
      transaction. This serializable transaction may intermittently cause
      the appendonly test to fail when run in the same test group. The
      appendonly test runs VACUUM on some appendonly tables and checks that
      last_sequence is nonzero in gp_fastsequence. Serializable transactions
      make concurrent VACUUM operations on appendonly tables exit early.
      
      To fix the contention, let's move the distributed_transactions test to
      another test group.
      
      appendonly test failure diff:
      *** 632,640 ****
         NormalXid |      0 | t        |             0
         NormalXid |      0 | t        |             1
         NormalXid |      0 | t        |             2
      !  NormalXid |      1 | t        |             0
      !  NormalXid |      1 | t        |             1
      !  NormalXid |      1 | t        |             2
        (6 rows)
      
      --- 630,638 ----
         NormalXid |      0 | t        |             0
         NormalXid |      0 | t        |             1
         NormalXid |      0 | t        |             2
      !  NormalXid |      1 | f        |             0
      !  NormalXid |      1 | f        |             1
      !  NormalXid |      1 | f        |             2
        (6 rows)
      
      Repro:
      1: CREATE TABLE heap_table (a int, b int);
      1: INSERT INTO heap_table SELECT i, i FROM generate_series(1,100)i;
      1: CREATE TABLE ao_table WITH (appendonly=true) AS SELECT * FROM heap_table;
      1: SELECT gp_segment_id, * FROM gp_dist_random('gp_fastsequence') WHERE gp_segment_id = 0;
      2: BEGIN ISOLATION LEVEL SERIALIZABLE;
      2: SELECT 1;
      1: VACUUM ao_table; -- VACUUM exits early
      1: SELECT gp_segment_id, * FROM gp_dist_random('gp_fastsequence') WHERE gp_segment_id = 0;
      2: END;
      1: VACUUM ao_table; -- VACUUM completes
      1: SELECT gp_segment_id, * FROM gp_dist_random('gp_fastsequence') WHERE gp_segment_id = 0;
      255c3c1e
  2. 13 8月, 2018 9 次提交
  3. 11 8月, 2018 3 次提交
    • L
      docs - add missing gp-specific options to pg_dump (#5433) · e381b848
      Lisa Owen 提交于
      * docs - add missing gp-specific options to pg_dump
      
      * qualify the options as unsupported
      
      * use a note
      e381b848
    • A
      Bump ORCA version to v2.68.0 · e2b805b8
      Ashuka Xue 提交于
      Signed-off-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      e2b805b8
    • A
      Adding GiST support for GPORCA · ec3693e6
      Ashuka Xue 提交于
      Prior to this commit, there was no support for GiST indexes in GPORCA.
      For queries involving GiST indexes, ORCA was selecting Table Scan paths
      as the optimal plan. These plans could take up to 300+ times longer than
      Planner, which generated a index scan plan using the GiST index.
      
      Example:
      ```
      CREATE TABLE gist_tbl (a int, p polygon);
      CREATE TABLE gist_tbl2 (b int, p polygon);
      CREATE INDEX poly_index ON gist_tbl USING gist(p);
      
      INSERT INTO gist_tbl SELECT i, polygon(box(point(i, i+2),point(i+4,
      i+6))) FROM generate_series(1,50000)i;
      INSERT INTO gist_tbl2 SELECT i, polygon(box(point(i+1, i+3),point(i+5,
      i+7))) FROM generate_series(1,50000)i;
      
      ANALYZE;
      ```
      With the query `SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE
      gist_tbl.p <@ gist_tbl2.p;`, we see a performance increase with the
      support of GiST.
      
      Before:
      ```
      EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
                                                           QUERY PLAN
      ---------------------------------------------------------------------------------------------------------------------
       Aggregate  (cost=0.00..171401912.12 rows=1 width=8)
         ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..171401912.12 rows=1 width=8)
               ->  Aggregate  (cost=0.00..171401912.12 rows=1 width=8)
                     ->  Nested Loop  (cost=0.00..171401912.12 rows=335499869 width=1)
                           Join Filter: gist_tbl.p <@ gist_tbl2.p
                           ->  Table Scan on gist_tbl2  (cost=0.00..432.25 rows=16776 width=101)
                           ->  Materialize  (cost=0.00..530.81 rows=49997 width=101)
                                 ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..525.76 rows=49997 width=101)
                                       ->  Table Scan on gist_tbl  (cost=0.00..432.24 rows=16666 width=101)
       Optimizer status: PQO version 2.65.1
      (10 rows)
      
      Time: 170.172 ms
      SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
       count
      -------
       49999
      (1 row)
      
      Time: 546028.227 ms
      ```
      
      After:
      ```
      EXPLAIN SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
                                                        QUERY PLAN
      ---------------------------------------------------------------------------------------------------------------
       Aggregate  (cost=0.00..21749053.24 rows=1 width=8)
         ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..21749053.24 rows=1 width=8)
               ->  Aggregate  (cost=0.00..21749053.24 rows=1 width=8)
                     ->  Nested Loop  (cost=0.00..21749053.24 rows=335499869 width=1)
                           Join Filter: true
                           ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..526.39 rows=50328 width=101)
                                 ->  Table Scan on gist_tbl2  (cost=0.00..432.25 rows=16776 width=101)
                           ->  Bitmap Table Scan on gist_tbl  (cost=0.00..21746725.48 rows=6667 width=1)
                                 Recheck Cond: gist_tbl.p <@ gist_tbl2.p
                                 ->  Bitmap Index Scan on poly_index  (cost=0.00..0.00 rows=0 width=0)
                                       Index Cond: gist_tbl.p <@ gist_tbl2.p
       Optimizer status: PQO version 2.65.1
      (12 rows)
      
      Time: 617.489 ms
      
      SELECT count(*) FROM gist_tbl, gist_tbl2 WHERE gist_tbl.p <@ gist_tbl2.p;
       count
      -------
       49999
      (1 row)
      
      Time: 7779.198 ms
      ```
      
      GiST support was implemented by sending over GiST index information to
      GPORCA in the metadata using a new index enum specifically for GiST.
      Signed-off-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>
      ec3693e6
  4. 10 8月, 2018 7 次提交
  5. 09 8月, 2018 5 次提交
    • H
      Remove obsolete comment. · 31e863a9
      Heikki Linnakangas 提交于
      31e863a9
    • D
      Fix version comparison bug · dd8230dd
      Daniel Gustafsson 提交于
      This was caused by a misplaced parenthesis which caused the check
      to always return false.
      dd8230dd
    • D
      Fix a few small leaks in pg_upgrade · 92d5946b
      Daniel Gustafsson 提交于
      If we don't find any AO tables we exit early, but we failed to close
      the PQExpBuffer we had for the query. Fix by destroying the buffer
      explicitly.
      
      Move the freeing of numeric_rels to be unconditionally since pg_free
      can cope with a NULL pointer.
      
      Save the quote_identifier() returned strings in a char * rather than
      passing them to fprint() so we can pg_free() them on the way out.
      92d5946b
    • D
      Plug some trivial memory leaks in pg_dump and pg_upgrade. · e4f36f46
      Daniel Gustafsson 提交于
      This is a partial backport of the below commit from upstream, with
      one hunk removed as it touches code yet in the future of this fork
      and another hunk massaged to account for path changes. Since we
      want the pg_upgrade hunk from this commit to silenve Coverity, we
      may as well grab all the hunks that are relevant and close those
      leaks ahead time in the fork.
      
        commit f712289f
        Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
        Date:   Thu Jul 2 20:58:51 2015 +0300
      
          Plug some trivial memory leaks in pg_dump and pg_upgrade.
      
          There's no point in trying to free every small allocation in these
          programs that are used in a one-shot fashion, but these ones seems like
          an improvement on readability grounds.
      
          Michael Paquier, per Coverity report.
      e4f36f46
    • D
      Fix typo in comment · 401a8d43
      Daniel Gustafsson 提交于
      401a8d43
  6. 08 8月, 2018 3 次提交
    • J
      Correctly detoast const datum during translation to array expr · c417d2b0
      Jesse Zhang and Melanie Plageman 提交于
      For a PL/pgSQL function like the following:
      
      set optimizer_trace_fallback to on;
      CREATE OR REPLACE FUNCTION boom()
      RETURNS bool AS $$
      DECLARE
      	mel bool;
      	sesh int[];
      BEGIN
      	sesh := '{42,1}'::int[]; -- query 1
      	select c = ANY (sesh) INTO mel FROM (values (42), (0)) nums(c); -- query 2
      	return mel;
      END
      $$
      LANGUAGE plpgsql VOLATILE;
      
      SELECT boom();
      
      With Orca enabled, the database crashes.  Starting in 9.2, PL/pgSQL
      supplies bound param values in more statement types to enable planner to
      fold constants in more cases. This is in contrast to leaving the param
      intact and waiting until execution to substitute it with its values.
      Previously, only dynamic execution ("EXECUTE 'SELECT $1' USING sesh")
      gets this treatment.  This revealed the bug because Orca would not have
      been able to plan queries whose query trees included params that were
      not in subplans (external params) and would just fall back.
      
      When query 1 is planned, it is translated into select '{42,1}'::int[];
      For uninteresting reasons, the planner-produced plan for query 1 is
      considered "simple", and the ORCA-produced plan is considered regular
      (not simple). PL/pgSQL has a fast-path for "simple" plans, minimally
      starting the executor via `ExecEvalExpr`. Regular plans are executed
      through SPI. During execution, SPI will pack (as part of
      `heap_form_tuple`) the 4-byte header datum into a 1-byte header datum.
      
      While planning query 2, we will attempt to substitute the param "sesh"
      with the actual const value during pre-processing.  Since Orca doesn't
      recognize const arrays as arrays, the translator will take the
      additional step of translating the const into an array expression.  When
      accessing the array-typed const, we need to "unpack"
      (`DatumGetArrayTypeP`) the datum.  This commit does that.
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      c417d2b0
    • Y
      Improved handling of empty lines and comments in commands log · bcf1345c
      yanchaozhong 提交于
      There are a lot of blank lines in the 'gpinitsystem' log. These are
      read directly from the configuration file using commands 'cat'. The
      comments are turned into blank lines, but they are not removed:
      
        gpinitsystem:node:gp6-[INFO]:-Start Main
        gpinitsystem:node:gp6-[INFO]:-Command line options passed to utility = -c ../gpinitsystem_config
        gpinitsystem:node:gp6-[INFO]:-Start Function CHK_GPDB_ID
        ...
        gpinitsystem:node:gp6-[INFO]:-End Function CHK_FILE
        gpinitsystem:node:gp6-[INFO]:-Dumping gpinitsystem_config to logfile for reference
      
        ARRAY_NAME="EMC Greenplum DW"
      
        SEG_PREFIX=gpseg
      
        PORT_BASE=40300
      
      This extends the exclusion regex used when appending to the logfile
      to remove blank lines completely and to handle comments that doesn't
      start on column zero.
      Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      bcf1345c
    • D
      Fix header file inclusion for PG_PRINTF_ATTRIBUTE in mapred · 795e0f09
      Daniel Gustafsson 提交于
      Commit 8e60838c22735fcacabc125170fc1d removed PG_PRINTF_ATTRIBUTE
      from pg_config_manual.h, which exposed the fact that gpmapreduce
      was erroneously including that header instead of the correct one.
      Instead include postgres_fe.h for now, as gpmapreduce is client
      side tool and not a separated extension, to fix compilation.
      Reviewed-by: NAsim R P <apraveen@pivotal.io>
      795e0f09