1. 22 11月, 2018 12 次提交
    • H
      Fix confusion with distribution keys of queries with FULL JOINs. · a25e2cd6
      Heikki Linnakangas 提交于
      There was some confusion on how NULLs are distributed, when CdbPathLocus
      is of Hashed or HashedOJ type. The comment in cdbpathlocus.h suggested
      that NULLs can be on any segment. But the rest of the code assumed that
      that's true only for HashedOJ, and that for Hashed, all NULLs are stored
      on a particular segment. There was a comment in cdbgroup.c that said "Or
      would HashedOJ ok, too?"; the answer to that is "No!". Given the comment
      in cdbpathlocus.h, I'm not suprised that the author was not very sure
      about that. Clarify the comments in cdbpathlocus.h and cdbgroup.c on that.
      
      There were a few cases where we got that actively wrong. repartitionPlan()
      function is used to inject a Redistribute Motion into queries used for
      CREATE TABLE AS and INSERT, if the "current" locus didn't match the target
      table's policy. It did not check for HashedOJ. Because of that, if the
      query contained FULL JOINs, NULL values might end up on all segments. Code
      elsewhere, particularly in cdbgroup.c, assumes that all NULLs in a table
      are stored on a single segment, identified by the cdbhash value of a NULL
      datum. Fix that, by adding a check for HashedOJ in repartitionPlan(), and
      forcing a Redistribute Motion.
      
      CREATE TABLE AS had a similar problem, in the code to decide which
      distribution key to use, if the user didn't specify DISTRIBUTED BY
      explicitly. The default behaviour is to choose a distribution key that
      matches the distribution of the query, so that we can avoid adding an
      extra Redistribute Motion. After fixing repartitionPlan, there was no
      correctness problem, but if we chose the key based on a HashedOJ locus,
      there is no performance benefit because we'd need a Redistribute Motion
      anyway. So modify the code that chooses the CTAS distribution key to
      ignore HashedOJ.
      
      While we're at it, refactor the code to choose the CTAS distribution key,
      by moving it to a separate function. It had become ridiculously deeply
      indented.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/6154, and adds tests.
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      a25e2cd6
    • H
      Add tests for deriving distribution keys from query in CREATE TABLE AS. · 9457fe71
      Heikki Linnakangas 提交于
      The case where some, but not all, of the query's distribution keys were
      present in the result set, was not covered by any existing tests.
      
      Per Paul Guo's observation.
      9457fe71
    • H
      Cosmetic fixes in the code to determine distribution key for CTAS. · a5fa3110
      Heikki Linnakangas 提交于
      Fix indentation. In the code to generate a NOTICE, remove if() for
      condition that we had checked earlier in the function already, and use a
      StringInfo for building the string.
      a5fa3110
    • H
      Remove unused variables, to silence compiler warnings. · 88a185a5
      Heikki Linnakangas 提交于
      These were left behind by commit 7f6066ea.
      88a185a5
    • H
      Fix obsolete comment in cdb_build_distribution_pathkeys(). · 8c9c0576
      Heikki Linnakangas 提交于
      It returns a simple list of PathKeys, not a list of lists. The code was
      changed in the 8.3-era merge of equivalence classes already, but we
      neglected the comment.
      8c9c0576
    • Z
      Remove dead code in ATExecSetDistributedBy · 50f2e3bb
      Zhenghua Lyu 提交于
      This commit is the first step to refactor ATExecSetDistributedBy. Its
      main purpose is to remove some dead code in this function and during
      the process we find some helper functions can also be simplified so
      the simplification is also in this commit.
      
      According to MPP-7770, we should disable changing storage options for now.
      It is ugly to just throw an error when encounter `appendonly` option but
      without removing the code. In this commit remove all related logic.
      
      Because of with clause can only contain reshuffle|reorganize, we only
      new_rel_opts if the table itself is ao|aoco. No need to deduce it from with
      clause.
      
      We also remove the unnecessary checks at the start of this function. Because
      These checks have been already done in the function `ATPrepCmd`.
      
      Co-authored-by: Shuejie Zhang <shzhang@pivotal.io >
      50f2e3bb
    • S
      After applying that commit 22e04dc12df9e0577ba93a75dbef160c8c1ed258, the... · 1136f2fb
      Shaoqi Bai 提交于
      After applying that commit 22e04dc12df9e0577ba93a75dbef160c8c1ed258, the master will block when the standby master is down.
      
      There are a couple things that need to be done to unblock the master.
      1. Run gpinitstandby -n to start the standby master back up.
      2. Run psql postgres -c "ALTER SYSTEM SET synchronous_standby_names = '';" and reload the master segment.
      
      Note that the ALTER SYSTEM SET has to be called again to set synchronous_standby_names back to '*' (and master config reloaded) to enable synchronous replication again.
      Thoughts are to make it 1 step combined into gpinitstandby -n instead of documenting a multi-step process.
      
      What this commit is just to make it 1 step combined into gpinitstandby -n.
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      1136f2fb
    • J
      Remove Master/Standby SyncRepWait Greenplum hack · 7f6066ea
      Jimmy Yih 提交于
      When the standby master is unavailable, the master will not block on commits
      even though we enable synchronous replication. This is because we have a
      Greenplum hack which checks if the WAL stream with the standby master is
      valid. If the stream is invalid, the master will quickly skip the SyncRepWait
      and continue on its commit.
      
      Remove this hack in order to make Master/Standby and Primary/Mirror WAL
      replication more similar.
      Co-authored-by: NShaoqi Bai <sbai@pivotal.io>
      7f6066ea
    • N
      Use max numsegments of subpaths for Append node · 1b2f7bcd
      Ning Yu 提交于
      Suppose t1 has numsegments=1 and t2 has numsegments=2, then below query
      will have incorrect plan:
      
          explain (costs off) select * from t2 a join t2 b using(c2)
                    union all select * from t1 c join t1 d using(c2);
                                         QUERY PLAN
          ------------------------------------------------------------------------
           Gather Motion 1:1  (slice3; segments: 1)
             ->  Append
                   ->  Hash Join
                         Hash Cond: (a.c2 = b.c2)
                         ->  Redistribute Motion 2:2  (slice1; segments: 2)
                               Hash Key: a.c2
                               ->  Seq Scan on t2 a
                         ->  Hash
                               ->  Redistribute Motion 2:2  (slice2; segments: 2)
                                     Hash Key: b.c2
                                     ->  Seq Scan on t2 b
                   ->  Hash Join
                         Hash Cond: (c.c2 = d.c2)
                         ->  Seq Scan on t1 c
                         ->  Hash
                               ->  Seq Scan on t1 d
           Optimizer: legacy query optimizer
          (17 rows)
      
      slice2 has a 2:2 redistribute motion to slice3, however slice3 only has
      1 segment, this is due to Append's numsegments is decided from the last
      subpath.
      
      To fix the issue we should use max numsegments of subpaths for Append.
      
      The issue was already fixed in 39856768,
      we are only adding tests for it now.
      1b2f7bcd
    • N
      New extension to debug partially distributed tables · 3119009a
      Ning Yu 提交于
      Introduced a new debugging extension gp_debug_numsegments to get / set
      the default numsegments when creating tables.
      
      gp_debug_get_create_table_default_numsegments() gets the default
      numsegments.
      
      gp_debug_set_create_table_default_numsegments(text) sets the default
      numsegments in text format, valid values are:
      - 'FULL': all the segments;
      - 'RANDOM': pick a random set of segments each time;
      - 'MINIMAL': the minimal set of segments;
      
      gp_debug_set_create_table_default_numsegments(integer) sets the default
      numsegments directly, valid range is [1, gp_num_contents_in_cluster].
      
      gp_debug_reset_create_table_default_numsegments(text) or
      gp_debug_reset_create_table_default_numsegments(integer) reset the
      default numsegments to the specified value, and the value can be reused
      later.
      
      gp_debug_reset_create_table_default_numsegments() resets the default
      numsegments to the value passed last time, if there is no previous call
      to it the value is 'FULL'.
      
      Refactored ICG test partial_table.sql to create partial tables with this
      extension.
      3119009a
    • D
      Fix dynahash HASH_ENTER usage · e576c0b9
      Daniel Gustafsson 提交于
      rel_partitioning_is_uniform() and addMCVToHashTable() inserted with
      HASH_ENTER, and subsequently checked the returnvalue for NULL in
      order to error out on "out of memory". HASH_ENTER however doesn't
      return if it couldn't insert and will error out itself so remove the
      test as it cannot happen.
      
      groupHashNew() was using HASH_ENTER_NULL which does return NULL in
      out of memory situations, but it failed to correctly handle the
      returnvalue and dereferenced without check risking a null pointer
      deref under memory pressure. Fix by using HASH_ENTER instead as
      the code clearly expect that behavior.
      Reviewed-by: NPaul Guo <paulguo@gmail.com>
      e576c0b9
    • H
      Another attempt at fixing Assertion on empty AO tables. · 77ac9bdf
      Heikki Linnakangas 提交于
      Commit cc2e211f attempted to silence the assertion in
      vac_update_relstats(), but the assertion it in turn added, was hit
      heavily. vacuum_appendonly_fill_stats() function, where I added the check
      for zero pages and non-zero tuples combination, is also reached in QD mode,
      contrary to the comments and the assertion that I added. I'm not sure why
      we look at the totals in QD mode - AFAICS we just throw away them away -
      but I'm reluctant to start restructuring this code right now. So move the
      code to zap reltuples to 0, into vac_update_relstats().
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      77ac9bdf
  2. 21 11月, 2018 11 次提交
    • D
      Set statistics to zero when no sampled rows · fb537b53
      Daniel Gustafsson 提交于
      In the unlikely event that we reach this codepath with a samplerows
      value of zero (which albeit unlikely could happen), avoid performing
      a division by zero and instead set the null fraction to zero as we
      clearly don't have any more information to go on. The HLL code calls
      calls the compute_stats function pointer with zero samplerows, and
      while that's using a different compute_stats function, it's an easy
      mistake to make when not all functions can handle a division by zero.
      This is defensive programming prompted by a report that triggered an
      old bug like this without actually hitting this, but there is little
      reason to take the risk of a crash. Suspenders go well with belts.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      fb537b53
    • H
      Fix assertion failure when vacuuming an AO table in utility mode. · cc2e211f
      Heikki Linnakangas 提交于
      There's an assertion in vac_update_relstats(), that if num_tuples is
      non-zero, num_pages must also be non-zero. Makes sense: if the table takes
      up no space, there can't be any tuples in it. But we hit that case, on
      vacuum of an AO table in the QD node in utility mode. The QD has the total
      tuple counts in each AO segment in the pg_aoseg table, across the whole
      cluster, but it doesn't have the physical sizes. So it reported N tuples,
      but 0 bytes.
      
      We started hitting that assertion after commit c0ce2eb9, which fixed the
      rounding in RelationGuessNumberOfBlocks(), so that it now returns 0
      blocks, for 0-byte relation. It used to return 1, which masked the
      problem.
      
      Fix by reporting 0 tuples and 0 blocks in the QD. That's a change from the
      old behaviour, which was to report N tuples and 1 block, but it's
      consistent with heap tables.
      Reviewed-by: NJacob Champion <pchampion@pivotal.io>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      cc2e211f
    • M
    • C
      docs - update docs with PG 8.4 merge, interation 3 (#6269) · b7756c77
      Chuck Litzell 提交于
      * Update DITA docs with Postgres 8.4 merge, interation 3 changes to SGML files.
      (PR #3780)
      
      * Update DITA docs with Postgres 8.4 merge, interation 3 changes to SGML files.
      (PR #3780)
      
      * Fix example syntax
      
      * Add missing <p> codes and move into sectiondiv
      b7756c77
    • D
    • M
      docs - using dblink as a non-superuser (#6270) · 5e2dc9bd
      Mel Kiyama 提交于
      * docs - using dblink as a non-superuser
      
      -update installing dblink on 6.0 - uses create extension
      -clarify some dblink_connect() information
      -add using dblink_connect_u() as a non-superuser
      
      * docs - using dblink as a non-superuser - review comment updates
      
      * docs - using dblink as a non-superuser - more review comment updates
      5e2dc9bd
    • M
      docs - gpcopy - add option --dest-dbname (#6276) · 66737c75
      Mel Kiyama 提交于
      * docs - gpcopy - add option --dest-dbname
      
      Also, add list of database names must be enclosed in quotes.
      
      * docs - gpcopy --dest-dbname review comment updates.
      Also, changed list of multiple db names to specify no spaces between names.
      
      * docs - gpcopy --dest-dbname - fix typo
      66737c75
    • A
      pg_rewind: Avoid need for empty callback test function. · a1d62a46
      Ashwin Agrawal 提交于
      Melanie suggested usage of
      `declare -F <function_name> > /dev/null && <function_name>` in pg_rewind test.
      
      -F will return only the name of the function and then the error code will be
      non-zero if it is not defined and then by and-ing it with the function
      definition alone, that function will only run in run_test if it is defined. So,
      this helps to avoid coding empty functions if the test doesn't wish to have any
      implementation for the same.
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      a1d62a46
    • A
      pg_rewind: handle AO and CO tables. · 97a76214
      Ashwin Agrawal 提交于
      Copy tail end of AO file from source to target (with some exceptions) if xlog
      record for the same is found on target after divergence. COPY_TAIL is used as ao
      inserts are not fixed size hence best we can do is copy till end of file,
      starting from modification offset.
      
      Truncate record for ao is ignored similar to heap for same reasons.
      
      Test is added for AO and CO tables to validate COPY and COPY_TAIL is performed
      correctly.
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      97a76214
    • A
      Fix pg_rewind to copy BTREE_METAPAGE for newroot. · 0a08f319
      Ashwin Agrawal 提交于
      This XLOG_BTREE_NEWROOT xlog record type doesn't log the update to the
      metapage. Hence pg_rewind misses to rewind change to meta page. The redo routine
      handling the XLOG_BTREE_NEWROOT implicitly makes update to meta-page which is
      always hard-coded to block zero.
      Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      0a08f319
    • A
      pg_rewind tests: promote and validate master after rewind. · adb10fb2
      Ashwin Agrawal 提交于
      Enhance the test framework to verify by promoting the old master after pg_rewind
      and actually running queries against it. Previously, queries were run only
      against standby which didn't undergo pg_rewind. So, basically tests missed
      validating if pg_rewind did its job correctly or not.
      Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
      adb10fb2
  3. 20 11月, 2018 10 次提交
  4. 19 11月, 2018 7 次提交
    • H
      Use block-level sampling for ANALYZE on heap tables. · baa76536
      Heikki Linnakangas 提交于
      The user-visible feature here is to speed up ANALYZE on heap tables, by
      using the block-level sampling method from PostgreSQL, instead of the
      scanning the whole table. This refactors the ANALYZE code, to make it
      possible to call the upstream function to do the block-level sampling
      again.
      
      A secondary feature is that this enables the ANALYZE callback for Foreign
      Data Wrappers, introduced in PostgreSQL 9.2. It was disabled in the 9.2
      merge, because the function signature had been changed in GPDB, making
      it cumbersome to call it. This patch reverts acquire-function's signature
      to the same as in upstream, so the FDW callback now works again.
      
      This also moves the logic to decide which rels to analyze, based on
      optimizer_analyze_root_partition or optimizer_analyze_midlevel_partition
      GUCs and the ROOTPARTITION option, wholly to get_rel_oids() in vacuum.c.
      It was mostly done there already, but there were some warts in analyze.c.
      None of the code in analyze.c should now treat partitions differently
      from normal inherited tables, which helps to minimize the diff vs
      upstream. (Except that the HLL code still heavily uses partition-related
      functions. That's a TODO.)
      baa76536
    • H
      Fix test case for changed data distribution with Jump Consistent hash. · da0f7a74
      Heikki Linnakangas 提交于
      Commit 4a174240 changed the hash method, which changed the data
      distribution of data. It adjusted many test cases that were dependent on
      the data distribution, but missed this one. This one isn't failing at the
      moment, but it was clearly mean to look at the relfrozenxid of the segment
      that it inserts all the data to. Change the value of the distribution key
      so that all the data falls into segment 0 again.
      da0f7a74
    • H
      Fix rounding in RelationGuessNumberOfBlocks(). · c0ce2eb9
      Heikki Linnakangas 提交于
      Doesn't matter much for AO tables, but if this is to be used for heap
      tables, where the relation size is always a multiple of BLCKSZ, it would
      be nice to not be off-by-one.
      c0ce2eb9
    • H
      Simplify the data structure to track "too wide" datums slightly. · fe914bcc
      Heikki Linnakangas 提交于
      The per-column 'toowide_cnt' was only used to check if there are any, so
      it's redundant with the actual flags. Replace the bool array with
      Bitmapset. The Bitmapset is left as NULL when there are no bits set, so we
      can use a NULL check in place of checking toowide_cnt.
      fe914bcc
    • H
    • H
      In a database-wide ANALYZE, analyze root partitions last. · ec35f3f9
      Heikki Linnakangas 提交于
      The HLL method of updating the root stats only works if the leaf partitions
      have already been analyzed. So by always analyzing the leafs first, we
      can make use of HLL in updating the root stats. Otherwise, it's pure luck
      if it works out or not, depending on the order the leafs and parents are
      analyzed.
      ec35f3f9
    • H
      Set dbname and username when MyProcPort is not initialized when setup… (#6206) · b7598541
      Hubert Zhang 提交于
      For background worker, MyProcPort is not initialized, which is used to set dbname and username when creategang. We could use MyDatabaseId and AuthenticatedUserId instead, which are initialized in InitPostgres() for background worker.
      b7598541