1. 25 1月, 2019 1 次提交
  2. 24 1月, 2019 18 次提交
    • D
      Tidy up error reporting in gp_sparse_vector a little · 9208b8f3
      Daniel Gustafsson 提交于
      This cleans up the error messages in the sparse vector code a little
      by ensuring they mostly conform to the style guide for error handling.
      Also fixes a nearby typo and removes commented out elogs which are
      clearly dead code.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      9208b8f3
    • D
      Fix up incorrect license statements for module · 41e8ed17
      Daniel Gustafsson 提交于
      The gp_sparse_vector module was covered by the relicensing done as
      part of the Greenplum open sourcing, but a few mentions of previous
      licensing remained in the code. The legal situation of this code has
      been reviewed by Pivotal legal and is cleared, so remove incorrect
      statements and replace with the standard copyright file headers.
      
      This also cleans up a few comments while at it.
      
      Reviewed-by: Cyrus Wadia
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      41e8ed17
    • D
      Remove unused header file · 0c1add31
      Daniel Gustafsson 提交于
      The float_specials.h header was removed shortly after this contrib
      module was imported in 2010, and has been dead code since. Remove.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      0c1add31
    • D
      Remove unused and inline single-use functions · f1277a5c
      Daniel Gustafsson 提交于
      This removes a few unused functions, and inlines the function body of
      another one which only had a single caller. Also properly mark a
      few functions as static.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      f1277a5c
    • D
      Stabilize gp_sparse_vector test · ccfe82b7
      Daniel Gustafsson 提交于
      Remove redundant test on array_agg which didn't have a stable output, and
      remove an ORDER BY to let atmsort deal with differences instead.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      ccfe82b7
    • D
      Allocate histogram with palloc to avoid memleak · 0d4908ee
      Daniel Gustafsson 提交于
      The histogram structure was allocated statically via malloc(), but it
      had no data retention between calls as it was purely a microoptimization
      to avoid the cost of repeated allocations. This lead to the allocated
      memory leaking as it's not cleaned up automatically. Fix by pallocing
      the memory instead and take the cost of repeat allocation.
      
      Also ensure to properly clean up allocated memory on failure cases.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      0d4908ee
    • D
      d613a759
    • D
      Fix memory management in gp_sparse_vector · 970a0395
      Daniel Gustafsson 提交于
      palloc() is guaranteed to only return on successful allocation, so there
      is no need to check it. ereport(ERROR..) is guaranteed never to return,
      and to clean up on it's way out, so pfree()ing after an ereport() is not
      just unreachable code, it would be a double-free if it was reached.
      
      Also add proper checks on the malloc() and strdup() calls as those are
      subject to the usual memory pressure controls by the programmer.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      970a0395
    • A
      pg_dump: free temporary variable qualTmpExtTable · 68b11d46
      Adam Lee 提交于
      This part of codes are not covered by PR pipeline, tested manually.
      68b11d46
    • A
      pg_dump: fix dropping temp external table failure on CI · ed940ab6
      Adam Lee 提交于
      fmtQualifiedId() and fmtId() share the same buffer, we cannot
      use any of them until we finished calling.
      ed940ab6
    • M
      Don't choose indexscan when you need a motion for a subplan (#6665) · cd055f99
      Melanie 提交于
      
      When you have subquery under a SUBLINK that might get pulled up, you should
      not allow indexscans to be chosen for the relation which is the
      rangetable for the subquery. If that relation is distributed and the
      subquery is pulled up, you will need to redistribute or broadcast that
      relation and materialize it on the segments, and cdbparallelize will not
      add a motion and materialize an indexscan, so you cannot use indexscan
      in these cases.
      You can't materialize an indexscan because it will materialize only one
      tuple at a time and when you compare that to the param you get from the
      relation on the segments, you can get wrong results.
      
      Because we don't pick indexscan very often, we don't see this issue very
      often. You need a subquery referring to a distributed table in a subplan
      which, during planning, gets pulled up and then when adding paths, the
      indexscan is cheapest.
      Co-authored-by: NAdam Berlin <aberlin@pivotal.io>
      Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      cd055f99
    • D
      Remove stale replication slots on mirrors. · fa09dd80
      David Kimura 提交于
      Stale replication slots can exist on mirrors that were once acting as
      primaries. In this case restart_lsn is non-zero value used in past
      replication slot setup. The stale replication slot will continue to
      retain xlog on mirror which is problematic and unnecessary.
      
      This patch drops internal replication slot on startup of mirror.
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      fa09dd80
    • J
      Prevent int128 from requiring more than MAXALIGN alignment. · d74dd56a
      Jesse Zhang 提交于
      We backported 128-bit integer support to speed up aggregates (commits
      8122e143 and 959277a4) from upstream 9.6 into Greenplum (in
      commits 9b164486 and 325e6fcd). However, we forgot to also port a
      follow-up fix postgres/postgres@7518049980b, mostly because it's nuanced
      and hard to reproduce.
      
      There are two ways to tell the brokenness:
      
      1. On a lucky day, tests would fail on my workstation, but not my laptop (or
         vice versa).
      
      1. If you stare at the generated code for `int8_avg_combine` (and friends),
         you'll notice the compiler uses "aligned" instructions like `movaps` and
         `movdqa` (on AMD64).
      
      Today's my lucky day.
      
      Original commit message from postgres/postgres@7518049980b (by Tom Lane):
      
      > Our initial work with int128 neglected alignment considerations, an
      > oversight that came back to bite us in bug #14897 from Vincent Lachenal.
      > It is unsurprising that int128 might have a 16-byte alignment requirement;
      > what's slightly more surprising is that even notoriously lax Intel chips
      > sometimes enforce that.
      
      > Raising MAXALIGN seems out of the question: the costs in wasted disk and
      > memory space would be significant, and there would also be an on-disk
      > compatibility break.  Nor does it seem very practical to try to allow some
      > data structures to have more-than-MAXALIGN alignment requirement, as we'd
      > have to push knowledge of that throughout various code that copies data
      > structures around.
      
      > The only way out of the box is to make type int128 conform to the system's
      > alignment assumptions.  Fortunately, gcc supports that via its
      > __attribute__(aligned()) pragma; and since we don't currently support
      > int128 on non-gcc-workalike compilers, we shouldn't be losing any platform
      > support this way.
      
      > Although we could have just done pg_attribute_aligned(MAXIMUM_ALIGNOF) and
      > called it a day, I did a little bit of extra work to make the code more
      > portable than that: it will also support int128 on compilers without
      > __attribute__(aligned()), if the native alignment of their 128-bit-int
      > type is no more than that of int64.
      
      > Add a regression test case that exercises the one known instance of the
      > problem, in parallel aggregation over a bigint column.
      
      > This will need to be back-patched, along with the preparatory commit
      > 91aec93e.  But let's see what the buildfarm makes of it first.
      
      > Discussion: https://postgr.es/m/20171110185747.31519.28038@wrigleys.postgresql.org
      
      (cherry picked from commit 75180499)
      d74dd56a
    • J
      Rearrange c.h to create a "compiler characteristics" section. · 60a08bc2
      Jesse Zhang 提交于
      This cherry-picks 91aec93e. We had to be extra careful to preserve
      still-in-use macros UnusedArg and STATIC_IF_INLINE and friends.
      
      > Generalize section 1 to handle stuff that is principally about the
      > compiler (not libraries), such as attributes, and collect stuff there
      > that had been dropped into various other parts of c.h.  Also, push
      > all the gettext macros into section 8, so that section 0 is really
      > just inclusions rather than inclusions and random other stuff.
      
      > The primary goal here is to get pg_attribute_aligned() defined before
      > section 3, so that we can use it with int128.  But this seems like good
      > cleanup anyway.
      
      > This patch just moves macro definitions around, and shouldn't result
      > in any changes in generated code.  But I'll push it out separately
      > to see if the buildfarm agrees.
      
      > Discussion: https://postgr.es/m/20171110185747.31519.28038@wrigleys.postgresql.org
      
      (cherry picked from commit 91aec93e)
      60a08bc2
    • D
      Update GDD to not assign global transaction ids · e24ddd70
      David Kimura 提交于
      Currently GDD sets DistributedTransactionContext to
      DTX_CONTEXT_QD_DISTRIBUTED_CAPABLE and as a result allocates distributed
      transaction id. It creates entry in ProcGlobal->allTmGxact with state
      DTX_STATE_ACTIVE_NOT_DISTRIBUTED. The effect of this is that any query
      taking a snapshot will see this transaction as in progress. Since GDD
      transaction is short lived it is not an issue in general, but in CI it
      causes flaky behavior for some of the vacuum tests. The flaky behavior
      shows up as unvacuumed tables where the vacuum snapshot was taken while
      GDD transaction was running thereby forcing vacuum to lower its oldest
      XMIN. Current behavior of GDD consuming a distributed transaction id
      (every 2 minutes by default) is also wasteful behavior.
      
      Currently GDD also sends a snapshot to QE, but this isn't required and
      is wasteful as well.
      
      In this change for GDD we keep DistributedTransactionContext as
      DTX_CONTEXT_LOCAL_ONLY and avoid dispatching snapshots to QEs.
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      e24ddd70
    • A
      Gpexpand use gp_add_segment to register primaries. · e2c699c8
      Ashwin Agrawal 提交于
      Currently, dbid is used in tablespace path. Hence, while creating
      segment need dbid. To get the dbid need to add segment to catalog
      first. But adding segment to catalog before creating causes
      issues. Hence, modify gpexpand to not let database generate the dbid,
      but instead pass the dbid upfront generated while registering in
      catalog. This way dbid used while creating the segment will be same as
      dbid in catalog.
      Reviewed-by: NJimmy Yih <jyih@pivotal.io>
      e2c699c8
    • L
      Add libzstd-devel to docker images (#6787) · bcdcb827
      Lav Jain 提交于
      bcdcb827
    • G
      Explicitly pass 0 as number of dead tuples to pgstat when vacuuming AO tables. · 148d718d
      Georgios Kokolatos 提交于
      An argument can be made that hidden tuples in AO tables are similar to dead tuples
      for regular tables. However, the use of this information with regards to pgstats
      seems to be semantically distinct and consequently should not be exposed. As example
      after a VACUUM (FULL, ANALYZE) of an AO table, hidden tuples will remain if AO
      compaction thresholds are not met.
      
      It seems preferable to explicitly pass 0 instead of the already zero'd LVRelStats
      member for clarity.
      Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      148d718d
  3. 23 1月, 2019 15 次提交
    • J
      Fix bug: zombie record in gp_distribution_policy (#6768) · b949ac96
      Jialun 提交于
      When a table has been transformed to a view by creating ON SELECT
      rule, the record in gp_distribution_policy should be deleted also,
      for there is no such record for a view.
      Also, the relstorage in pg_class should be changed to 'v'.
      b949ac96
    • D
      Add libzstd to CentOS dependencies README · 78038632
      Dmitriy Dubson 提交于
      Missing documentation on newly required `libzstd` dependency.
      Reviewed-by: NJimmy Yih <jyih@pivotal.io>
      Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      78038632
    • P
      gp_toolkit.gp_skew_* should support replicated table correctly · 6e862c90
      Pengzhou Tang 提交于
      gp_toolkit.gp_skew_* series views/functions are used to query how data
      is skewed in database. The idea is using a query like:
      "select gp_segment_id, count(*) cnt from foo group by gp_segment_id",
      and compare the cnt by gp_segment_id.
      
      For the replicated table, only one replica is picked to count the tuple
      number by the planner, so the old calculate logic produced a confusing
      result that a replicated table is skewed which is not expected:
      
      gpadmin=# select * From gp_toolkit.gp_skew_idle_fractions;
       sifoid | sifnamespace | sifrelname |      siffraction
       --------+--------------+------------+------------------------
         16385 | public       | rpt        | 0.66666666666666666667
      
      What's more, gp_segment_id is ambiguous for replicated table, so in
      commit b120194a, we disallow user to access system columns include
      gp_segment_id, so gp_toolkit.gp_skew_* views now report an error
      now.
      
      This commit correct the results of gp_toolkit.gp_skew_*
      views/functions for the replicated table although the results are
      pointless, however, this way should be more friendly for users.
      6e862c90
    • P
      Remove the obsolete comment for RETURNING and put the test in a parallel... · 00daeffe
      Paul Guo 提交于
      Remove the obsolete comment for RETURNING and put the test in a parallel running group, following pg upstream.
      00daeffe
    • P
      Run gp_toolkit early to reduce testing time of it due to less logs. · 5bc0bcb2
      Paul Guo 提交于
      gp_toolkit test tests various log related views like gp_log_system(), etc.  If
      we run the test earlier, less logs are generated and thus the test runs fater.
      In my test environment, the test time reduces from ~22 seconds to 6.X seconds
      with this patch. Also, I check the whole test case, this change will not affect
      the test coverage.
      5bc0bcb2
    • P
      Declare cursor for update should handle replicated table too · 80f49a18
      Pengzhou Tang 提交于
      In 9.0 merge, we add bellow rule for FOR UPDATE:
      
      select for update will lock the whole table, we do it at addRangeTableEntry.
      The reason is that gpdb is an MPP database, the result tuples may not be on
      the same segment. And for cursor statement, reader gang cannot get Xid to lock
      the tuples, so we didn't add a LockRows node for distributed table to avoid
      it
      
      this rule should also apply to replicated table.
      80f49a18
    • D
      Synchronize mpp_execute option description and precedence rules in en… (#6734) · 7e0bd349
      David Yozie 提交于
      * Synchronize mpp_execute option description and precedence rules in end-user documentation
      
      * describe the order of precedence in each command
      
      * one any -> any one
      
      * Feedback from Lisa
      7e0bd349
    • Z
      Parent partition and children partition table must have same columns · d8a613b8
      ZhangJackey 提交于
      In the previous code, we can modify the parent partition's column
      by ALTER TABLE ONLY, so the column of the parent partition
      and children partition may be different.
      
      In order to prohibit this situation, we check the DROP COLUMN/
      ADD COLUMN/ALTER TYPE COLUMN statement to prohibit the
      user only modify the column of parent partition or children partitions.
      
      There was a discussion on gpdb-dev@:
      https://groups.google.com/a/greenplum.org/forum/#!msg/gpdb-dev/0SzL_gSbqKo/d-2RpwKrFwAJ
      d8a613b8
    • B
      Delete top-level Dockerfile · ae67ca0f
      Bradford D. Boyle 提交于
      It doesn't build because --disable-orca is not being passed to configure
      and pivotaldata/gpdb-devel doesn't have xerces, on which Orca depends.
      
      It seems this Dockerfile is not used. The Dockerfiles in
      ./src/tools/docker/*/Dockerfile are more recently maintained.
      Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io>
      Co-authored-by: NBen Christel <bchristel@pivotal.io>
      ae67ca0f
    • K
      CI: Remove extra sles11 task input for RC job · ca26fb34
      Kris Macoskey 提交于
      For GPDB 6 Beta, only Centos 6/7 need to be passing for the same commit
      to be a valid release candidate.
      
      This was originally done in this commit: fa63e7ab
      
      But the commit was missing an update to the task yaml for the
      Release_Candidate job to accompodate removal of the sles11 input.
      Authored-by: NKris Macoskey <kmacoskey@pivotal.io>
      ca26fb34
    • A
      a9cd61e0
    • A
      Validation for gp_dbid and gp_contentid between QD catalog and QE. · 78aed203
      Ashwin Agrawal 提交于
      Since gp_dbid and gp_contentid is stored in conf files on QE, its
      helpful to have validation to compare values between QD catalog table
      gp_segment_configuration and QE. This validation is performed using
      FTS. FTS message includes gp_dbid and gp_contentid values from
      catalog. QE validates the value while handling the FTS message and if
      finds inconsistency PANICS.
      
      This check is mostly targeted during development to catch missed
      handling of gp_dbid and gp_contentid values in config files. For
      future features like pg_upgrade and gpexpand which copy master
      directory and convert it to segment.
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      78aed203
    • A
      Delete gpsetdbid.py and gp_dbid.py. · 549cd61c
      Ashwin Agrawal 提交于
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      549cd61c
    • A
      Store gp_dbid and gp_contentid in conf files. · 4eaeb7bc
      Ashwin Agrawal 提交于
      Currently, gp_dbid and gp_contentid is passed as command line
      arguments for starting QD and QE. Since, the values are stored in
      master's catalog table, to get the right values, must start the master
      first. Hence, hard-coded dbid=1 was used for starting the master in
      admin mode always. This worked fine till dbid was not used for
      anything on-disk. But given dbid is used for tablespace path in GPDB
      6, startting the instance with wrong dbid, means inviting recovery
      time failues, data corruption or data loss situations. Dbid=1 will go
      wrong after failover to standby master as it has dbid != 1. This
      commit hence eliminate the need of passing the gp_dbid and
      gp_contentid on command line, instead while creating the instance the
      values are stored in conf files for the instance.
      
      This also helps to avoid passing gp_dbid as argument to pg_rewind,
      which needs to start target instance in single user mode to complete
      recovery before performing rewind operation.
      
      Plus, this eases during development to just use pg_ctl start and not
      require to correctly pass these values.
      
       - gp_contentid is stored in postgresql.conf file.
      
       - gp_dbid is stored in internal.auto.conf.
      
       - Introduce internal.auto.conf file created during
         initdb. internal.auto.conf is included from postgresql.conf file.
      
       - Separate file is chosen to write gp_dbid for ease handling during
         pg_rewind and pg_basebackup, as can exclude copying this file from
         primary to mirror, instead of trying to edit the contents of the
         same after copy during these operations. gp_contentid remains same
         for primary and mirror hence having it in postgresql.conf file
         makes senes. If gp_contentid is also stored in this new file
         internal.auto.conf then pg_basebackup needs to be passed contentid
         as well to write to this file.
      
       - pg_basebackup: write the gp_dbid after backup. Since, gp_dbid is
         unique for primary and mirror, pg_basebackup excludes copying
         internal.auto.conf file storing the gp_dbid. pg_basebackup explicit
         (over)writes the file with value passed as
         --target-gp-dbid. --target-gp-dbid due to this is mandatory
         argument to pg_basebackup now.
      
       - gpexpand: update gp_dbid and gp_contentid post directory copy.
      
       - pg_upgrade: retain all configuration files for
         segment. postgresql.auto.conf and internal.auto.conf are also
         internal configuration files which should be restored back after
         directory copy. Similar, change is required in gp_upgrade repo in
         restoreSegmentFiles() after copyMasterDirOverSegment().
      
       - Update tests to avoid passing gp_dbid and gp_contentid.
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      4eaeb7bc
    • A
      gpinitsystem: add mirror to catalog first and then create them. · f6e85f1f
      Ashwin Agrawal 提交于
      To create mirrors, pg_basebackup needs to be performed. pg_basebackup
      to correctly handle tablespaces needs dbid as argument. This
      requirement exist because dbid is used in tablespace path.
      
      dbid in master catalog to be in sync with what's used by mirror for
      tablespace, need to add mirror to catalog first. Get the dbid and pass
      the same to pg_basebackup for creating mirror.
      Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
      f6e85f1f
  4. 22 1月, 2019 3 次提交
    • A
      Remove the case of external partition · 9cef1c91
      Adam Lee 提交于
      pg_upgrade doesn't like it, please revert this commit once the restriction is
      removed.
      
      ```
      Checking for external tables used in partitioning           fatal
      
      | Your installation contains partitioned tables with external
      | tables as partitions.  These partitions need to be removed
      | from the partition hierarchy before the upgrade.  A list of
      | external partitions to remove is in the file:
      | 	external_partitions.txt
      
      Failure, exiting
      ```
      9cef1c91
    • A
      pg_dump: dump the namespace while processing external partitions · bbb9f9dd
      Adam Lee 提交于
      We forgot to dump the namespace while processing external partitions, it
      would be a problem since upstream pg_dump decided not to dump the
      search_path, this commit fixes it.
      bbb9f9dd
    • H
      Fix gppkg error when master and standby master are in the same node · 1f33759b
      Haozhou Wang 提交于
      If both master and standby master are set in the same
      node, the gppkg utility will report error when uninstall a gppkg.
      This is because, gppkg utility assume master and standby master
      are in the different node, which is not be true in test environment.
      
      This patch fixed this issue, when master and standby master are in
      the same node, we skip to install/uninstall gppkg on standby master
      node.
      1f33759b
  5. 21 1月, 2019 2 次提交
    • S
      Remove GPDB_93_MERGE_FIXME (#6699) · d286b105
      Shaoqi Bai 提交于
      The code was added to tackle the case when FTS sends promote message, on mirror create the PROMOTE file and signal mirror to promote. But while mirror is still under promotion and not completed yet, FTS sends promote again, which creates the PROMOTE file again. Now, this PROMOTE file exist on promoted mirror which is acting as primary.
      So, if basebackup was taken from this primary to create mirror, it included PROMOTE file and auto promoted mirror on creation which is incorrect. Hence, via FTS to detect if this file exist delete PROMOTE file was added along with pg_basebackup excluding the copy of PROMOTE file.
      
      Now, given that background and upstream commit to always just delete the PROMOTE file on postmaster start, covers for even if PROMOTE file gets created after mirror promotion and gets copied over by pg_basebackup. On mirror startup no risk of auto-promotion. So, we can safely remove this code now.
      Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Reviewed-by: NPaul Guo <pguo@pivotal.io>
      d286b105
    • R
      Use the right rel for largest_child_relation(). · 8712da1e
      Richard Guo 提交于
      Function largest_child_relation() is used to find the largest child
      relation for an inherited/partitioned relation, recursively. Previously
      we passed a wrong rel as its param.
      
      This patch finds in root->simple_rel_array the right rel for
      largest_child_relation(). Also it replaces several rt_fetch with a
      search in root->simple_rte_array.
      
      This patch fixes #6599.
      Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      8712da1e
  6. 19 1月, 2019 1 次提交
    • L
      docs - reorg pxf content, add multi-server, objstore content (#6736) · f601572d
      Lisa Owen 提交于
      * docs - reorg pxf content, add multi-server, objstore content
      
      * misc edits, SERVER not optional
      
      * add server, remove creds from examples
      
      * address comments from alexd
      
      * most edits requested by david
      
      * add Minio to table column name
      
      * edits from review with pxf team (start)
      
      * clear text credentials, reorg objstore cfg page
      
      * remove steps with XXX placeholder
      
      * add MapR to supported hadoop distro list
      
      * more objstore config updates
      
      * address objstore comments from alex
      
      * one parquet data type mapping table, misc edits
      
      * misc edits from david
      
      * add mapr hadoop config step, misc edits
      
      * fix formatting
      
      * clarify copying libs for MapR
      
      * fix pxf links on CREATE EXTERNAL TABLE page
      
      * misc edits
      
      * mapr paths may differ based on version in use
      
      * misc edits, use full topic name
      
      * update OSS book for pxf subnav restructure
      f601572d