1. 02 2月, 2019 1 次提交
    • D
      Docs: remove install guide source (#6859) · 0af5719f
      David Yozie 提交于
      * bump postgresql url reference to 9.4
      
      * Remove source for install guide
      
      * Revert "bump postgresql url reference to 9.4"
      
      This reverts commit ab3405ae380f2f5a08ca5305f51fd431f479eae3.
      0af5719f
  2. 01 2月, 2019 17 次提交
    • H
      Fix gpcheckcat test case, distkey cannot be NULL anymore. · 6cacc636
      Heikki Linnakangas 提交于
      A randomly distributed table is now represented by an empty int2vector.
      6cacc636
    • D
      Error out on multiple writers in CTE · bfcb7882
      Daniel Gustafsson 提交于
      While Greenplum can plan a CTE query with multiple writable expressions,
      it cannot execute it as there is a limitation on using a single writer
      gang. Until we can support multiple writer gangs, let's error out with
      a graceful error message rather than failing during exeucution with a
      more cryptic internal error.
      
      Ideally this will be reverted in GPDB 7.X but right now it's much too
      close to release for attacking this.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      bfcb7882
    • D
      Fix leftover merge conflict in xmlmap test · cfad092b
      Daniel Gustafsson 提交于
      The 9.4.20 merge mistakenly left a merge conflict in the alternative
      output for the xmlmap test. Fix verified against a backend without
      XML support.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      cfad092b
    • H
      Use normal hash operator classes for data distribution. · 242783ae
      Heikki Linnakangas 提交于
      Replace the use of the built-in hashing support for built-in datatypes, in
      cdbhash.c, with the normal PostgreSQL hash functions. Now is a good time
      to do this, since we've already made the change to use jump consistent
      hashing in GPDB 6, so we'll need to deal with the upgrade problems
      associated with changing the hash functions, anyway.
      
      It is no longer enough to track which columns/expressions are used to
      distribute data. You also need to know the hash function used. For that,
      a new field is added to gp_distribution_policy, to record the hash
      operator class used for each distribution key column. In the planner,
      a new opfamily field is added to DistributionKey, to track that throughout
      the planning.
      
      Normally, if you do "CREATE TABLE ... DISTRIBUTED BY (column)", the
      default hash operator class for the datatype is used. But this patch
      extends the syntax so that you can specify the operator class explicitly,
      like "... DISTRIBUTED BY (column opclass)". This is similar to how an
      operator class can be specified for each column in CREATE INDEX.
      
      To support upgrade, the old hash functions have been converted to special
      (non-default) operator classes, named cdbhash_*_ops. For example, if you
      want to use the old hash function for an integer column, you could do
      "DISTRIBUTED BY (intcol cdbhash_int4_ops)". The old hard-coded whitelist
      of operators that have "compatible" cdbhash functions has been replaced
      by putting the compatible hash opclasses in the same operator family. For
      example, all legacy integer operator classes, cdbhash_int2_ops,
      cdbhash_int4_ops and cdbhash_int8_ops, are all part of the
      cdbhash_integer_ops operator family).
      
      This removes the pg_database.hashmethod field. The hash method is now
      tracked on a per-table and per-column basis, using the opclasses, so it's
      not needed anymore.
      
      To help with upgrade from GPDB 5, this introduces a new GUC called
      'gp_use_legacy_hashops'. If it's set, CREATE TABLE uses the legacy hash
      opclasses, instead of the default hash opclasses, if the opclass is not
      specified explicitly. pg_upgrade will set the new GUC, to force the use of
      legacy hashops, when restoring the schema dump. It will also set the GUC
      on all upgraded databases, as a per-database option, so any new tables
      created after upgrade will also use the legacy opclasses. It seems better
      to be consistent after upgrade, so that collocation between old and new
      tables work for example. The idea is that some time after the upgrade, the
      admin can reorganize all tables to use the default opclasses instead. At
      that point, he should also clear the GUC on the converted databases. (Or
      rather, the automated tool that hasn't been written yet, should do that.)
      
      ORCA doesn't know about hash operator classes, or the possibility that we
      might need to use a different hash function for two columns with the same
      datatype. Therefore, it cannot produce correct plans for queries that mix
      different distribution hash opclasses for the same datatype, in the same
      query. There are checks in the Query->DXL translation, to detect that
      case, and fall back to planner. As long as you stick to the default
      opclasses in all tables, we let ORCA to create the plan without any regard
      to them, and use the default opclasses when translating the DXL plan to a
      Plan tree. We also allow the case that all tables in the query use the
      "legacy" opclasses, so that ORCA works after pg_upgrade. But a mix of the
      two, or using any non-default opclasses, forces ORCA to fall back.
      
      One curiosity with this is the "int2vector" and "aclitem" datatypes. They
      have a hash opclass, but no b-tree operators. GPDB 4 used to allow them
      as DISTRIBUTED BY columns, but we forbid that in GPDB 5, in commit
      56e7c16b. Now they are allowed again, so you can specify an int2vector
      or aclitem column in DISTRIBUTED BY, but it's still pretty useless,
      because the planner still can't form EquivalenceClasses on it, and will
      treat it as "strewn" distribution, and won't co-locate joins.
      
      Abstime, reltime, tinterval datatypes don't have default hash opclasses.
      They are being removed completely on PostgreSQL v12, and users shouldn't
      be using them in the first place, so instead of adding hash opclasses for
      them now, we accept that they can't be used as distribution key columns
      anymore. Add a check to pg_upgrade, to refuse upgrade if they are used
      as distribution keys in the old cluster. Do the same for 'money' datatype
      as well, although that's not being removed in upstream.
      
      The legacy hashing code for anyarray in GPDB 5 was actually broken. It
      could produce a different hash value for two arrays that are considered
      equal, according to the = operator, if there were differences in e.g.
      whether the null bitmap was stored or not. Add a check to pg_upgrade, to
      reject the upgrade if array types were used as distribution keys. The
      upstream hash opclass for anyarray works, though, so it is OK to use
      arrays as distribution keys in new tables. We just don't support binary
      upgrading them from GPDB 5. (See github issue
      https://github.com/greenplum-db/gpdb/issues/5467). The legacy hashing of
      'anyrange' had the same problem, but that was new in GPDB 6, so we don't
      need a pg_upgrade check for that.
      
      This also tightens the checks ALTER TABLE ALTER COLUMN and CREATE UNIQUE
      INDEX, so that you can no longer create a situation where a non-hashable
      column becomes the distribution key. (Fixes github issue
      https://github.com/greenplum-db/gpdb/issues/6317)
      
      Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/4fZVeOpXllQCo-authored-by: NMel Kiyama <mkiyama@pivotal.io>
      Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
      Co-authored-by: NChris Hajas <chajas@pivotal.io>
      Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      Reviewed-by: NSimon Gao <sgao@pivotal.io>
      Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      Reviewed-by: NYandong Yao <yyao@pivotal.io>
      242783ae
    • H
      Rename gp_distribution_policy.attrnums to distkey, and make it int2vector. · 69ec6926
      Heikki Linnakangas 提交于
      This is in preparation for adding operator classes as a new column
      (distclass) to gp_distribution_policy. This naming is consistent with
      pg_index.indkey/indclass. Change the datatype to int2vector, also for
      consistency with pg_index, and some other catalogs that store attribute
      numbers, and because int2vector is slightly more convenient to work with
      in the backend. Move the column to the end of the table, so that all the
      variable-length and nullable columns are at the end, which makes it
      possible to reference the other columns directly in Form_gp_policy.
      
      Add a backend function, pg_get_table_distributedby(), to deparse the
      DISTRIBUTED BY definition of a table into a string. This is similar to
      pg_get_indexdef_columns(), pg_get_functiondef() etc. functions that we
      have. Use the new function in psql and pg_dump, when connected to a GPDB6
      server.
      Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Co-authored-by: NPeifeng Qiu <pqiu@pivotal.io>
      Co-authored-by: NAdam Lee <ali@pivotal.io>
      69ec6926
    • P
      Make GDD tests deterministic · a0b9fde8
      Pengzhou Tang 提交于
      GDD tests framework now acquire the desired lock by updating the nth tuple
      in a segment instead of a specified value, so even a hash algorithm changed,
      the tests will not be affected. This method works fine except that a segment
      has not enough tuples to provide the nth tuple. Fix is simple, enlarge the
      test tables from 20 rows to 100 rows.
      
      Authored-by: Ning Yu nyu@pivotal.io
      a0b9fde8
    • Z
      Update serially when GDD is disabled · 29e7f102
      Zhang Shujie 提交于
      If Global Deadlock Detector is enabled, then the table lock may
      downgrade to RowExclusiveLock, It may lead two problems:
      
      1. When updating distributed keys concurrently, SplitUpdate node
         would generate more tuples in the table.
      2. When updating concurrently, it may trigger the EvalPlanQual
         function, when the SubPlan has Motion node, it can not execute
         correctly.
      
      Now we add a GUC for GDD, if it is disabled, we execute these
      UPDATE statement serially, if it is enabled, we raise an error when
      updating concurrently.
      
      Co-authored-by: Zhenghua Lyu zlv@pivotal.io
      29e7f102
    • P
      Remove ext submodule folder · 1e43b584
      Peifeng Qiu 提交于
      We rmeoved the submodule address but didn't remove the actural
      folder. Submodule clone will fail due to missing url link. Remove
      the folder to avoid that.
      1e43b584
    • J
      Fix OOM after cluster reset when gp_vmem_protect_limit > 16GB (#6862) · 9e0e7c27
      Jialun 提交于
      The function VmemTracker_ShmemInit will initialize chunkSizeInBits
      according to gp_vmem_protect_limit. Which is the unit of chunk size.
      The base value of chunkSizeInBits is 20(1MB). If gp_vmem_protect_limit
      is larger than 16GB, it will increase to adapter the large memory
      environment. This value should not be changed after initialized.
      But if this function was called more times, chunkSizeInBits will
      accumulate.
      
      Considering the scenario, QD crashed, then postmaster will reaper the
      QD process and reset shared memory. This will lead to VmemTracker_ShmemInit
      be called more times. So chunkSizeInBits will increase every time after
      crash when gp_vmem_protect_limit is larger than 16GB. At last, the
      chunkSize will be very large which means the new reserved chunk will
      always be zero or a very small value. So the memory limit mechanism
      takes no effect and will cause Out-of-Memory when cannot really
      allocate new memory.
      
      So we set chunkSizeInBits to BITS_IN_MB in VmemTracker_ShmemInit
      every time instead of Assert.
      
      Why there is no new test case in this commit?
      - We just change an Assert to assignment, no logic changes.
      - It is very difficult to add a crash case in current isolation test
        frame, for the connection will be lost due to crash.
      
      We have verified the case in our dev environment manually by setting
      gp_vmem_protect_limit to 65535 and kill -9 QD process. Then we see
      chunkSizeInBits increases every time. At last, we got error message
      "ERROR:  Canceling query because of high VMEM usage."
      9e0e7c27
    • P
      Remove unused gpfdist dependency submodule and WIN32 Readme (#6861) · 7281a162
      Peifeng Qiu 提交于
      We no longer use the ext submodule for gpfdist dependencies. Remove
      it to avoid confusion. WIN32 build process is changed to native
      build. We will add README when it's ready.
      7281a162
    • P
      Remove a FIXME related to recoveryTargetIsLatest. (#6863) · 406fa028
      Paul Guo 提交于
      The recoveryTargetIsLatest setting code was missing somehow and later
      it was added back in commit 55808e18. Removing the FIXME comment.
      Reviewed-by: NJimmy Yih <jyih@pivotal.io>
      Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
      406fa028
    • A
      Set needToPromoteCatalog before updating ControlFile->state. · 434bd5b9
      Ashwin Agrawal 提交于
      Commit 6d80ce31 moved updating the
      control file state above, which caused failure in CI for
      gpactivatestandby test. As catalog update got missed since
      needToPromoteCatalog remained set as false. Hence, move setting
      needToPromoteCatalog before setting the ControlFile->state.
      434bd5b9
    • A
      Avoid FinishPreparedTransaction() calling readRecoveryCommandFile() · cb256d04
      Ashwin Agrawal 提交于
      Not sure why we had FinishPreparedTransaction() calling
      readRecoveryCommandFile(), seems serves no purpose to me. Seems to
      exist from ages and wasn't able to find the rational for the same,
      definitely not with current code. Seems unnecessary performance hit on
      every commit to read and parse the file.
      cb256d04
    • A
      Align transaction log manager (xlog.c and xlog.h) to upstream. · 6d80ce31
      Ashwin Agrawal 提交于
      Lot of differences collected over the years compared to upstream. Some
      confusing or redundant code as well hence better to make it match
      upstream.
      6d80ce31
    • A
      concourse: Remove unused dev_generate_installer.yml · b55a0b71
      Amil Khanzada 提交于
      - We're not sure when this file became abandoned, but it doesn't seem to
        be being used anywhere.
      - Also remove task file and bash scripts that were only referenced by
        this pipeline.
      Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io>
      Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>
      b55a0b71
    • G
      Remove disabled code in set_plan_references_input_asserts() · e8238cc1
      Georgios Kokolatos 提交于
      This commit removes GPDB_93_MERGE_FIXME introduced while including
      46c508fb from upstream. The intention of the upstream commit is
      to keep planner params separated so that they don't get reused
      incorrectly. In doing so, it removed the need for a global list of
      PlannerParamItems.
      
      The removed assertion in this commit was verifying that each Param
      in the tree was included in a global list of PlannerParamItems, and
      that each datatype of each Param matches that in the global list.
      
      At the time of the assertion, we simply don't have the necessary
      information to be able to verify properly. An argument could be made
      for re-introducing such a global list PlannerParamItems. However
      this assertion would not verify that a parameter is ancored in the
      right place and it would introduce additional code to maintain.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      e8238cc1
    • G
      Remove fixme comment as it is not relative to the merge itself · 9baf5269
      Georgios Kokolatos 提交于
      However, it correctly identifies that xidWarnLimit should not be
      configurable. The same should also apply for xidStopLimit. The GUCs
      for those were added in time immemorial, i.e. significantly before
      greenplum was opensourced, with a commit message clearly identifying
      their addition as a one shot hotfix.
      
      A proposal for their depracation has been made in the forum.
      Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      9baf5269
  3. 31 1月, 2019 9 次提交
    • H
      Remove work arounds for SSL_get_current_compression. · cb5d1db9
      Heikki Linnakangas 提交于
      gpcloud uses OpenSSL's libcrypto, even if you ran configure
      --without-openssl. The #include <openssl/ssl.h> in gpcloud clashed with
      the #define in port.h. I suspect the "ssl.h" was a typo, and should've been
      "sha.h", because gpcloud only uses OpenSSL for the hash functions. Change
      it that way.
      
      It's a bit bogus that it builds with libcrypto, even if you specified no
      OpenSSL support in configure, but
      Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      cb5d1db9
    • M
      docs - gpcheckcat: add test orphaned_toast_tables (#6737) · 492059c9
      Mel Kiyama 提交于
      * docs -  gpcheckcat: add test orphaned_toast_tables
      
      * docs - update Note for gpcheckcat test orphaned_toast_table.
      
      * docs - gpcheckcat orphaned TOAST table change. Added term mismatch to note.
      mismatch is the term used in the gpcheckcat output/logfile
      
      * docs - gpcheckcat clarified note that this one one a toast table can be orphaned
      492059c9
    • K
      94b727f1
    • D
      Docs - updating Gemfile.lock · 90d4ec18
      David Yozie 提交于
      90d4ec18
    • J
      Remove gpfaultinjector · eaf7258b
      Jimmy Yih 提交于
      The gpfaultinjector utility was been replaced with the gp_inject_fault
      extension located in gpcontrib directory.
      eaf7258b
    • A
      pg_basebackup and pg_rewind: exclude internal.auto.conf. · c238563e
      Ashwin Agrawal 提交于
      Previously, to exclude this `internal.auto.conf` greenplum specific
      file pg_rewind and pg_basebackup used different mechanisms. Both
      didn't even exist in backend code also. This commit uses the new
      exclusion code we have now to easily exclude `internal.auto.conf` file
      for both pg_basebackup and pg_rewind.
      c238563e
    • A
      pg_basebackup and pg_rewind: exclude pg_log. · 6ec64226
      Ashwin Agrawal 提交于
      This commit makes pg_rewind to exclude pg_log from performing any
      operation, means comparing and copying from current primary to old
      primary.
      
      Also, it uses the new upstream style to exclude pg_log in
      pg_basebackup.
      6ec64226
    • F
      Make pg_rewind skip files and directories that are removed during server start. · 8d2595bd
      Fujii Masao 提交于
      The target cluster that was rewound needs to perform recovery from
      the checkpoint created at failover, which leads it to remove or recreate
      some files and directories that may have been copied from the source
      cluster. So pg_rewind can skip synchronizing such files and directories,
      and which reduces the amount of data transferred during a rewind
      without changing the usefulness of the operation.
      
      Author: Michael Paquier
      Reviewed-by: Anastasia Lubennikova, Stephen Frost and me
      
      Discussion: https://postgr.es/m/20180205071022.GA17337@paquier.xyz
      (cherry picked from commit 266b6acb)
      8d2595bd
    • A
      pg_basebackup: cherry-pick upstream code to exclude directories and files. · 9ac2cf9c
      Ashwin Agrawal 提交于
      This commit cherry-picks parts of upstream commit
      6ad8ac60 "Exclude additional
      directories in pg_basebackup".
      ------------
      Author: Peter Eisentraut <peter_e@gmx.net>
      Date:   Wed Sep 28 12:00:00 2016 -0400
      
          Exclude additional directories in pg_basebackup
      
          The list of files and directories that pg_basebackup excludes from the
          backup was somewhat incomplete and unorganized.  Change that with having
          the exclusion driven from tables.  Clean up some code around it.  Also
          document the exclusions in more detail so that users of pg_start_backup
          can make use of it as well.
      
          The contents of these directories are now excluded from the backup:
          pg_dynshmem, pg_notify, pg_serial, pg_snapshots, pg_subtrans
      
          Also fix a bug that a pg_repl_slot or pg_stat_tmp being a symlink would
          cause a corrupt tar header to be created.  Now such symlinks are
          included in the backup as empty directories.  Bug found by Ashutosh
          Sharma <ashu.coek88@gmail.com>.
      
          From: David Steele <david@pgmasters.net>
      Reviewed-by: NMichael Paquier <michael.paquier@gmail.com>
          (cherry picked from commit 6ad8ac60)
      ------------
      
      Pieces relating to symlink handling already exists from our merge to
      9.4.20. This commit is mainly bringing in the code to have the
      exclusion driven from tables and help make the code same with
      pg_rewind.
      9ac2cf9c
  4. 30 1月, 2019 13 次提交