- 02 2月, 2019 1 次提交
-
-
由 David Yozie 提交于
* bump postgresql url reference to 9.4 * Remove source for install guide * Revert "bump postgresql url reference to 9.4" This reverts commit ab3405ae380f2f5a08ca5305f51fd431f479eae3.
-
- 01 2月, 2019 17 次提交
-
-
由 Heikki Linnakangas 提交于
A randomly distributed table is now represented by an empty int2vector.
-
由 Daniel Gustafsson 提交于
While Greenplum can plan a CTE query with multiple writable expressions, it cannot execute it as there is a limitation on using a single writer gang. Until we can support multiple writer gangs, let's error out with a graceful error message rather than failing during exeucution with a more cryptic internal error. Ideally this will be reverted in GPDB 7.X but right now it's much too close to release for attacking this. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Daniel Gustafsson 提交于
The 9.4.20 merge mistakenly left a merge conflict in the alternative output for the xmlmap test. Fix verified against a backend without XML support. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Heikki Linnakangas 提交于
Replace the use of the built-in hashing support for built-in datatypes, in cdbhash.c, with the normal PostgreSQL hash functions. Now is a good time to do this, since we've already made the change to use jump consistent hashing in GPDB 6, so we'll need to deal with the upgrade problems associated with changing the hash functions, anyway. It is no longer enough to track which columns/expressions are used to distribute data. You also need to know the hash function used. For that, a new field is added to gp_distribution_policy, to record the hash operator class used for each distribution key column. In the planner, a new opfamily field is added to DistributionKey, to track that throughout the planning. Normally, if you do "CREATE TABLE ... DISTRIBUTED BY (column)", the default hash operator class for the datatype is used. But this patch extends the syntax so that you can specify the operator class explicitly, like "... DISTRIBUTED BY (column opclass)". This is similar to how an operator class can be specified for each column in CREATE INDEX. To support upgrade, the old hash functions have been converted to special (non-default) operator classes, named cdbhash_*_ops. For example, if you want to use the old hash function for an integer column, you could do "DISTRIBUTED BY (intcol cdbhash_int4_ops)". The old hard-coded whitelist of operators that have "compatible" cdbhash functions has been replaced by putting the compatible hash opclasses in the same operator family. For example, all legacy integer operator classes, cdbhash_int2_ops, cdbhash_int4_ops and cdbhash_int8_ops, are all part of the cdbhash_integer_ops operator family). This removes the pg_database.hashmethod field. The hash method is now tracked on a per-table and per-column basis, using the opclasses, so it's not needed anymore. To help with upgrade from GPDB 5, this introduces a new GUC called 'gp_use_legacy_hashops'. If it's set, CREATE TABLE uses the legacy hash opclasses, instead of the default hash opclasses, if the opclass is not specified explicitly. pg_upgrade will set the new GUC, to force the use of legacy hashops, when restoring the schema dump. It will also set the GUC on all upgraded databases, as a per-database option, so any new tables created after upgrade will also use the legacy opclasses. It seems better to be consistent after upgrade, so that collocation between old and new tables work for example. The idea is that some time after the upgrade, the admin can reorganize all tables to use the default opclasses instead. At that point, he should also clear the GUC on the converted databases. (Or rather, the automated tool that hasn't been written yet, should do that.) ORCA doesn't know about hash operator classes, or the possibility that we might need to use a different hash function for two columns with the same datatype. Therefore, it cannot produce correct plans for queries that mix different distribution hash opclasses for the same datatype, in the same query. There are checks in the Query->DXL translation, to detect that case, and fall back to planner. As long as you stick to the default opclasses in all tables, we let ORCA to create the plan without any regard to them, and use the default opclasses when translating the DXL plan to a Plan tree. We also allow the case that all tables in the query use the "legacy" opclasses, so that ORCA works after pg_upgrade. But a mix of the two, or using any non-default opclasses, forces ORCA to fall back. One curiosity with this is the "int2vector" and "aclitem" datatypes. They have a hash opclass, but no b-tree operators. GPDB 4 used to allow them as DISTRIBUTED BY columns, but we forbid that in GPDB 5, in commit 56e7c16b. Now they are allowed again, so you can specify an int2vector or aclitem column in DISTRIBUTED BY, but it's still pretty useless, because the planner still can't form EquivalenceClasses on it, and will treat it as "strewn" distribution, and won't co-locate joins. Abstime, reltime, tinterval datatypes don't have default hash opclasses. They are being removed completely on PostgreSQL v12, and users shouldn't be using them in the first place, so instead of adding hash opclasses for them now, we accept that they can't be used as distribution key columns anymore. Add a check to pg_upgrade, to refuse upgrade if they are used as distribution keys in the old cluster. Do the same for 'money' datatype as well, although that's not being removed in upstream. The legacy hashing code for anyarray in GPDB 5 was actually broken. It could produce a different hash value for two arrays that are considered equal, according to the = operator, if there were differences in e.g. whether the null bitmap was stored or not. Add a check to pg_upgrade, to reject the upgrade if array types were used as distribution keys. The upstream hash opclass for anyarray works, though, so it is OK to use arrays as distribution keys in new tables. We just don't support binary upgrading them from GPDB 5. (See github issue https://github.com/greenplum-db/gpdb/issues/5467). The legacy hashing of 'anyrange' had the same problem, but that was new in GPDB 6, so we don't need a pg_upgrade check for that. This also tightens the checks ALTER TABLE ALTER COLUMN and CREATE UNIQUE INDEX, so that you can no longer create a situation where a non-hashable column becomes the distribution key. (Fixes github issue https://github.com/greenplum-db/gpdb/issues/6317) Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/4fZVeOpXllQCo-authored-by: NMel Kiyama <mkiyama@pivotal.io> Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io> Co-authored-by: NPengzhou Tang <ptang@pivotal.io> Co-authored-by: NChris Hajas <chajas@pivotal.io> Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io> Reviewed-by: NNing Yu <nyu@pivotal.io> Reviewed-by: NSimon Gao <sgao@pivotal.io> Reviewed-by: NJesse Zhang <jzhang@pivotal.io> Reviewed-by: NZhenghua Lyu <zlv@pivotal.io> Reviewed-by: NMelanie Plageman <mplageman@pivotal.io> Reviewed-by: NYandong Yao <yyao@pivotal.io>
-
由 Heikki Linnakangas 提交于
This is in preparation for adding operator classes as a new column (distclass) to gp_distribution_policy. This naming is consistent with pg_index.indkey/indclass. Change the datatype to int2vector, also for consistency with pg_index, and some other catalogs that store attribute numbers, and because int2vector is slightly more convenient to work with in the backend. Move the column to the end of the table, so that all the variable-length and nullable columns are at the end, which makes it possible to reference the other columns directly in Form_gp_policy. Add a backend function, pg_get_table_distributedby(), to deparse the DISTRIBUTED BY definition of a table into a string. This is similar to pg_get_indexdef_columns(), pg_get_functiondef() etc. functions that we have. Use the new function in psql and pg_dump, when connected to a GPDB6 server. Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Co-authored-by: NPeifeng Qiu <pqiu@pivotal.io> Co-authored-by: NAdam Lee <ali@pivotal.io>
-
由 Pengzhou Tang 提交于
GDD tests framework now acquire the desired lock by updating the nth tuple in a segment instead of a specified value, so even a hash algorithm changed, the tests will not be affected. This method works fine except that a segment has not enough tuples to provide the nth tuple. Fix is simple, enlarge the test tables from 20 rows to 100 rows. Authored-by: Ning Yu nyu@pivotal.io
-
由 Zhang Shujie 提交于
If Global Deadlock Detector is enabled, then the table lock may downgrade to RowExclusiveLock, It may lead two problems: 1. When updating distributed keys concurrently, SplitUpdate node would generate more tuples in the table. 2. When updating concurrently, it may trigger the EvalPlanQual function, when the SubPlan has Motion node, it can not execute correctly. Now we add a GUC for GDD, if it is disabled, we execute these UPDATE statement serially, if it is enabled, we raise an error when updating concurrently. Co-authored-by: Zhenghua Lyu zlv@pivotal.io
-
由 Peifeng Qiu 提交于
We rmeoved the submodule address but didn't remove the actural folder. Submodule clone will fail due to missing url link. Remove the folder to avoid that.
-
由 Jialun 提交于
The function VmemTracker_ShmemInit will initialize chunkSizeInBits according to gp_vmem_protect_limit. Which is the unit of chunk size. The base value of chunkSizeInBits is 20(1MB). If gp_vmem_protect_limit is larger than 16GB, it will increase to adapter the large memory environment. This value should not be changed after initialized. But if this function was called more times, chunkSizeInBits will accumulate. Considering the scenario, QD crashed, then postmaster will reaper the QD process and reset shared memory. This will lead to VmemTracker_ShmemInit be called more times. So chunkSizeInBits will increase every time after crash when gp_vmem_protect_limit is larger than 16GB. At last, the chunkSize will be very large which means the new reserved chunk will always be zero or a very small value. So the memory limit mechanism takes no effect and will cause Out-of-Memory when cannot really allocate new memory. So we set chunkSizeInBits to BITS_IN_MB in VmemTracker_ShmemInit every time instead of Assert. Why there is no new test case in this commit? - We just change an Assert to assignment, no logic changes. - It is very difficult to add a crash case in current isolation test frame, for the connection will be lost due to crash. We have verified the case in our dev environment manually by setting gp_vmem_protect_limit to 65535 and kill -9 QD process. Then we see chunkSizeInBits increases every time. At last, we got error message "ERROR: Canceling query because of high VMEM usage."
-
由 Peifeng Qiu 提交于
We no longer use the ext submodule for gpfdist dependencies. Remove it to avoid confusion. WIN32 build process is changed to native build. We will add README when it's ready.
-
由 Paul Guo 提交于
The recoveryTargetIsLatest setting code was missing somehow and later it was added back in commit 55808e18. Removing the FIXME comment. Reviewed-by: NJimmy Yih <jyih@pivotal.io> Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
由 Ashwin Agrawal 提交于
Commit 6d80ce31 moved updating the control file state above, which caused failure in CI for gpactivatestandby test. As catalog update got missed since needToPromoteCatalog remained set as false. Hence, move setting needToPromoteCatalog before setting the ControlFile->state.
-
由 Ashwin Agrawal 提交于
Not sure why we had FinishPreparedTransaction() calling readRecoveryCommandFile(), seems serves no purpose to me. Seems to exist from ages and wasn't able to find the rational for the same, definitely not with current code. Seems unnecessary performance hit on every commit to read and parse the file.
-
由 Ashwin Agrawal 提交于
Lot of differences collected over the years compared to upstream. Some confusing or redundant code as well hence better to make it match upstream.
-
由 Amil Khanzada 提交于
- We're not sure when this file became abandoned, but it doesn't seem to be being used anywhere. - Also remove task file and bash scripts that were only referenced by this pipeline. Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io> Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>
-
由 Georgios Kokolatos 提交于
This commit removes GPDB_93_MERGE_FIXME introduced while including 46c508fb from upstream. The intention of the upstream commit is to keep planner params separated so that they don't get reused incorrectly. In doing so, it removed the need for a global list of PlannerParamItems. The removed assertion in this commit was verifying that each Param in the tree was included in a global list of PlannerParamItems, and that each datatype of each Param matches that in the global list. At the time of the assertion, we simply don't have the necessary information to be able to verify properly. An argument could be made for re-introducing such a global list PlannerParamItems. However this assertion would not verify that a parameter is ancored in the right place and it would introduce additional code to maintain. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Georgios Kokolatos 提交于
However, it correctly identifies that xidWarnLimit should not be configurable. The same should also apply for xidStopLimit. The GUCs for those were added in time immemorial, i.e. significantly before greenplum was opensourced, with a commit message clearly identifying their addition as a one shot hotfix. A proposal for their depracation has been made in the forum. Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
- 31 1月, 2019 9 次提交
-
-
由 Heikki Linnakangas 提交于
gpcloud uses OpenSSL's libcrypto, even if you ran configure --without-openssl. The #include <openssl/ssl.h> in gpcloud clashed with the #define in port.h. I suspect the "ssl.h" was a typo, and should've been "sha.h", because gpcloud only uses OpenSSL for the hash functions. Change it that way. It's a bit bogus that it builds with libcrypto, even if you specified no OpenSSL support in configure, but Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
-
由 Mel Kiyama 提交于
* docs - gpcheckcat: add test orphaned_toast_tables * docs - update Note for gpcheckcat test orphaned_toast_table. * docs - gpcheckcat orphaned TOAST table change. Added term mismatch to note. mismatch is the term used in the gpcheckcat output/logfile * docs - gpcheckcat clarified note that this one one a toast table can be orphaned
-
由 Karen Huddleston 提交于
Authored-by: NKaren Huddleston <khuddleston@pivotal.io>
-
由 David Yozie 提交于
-
由 Jimmy Yih 提交于
The gpfaultinjector utility was been replaced with the gp_inject_fault extension located in gpcontrib directory.
-
由 Ashwin Agrawal 提交于
Previously, to exclude this `internal.auto.conf` greenplum specific file pg_rewind and pg_basebackup used different mechanisms. Both didn't even exist in backend code also. This commit uses the new exclusion code we have now to easily exclude `internal.auto.conf` file for both pg_basebackup and pg_rewind.
-
由 Ashwin Agrawal 提交于
This commit makes pg_rewind to exclude pg_log from performing any operation, means comparing and copying from current primary to old primary. Also, it uses the new upstream style to exclude pg_log in pg_basebackup.
-
由 Fujii Masao 提交于
The target cluster that was rewound needs to perform recovery from the checkpoint created at failover, which leads it to remove or recreate some files and directories that may have been copied from the source cluster. So pg_rewind can skip synchronizing such files and directories, and which reduces the amount of data transferred during a rewind without changing the usefulness of the operation. Author: Michael Paquier Reviewed-by: Anastasia Lubennikova, Stephen Frost and me Discussion: https://postgr.es/m/20180205071022.GA17337@paquier.xyz (cherry picked from commit 266b6acb)
-
由 Ashwin Agrawal 提交于
This commit cherry-picks parts of upstream commit 6ad8ac60 "Exclude additional directories in pg_basebackup". ------------ Author: Peter Eisentraut <peter_e@gmx.net> Date: Wed Sep 28 12:00:00 2016 -0400 Exclude additional directories in pg_basebackup The list of files and directories that pg_basebackup excludes from the backup was somewhat incomplete and unorganized. Change that with having the exclusion driven from tables. Clean up some code around it. Also document the exclusions in more detail so that users of pg_start_backup can make use of it as well. The contents of these directories are now excluded from the backup: pg_dynshmem, pg_notify, pg_serial, pg_snapshots, pg_subtrans Also fix a bug that a pg_repl_slot or pg_stat_tmp being a symlink would cause a corrupt tar header to be created. Now such symlinks are included in the backup as empty directories. Bug found by Ashutosh Sharma <ashu.coek88@gmail.com>. From: David Steele <david@pgmasters.net> Reviewed-by: NMichael Paquier <michael.paquier@gmail.com> (cherry picked from commit 6ad8ac60) ------------ Pieces relating to symlink handling already exists from our merge to 9.4.20. This commit is mainly bringing in the code to have the exclusion driven from tables and help make the code same with pg_rewind.
-
- 30 1月, 2019 13 次提交
-
-
由 Karen Huddleston 提交于
This reverts commit b1608404.
-
由 Karen Huddleston 提交于
This reverts commit 577b09ce.
-
由 Karen Huddleston 提交于
This reverts commit 05611ba6.
-
由 Richard Guo 提交于
For LASJ join, the result is supposed to be empty if there is NULL in the inner side. To check for the NULLness, the join clauses are split into outer and inner argument values so that we can evaluate those subexpressions separately. This patch adds verification when doing extraction that the join clauses are in the format of 'foo = ANY bar' and that the equality operation is strict. This patch fixes issue #6389, in which the equality operator is implemented by a function. In this case, the length of arguments is one. So when it tries to extract the second argument, it refers to an invalid pointer and gets segfaulted. Reviewed-by: NEkta Khanna <ekhanna@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
-
由 Karen Huddleston 提交于
We are no longer consuming python from Ivy. We are now building it ourselves against the version of OpenSSL provided on the OS. Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io> Co-authored-by: NBen Christel <bchristel@pivotal.io> Co-authored-by: NDavid Sharp <dsharp@pivotal.io>
-
由 David Sharp 提交于
This commit updates the extensions tarball to exclude krb5, and drops other dependencies from ivy.xml as well. We have shifted these dependencies to be included in our build images. Removed libs: - krb5 - openssl - curl - python - openldap These all have to be removed together because we cannot easily link against multiple versions of the same library and the SOVERSION of the OpenSSL installed on centos7 differs from the one fetched via ivy. As of this commit we no longer package libldap .so files with GPDB, on CentOS only. Soon, we will make the same change for SLES and Ubuntu, and the conditional for Linux_LOADERS_LIBS will no longer be necessary. Co-authored-by: NDavid Sharp <dsharp@pivotal.io> Co-authored-by: NBen Christel <bchristel@pivotal.io> Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io> Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>
-
由 Bradford D. Boyle 提交于
This drops unused libraries and programs from the extensions tarball. Unused dependencies that were dropped: - clapack - gimli - json-c - net-snmp - pcre Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io> Co-authored-by: NBen Christel <bchristel@pivotal.io> Co-authored-by: NKaren Huddleston <khuddleston@pivotal.io>
-
由 Jacob Champion 提交于
These constants, all related to file replication in some way, no longer have any uses in the codebase. Co-authored-by: NShoaib Lari <slari@pivotal.io>
-
由 Jacob Champion 提交于
Replace both concepts with MODE_NOT_SYNC -- WALrep doesn't make any further distinction.
-
由 Jacob Champion 提交于
Most of these codes were set only by GpSegStart.__convertSegments, which has been unused since ce4d96b6. Remove that function as well. The final client of SEGSTART_ERROR_MIRRORING_FAILURE, gpstart, has been simplified; the concept of a "mirroring failure" has not been supported since the removal of filerep.
-
由 Jacob Champion 提交于
- happy path - mirrors are marked down - mirrors are dead but marked up Co-authored-by: NMark Sliva <msliva@pivotal.io>
-
由 Shoaib Lari 提交于
We no longer have a filerep-based API to be able to retrieve a mirror's version. Instead, have the postmaster append a known string (POSTMASTER_MIRROR_VERSION_DETAIL_MSG) to the detail message when a client attempts to connect to a mirror. gpstate will then look for this string to determine a mirror's version. (This is similar to the current practice of returning replication state in the detail message.) Co-authored-by: NMark Sliva <msliva@pivotal.io> Co-authored-by: NJacob Champion <pchampion@pivotal.io>
-
由 Jacob Champion 提交于
These steps relied on has_process_eventually_stopped() to tell them when a process of a given PID had finally exited. Unfortunately that function doesn't take a PID -- it takes a process name and passes that to pgrep. Replace the implementation here with one that supports PIDs. (And bring the default timeout down from 2 minutes; there's no reason a process kill should take that long.)
-