- 16 10月, 2018 5 次提交
-
-
由 Abhijit Subramanya 提交于
-
由 Dhanashree Kashid 提交于
This patch enables support for optimization of SELECT and INSERT statements for replicated tables in ORCA. UPDATE and DELETE statements are currently not supported in this patch. Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
-
由 mkiyama 提交于
-
由 Bhuvnesh Chaudhary 提交于
If a Append Only-Row oriented table was empty, relpages of the table was estimated as 1 for a segment. So, the relpages updated in pg_class was the total number of segments available in the cluster, however that is incorrect. If the table is empty, relpages for the table in pg_class should be updated to 1. Make the calculation to show consistent behavior with other table types.
-
由 Jimmy Yih 提交于
This used to be handled by persistent tables. When persistent tables were removed, we forgot to add back the dropping of shared buffer cache entries as part of DROP DATABASE operation. This commit adds it back along with a test so we do not ever forget. Some relevant comments from Heikki/Jesse: The reason why this issue was not seen is because there is a RequestCheckpoint() near the end of dropdb(). So before dropdb() actually removes the files for the database, all dirty buffers have already been flushed to disk. The buffer manager will not try to write out clean buffers back to disk, so they will just age out of the buffer cache over time. One way this issue could have shown itself if we did not catch this could be the rare scenario that the same database OID is reused later for a different database, where this could cause false positives in future buffer cache lookups.
-
- 15 10月, 2018 2 次提交
-
-
由 Ning Yu 提交于
Now there is only the async dispatcher. The dispatcher API interface is kept so we might add new backend in the future. The GUC gp_connections_per_thread is also retired which was used to switch between the async and threaded backends.
-
由 Ning Yu 提交于
* Protect catalog changes on master. To allow gpexpand to do the job without restarting the cluster we need to prevent concurrent catalog changes on master. A catalog lock is provided for this purpose, all insert/update/delete to catalog tables should hold this lock in shared mode before making the changes; gpexpand can hold this lock in exclusive mode to 1) wait for in-progress catalog updates to commit/rollback and 2) prevent concurrent catalog updates. Add UDF to hold catalog lock in exclusive mode. Add test cases for the catalog lock. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * Numsegments management. GpIdentity.numsegments is a global variable saving the cluster size in each process. It is important for cluster. Changing of the cluster size means the expansion is finished and new segments have taken effect. FTS will count the number of primary segments from gp_segment_configuration and record it in shared memory. Later, other processes including master, fts, gdd, QD will update GpIdentity.numsegmentswith this information. So far it's not easy to make old transactions running with new segments, so for QD GpIdentity.numsegments can only be updated at beginning of transactions. Old transactions can only see old segments. QD will dispatch GpIdentity.numsegments to QEs, so they will see the same cluster size. Catalog changes in old transactions are disallowed. Consider below workflow: A: begin; B: gpexpand from N segments to M (M > N); A: create table t1 (c1 int); A: commit; C: drop table t1; Transaction A began when cluster size was still N, all its commands are dispatched to N segments only even after cluster being expanded to M segments, so t1 was only created on N tables, not only the data distribution but also the catalog records. This will cause the later DROP TABLE to fail on the new segments. To prevent this issue we currently disable all catalog updates in old transactions once the expansion is done. New transactions can update catalog as they are already running on all the M segments. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * Online gpexpand implementation. Do not restart the cluster during expansion. - Lock the catalog - Create the new segments from master segment - Add new segments to cluster - Reload the cluster so new transaction and background processes can see new segments - Unlock the catalog Add test cases. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * New job to run ICW after expansion. It is better to run ICW after online expansion to see if the cluster works well. So we add a new job to create a cluster with two segments first, then expand to three segments and run all the ICW cases. Cluster restarting is forbidden for we must be sure all the cases will be passed after online expansion. So any case which contains cluster restarting must be removed from this job. The new job is icw_planner_centos7_online_expand, it is same as icw_planner_centos7 but contains two new params EXCLUDE_TESTS and ONLINE_EXPAND. If ONLINE_EXPAND is set, the ICW shell will go to different branch, creating cluster with size two, expanding, etc. EXCLUDE_TESTS lists the cases which contain cluster restarting or the test cases will fail without restarting test cases. After the whole run, the pid of master will be checked, if it changes, the cluster must be restarted, so the job will fail. As a result any new restarting test cases should be added to EXCLUDE_TESTS. * Add README. Co-authored-by: NJialun Du <jdu@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io> * Small changes per review comments. - Delete confusing test case - Change function name updateBackendGpIdentityNumsegments to updateSystemProcessGpIdentityNumsegments - Fix some typos * Remove useless Assert These two asserts will cause failure in isolation2/uao/ compaction_utility_insert. This case will check AO table in utility mode. But numsegments are meaningless in sole segment, so this assert must be removed.
-
- 14 10月, 2018 1 次提交
-
-
由 Heikki Linnakangas 提交于
In the query in checkPGClass, AFAICS the reltoastrelid and reltoastidxid fields were not used for anything, so just remove them from the query. (To be honest, the whole query seems pretty pointless to me). In the checkOwners() query, I also just removed the check for the toast index. We have a regression test to check that the owner is set correctly in ALTER TABLE commands, in the main test suite, in 'alter_distribution_policy' test, so it seems unlikely for that particular problem to reappear. An index should always have the same owner as the table itself, so perhaps adding a test for that more general case would be useful, but I don't see much value in testing the toast index specifically. pg_class.reltoastidxid field will be removed in PostgreSQL 9.4. Until then, there's nothing wrong with these queries as such, but I wanted to open a separate PR to get more eyeballs on the decisions.
-
- 13 10月, 2018 8 次提交
-
-
由 Ashwin Agrawal 提交于
The planner file is same as optimizer. Ideally, optimizer file was stale and after change in commit bd8fb75d to diff against specific file only this test started failing :-)
-
由 Ashwin Agrawal 提交于
In GPDB, platform_expectfile is used for determining ORCA/planner/resgroup expect files, wheras in upstream that is not the case and it is based on the underlying platform. Thus, it is unnecessary and confusing to compare against default answer file even when platform_expect file exists. It gets confusing because the below block chooses the best expect file based on the number of lines in diff file. Co-authored-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Joao Pereira 提交于
Starting with the 9.3 merge the arguments for the global connection to master are being used to create the connections to all databases during pg_dump. This is causing pg_dump to not retrieve any data present in the segments because we set utility mode on the global connection. We reverted the utility mode change to align to upstream as we are not required to be in utility mode to retrieve the information that we need from master. Now that pg_dumpall doesn't accidentally set utility mode for itself, we have to explicitly set it during pg_upgrade. Co-authored-by: NJacob Champion <pchampion@pivotal.io>
-
由 Heikki Linnakangas 提交于
The DISTRIBUTED BY doesn't need to be a left-subset of the UNIQUE indexes. Any subset will do. That has been true at least since e572af54, although the check here has been the same even longer, so I suspect that we allowed that for ALTER TABLE even before that. Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
-
由 Heikki Linnakangas 提交于
When this test was ported over from TINC in commit 64f83aa8, I accidentally left it out of 'greenplum_schedule'. Oops. It has somewhat bitrotted since, due to changes in the error context messages. Fix the bitrot, and enable the test in the schedule.
-
由 Heikki Linnakangas 提交于
The test functions modify the 'bar' table. If it's run in multiple backends concurrently, they can deadlock. We've seen that causing failures in the pipeline a few times. To fix, have a separate copy of the 'foo' and 'bar' test tables for each test that uses these functions. Thanks to Ning Yu, Jialun Du and Melanie Plageman for hunting down the cause of the deadlock.
-
由 David Kimura 提交于
During binary upgrade for segments, the distributed transaction log that we're using at startup has just been copied as-is from the master data directory. This distributed log isn't actually maintained by the master, so it may not have all the pages that we need. This leads to a stacktrace on startup: Could not read from file "pg_distributedlog/[file]" at offset [num] Since no distributed transactions exist during upgrade (there's no cluster to distribute to), and the distributed log is going to be blown away and replaced with the old segment's copy later during the upgrade, let's just make sure it is initialized to all zeroes for the pages between oldestActiveXid and nextXid. Co-authored-by: NJacob Champion <pchampion@pivotal.io> Co-authored-by: NJim Doty <jdoty@pivotal.io> Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
-
由 Jacob Champion 提交于
We were doing a linear search through the list of tables to find the correct OID. That's what findTableByOid() is for -- it does a binary search on the sorted list of tables -- so use it instead.
-
- 12 10月, 2018 11 次提交
-
-
由 Heikki Linnakangas 提交于
RESUME_INTERRUPTS() contains this same check, so this is redundant. The message from RESUME_INTERRUPTS() is more generic, but a PANIC prints a backtrace in the log, and possibly a core dump, and all the extra information in these messages should be evident from the backtrace, too. Besides, this isn't a very common failure mode, I don't remember ever hitting this PANIC. These extra checks were added a very long time ago, maybe there was a specific problem that the developers were trying to hunt down back then, but they don't seem useful anymore. The upcoming 9.4 merge would further complicate these, or make them even less useful. In 9.4, LWLocks are no longer identified by a simple integer, so all we have here is a pointer to the LWLock struct, and parsing that into a human-readable message would require more effort. Or we could just print the pointer value, which is pretty useless.
-
由 Heikki Linnakangas 提交于
The 'fts_recovery_in_progress' fails sometimes with a diff like this: --- /tmp/build/e18b2f02/gpdb_src/src/test/regress/expected/fts_recovery_in_progress.out 2018-10-12 01:35:16.895897976 +0000 +++ /tmp/build/e18b2f02/gpdb_src/src/test/regress/results/fts_recovery_in_progress.out 2018-10-12 01:35:16.899897976 +0000 @@ -37,7 +48,7 @@ show gp_fts_probe_retries; gp_fts_probe_retries ---------------------- - 2 + 5 (1 row) The test sets the 'gp_fts_probe_retries' GUC with gpconfig, changing it from 5 to 2, and sends SIGHUP, and then issues the above SHOW statement to verify that the change took effect. But on a busy system, the SIGHUP and the config change might not reach the backend quickly enough. Add a 5 second delay, to give it more time. On a very busy system, even that might not of course be enough, but let's see how far we get with this, before we start inventing a more complicated waiting mechanism.
-
由 Heikki Linnakangas 提交于
If optimizer_trace_fallback is enabled, report the ORCA exception's message to the user. It usually says something like "Feature not supported by the Pivotal Query Optimizer: Non-default collation", which is quite helpful. While we're at it, simplify setting the "had unexpected failure" flag, by putting it in SOptContext. That allows removing one layer of catching and re-throwing the exception.
-
由 Heikki Linnakangas 提交于
There are a few of these in the 'gporca' test. In the next commit, I'm about to improve "orca_trace_fallback" so that the reason for the fallback is printed to the user, so it'd be nice if the message was more human-readable, and also less likely to change from one PG version to another.
-
由 Heikki Linnakangas 提交于
After commit f8a78c25, I started seeing this compiler warning: check.c: In function ‘check_new_cluster’: check.c:206:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation] if (user_opts.segment_mode == DISPATCHER) ^~ check.c:210:2: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘if’ check_for_prepared_transactions(&new_cluster); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix the indentation to use tabs instead of spaces, to silence that. While we're at it, add an extra pair of braces, for clarity. I think that's more clear, and Paul Guo also pointed at that in the original review of the PR. This is in line of David Krieger's wish to keep the code as close to upstream as possible.
-
由 BaiShaoqi 提交于
Reviewed by Paul, Heikki. Refactored by Heikki.
-
由 Zhenghua Lyu 提交于
This commit fix Github Issue 5976. Co-authored-by: NZhenghua Lyu <zlv@pivotal.io> Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
-
由 David Krieger 提交于
pg_upgrade change to support Greenplum cluster upgrade(gpupgrade): pg_upgrade is subordinate to the overall scheme which will essentially force the master's catalog tables onto the segments. So this check that the target cluster only has one user isn't appropriate for a Greenplum cluster segment.
-
由 Heikki Linnakangas 提交于
Function Scan materializes the result of each in a TupleStore, and can be rescanned. Mark it as rescannable in the planner, so that we avoid putting a pointless Materialize node on top of it. Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
-
由 Heikki Linnakangas 提交于
This is the same fix that was applied to DistributedLog_AdvanceOldestXmin() in commit b26d34cd. This one is harder to hit, because DistributedLog_GetOldestXmin() is only used by segment-local snapshots, i.e. only by utility mode connections. I was able to reproduce it with this: 1. Added a little sleep() in GetOldestSnapshot(), before the call to DistributedLog_GetOldestXmin(). 2. Launched pgbench in one terminal 3. Opened a utility mode connection with psql in another terminal, and ran some queries. The assertion looked like this: FATAL: Unexpected internal error (distributedlog.c:300) DETAIL: FailedAssertion("!(TransactionIdFollowsOrEquals(oldestLocalXmin, result))", File: "distributedlog.c", Line: 300) We bumped into this in the 9.4 merge, because with 9.4, SnapshotNow goes away, and all catalog lookups use a regular MVCC snapshot instead. In the segments, the "regular" MVCC snapshot is a segment-local snapshot, however, so we started to exercise this codepath a lot more, and started seeing these assertion failures occasionally.
-
由 Francisco Guerrero 提交于
- PXF is a public repository and no longer requires a private key to access it. In the pipeline template, we remove this property Co-authored-by: NFrancisco Guerrero <aguerrero@pivotal.io> Co-authored-by: NDivya Bhargov <dbhargov@pivotal.io>
-
- 11 10月, 2018 10 次提交
-
-
由 Heikki Linnakangas 提交于
This reverts commit c605ca73. This depended on the fixes to shell-quoting the environment variables, in commit 7a9de421. I had to revert that, so this needs to be reverted as well.
-
由 Heikki Linnakangas 提交于
This reverts commit 7a9de421. This caused regression failures like this: --- /tmp/build/e18b2f02/gpdb_src/src/test/regress/expected/gpcopy.out 2018-10-11 07:42:24.231194166 +0000 +++ /tmp/build/e18b2f02/gpdb_src/src/test/regress/results/gpcopy.out 2018-10-11 07:42:24.287194166 +0000 @@ -1373,10 +1407,10 @@ COPY foo FROM PROGRAM 'cat /tmp/gpcopyenvtest'; COPY foo FROM PROGRAM 'echo database in COPY FROM: $GP_DATABASE'; select * from foo; - t -------------------------------------------------- + t +-------------------------------------------------- data1 - database in COPY FROM: funny copy"db'withquotes - database in COPY TO: funny copy"db'withquotes + database in COPY FROM: funny copy"db'with\quotes + database in COPY TO: funny copy"db'with\quotes (3 rows) on the concourse pipeline, on SLES and Centos 6 and 7 systems. Looks like the shell-quoting rules are different on those systems than on Debian / Ubuntu. (More precisely, I guess the rules depend on the shell.) The output with the backslash actually looks more correct to me, I wasn't paying enough attention when I accepted the output without it, to the expected output.
-
由 Heikki Linnakangas 提交于
That's what we were effectively choosing for session user before already but findSuperuser() was an overly complicated way to do it. In addition to setting the session userid, also set authenticated userid by calling InitializeSessionUserIdStandalone(), like in most other aux processes.
-
由 Ming Li 提交于
Signed-off-by: NTingfang Bao <bbao@pivotal.io>
-
由 Heikki Linnakangas 提交于
This showed up as a regression test failure on the 9.4 merge branch, although I forgot which test it was :-(. On master, I believe this is harmless, because none of the callers will look at the collation if it's a system column, but we should certainly fix this in any case.
-
由 Heikki Linnakangas 提交于
The external program executed with COPY PROGRAM or EXECUTE-type external table is passed a bunch of environment variables. They are passed by adding them to the command line of the program being executed, with "<var>=<value> && export VAR && ...". However, the quoting in the code that builds that command line was broken. Fix it, and add a test. Fixes github issue https://github.com/greenplum-db/gpdb/issues/5925
-
由 Heikki Linnakangas 提交于
It previously didn't make any attempt at quoting and escaping database names. It's just a developer tool at the moment, but if it can't deal with funny characters, that gets in the way of testing other parts of the system with them.
-
由 Ning Yu 提交于
Query string is dispatched with each sub command, when the query string and the count of sub commands are both large it is a huge effort to dispatch it repeatedly. One simple example is as below: create table t (c1 int, c2 int) /* comment */ /* ... */ /* comment */ partition by range(c2) (start(0) end(1000) every(1)); This SQL contains 2001 sub commands, if we put 10MB comments into it then it is 10MB*2001 = 20GB data to dispatch, that will impact the performance a log. There can also be real world example: create table t (c1 int, c2 int, ..., c100 int) with (appendonly=true, compresslevel=5, orientation=column, compresstype=zlib) partition by range(c2) ( start(0) end(1) every(1) with (tablename='p1', appendonly=true, compresslevel=5, orientation=column, compresstype=zlib) column c1 encoding (compresslevel=5, compresstype=zlib, blocksize=32768) ... column c100 encoding (compresslevel=5, compresstype=zlib, blocksize=32768), ... start(999) end(1000) every(1) with (tablename='p1000', appendonly=true, compresslevel=5, orientation=column, compresstype=zlib) column c1 encoding (compresslevel=5, compresstype=zlib, blocksize=32768) ... column c100 encoding (compresslevel=5, compresstype=zlib, blocksize=32768) ); However dispatched query strings are not always useful, it is mainly for debugging purpose if there is a valid query tree or plan tree. In such a case we could dispatch truncated query strings to improve the performance. The truncated size is hard coded to 1KB for now. A discussion in mailing list: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/8tz4oivq3vM/cp12HKL-BAAJReviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
-
由 Shreedhar Hardikar 提交于
Also bump ORCA version to v3.2.0.
-
由 Heikki Linnakangas 提交于
Got the column name wrong. Broken in the 9.3 merge, I think.
-
- 10 10月, 2018 3 次提交
-
-
由 Daniel Gustafsson 提交于
The special handling for children of append nodes was removed in the 9.3 merge, but the variable was left behind. Fix by removing. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
-
由 Daniel Gustafsson 提交于
Align with upstream and use --quote-all-identifiers when dumping the schema since we have merged the underlying functionality to do so from PostgreSQL. Reviewed-by: NJacob Champion <pchampion@pivotal.io> Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
-
由 Daniel Gustafsson 提交于
To make the Oid preassignment stable against quoted identifiers, grab the namespace name from the catalog every time rather than trying to dequote the passed name. Reviewed-by: NJacob Champion <pchampion@pivotal.io>
-