- 09 1月, 2020 1 次提交
-
-
由 dyozie 提交于
-
- 08 1月, 2020 3 次提交
-
-
由 Paul Guo 提交于
We previously had the code to terminate the connection if needed on QE to avoid potential data inconsistency. This is gpdb specific since upstream code there seems to be not friendly for failover + data consistency. However that introduces various abort or assert failures since apparently some shm exit callback functions are not friendly to the current transaction state. Below are some stack examples. Originally I fixed them in those callback functions but I found on gpdb6 after I fixed one, another one (in another callback function) come out. That's why I could collect so many gpdb 6 stacks below. I just collect one gpdb master stack, but I it should have more stacks also if we fixing in those callbacks one by one. Anyway finally I decide to fix by delaying the ereport(FATAL) exec_mpp_dtx_protocol_command() instead, and let QD retry 2PC to ensure the data consistency. Note 1PC retry is currently not implemented but this should be in another PR. gpdb master (7) stack: 2 0x0000000000b48ddc in ExceptionalCondition (conditionName=0xe527e8 "!(ShmemAddrIsValid(nextElem))", errorType=0xe527bd "FailedAssertion", fileName=0xe527b2 "shmqueue.c", lineNumber=74) at assert.c:66 3 0x0000000000996311 in SHMQueueDelete (queue=0x7ff5e6676da8) at shmqueue.c:74 4 0x00000000009689de in SyncRepCleanupAtProcExit () at syncrep.c:436 5 0x00000000009a7b49 in ProcKill (code=1, arg=0) at proc.c:949 6 0x000000000098c001 in shmem_exit (code=1) at ipc.c:288 7 0x000000000098be5f in proc_exit_prepare (code=1) at ipc.c:212 8 0x000000000098bd64 in proc_exit (code=1) at ipc.c:104 9 0x0000000000b4a7d4 in errfinish (dummy=0) at elog.c:738 10 0x000000000096860e in SyncRepWaitForLSN (lsn=210148624, commit=1 '\001') at syncrep.c:303 11 0x000000000055c082 in RecordTransactionCommitPrepared (xid=638, gid=0x2c603ad "1575462785-0000000012", nchildren=0, children=0x2c6d2d0, nrels=0, rels=0x2c6d2d0, ndeldbs=0, deldbs=0x2c6d2d0, ninvalmsgs=0, invalmsgs=0x2c6d2d0, initfileinval=0 '\000') at twophase.c:2283 12 0x000000000055aae3 in FinishPreparedTransaction (gid=0x2c603ad "1575462785-0000000012", isCommit=1 '\001', raiseErrorIfNotFound=0 '\000') at twophase.c:1493 13 0x0000000000c4e4fe in performDtxProtocolCommitPrepared (gid=0x2c603ad "1575462785-0000000012", raiseErrorIfNotFound=0 '\000') at cdbtm.c:2037 14 0x0000000000c4e9d5 in performDtxProtocolCommand (dtxProtocolCommand=DTX_PROTOCOL_COMMAND_RECOVERY_COMMIT_PREPARED, gid=0x2c603ad "1575462785-0000000012", contextInfo=0x1220f20) at cdbtm.c:2215 gpdb 6 stacks: 2 0x0000000000ad9ea5 in ExceptionalCondition (conditionName=0xdbfddb "!(MyProc->syncRepState == 0)", errorType=0xdbfd28 "FailedAssertion", fileName=0xdbfcd0 "syncrep.c", lineNumber=130) at assert.c:66 3 0x000000000091ce81 in SyncRepWaitForLSN (XactCommitLSN=3400317528) at syncrep.c:130 4 0x000000000053991a in RecordTransactionCommit () at xact.c:1663 5 0x000000000053b0b2 in CommitTransaction () at xact.c:2756 6 0x000000000053c024 in CommitTransactionCommand () at xact.c:3646 7 0x00000000005c6c25 in RemoveTempRelationsCallback (code=1, arg=0) at namespace.c:4107 8 0x000000000093c353 in shmem_exit (code=1) at ipc.c:257 9 0x000000000093c248 in proc_exit_prepare (code=1) at ipc.c:214 10 0x000000000093c146 in proc_exit (code=1) at ipc.c:104 11 0x0000000000adb93d in errfinish (dummy=0) at elog.c:754 12 0x000000000091d2ef in SyncRepWaitForLSN (XactCommitLSN=3400294096) at syncrep.c:284 13 0x0000000000549d8e in EndPrepare (gxact=0x7f8a7d5fa0e0) at twophase.c:1241 3 0x0000000000ade6d1 in elog_finish (elevel=22, fmt=0xc3a898 "cannot abort transaction %u, it was already committed") at elog.c:1735 4 0x0000000000539d22 in RecordTransactionAbort (isSubXact=0 '\000') at xact.c:1923 5 0x000000000053b95c in AbortTransaction () at xact.c:3340 6 0x000000000053e0a7 in AbortOutOfAnyTransaction () at xact.c:5248 7 0x00000000005c68b9 in RemoveTempRelationsCallback (code=1, arg=0) at namespace.c:4088 8 0x000000000093c371 in shmem_exit (code=1) at ipc.c:257 9 0x000000000093c266 in proc_exit_prepare (code=1) at ipc.c:214 10 0x000000000093c164 in proc_exit (code=1) at ipc.c:104 11 0x0000000000adb94e in errfinish (dummy=0) at elog.c:754 12 0x000000000091d30d in SyncRepWaitForLSN (XactCommitLSN=19529538376) at syncrep.c:284 13 0x000000000053985a in RecordTransactionCommit () at xact.c:1663 2 0x0000000000adb9a9 in ExceptionalCondition (conditionName=0xdb2560 "!(entry->trans == ((void *)0))", errorType=0xdb2550 "FailedAssertion", fileName=0xdb216a "pgstat.c", lineNumber=842) at assert.c:66 3 0x00000000008d3391 in pgstat_report_stat (force=1 '\001') at pgstat.c:842 4 0x00000000008d65e8 in pgstat_beshutdown_hook (code=1, arg=0) at pgstat.c:2685 5 0x000000000093deba in shmem_exit (code=1) at ipc.c:290 6 0x000000000093dd1a in proc_exit_prepare (code=1) at ipc.c:214 7 0x000000000093dc18 in proc_exit (code=1) at ipc.c:104 8 0x0000000000add441 in errfinish (dummy=0) at elog.c:750 9 0x000000000091ee5c in SyncRepWaitForLSN (XactCommitLSN=225227432) at syncrep.c:333 10 0x0000000000549dd8 in EndPrepare (gxact=0x7f02508680e0) at twophase.c:1241 2 0x0000000000adb9c2 in ExceptionalCondition (conditionName=0xdcb458 "!(!((allPgXact[proc->pgprocno].xid) != ((TransactionId) 0)))", errorType=0xdcb408 "FailedAssertion", fileName=0xdcb3d9 "procarray.c", lineNumber=369) at assert.c:66 3 0x000000000093f614 in ProcArrayRemove (proc=0x7f4f1f5a05d0, latestXid=0) at procarray.c:369 4 0x00000000009586ec in RemoveProcFromArray (code=1, arg=0) at proc.c:904 5 0x000000000093ded3 in shmem_exit (code=1) at ipc.c:290 6 0x000000000093dd33 in proc_exit_prepare (code=1) at ipc.c:214 7 0x000000000093dc31 in proc_exit (code=1) at ipc.c:104 8 0x0000000000add45a in errfinish (dummy=0) at elog.c:750 9 0x000000000091ee75 in SyncRepWaitForLSN (XactCommitLSN=348629504) at syncrep.c:333 10 0x0000000000549dd8 in EndPrepare (gxact=0x7f4f1fa8cce0) at twophase.c:1241 11 0x000000000053b621 in PrepareTransaction () at xact.c:3115 Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io> Reviewed-by: NAsim R P <apraveen@pivotal.io> Cherry-picked from 7b761730
-
由 Abhijit Subramanya 提交于
The modify_table_data_corrupt test failed due to difference in the ORCA version string. So ignore it by adding the pattern in the isolation2 init_file.
-
由 Heikki Linnakangas 提交于
The code in EXPLAIN that displays the gang information of a node was correctly prepared to handle the case that a node was missing flow information, by looking at the child plan's flow instead. However, it's possible to have two such nodes on top of each other. We need to look at the grandchild's flow in that case. This only occurs with JSON/YAML format EXPLAIN, because in text format we only print the gang information on Motion nodes. Fixes https://github.com/greenplum-db/gpdb/issues/9359. Fix on 6X_STABLE only; 5X_STABLE didn't have JSON/YAML format output, and this works on master. I'm not entirely sure why this works on master, but I'm not going to spend time figuring that out right now, because I'm just about to refactor this so that it's not based on the Flow nodes at all (https://github.com/greenplum-db/gpdb/pull/9093). Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
-
- 07 1月, 2020 3 次提交
-
-
由 Abhijit Subramanya 提交于
-
由 Jesse Zhang 提交于
This commit fixes the new Clang 10 warnings around misleading indentation, in the same vein as commit b93a631f.
-
由 Paul Guo 提交于
Saw the below similar test failure (e.g. in test starve_case) some times. It seems that this is due to test misc. test misc was run in parallel with other tests (besides starve_case). It creates table and index in utility mode and could easily introduce oid conflict. Moving the test out of the parallel running group to fix the failures. create table starve (c int); CREATE create table starve_helper (name varchar, sessionid int); -CREATE +DETAIL: Key (oid)=(131128) already exists. +ERROR: duplicate key value violates unique constraint "pg_type_oid_index" (cherry picked from commit 2de2f3f7)
-
- 03 1月, 2020 9 次提交
-
-
由 Zhenghua Lyu 提交于
If a subquery's locus is general, we should keep it general here. And general locus's numsegments should be the cluster size.
-
由 Zhenghua Lyu 提交于
Recursive union plannode contains two non-empty subplan trees, so that this plannode's flow and locus should take both trees into consideration. Besides, between Recursive union plannode node and WorkTableScan plannode there must be no Motion nodes because the execution of WorkTableScan depends on the Recursive union's data structure. And we always use cteplan's locus as WTS's locus. Remember WTS path cannot be turned to replicated(means broadcast) when dealing with join. Most of the cases, it is OK. But for replicated table whose locus is CdbLocusType_SegmentGeneral, it can not be taken as everywhere, we will gather it to singleQE or redistribute it when joining. To avoid such case, if cteplan's locus is CdbLocusType_SegmentGeneral, we build WTS path using singlQE, and later in the function `set_recursive_union_flow` to add a gather on the top of cteplan.
-
由 Ning Yu 提交于
A temp table's schema name is pg_temp_<session_id> in normal mode, in utility mode the name is pg_temp_<backend_id>, however once the normal-mode session id equals to the utility-mode backend id they will conflict with each other and cause catalog corruption on the segment. To fix this issue we changed the name to pg_temp_0<backend_id> in utility mode, this still matches the pattern "pg_temp_[0-9]+", which is expected for temp schema names. (cherry picked from commit 9bde1b01)
-
由 Huiliang.liu 提交于
cherry pick from gpdb master. gpload will run in GPDB6 compatibility mode if imports gpVersion failed
-
由 Ning Yu 提交于
In AddInvalidationMessage() a new chunk always has double size of last chunk, but once the size exceeds 1GB, the max allowed alloc size, an error like "invalid memory alloc request size 1,342,177,300" will be thrown. Fixed by limiting the chunk size. (cherry picked from commit a268b387530ba4007d97a9c72e402546f48ce9bc) (cherry picked from commit 8fdf36d7)
-
由 Zhenghua Lyu 提交于
Commit d95f351a does not add splitupdate case. This commit adds this kind of test cases.
-
由 Huiliang.liu 提交于
GPload: change metadata query SQL to improvement performance Old query SQL may take long time if catalog is large.
-
由 Ashwin Agrawal 提交于
alter_db_set_tablespace test has scenarios to inject error fault for content 0. Then run ALTER DATABASE SET TABLESPACE command. Once error is hit on content 0, the transaction is aborted. Based on when the transaction gets aborted, its unpredictable what point the command has reached for non-content 0 primaries. If non-content 0 primaries, have reached the point of directory copy, then only abort record for them will have database directory deletion record to be replayed on mirror, else not. The test was waiting for directory deletion fault to be triggered for all the content mirrors. This expectation is incorrect and makes test flaky based on timing. Hence, modifying the test for error scenarios to only wait for directory deletion for content 0. Then wait for all the mirrors to replay all the currently generated wal records, post which make sure destination directory is empty. This should eliminate the flakiness from the test. Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
由 Ashwin Agrawal 提交于
gpdeletesystem uses GpDirsExist() to check if dump directories are present to warn and avoid deleting the cluster. Only if "-f" option is used allowed to delete the cluster with dump directories present. Though this function incorrectly checks for files and directories with name "*dump*" and not just directories. So, gpdeletesystem started failing after commit eb036ac1. FTS writes file with name of file as `gpsegconfig_dump`. GpDirsExist() incorrectly reports this as backup directory present and fails. Fix the same by only checking for directories and not files. Fixes https://github.com/greenplum-db/gpdb/issues/8442Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
- 02 1月, 2020 1 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
The error code should not set twice with different codes in `errfinish_and_return`. Since the order of function's parameters is dependent on the compiler. If the final error code is ERRCODE_INTERNAL_ERROR, file name and line number print out. ``` ERROR: Error on receive from SEG IP:PORT pid=PID: *** (cdbdispatchresult.c:487) ``` It's strange to print out the file name and line number here. So, remove ERRCODE_INTERNAL_ERROR, and only keep ERRCODE_GP_INTERCONNECTION_ERROR, which is also the error code before commit: 143bb7c6. Reviewed-by: NPaul Guo <paulguo@gmail.com> (cherry picked from commit 55d6415b)
-
- 01 1月, 2020 1 次提交
-
-
由 Ashwin Agrawal 提交于
Test should make sure mirror has processed the drop database wal record before proceeding to perform check for destination tablespace directory non-existence. It skipped performing the wait for content 0, incase of panic after writing wal record, that's incorrect. Adding to wait for all the mirror's to process the wal record and then only perform the validation. This should fix the failures seen in CI with below diff ``` --- /tmp/build/e18b2f02/gpdb_src/src/test/regress/expected/alter_db_set_tablespace.out 2019-10-14 16:09:43.638372174 +0000 +++ /tmp/build/e18b2f02/gpdb_src/src/test/regress/results/alter_db_set_tablespace.out 2019-10-14 16:09:43.714379108 +0000 @@ -1262,25 +1271,352 @@ CONTEXT: PL/Python function "stat_db_objects" NOTICE: dboid dir for database alter_db does not exist on dbid = 4 CONTEXT: PL/Python function "stat_db_objects" -NOTICE: dboid dir for database alter_db does not exist on dbid = 5 -CONTEXT: PL/Python function "stat_db_objects" NOTICE: dboid dir for database alter_db does not exist on dbid = 6 CONTEXT: PL/Python function "stat_db_objects" NOTICE: dboid dir for database alter_db does not exist on dbid = 7 CONTEXT: PL/Python function "stat_db_objects" NOTICE: dboid dir for database alter_db does not exist on dbid = 8 CONTEXT: PL/Python function "stat_db_objects" - dbid | relfilenode_dboid_relative_path | size -------+---------------------------------+------ - 1 | | - 2 | | - 3 | | - 4 | | - 5 | | - 6 | | - 7 | | - 8 | | -(8 rows) + dbid | relfilenode_dboid_relative_path | size +------+---------------------------------+-------- + 1 | | + 2 | | + 3 | | + 4 | | + 5 | 180273/112 | 32768 + 5 | 180273/113 | 32768 + 5 | 180273/12390 | 65536 + 5 | 180273/12390_fsm | 98304 <....choping output as very long...> + 5 | 180273/PG_VERSION | 4 + 5 | 180273/pg_filenode.map | 1024 + 6 | | + 7 | | + 8 | | +(337 rows) ``` Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
- 31 12月, 2019 1 次提交
-
-
由 Paul Guo 提交于
This helps script handling by checking return values. Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
- 30 12月, 2019 1 次提交
-
-
由 xiong-gang 提交于
When 'CopyReadLineText' find a broken end-of-copy marker, it errors out without setting the current index in the buffer. In the case of 'reject limit' is set, copy will process the line again.
-
- 28 12月, 2019 1 次提交
-
-
由 Ashwin Agrawal 提交于
Based on reports from field for GPDB, 1 min of wal_sender_timeout GUC is causing primary to terminate the replication connection too often in heavy workload situations. This causes mirror to be marked down and piles up WAL on primary. This is moslty seen in configurations where fsync takes long time on mirrors. Hence, would be helpful to have higher default value of this GUC to avoid unnecessary marking mirror down situations. Only downside of this change would be when connection between primary and mirror exist but for some reason mirror doesn't respond, it will be detected little later compared to previous 1 min timeout. But 1 min timeout is causing major downside and mirrors need to be manually recovered after being marked down. Hence, its desirable to not falsely break the connectiion due to timeout. Increasing the timeout to 5 mins is just a educated guess as its hard to come up with reasonable default, but bumping the value is desired based on inputs. Reviewed-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
-
- 27 12月, 2019 2 次提交
-
-
由 Zhenghua Lyu 提交于
Previous commit c9655e2d forgot to check tuple locality for update. This commit add check tuple locality for ExecUpdate and refactor test cases.
-
由 Chuck Litzell 提交于
-
- 26 12月, 2019 3 次提交
-
-
由 Ning Yu 提交于
ZSTD creates CCtx and DCtx with malloc() by default, a NULL pointer will be returned on OOM, the callers must check for NULL pointers. Also fixed a typo in the comment. Fixes: https://github.com/greenplum-db/gpdb/issues/9294 Reported-by: shellboy Reviewed-by: NZhenghua Lyu <zlv@pivotal.io> (cherry picked from commit d74aa39f)
-
由 Paul Guo 提交于
We recently start to control wal write burst by calling SyncRepWaitForLSN() more frequently, then the changes cause the segspace test flaky. In the segspace test case, there is an inject fault (exec_hashjoin_new_batch) with interrupt event, this makes the test easier to have the cancel event seen in SyncRepWaitForLSN() and then cause additional outputs sometimes. Fixing this by disabling the current cancel handling code if it is not a commit call of SyncRepWaitForLSN(). Here is the diff of the test failure. begin; insert into segspace_t1_created SELECT t1.* FROM segspace_test_hj_skew AS t1, segspace_test_hj_skew AS t2 WHERE t1.i1=t2.i2; +DETAIL: The transaction has already changed locally, it has to be replicated to standby. ERROR: canceling MPP operation +WARNING: ignoring query cancel request for synchronous replication to ensure cluster consistency rollback; Cherry-picked from 84642c4b Besides, add the commit parameter in SyncRepWaitForLSN() following the master code. Checked the related upstream patch. Adding the new parameter in current gpdb version should be fine.
-
由 Ning Yu 提交于
The alter_db_set_tablespace test has been flaky for a long time, one typical failure is like this: --- /regress/expected/alter_db_set_tablespace.out +++ /regress/results/alter_db_set_tablespace.out @@ -1204,21 +1213,348 @@ NOTICE: dboid dir for database alter_db does not exist on dbid = 2 NOTICE: dboid dir for database alter_db does not exist on dbid = 3 NOTICE: dboid dir for database alter_db does not exist on dbid = 4 -NOTICE: dboid dir for database alter_db does not exist on dbid = 5 NOTICE: dboid dir for database alter_db does not exist on dbid = 6 NOTICE: dboid dir for database alter_db does not exist on dbid = 7 NOTICE: dboid dir for database alter_db does not exist on dbid = 8 The test disables fts probing with fault injection, however it does not wait for the fault to be triggered. The other problem is that the fts probing was disabled after the PANIC, that might not be in time. So the problem was that we were having a scenario where we were injecting the fault after the fts loop was beyond the fault point and then when the subsequent PANIC was caused, fts was still active. By manually triggering, and then by waiting to ensure that the fault is hit at least once, we can guarantee that the scenario described above doesn't happen. Reviewed-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io> Reviewed-by: NTaylor Vesely <tvesely@pivotal.io> Reviewed-by: NHubert Zhang <hzhang@pivotal.io> (cherry picked from commit 54e3af6d)
-
- 24 12月, 2019 3 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
For below external table: ``` CREATE EXTERNAL WEB TABLE web_ext ( junk text) execute 'echo hi' on master FORMAT 'text' (delimiter 'OFF' null E'\\N' escape E'\\'); ``` When querying the table, an unexpected error happens: ``` SELECT * FROM web_ext; ERROR: using no delimiter is only supported for external tables ``` Since external scan calls BeginCopyFrom to init CopyStateData. When `ProcessCopyOptions` in `BeginCopy`, the relation may be an external relation. The fix checks whether the relation is an external relation and if yes, set the correct parameters for `ProcessCopyOptions`.
-
由 Ashwin Agrawal 提交于
Similar to commit 8c40565a, apply same change to commit_blocking_on_standby test as well. Unnecessary to check sync_state and it makes the test flaky, only way would be add reties instead not checking for same is better.
-
由 Ashwin Agrawal 提交于
This test fails sometimes with below diff ``` -- Sync state between master and standby must be restored at the end. select application_name, state, sync_state from pg_stat_replication; application_name | state | sync_state ------------------+-----------+------------ - gp_walreceiver | streaming | sync + gp_walreceiver | streaming | async (1 row) ``` The reason being, if this query is excuted in window between when standby is created and changes state to streaming but yet to set flush to valid location based on reply from standby. pg_stat_get_wal_senders() reports sync_state as "async", if flush location is invalid pointer. Hence, we get the above diff sometimes in test based on timing. To fix the same, removing the sync_state field from above query for this test. As in GPDB we always create standby as sync only, secondly "state" is giving us what we wish to check here, which is if standby is up and running or not. Checking for sync_state is unnecessary and hence avoid the same and make test stable. If we have to keep the sync_state, would have to add unnecessary retry logic for this query.
-
- 23 12月, 2019 3 次提交
-
-
由 Zhenghua Lyu 提交于
transformRelOptions may return a null pointer in some cases, add the check in function `add_partition_rule`.
-
由 Heikki Linnakangas 提交于
The Motion sender code has four different codepaths for serializing a tuple from the input slot: 1. Fetch MemTuple from slot, copy it out as it is. 2. Fetch MemTuple from slot, re-format it into a new MemTuple by fetching and inlining any toasted datums. Copy out the re-formatted MemTuple. 3. Fetch HeapTuple from slot, copy it out as it is. 4. Fetch HeapTuple from slot, copy out each attribute separately, fetching and inlining any toasted datums. In addition to the above, there are "direct" versions of codepaths 1 and 3, used when the tuple fits in the caller-provided output buffer. As discussed in https://github.com/greenplum-db/gpdb/issues/9253, the fourth codepath is very inefficient, if the input tuple contains datums that are compressed inline, but not toasted. We decompress such tuples before serializing, and in the worst case, might need to recompress them again in the receiver if it's written out to a table. I tried to fix that in commit 4c7f6cf7, but it was broken and was reverted in commit 774613a8. This is a new attempt at fixing the issue. This commit removes codepath 4. altogether, so that if the input tuple is a HeapTuple with any toasted attributes, it is first converted to a MemTuple and codepath 2 is used to serialize it. That way, we have less code to test, and materializing a MemTuple is roughly as fast as the old code to write out the attributes of a HeapTuple one by one, except that the MemTuple codepath avoids the decompression of already-compressed datums. While we're at it, add some tests for the various codepaths through SerializeTuple(). To test the performance of the affected case, where the input tuple is a HeapTuple with toasted datums, I used this: --- CREATE temporary TABLE foo (a text, b text, c text, d text, e text, f text, g text, h text, i text, j text, k text, l text, m text, n text, o text, p text, q text, r text, s text, t text, u text, v text, w text, x text, y text, z text, large text); ALTER TABLE foo ALTER COLUMN large SET STORAGE external; INSERT INTO foo SELECT 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', repeat('1234567890', 1000) FROM generate_series(1, 10000); -- verify that the data is uncompressed, should be about 110 MB. SELECT pg_total_relation_size('foo'); \o /dev/null \timing on SELECT * FROM foo; -- repeat a few times --- The last select took about 380 ms on my laptop, with or without this patch. So the new codepath where the input HeapTuple is converted to a MemTuple first, is about as fast as the old method. There might be small differences in the serialized size of the tuple, too, but I didn't explicitly measure that. If you have a toasted but not compressed datum, the input must be quite large, so small differences in the datum header sizes shouldn't matter much. If the input HeapTuple contains any compressed datums, this avoids the recompression, so even if converting to a MemTuple was somewhat slower in that case, it should still be much better than before. I kept the HeapTuple codepath for the case that there are no toasted datums. I'm not sure it's significantly faster than converting to a MemTuple either; the caller has to slot_deform_tuple() the received tuple before it can do much with it, and that is slower with HeapTuples than MemTuples. But that codepath is straightforward enough that getting rid of it wouldn't save much code, and I don't feel like doing the performance testing to justify it right now. Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
由 Heikki Linnakangas 提交于
It cannot return NULL. It will either return a valid pointer, or the palloc() will ERROR out. Reviewed-by: NAsim R P <apraveen@pivotal.io>
-
- 21 12月, 2019 3 次提交
-
-
由 ggbq 提交于
Master having mode as COPY_DISPATCH incorrectly allocated the TupleTableSlot for a new ResultRelInfo of one partition on the per-tuple memory context. This can happen in the existence of ResultRelInfo::ri_partInsertMap. This causes crash because the per-tuple context will be reset for each tuple iteration. It should be using the per-query context. Reproduce the crash using the following SQL commands: DROP TABLE IF EXISTS partition_test; CREATE TABLE partition_test ( id INT, tm TIMESTAMP ) DISTRIBUTED BY (id) PARTITION BY RANGE(tm) ( PARTITION p2019 START ('2019-01-01'::TIMESTAMP) END ('2020-01-01'::TIMESTAMP), DEFAULT PARTITION extra ); ALTER TABLE partition_test ADD COLUMN dd TIMESTAMP; ALTER TABLE partition_test DROP COLUMN dd; ALTER TABLE partition_test ADD COLUMN dd TEXT; ALTER TABLE partition_test SPLIT DEFAULT PARTITION START ('2020-01-01'::TIMESTAMP) END ('2021-01-01'::TIMESTAMP) INTO (PARTITION p2020, DEFAULT PARTITION); COPY (SELECT generate_series, '2020-12-20'::TIMESTAMP, 'ABCDEF' FROM generate_series(1, 10000)) TO '/tmp/partition_test.txt'; COPY partition_test FROM '/tmp/partition_test.txt'; Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
由 Heikki Linnakangas 提交于
Commit 589c737e bumped the expected ORCA version number to 3.86, but forgot to update the error message.
-
- 20 12月, 2019 5 次提交
-
-
由 Hao Wu 提交于
In this PR(https://github.com/greenplum-db/gpdb/pull/9248), we set the default value of wal_keep_segments to 0, the same as upstream, because we have replication slot to avoid removal of WAL files required by the mirror. It seems to be fine. But there is no replication slot for master/standby now. It's unsafe to remove WAL files if the file was required by the standby. So, for now, before the replication slot is added to master, let's set the default value of wal_keep_segments to 5. (cherry picked from commit 3ce78553)
-
由 Ashwin Agrawal 提交于
To cover for Alter Table and CTAS, heap_insert() is common place, so we felt better to have the call in heap_insert() instead of spreading calls specifically only for those two functionalities. Vacuum full uses cluster code, we have placed call separately for vacuum full, which also covers cluster. Lazy vacuum needed separate call as well. Co-authored-by: NAdam Lee <ali@pivotal.io> Reviewed-by: NPaul Guo <pguo@pivotal.io> (cherry picked from commit 22b4073d)
-
由 Ashwin Agrawal 提交于
Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NPaul Guo <pguo@pivotal.io> (cherry picked from commit eae1c6ef)
-
由 Ashwin Agrawal 提交于
Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NPaul Guo <pguo@pivotal.io> (cherry picked from commit cf254d1d)
-
由 Ashwin Agrawal 提交于
Transactions on commit, wait for replication and make sure WAL is flushed up to commit lsn on mirror in GPDB. While commit is madatory sync/wait point, waiting for replication at some periodic intervals even before that may be desirable/efficient to act as good citizen in system. Consider for example setup where primary and mirror can write at 20GB/sec, while network between them can only transfer at 2GB/sec. Now if CTAS is run in such setup for large table, it can generate WAL very accresively on primary, but can't be transfered at that rate to mirror. Hence, there would be pending WAL build-up on primary. This exhibits two main things: - new write transactions (even if single tuple I/U/D), would exhibit latency for amount of time equivalent to the pending WAL to be shipped and flushed to mirror - primary needs to have space to hold that much WAL, since till the WAL is not shipped to mirror, it can't be recycled So, to make the situation better instead of waiting for mirror only at commit point, need way to avoid primary racing to forward with WAL generation and instead have way to move large transactions at more sustained speed with network and mirrors. This will help to avoid bulk transactions starving concurrent transactions from commiting due to sync rep. Adding global (backend local) variable, which tracks amount of wal written by transaction. Interface `wait_to_avoid_large_repl_lag()` which can be called at strategic points to wait for replication. This interface if threshold amount of WAL is written (defined by a new GUC) by transaction, calls SyncRepWaitForLSN() with LSN equal to cached value of WAL flush point. So, using this interface large WAL generation transactions can wait for replication based on amount of WAL written by them much before reaching commit point as well. Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/3qMsyIj3ikA/bcioZv8wAQAJReviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NPaul Guo <pguo@pivotal.io> (cherry picked from commit 0aec3c8f)
-