提交 · 8083a0464441c4fdf6e1567edeb22ee2d5446726 · Greenplum / Gpdb

18 9月, 2020 1 次提交

Align Orca relhasindex behavior with Planner (#10788) · 8083a046

由 David Kimura 提交于 9月 17, 2020

Function `RelationGetIndexList()` does not filter out invalid indexes.
That responsiblity is left to the caller (e.g. `get_relation_info()`).
Issue is that Orca was not checking index validity.

This commit also introduces an optimization to Orca that is already used
in Planner whereby we first check relhasindex before checking pg_index.

(cherry picked from commit b011c351)

8083a046

17 9月, 2020 2 次提交

Do not read a persistent tuple after it is freed · 5f765a8e

由 Asim R P 提交于 9月 15, 2020

This bug was found in a production environment where vacuum on
gp_persistent_relation was concurrently running with a backend
performing end-of-xact filesystem operations.  And the GUC
debug_persistent_print was enabled.

The *_ReadTuple() function was called on a persistent TID after the
corresponding tuple was deleted with frozen transaction ID.  The
concurrent vacuum recycled the tuple and it led to a SIGSEGV when the
backend tried to access values from the tuple.

Fix it by avoiding the debug log message in case when the persistent
tuple is freed (transitioning to FREE state).  All other state
transitions are logged.

In absence of concurrent vacuum, things worked just fine because the
*_ReadTuple() interface reads tuples from persistent tables directly
using TID.

5f765a8e

Skip FK check when do relation truncate · b50c134b

由 Weinan WANG 提交于 9月 17, 2020

GPDB does not support FK, but keep FK grammar in DDL, since it
reduce DB migration manual workload from others.
Hence, we do not need FK check for truncate command, rid of it.

b50c134b

11 9月, 2020 1 次提交

Use return instead of exit() in configure · f3c05b2b

由 Peter Eisentraut 提交于 8月 30, 2016

Using exit() requires stdlib.h, which is not included.  Use return
instead.  Also add return type for main().
Reviewed-by: NHeikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: NThomas Munro <thomas.munro@enterprisedb.com>
(cherry picked from commit 1c0cf52b)
(cherry picked from commit 6d3c99bb)

f3c05b2b

10 9月, 2020 5 次提交

Add .git-blame-ignore-revs · 9b8a2a2f

由 Jesse Zhang 提交于 7月 13, 2020

This file will be used to record commits to be ignored by default by
git-blame (user still has to opt in). This is intended to include
large (generally automated) reformatting or renaming commits.

(cherry picked from commit b19e6abb)

9b8a2a2f

gpstart: skip filespace checks for standby when unreachable · 0ceee69f

由 Kalen Krempely 提交于 8月 26, 2020

When the standby is unreachable and the user proceeds with startup,
gpstart fails to start when temporary or transaction files have been
moved to a non-default filespace.

To determine when the standby is unreachable fetch_tli was reworked to
raise a StandbyUnreachable exception. And the standby is not started if
it is unreachable.
Co-authored-by: NBhuvnesh Chaudhary <bchaudhary@vmware.com>

0ceee69f

behave: test that gpstart continues if standby is unreachable · e037b5ba

由 Jacob Champion 提交于 8月 25, 2020

Add a failing behave test to ensure that gpstart prompts and continues
successfully if the standby host is unreachable. The subsequent commit
will fix the test case.
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>

e037b5ba

L

docs - updates for pxf removal from server distribution (#10786) · 706acd7b
由 Lisa Owen 提交于 9月 09, 2020

706acd7b

Allow direct dispatch in Orca if predicate on column gp_segment_id (#10679) (#10785) · b52d5b9e

由 David Kimura 提交于 9月 09, 2020

This approach special cases gp_segment_id enough to include the column
as a distributed column constraint. It also updates direct dispatch info
to be aware of gp_segment_id which represents the raw value of the
segment where the data resides. This is different than other columns
which hash the datum value to decide where the data resides.

After this change the following DDL shows Gather Motion from 2 segments
on a 3 segment demo cluster.

```
CREATE TABLE t(a int, b int) DISTRIBUTED BY (a);
EXPLAIN SELECT gp_segment_id, * FROM t WHERE gp_segment_id=1 or gp_segment_id=2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..431.00 rows=1 width=12)
   ->  Seq Scan on t  (cost=0.00..431.00 rows=1 width=12)
         Filter: ((gp_segment_id = 1) OR (gp_segment_id = 2))
 Optimizer: Pivotal Optimizer (GPORCA)
(4 rows)

```

(cherry picked from commit 10e2b2d9)

* Bump ORCA version to 3.110.0

b52d5b9e

09 9月, 2020 4 次提交

Revert "gpstart: skip filespace checks for standby when unreachable" · 63d3c18f

由 Kalen Krempely 提交于 9月 08, 2020

This reverts commit 5bab25f7.

The following test_movetransfiles.py tinc tests are failing from the
cs_walrep_1 job:
test_standby_is_configured
test_transfiles_are_moved
test_tempfiles_are_moved

gpfilespace --showtransfilespace and --showtempfilespace return with a
non-zero exit code.

63d3c18f

K
Revert "behave: test that gpstart continues if standby is unreachable" · 2abc4824
由 Kalen Krempely 提交于 9月 08, 2020
```
This reverts commit 05071344.
```
2abc4824

gpstart: skip filespace checks for standby when unreachable · 5bab25f7

由 Kalen Krempely 提交于 8月 26, 2020

When the standby is unreachable and the user proceeds with startup,
gpstart fails to start when temporary or transaction files have been
moved to a non-default filespace.

To determine when the standby is unreachable fetch_tli was reworked to
raise a StandbyUnreachable exception. And the standby is not started if
it is unreachable.
Co-authored-by: NBhuvnesh Chaudhary <bchaudhary@vmware.com>

5bab25f7

behave: test that gpstart continues if standby is unreachable · 05071344

由 Jacob Champion 提交于 8月 25, 2020

Add a failing behave test to ensure that gpstart prompts and continues
successfully if the standby host is unreachable. The subsequent commit
will fix the test case.
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>

05071344

04 9月, 2020 1 次提交
- X
  Fix unit test failure · 339a292f
  由 xiong-gang 提交于 9月 04, 2020
```
commit 4f5a2c23 breaks the unittest cdbtm_test
```
  339a292f
03 9月, 2020 5 次提交

H

Fix formatting issue in answer file · 13baea67
由 Hubert Zhang 提交于 9月 03, 2020

13baea67

Bump ORCA version to 3.109, add test cases for corr subq with LOJs (#10512) · c023d9db

由 Hans Zeller 提交于 9月 02, 2020

* Add test cases for correlated subqueries with outer joins

We found several problems with outer references in outer joins and
related areas, especially when using optimizer_join_order = exhaustive2.

Adding some tests. Please note that due to some remaining problems in
both ORCA and planner, the tests contain some FIXMEs.

* Bump ORCA version to 3.109.0

c023d9db

Using lwlock to protect resgroup slot in session state · 1e24b618

由 Hubert Zhang 提交于 9月 02, 2020

Resource group used to access resGroupSlot in SessionState without
lock. This is correct when session only access resGroupSlot by itself.
But as we introduced runaway feature, we need to traverse the current
session array to find the top consumer session when redzone is reached.
This requires:
1. runaway detector should hold shared resgroup lock to avoid resGroupSlot
is detached from a session concurrently when redzone is reached.
2. normal session should hold exclusive lock when modifying resGroupSlot
in SessionState.
Reviewed-by: NNing Yu <nyu@pivotal.io>

(cherry picked from commit a4cb06b4)

1e24b618

Fix resource group runaway rounding issue · e9223710

由 Hubert Zhang 提交于 9月 01, 2020

When calculating safeChunksThreshold of runaway in resource group,
we used to divide by 100 to get the number of safe chunks. This may
lead to small chunk numbers to be rounded to zero. Fix it by storing
safeChunksThreshold100(100 times bigger than the real safe chunk) and
do the computation on the fly.
Reviewed-by: NNing Yu <nyu@pivotal.io>
(cherry picked from commit 757184f9)

e9223710

Correctly use atomic variable in ResGroupControl.freeChunks. (#8434) · 1557fd13

由 Paul Guo 提交于 8月 21, 2019

This variable was used mixing with atomic api functions and direct access.
This is not wrong usually in real scenario but is not a good implementation
since 1) that depends on compiler and H/W to ensure the correctness of direct
access. 2) code is not graceful.

Changing to all use atomic api functions.
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>
(cherry picked from commit f59307f5)

1557fd13

29 8月, 2020 2 次提交

Fix double deduction of FREEABLE_BATCHFILE_METADATA · 567025bd

由 Jesse Zhang 提交于 4月 17, 2020

Earlier, we always deducted FREEABLE_BATCHFILE_METADATA inside
closeSpillFile() regardless of whether the spill file was already
suspended. This deduction, is already performed inside
suspendSpillFiles(). This double accounting leads to
hashtable->mem_for_metadata becoming negative and we get:

FailedAssertion("!(hashtable->mem_for_metadata > 0)", File: "execHHashagg.c", Line: 2019)
Co-authored-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>

567025bd

Fix assert condition in spill_hash_table() · 679ed508

由 Jesse Zhang 提交于 4月 17, 2020

This commit fixes the following assertion failure message reported in:
(#9902) https://github.com/greenplum-db/gpdb/issues/9902

FailedAssertion("!(hashtable->nbuckets > spill_set->num_spill_files)", File: "execHHashagg.c", Line: 1355)

hashtable->nbuckets can actually end up being equal to
spill_set->num_spill_files, which causes the failure. This is because:

hashtable->nbuckets is set with HashAggTableSizes->nbuckets, which can
end up being equal to: gp_hashagg_default_nbatches. Refer:
nbuckets = Max(nbuckets, gp_hashagg_default_nbatches);

Also, spill_set->num_spill_files is set with
HashAggTableSizes->nbatches, which is further set to
gp_hashagg_default_nbatches.

Thus, these two entities can be equal.
Co-authored-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
(cherry picked from commit 067bb350)

679ed508

27 8月, 2020 3 次提交

(

Error out when changing datatype of column with constraint. (#10712) · 9ebc0423

由 (Jerome)Junfeng Yang 提交于 8月 27, 2020

Raise a meaningful error message for this case.
GPDB doesn't support alter type on primary key and unique
constraint column. Because it requires to drop - recreate logic.
The drop currently only performs on master which lead error when
recreating index (since recreate index will dispatch to segments and
there still an old constraint index exists).

This fixes the issue https://github.com/greenplum-db/gpdb/issues/10561.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
(cherry picked from commit 32446a32)

9ebc0423

Fix assertion failures in BackoffSweeper · c9f2a816

由 ggbq 提交于 12月 23, 2019

Previous commit ab74e1c6, c7befb1d did not completely solve its race
condition, it did not test for last iteration of the while/for loop.
This could result in failed assertion in the following loop. The patch
moves the judgement to the ending of the for loop, it is safe, because
the first iteration will never trigger: Assert(activeWeight > 0.0).

Also, the other one race condition can trigger this assertion
Assert(gl->numFollowersActive > 0). Consider this situation:

    Backend A, B belong to the same statement.

    Timestamp1: backend A's leader is A, backend B's leader is B.

    Timestamp2: backend A's numFollowersActive remains zero due to timeout.

    Timestamp3: Sweeper calculates leader B's numFollowersActive to 1.

    Timestamp4: backend B changes it's leader to A even if A is inactive.

We stop sweeping for this race condition just like commit ab74e1c6 did.

Both Assert(activeWeight > 0.0) and Assert(gl->numFollowersActive > 0)
are removed.

(cherry picked from commit b1c19196)

c9f2a816

Minimize the race condition in BackoffSweeper() · a3233b6b

由 Pengzhou Tang 提交于 1月 03, 2018

There is a long-standing race condition in BackoffSweeper() which
triggers an error and then triggers another assertion failure for
not reset sweeperInProgress to false.

This commit doesn't resolve the race condition fundamentally with
lock or other implementation, because the whole backoff mechanism
did not ask for accurate control, so skipping some sweeps should
be fine so far. We also downgrade the log level to DEBUG because
a restart of sweeper backend is unnecessary.

(cherry picked from commit ab74e1c6)

a3233b6b

26 8月, 2020 4 次提交

PANIC when the shared memory is corrupted · 4f5a2c23

由 xiong-gang 提交于 8月 26, 2020

shmNumGxacts and shmGxactArray are accessed under the protection of
shmControlLock, this commit add some defensive code and PANIC at the earliest
when the shared memory is corrupted.

4f5a2c23

Fix dblink's libpq issue on gpdb 5X (#10695) · ed85fe85

由 Xiaoran Wang 提交于 8月 26, 2020

* Fix dblink's libpq issue

When using dblink to connect to a postgres database, it reports the
following error:
unsupported frontend protocol 28675.0: server supports 2.0 to 3.0

Even if dblink.so is dynamic linked to libpq.so which is compiled
with the option -DFRONTEND, but when it's loaded in gpdb and run,
it will use the backend libpq which is compiled together with
postgres program and reports the error. So we define FRONTEND
before defining libpq-fe.h.

* dblink can't be built on Mac

ed85fe85

X

Fix gp_error_handling makefile · b6970883
由 Xiaoran Wang 提交于 8月 26, 2020

b6970883

Harden analyzedb further against dropped/recreated tables (#10704) · 62013e67

由 Chris Hajas 提交于 8月 25, 2020

Commit 445fc7cc hardened some parts of analyzedb. However, it missed a
couple of cases.

1) When the statement to get the modcount from the pg_aoseg table failed
due to a dropped table, the transaction was also terminated. This caused
further modcount queries to fail and while those tables were analyzed,
it would error and not properly record the mod count. Therefore, we now
restart the transaction when it errors.

2) If the table is dropped and then recreated while analyzedb is running
(or some other mechanism that results in the table being successfully
analyzed, but the pg_aoseg table did not exist during the initial
check), the logic to update the modcount may fail. Now, we skip the
update for the table if this occurs. In this case, the modcount would
not be recorded and the next analyzedb run will consider the table
modified (or dirty) and re-analyze it, which is the desired behavior.

Note: This isn't as hardened as gpdb master/6X due to improvements in
newer versions of pygresql, so there does exist a window where dropped
tables still cause analyzedb to fail.

62013e67

25 8月, 2020 1 次提交

Fix unexpected corrupt of persistent filespace table (#10623) · 424e382a

由 Tang Pengzhou 提交于 8月 25, 2020

With a segment whose primary is down and its mirror is promoted to
primary, we run gp_remove_segment_mirror to remove the mirror of
the segment, we see the mirror related fields are cleaned up in
gp_persistent_filespace_node. But when we run gp_remove_segment_mirror
for the same segment again, the primary related fields are also
cleaned up, this is wrong and not expected.

Such a case was observed in production when gprecoverseg -F was
interrupted in the middle of __updateSystemConfigRemoveAddMirror() and
run again.
Reviewed-by: NAsim R P <pasim@vmware.com>

424e382a

13 8月, 2020 2 次提交

(

Correct modify_table_data_corrupt test output for orca. (#10631) · d2f610ee
由 (Jerome)Junfeng Yang 提交于 8月 13, 2020

d2f610ee

(

Modify error context callback functions to not assume that they can fetch · dc572635

由 (Jerome)Junfeng Yang 提交于 8月 13, 2020

catalog entries via SearchSysCache and related operations.  Although, at the
time that these callbacks are called by elog.c, we have not officially aborted
the current transaction, it still seems rather risky to initiate any new
catalog fetches.  In all these cases the needed information is readily
available in the caller and so it's just a matter of a bit of extra notation
to pass it to the callback.

Per crash report from Dennis Koegel.  I've concluded that the real fix for
his problem is to clear the error context stack at entry to proc_exit, but
it still seems like a good idea to make the callbacks a bit less fragile
for other cases.

Backpatch to 8.4.  We could go further back, but the patch doesn't apply
cleanly.  In the absence of proof that this fixes something and isn't just
paranoia, I'm not going to expend the effort.

(cherry picked from commit a836abe9)
Note the changes from the above commit in `inline_set_returning_function` is not
included cause the function does not exist in 5X right now.
Co-authored-by: NTom Lane <tgl@sss.pgh.pa.us>

dc572635

12 8月, 2020 1 次提交

Print CTID when we detect data distribution wrong for UPDATE|DELETE. · 324b7834

由 Zhenghua Lyu 提交于 8月 12, 2020

When update or delete statement errors out because of the CTID is
not belong to the local segment, we should also print out the CTID
of the tuple so that it will be much easier to locate the wrong-
distributed data via:
  `select * from t where gp_segment_id = xxx and ctid='(aaa,bbb)'`.

324b7834

03 8月, 2020 1 次提交

(

Resolve high `CacheMemoryContext` usage for `ANALYZE` on large partition table.(#10555) · 3d41c361

由 (Jerome)Junfeng Yang 提交于 8月 03, 2020

In some cases, merge stats logic for root partition table may consume
very high memory usage in CacheMemoryContext.
This may lead to `Canceling query because of high VMEM usage` when
concurrently ANALYZE partition tables.

For example, there are several root partition tables and they both have
thousands of leaf tables. And these tables are all wide tables that may
contain hundreds of columns.
So when analyze()/auto_stats() leaf tables concurrently,
`leaf_parts_analyzed` will consume lots of memory(catalog catch for
pg_statistic and pg_attribute) under
CacheMemoryContext for each backend, which may hit the protect VMEM
limit.
In `leaf_parts_analyzed`, a single backend's leaf table analysis for a
root partition table, it may add cache entries up to
number_of_leaf_tables * number_of_columns tuples from pg_statistic and
number_of_leaf_tables * number_of_columns tuples from pg_arrtibute.
Set guc `optimizer_analyze_root_partition` or
`optimizer_analyze_enable_merge_of_leaf_stats` to false could skip merge
stats for root table and `leaf_parts_analyzed` will not execute.

To resolve this issue:
1. When checking whether merge stats are available for a root table in
`leaf_parts_analyzed`, check whether all leaf tables are ANALYZEd first,
if they're still un-ANALYZE leaf table exists, return quickly to avoid touch
columns' pg_attribute and pg_statistic per leaf table(this will save lots of time).
And also don't rely on system catalog cache and use the
index to fetch the stats tuple to avoid one-time cache usage(in common cases).

2. When merging a stats in `merge_leaf_stats`, don't rely on system
catalog cache and use the index to fetch the stats tuple.

There are side-effects for not rely on system catalog cache(which are all **rare** situations).
1. If insert/update/copy several leaf tables which under **same
root partition** table in **same session** and all leaf tables are **analyzed**
will be much slower since auto_stats will call `leaf_parts_analyzed` once the leaf
table gets updated, and we don't rely on system catalog cache now.
(`set optimizer_analyze_enable_merge_of_leaf_stats=false` could avoid
this)

2. ANALYZE the same root table several times in the same session is much
slower than before since we don't rely on system catalog cache.

Seems this solution improves the performance for ANALYZE, and
it also makes ANALYZE won't hit the memory issue anymore.

(cherry picked from commit 533a47dd)

3d41c361

31 7月, 2020 1 次提交

Bump Orca version to v3.108.0 (#10559) · d3a18041

由 Chris Hajas 提交于 7月 30, 2020

Corresponding Orca commit: "Add knowledge of partition selectors to
Orca's DPv2 algorithm"

Also includes "Do not allocate MemoryPoolManager from a memory pool "

d3a18041

23 7月, 2020 1 次提交

Allow merging of statistics for domain types · 33f71c07

由 Ashuka Xue 提交于 7月 16, 2020

Prior to this commit, incremental analyze would error out when merging
statistics of partition tables containing domain types with a message
saying that the domain type is not hashable.

We should not be trying to hash the domain type, instead we should be
hashing the underlying base type of the domain. We noticed that domains
of array types are not being merged in any circumstance due to logic in
`isGreenplumDbHashable` inside analyze.c. (For example, `CREATE DOMAIN
int32_arr int[]` will not be merged, instead we will compute scalar
stats). We did not want to enable this functionality as it would have
other ramifications in GPDB5.

Example:
```
CREATE DOMAIN int32 int;

CREATE TABLE foo (x int32) PARTITION BY (START(1) END(5) EVERY(1));
INSERT INTO foo SELECT i % 5 FROM (generate_series(1,20)i;
ANALYZE foo;
```
will no longer error out during ANALYZE, but will generate proper
statistics for table foo.
Co-authored-by: NAshuka Xue <axue@vmware.com>
Co-authored-by: NChris Hajas <chajas@vmware.com>

33f71c07

17 7月, 2020 2 次提交

Add debugging code in shared snapshot code and tweak the shared snapshot code a bit. · 6eb5a89e

由 Paul Guo 提交于 7月 08, 2020

Notably we want the shared snapshot dumping information when encountering the
"snapshot collision" error, which was seen on real scenario and it is hard to
debug.

(cherry picked from commit ee2d4641)

6eb5a89e

Add debugging code for the "latch already owned" error. · 4ad4f152

由 Paul Guo 提交于 7月 16, 2020

We've seen such a case on a stable release but it is hard to debug via the
message only, so let's provide more details in the error message.

4ad4f152

15 7月, 2020 2 次提交

R

Fix subselect_gp_optimizer.out · cc2f61e4
由 Richard Guo 提交于 7月 15, 2020

cc2f61e4

Fix pulling up EXPR sublinks · a6ee98bf

由 Richard Guo 提交于 7月 15, 2020

Currently GPDB tries to pull up EXPR sublinks to inner joins. For query

select * from foo where foo.a >
    (select avg(bar.a) from bar where foo.b = bar.b);

GPDB would transform it to:

select * from foo inner join
    (select bar.b, avg(bar.a) as avg from bar group by bar.b) sub
on foo.b = sub.b and foo.a > sub.avg;

To do that, GPDB needs to recurse through the quals in sub-select and
extract quals of form 'outervar = innervar' and then build new
SortGroupClause items and TargetEntry items based on these quals for
sub-select.

But for quals of form 'function(outervar, innervar1) = innvervar2', GPDB
handles them incorrectly and will cause wrong results issues as
described in issue #9615.

This patch fixes this issue by treating these kinds of quals as not
compatible correlated and thus the sub-select would not be converted to
join.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAsim R P <apraveen@pivotal.io>
(cherry picked from commit dcdc6c0b)

a6ee98bf

13 7月, 2020 1 次提交

Fix the assert failure on pullup flow in within group · 7246f370

由 Jinbao Chen 提交于 7月 13, 2020

Flow in AggNode has wrong TargetList. AggNode has a different
TargetList from its child nodes, so copying flow directly from the
child node to AggNode is completely wrong. We need to use pullupflow to
generate this TargetList in creating the within group plan with single
QE.

7246f370