提交 · fa7b4b4bd10a359c63812709125aa78cfb32a6b2 · Greenplum / Gpdb

05 8月, 2020 1 次提交
- L
  docs - note how oss users get gpbackup (#10478) · fa7b4b4b
  由 Lisa Owen 提交于 8月 04, 2020
```
* docs - note how oss users get gpbackup

* small edit
```
  fa7b4b4b
04 8月, 2020 2 次提交

ic-proxy: correct SIGHUP handler · a181655b

由 Ning Yu 提交于 8月 04, 2020

Fixed the bug that the SIGHUP handler was installed for SIGINT by
mistake, so the ic-proxy bgworkers would die on SIGHUP.

By correcting the signal name, now we could let the ic-proxy bgworkers
reload the postgresql.conf by executing "gpstop -u".
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

a181655b

gpcheckcat: fix inconsistent issues reporting · 85103e37

由 Adam Lee 提交于 7月 30, 2020

Note that since PyGreSQL 5.0 this method will return the values of array
type columns as Python lists.

ref: https://pygresql.org/contents/pg/query.html

85103e37

03 8月, 2020 7 次提交

H
Coverity: Check return value of strcmp · fd498cf9
由 Hubert Zhang 提交于 7月 29, 2020
```
return value of strcmp is not checked in some branches.
```
fd498cf9

Coverity: Behave test misformat · 126ba1c6

由 Hubert Zhang 提交于 7月 29, 2020

conn = dbconn.connect() should be aligned with if statement, or it
will never be executed.

126ba1c6

ic-proxy: handle early coming BYE correctly · 79ff4e62

由 Ning Yu 提交于 8月 02, 2020

In a query that contains multiple init/sub plans, the packets of the
second subplan might be received while the first is still being
processed in the ic-proxy mode, this is because in ic-proxy mode a local
host handshake is used instead of the global one.

To distinguish the packets of different subplans, especially for the
early coming ones, we must stop handling on the BYE immediately, and
pass any unhandled early coming pkts to the successor or the
placeholder.

This fixes the random hanging during the ICW parallel group of
qp_functions_in_from.  No new test is added.
Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

79ff4e62

H
Coverity: Sizeof not portable · cf23db49
由 Hubert Zhang 提交于 7月 29, 2020
```
sizeof(HeapTuple *) should be sizeof(HeapTuple)
```
cf23db49
H
Coverity: Logically dead code · 38552bf7
由 Hubert Zhang 提交于 7月 29, 2020
```
Remove dead code.
insertDesc is alwasy NULL in ao_vacuum_rel_compact()
```
38552bf7

Coverity: Variable unused · e5c86775

由 Hubert Zhang 提交于 7月 29, 2020

Remove unused varaible.
For tuplesort.h, we doesn't support mksort based cluster,
so we should just set is_mk_tuplesortstate to false

e5c86775

H
Coverity: Identical code for different branches · f076205c
由 Hubert Zhang 提交于 7月 29, 2020
```
Clean identical code in heap.c and analyze.c
```
f076205c

01 8月, 2020 5 次提交
- H
  
  Remove comment about field that was removed earlier. · 42f796ca
  由 Heikki Linnakangas 提交于 7月 31, 2020
  
  42f796ca
- H
  Remove incomplete support for dependency entries on pg_compression. · e6116093
  由 Heikki Linnakangas 提交于 7月 31, 2020
```
Also, the entry for ExtprotocolRelationid was in wrong place in
object_classes[]. It's a bit surprising that didn't cause any ill effects,
but let's fix it in any case.
```
  e6116093
- H
  
  Fix description of 'numeric_dec' function. · ddd4244f
  由 Heikki Linnakangas 提交于 7月 31, 2020
  
  ddd4244f
- H
  Remove stray unused struct. · fa94c56a
  由 Heikki Linnakangas 提交于 7月 31, 2020
```
It was added in commit 0138eed4, but was never used for anything.
```
  fa94c56a
- H
  
  Make function static. · c501ad2e
  由 Heikki Linnakangas 提交于 7月 31, 2020
  
  c501ad2e
31 7月, 2020 7 次提交

Remove fault injection from gpMgmt · 2f65547b

由 Tyler Ramer 提交于 7月 29, 2020

The command execution framework shipped with a fault injection in
delivered code. See https://github.com/greenplum-db/gpdb/issues/10546
for execution details and implications.

It seems the fault injection framework was added in 2009, used
sparingly, and should be removed until it can be safely replaced.

Additionally, the "gppylib/test/regress" folder used fault injector, but
the "check-regress" target seems not to have been called - obvious
because pygresql regression checks are present, but pygresql has not
been in master for some time without causing any errors to these tests
Authored-by: NTyler Ramer <tramer@vmware.com>

2f65547b

Coverity: Argument cannot be negative · a500d537

由 Hubert Zhang 提交于 7月 29, 2020

Should check the len parameter of memcpy is not negative in
gp_hyperloglog.c
Should use errno instead of seekResult in cdbappendonlystorageread.c

a500d537

A
gp_replica_check: skip checking unlogged tables · 0f4721a3
由 Adam Lee 提交于 7月 30, 2020
```
Unlogged tables do not propagate to replica servers, skip them and their
initialization forks.
```
0f4721a3
H
Coverity: Resource leak · e27ec070
由 Hubert Zhang 提交于 7月 29, 2020
```
Fix some resource leak.
```
e27ec070

Correct and stabilize some replication tests · c565e988

由 Ashwin Agrawal 提交于 7月 23, 2020

Adding pg_stat_clear_snapshot() in functions looping over
gp_stat_replication / pg_stat_replication to refresh result everytime
the query is run as part of same transaction. Without
pg_stat_clear_snapshot() query result is not refreshed for
pg_stat_activity neither for xx_stat_replication functions on multiple
invocations inside a transaction. So, in absence of it the tests
become flaky.

Also, tests commit_blocking_on_standby and dtx_recovery_wait_lsn were
initially committed with wrong expectations, hence were missing to
test the intended behavior. Now reflect the correct expectation.

c565e988

A
Add mirror_replay test to greenplum_schedule · 8ef5d722
由 Ashwin Agrawal 提交于 7月 30, 2020
```
This was missed in commit 96b332c0.
```
8ef5d722

Add knowledge of partition selectors to Orca's DPv2 algorithm (#10263) · 9c445321

由 Chris Hajas 提交于 7月 30, 2020

Orca's DP algorithms currently generate logical alternatives based only on cardinality; they do not take into account motions/partition selectors as these are physical properties handled later in the optimization process. Since DPv2 doesn't generate all possible alternatives for the optimization stage, we end up generating alternatives that do not support partition selection or can only place poor partition selectors.

This PR introduces partition knowledge into the DPv2 algorithm. If there is a possible partition selector, it will generate an alternative that considers it, in addition to the previous alternatives.

We introduce new properties, m_contain_PS  to indicate whether a SExpressionInfo contains a PS for a particular expression. We consider an expression to have a possible partition selector if the join expression columns and the partition table's partition key overlap. If they do, we mark this expression as containing a PS for a particular PT.

We consider a good PS one which is selective. Eg:
```
- DTS
- PS
   -TS
     - Pred
```

would be selective. However, if there is no selective predicate, we do not consider this as a promising PS.

For now, we add just a single alternative that satisfies this property and only consider linear trees.

9c445321

30 7月, 2020 3 次提交

(

Coverity: dump_mt_bind should free FILE pointer. (#10541) · 9d801056
由 (Jerome)Junfeng Yang 提交于 7月 30, 2020

9d801056
N
Revert "ic-proxy: get libuv on pr pipeline" · 54d92f22
由 Ning Yu 提交于 7月 29, 2020
```
This reverts commit ef887cfe.
```
54d92f22

Improve cardinality for joins using distribution columns in ORCA (#10479) · b0c1d810

由 Ashuka Xue 提交于 7月 29, 2020

This commit only affects cardinality estimation in ORCA when the user
sets `optimizer_damping_factor_join = 0`. It improves the square root
algorithm first introduced by commit ce453cf2.

In the original square root  algorithm, we assumed that distribution
column predicates would have some correlation with other predicates in
the join and therefore would be accordingly damped when calculating join
cardinality.

However, distribution columns are ideally unique in order to gain the
best performance for GPDB. Under this assumption, distribution columns
should not be correlated and thus needed to be treated as independent
when calculating join cardinality. This is a best guess since we do not
have a way to support correlated columns at this time.
Co-authored-by: NAshuka Xue <axue@vmware.com>
Co-authored-by: NChris Hajas <chajas@vmware.com>

b0c1d810

29 7月, 2020 4 次提交

ic-proxy: include postmaster pid in the domain socket path · 5c5a358a

由 Ning Yu 提交于 7月 29, 2020

We used to store them under /tmp/, we include the postmaster port number
in the file name in the hope that two clusters will not conflict with
each other on this file.

However the conflict still happen in the test src/bin/pg_basebackup.
And it can also happen if a second cluster is missed configured by
accident.  So to make things safe we also include the postmaster pid in
the domain socket path, there is no chance for two postmasters to share
the same pids.
Reviewed-by: NPaul Guo <pguo@pivotal.io>

5c5a358a

D

Docs - add GreenplumR 1.1.0 to component version table · d6362126
由 David Yozie 提交于 7月 28, 2020

d6362126

Add Orca support for index only scan · 3b72df18

由 David Kimura 提交于 6月 30, 2020

This commit allows Orca to select plans that leverage IndexOnlyScan
node. A new GUC 'optimizer_enable_indexonlyscan' is used to enable or
disable this feature. Index only scan is disabled by default, until the
following issues are addressed:

  1) Implement cost comparison model for index only scans. Currently,
     cost is hard coded for testing purposes.
  2) Support index only scan using GiST and SP-GiST as allowed.
     Currently, code only supports index only scans on b-tree index.
Co-authored-by: NChris Hajas <chajas@vmware.com>

3b72df18

M
docs - update ALTER TABLE ... ADD COLUMN for AOCO tables (#10518) · a4b5a80d
由 Mel Kiyama 提交于 7月 28, 2020
```
Add information about the compression parameters used when
a column is added to an AOCO table.
```
a4b5a80d

28 7月, 2020 5 次提交

Partial revert ic-proxy: enable ic-proxy in gpdb packages · 88d7d969

由 Tyler Ramer 提交于 7月 27, 2020

The libuv source is in the build container, rather thank in a gcp
bucket. This commit is a partial revert from getting source from a
gcp bucket and instead uses the libuv available in the build
container image.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NKris Macoskey <kmacoskey@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

88d7d969

Fix flaky test isolation2:pg_basebackup_with_tablespaces (#10509) · c8b00ac7

由 Paul Guo 提交于 7月 28, 2020

Here is the diff output of the test result.

 drop database some_database_without_tablespace;
 -DROP
 +ERROR:  database "some_database_without_tablespace" is being accessed by other users
 +DETAIL:  There is 1 other session using the database.
 drop tablespace some_basebackup_tablespace;
 -DROP
 +ERROR:  tablespace "some_basebackup_tablespace" is not empty

The reason is that after client connection to the database exits, the server
needs some time (the process might be scheduled out soon, and the operation
needs to content for the ProcArrayLock lock) to release the PGPROC in
proc_exit()->ProcArrayRemove(). During dropdb() (for database drop), postgres
will call CountOtherDBBackends() to see if there are still sessions that are
using the database by checking proc->databaseId, and it will try at most 5 sec.
This test quits the db connection of some_database_without_tablespace and then
drops the database immediately. This should be mostly fine but if the system is
in slow or in heavy load, this still could lead to test flakiness.

This issue could be simulated using gdb. Let's poll until database drop
commands succeeds for the affected database.  It seems that drop database sql
command could not be in transaction block so I could not use plpgsql to
implement, instead I use dropdb utility and bash command to implement that.
Reviewed-by: NAsim R P <pasim@vmware.com>

c8b00ac7

Don't mock GPORCA functions unnecessarily. · e5fa0d7a

由 Heikki Linnakangas 提交于 7月 27, 2020

None of the mock tests in the backend call into GPORCA functions, so
no need to generate mock objects for them. Saves some time and space when
running mock tests.
Reviewed-by: NAshuka Xue <axue@pivotal.io>

e5fa0d7a

Scan travis hostname to avoid hostname lookup errors · 86bcab98

由 Tyler Ramer 提交于 7月 25, 2020

Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

86bcab98

Remove gphostcache · f61d35cd

由 Tyler Ramer 提交于 7月 20, 2020

Gphostcache has numerous issues, and has been a pain point for some
time. For this reason, we are removing it.

This commit moves the useful function of gphostcache - the hostname
deduping - to the gparray class, where a list of deduped hostnames is
returned from gpArray.get_hostlist().

There is a FIXME of correctly adding hostname to a newly added or
recovered mirror. The hostname resolution from address was incorrect and
faulty in its logic - an IP address never requires a hostname associated
with it. However, the hostname field in gp_segment_configuration should
be populated somehow - we recommend a "hostname" field addition to any
configuration files that require it. For now, we simple set the
"hostname" to "address" which ultimately delivers the same functionality
as the gphostcache implementation.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

f61d35cd

25 7月, 2020 1 次提交
- M
  
  docs - PL/Container 3 supports the DO command · 60092d21
  由 mkiyama 提交于 7月 24, 2020
  
  60092d21
24 7月, 2020 3 次提交

Fix and enable upstream pg_dump regression tests. · 91c8b403

由 Heikki Linnakangas 提交于 7月 24, 2020

We skipped these during the PostgreSQL 9.6 merge. Time to pay down this
little technical debt.

We had copied the latest version of the t/002_pg_dump.pl test file from
upstream REL9_6_STABLE branch. That included the test change from upstream
commit e961341cc, but we didn't have the corresponding code changes yet.
Revert the test change, so that it passes. We'll get the bug fix, along
with the test change again, when we continue with the upstream merge, but
until then let's just keep the test in sync with the code.
Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>

91c8b403

(

Resolve high `CacheMemoryContext` usage for `ANALYZE` on large partition table. (#10499) · 533a47dd

由 (Jerome)Junfeng Yang 提交于 7月 24, 2020

In some cases, merge stats logic for root partition table may consume
very high memory usage in CacheMemoryContext.
This may lead to `Canceling query because of high VMEM usage` when
concurrently ANALYZE partition tables.

For example, there are several root partition tables and they both have
thousands of leaf tables. And these tables are all wide tables that may
contain hundreds of columns.
So when analyze()/auto_stats() leaf tables concurrently,
`leaf_parts_analyzed` will consume lots of memory(catalog catch for
pg_statistic and pg_attribute) under
CacheMemoryContext for each backend, which may hit the protect VMEM
limit.
In `leaf_parts_analyzed`, a single backend's leaf table analysis for a
root partition table, it may add cache entries up to
number_of_leaf_tables * number_of_columns tuples from pg_statistic and
number_of_leaf_tables * number_of_columns tuples from pg_arrtibute.
Set guc `optimizer_analyze_root_partition` or
`optimizer_analyze_enable_merge_of_leaf_stats` to false could skip merge
stats for root table and `leaf_parts_analyzed` will not execute.

To resolve this issue:
1. When checking whether merge stats are available for a root table in
`leaf_parts_analyzed`, check whether all leaf tables are ANALYZEd first,
if they're still un-ANALYZE leaf table exists, return quickly to avoid touch
columns' pg_attribute and pg_statistic per leaf table(this will save lots of time).
And also don't rely on system catalog cache and use the
index to fetch the stats tuple to avoid one-time cache usage(in common cases).

2. When merging a stats in `merge_leaf_stats`, don't rely on system
catalog cache and use the index to fetch the stats tuple.

There are side-effects for not rely on system catalog cache(which are all **rare** situations).
1. If insert/update/copy several leaf tables which under **same
root partition** table in **same session** and all leaf tables are **analyzed**
will be much slower since auto_stats will call `leaf_parts_analyzed` once the leaf
table gets updated, and we don't rely on system catalog cache now.
(`set optimizer_analyze_enable_merge_of_leaf_stats=false` could avoid
this)

2. ANALYZE the same root table several times in the same session is much
slower than before since we don't rely on system catalog cache.

Seems this solution improves the performance for ANALYZE, and
it also makes ANALYZE won't hit the memory issue anymore.

533a47dd

ic-proxy: get libuv on pr pipeline · ef887cfe

由 Ning Yu 提交于 7月 24, 2020

We must install libuv on the PR pipeline to compile with ic-proxy
enabled.  ICW tests are still run in ic-udpifc mode.

ef887cfe

23 7月, 2020 2 次提交

D
Merge pull request #10197 from arenadata/explain_analyze · b40e5ba0
由 Denis Smirnov 提交于 7月 23, 2020
```
Fix explain analyze error handling
```
b40e5ba0

ic-proxy: reload addresses on SIGHUP · c2523232

由 Ning Yu 提交于 7月 23, 2020

We used to mark the GUC gp_interconnect_proxy_addresses as
PGC_POSTMASTER, so the cluster must be restarted to reload this setting,
this can be a problem during gpexpand: the cluster expansion itself is
online, but to configure the proxy addresses for the new segments a
restart is needed.

Now we changed it to PGC_SIGHUP, so the setting can be reloaded on
SIGHUP.

Also changed the setting from a developer option to a normal one.

c2523232