提交 · 9eb9c2acb27b2ce85ebc27ccd79579623cb16ef0 · Greenplum / Gpdb

13 7月, 2020 4 次提交

D

Docs - remove HCI warning · 9eb9c2ac
由 David Yozie 提交于 7月 13, 2020

9eb9c2ac

Update linux installation guide · ba5792fa

由 Tyler Ramer 提交于 7月 10, 2020

Issue #10069 noted some problems with the linux documentation.

Updating this documentation to be more accurate and direct configuration
steps to the appropriate documentation.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

ba5792fa

Remove unused function pathnode_walk_node. · 7339a178

由 Zhenghua Lyu 提交于 7月 13, 2020

Previously, `cdbpath_dedup_fixup` is the only function that
will invoke `pathnode_walk_node`. And it was removed by the
commit 9628a332.

So in this commit we remove these unused functions.

7339a178

(

Fix flaky test for replication_keeps_crash. (#10423) · db60b003

由 (Jerome)Junfeng Yang 提交于 7月 13, 2020

Remove the set `gp_fts_probe_retries to 1` which may cause FTS probe failed.
This was first added to reduce the test time, but set a lower retry
value may cause the test failed to probe FTS update segment
configuration. Since reduce the `gp_fts_replication_attempt_count` also
save the test time, so skip alter ``gp_fts_probe_retries`.

Also find an assertion may not match when mark mirror down happens before
walsender exit, which will free the replication status before walsender
exit and try to record disconnect info. Which lead the segment crash
and starts recover.

db60b003

10 7月, 2020 11 次提交

ic-proxy: enable ic-proxy with --enable-ic-proxy · 81810a20

由 Ning Yu 提交于 6月 15, 2020

We used to use the option --with-libuv to enable ic-proxy, it is not
staightforward to understand the purpose of that option, though.  So we
renamed it to --enable-ic-proxy, and the default setting is changed to
"disable".

Suggested by Kris Macoskey <kmacoskey@pivotal.io>

81810a20

ic-proxy: let backends connect to the proxy bgworker · 94c9d996

由 Ning Yu 提交于 5月 18, 2020

Only in proxy mode, of course. Currently the ic-proxy mode shares most
of the backend logic with ic-tcp mode, so instead of copying the code we
actually embed the ic-proxy specific logic in ic_tcp.c .

94c9d996

N

ic-proxy: launch as a bgworker · 5b60069c
由 Ning Yu 提交于 5月 18, 2020

5b60069c
N
ic-proxy: new value "proxy" in GUC gp_interconnect_type · 245ca266
由 Ning Yu 提交于 5月 18, 2020
```
It is for the ic-proxy mode.
```
245ca266
N

ic-proxy: make gp_interconnect_proxy_addresses a GUC · 3140a44f
由 Ning Yu 提交于 5月 18, 2020

3140a44f

ic-proxy: implement the core logic · 6188fb1f

由 Ning Yu 提交于 5月 18, 2020

The interconnect proxy mode, a.k.a. ic-proxy, is a new interconnect
mode, all the backends communicate via a proxy bgworker, all the
backends on the same segment share the same proxy bgworker, so every two
segments only need one network connection between them, which reduces
the network flows as well the ports.

To enable the proxy mode we need to first configure the guc
gp_interconnect_proxy_addresses, for example:

    gpconfig \
      -c gp_interconnect_proxy_addresses \
      -v "'1:-1:10.0.0.1:2000,2:0:10.0.0.2:2001,3:1:10.0.0.3:2002'" \
      --skipvalidation

Then restart to take effect.

6188fb1f

Store dbid in CdbProcess · 8804bf39

由 Ning Yu 提交于 5月 18, 2020

It is a preparation for the ic-proxy mode, we need this information to
distinguish a primary segment with its mirror.

8804bf39

Fix pyyaml windows build (#10451) · 3daafd2f

由 Peifeng Qiu 提交于 7月 10, 2020

Local fork at gpMgmt/bin/ext/yaml was removed by 8d6c3059. Unpack
it from gpMgmt/bin/pythonSrc/ext just like pygresql.

3daafd2f

A
[Refactor] Pull out KHeap into CKHeap.h · 9e8f261d
由 Ashuka Xue 提交于 6月 22, 2020
```
Pull out the implementation for binary heap into its own templated h
file.
```
9e8f261d

Make histograms commutative when merging · 9b427611

由 Ashuka Xue 提交于 6月 22, 2020

Prior to this commit, merging two histograms was not commutative.
Meaning histogram1->Union(histogram2) could result in a row estimate of
1500 rows, but histogram2->Union(histogram1) could result in a row
estimate of 600 rows.

Now, MakeBucketMerged has been renamed to SplitAndMergeBuckets. This
function, which calculates the statistics for the merged bucket, now
consistently return the same histogram buckets regardless of the order
of input. This in turn, makes MakeUnionHistogramNormalize and
MakeUnionAllHistogramNormalize commutative.

Once we have successfully split the buckets and merged them as
necessary, we may have generated up to 3X the number of buckets that
were originally present. Thus we cap the number of buckets to be either
the max size of the two incoming buckets, or, 100 buckets.

CombineBuckets will then reduce the size of the histogram by combining
consecutive buckets that have similar information. It does this by using
a combination of two ratios: freq/ndv and freq/bucket_width. These two
ratios were decided based off the following examples:

Assuming that we calculate row counts for selections like the following:
- For a predicate col = const: rows * freq / NDVs
- For a predicate col < const: rows * (sum of full or fractional frequencies)

Example 1 (rows = 100), freq/width, ndvs/width and ndvs/freq are all the same:
  ```
  Bucket 1: [0, 4)   freq .2  NDVs 2  width 4  freq/width = .05 ndv/width = .5 freq/ndv = .1
  Bucket 2: [4, 12)  freq .4  NDVs 4  width 8  freq/width = .05 ndv/width = .5 freq/ndv = .1
  Combined: [0, 12)  freq .6  NDVs 6  width 12
  ```

This should give the same estimates for various predicates, with separate or combined buckets:
```
pred          separate buckets         combined bucket   result
-------       ---------------------    ---------------   -----------
col = 3  ==>  100 * .2 / 2           = 100 * .6 / 6    = 10 rows
col = 5  ==>  100 * .4 / 4           = 100 * .6 / 6    = 10 rows
col < 6  ==>  100 * (.2 + .25 * .4)  = 100 * .5 * .6   = 30 rows
```

Example 2 (rows = 100), freq and ndvs are the same, but width is different:
```
Bucket 1: [0, 4)   freq .4  NDVs 4  width 4  freq/width = .1 ndv/width = 1 freq/ndv = .1
Bucket 2: [4, 12)  freq .4  NDVs 4  width 8  freq/width = .05 ndv/width = .5 freq/ndv = .1
Combined: [0, 12)  freq .8  NDVs 8  width 12
```

This will give different estimates with the combined bucket, but only for non-equal preds:
```
pred          separate buckets         combined bucket   results
-------       ---------------------    ---------------   --------------
col = 3  ==>  100 * .4 / 4           = 100 * .8 / 8    = 10 rows
col = 5  ==>  100 * .4 / 4           = 100 * .8 / 8    = 10 rows
col < 6  ==>  100 * (.4 + .25 * .4) != 100 * .5 * .8     50 vs. 40 rows
```

Example 3 (rows = 100), now NDVs / freq is different:
```
Bucket 1: [0, 4)   freq .2  NDVs 4  width 4  freq/width = .05 ndv/width = 1 freq/ndv = .05
Bucket 2: [4, 12)  freq .4  NDVs 4  width 8  freq/width = .05 ndv/width = .5 freq/ndv = .1
Combined: [0, 12)  freq .6  NDVs 8  width 12
```

This will give different estimates with the combined bucket, but only for equal preds:
```
pred          separate buckets         combined bucket   results
-------       ---------------------    ---------------   ---------------
col = 3  ==>  100 * .2 / 4          != 100 * .6 / 8      5 vs. 7.5 rows
col = 5  ==>  100 * .4 / 4          != 100 * .8 / 8      10 vs. 7.5 rows
col < 6  ==>  100 * (.2 + .25 * .4)  = 100 * .5 * .6   = 30 rows
```

This commit also adds an attribute to the statsconfig for MaxStatsBuckets
and changes the scaling method when creating singleton buckets.

9b427611

[Refactor] Update MakeStatsFilter, Rename CreateHistMashMapAfterMergingDisjPreds -> · c14fbb92

由 Ashuka Xue 提交于 4月 16, 2020

MergeHistogramMapsforDisjPreds

This commit refactors MakeStatsFilter to use
MakeHistHashMapConjOrDisjFilter instead of individually calling
MakeHistHashMapConj and MakeHistHashMapDisj.

This commit also modifies MergeHistogramMapsForDisjPreds to avoid copy
and creating unnecessary histogram buckets.

c14fbb92

09 7月, 2020 4 次提交

Use yaml safe_load in gppkg · 4aa3b2a3

由 Tyler Ramer 提交于 7月 08, 2020

Commit 21a2cb27b38117cce90c4ff06d8d447842c5acf1, added in PR #10361,
updated yaml and changed yaml.load to yaml.safe_load in gpload.

gppkg uses yaml as well, but references were not updated - this commit
resolves that discrepancy.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

4aa3b2a3

A

Alphabetically arrange the (un)sync_guc_name.h files · 93d8881e
由 Ashwin Agrawal 提交于 7月 08, 2020

93d8881e
F
Remove duplicated GUC names in unsync_guc_name.h · 91e4e3f7
由 Fang Zheng 提交于 7月 07, 2020
```
Fixes https://github.com/greenplum-db/gpdb/issues/10437
```
91e4e3f7

Fix colrefs getting mangled when merging equivalent classes in Orca · 3062b121

由 Chris Hajas 提交于 7月 07, 2020

Previously, the PdrgpcrsAddEquivClass function would modify the input
colref set. This does not appear intentional, as this same reference may
be accessed in other places. This caused Orca to fall back to planner in
some cases during translation with "Attribute number 0 not found in
project list".
Co-authored-by: Nmubo.fy <mubo.fy@alibaba-inc.com>
Co-authored-by: NChris Hajas <chajas@pivotal.io>
Co-authored-by: NHans Zeller <hzeller@vmware.com>

3062b121

08 7月, 2020 8 次提交

gpcheckcat: fix gpcheckcat vpinfo issue · 988d7c03

由 xiong-gang 提交于 7月 08, 2020

The entry in aocsseg table might be compacted and waiting for drop, so we
should use 'state' to filter the unused entry.

988d7c03

Fix pygresql windows build (#10420) · 49765579

由 Peifeng Qiu 提交于 7月 08, 2020

- CMakeLists.txt moved to gpMgmt/bin/pythonSrc/PyGreSQL
- Unpack source code from gpMgmt/bin/pythonSrc/ext/PyGreSQL-*.tar.gz
- Add declaration to force dllexport on init_pg
- Remove the pygresql level folder. All files are moved up.

49765579

gpcheckcat: add the check of vpinfo consistency · f2efbda3

由 xiong-gang 提交于 7月 08, 2020

column 'vpinfo' in pg_aoseg.pg_aocsseg_xxx record the 'eof' of each attribute
in the AOCS table. Add a new check 'aoseg_table' in gpcheckcat, it checks the
number of attributes in 'vpinfo' is the same as the number of attributes in
'pg_attribute'. This check is performed in parallel and independently on each
segment, and it checks aoseg table and pg_attribute in different transaction,
so it should be run 'offline' to avoid false alarm.

f2efbda3

Use separate make and make install in travis · 0f02a355

由 Tyler Ramer 提交于 6月 24, 2020

Travis will consume some of the output if make -s install is used
instead of separate make and make install steps.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

0f02a355

Remove unused imports · 931b5077

由 Tyler Ramer 提交于 6月 23, 2020

Yaml was imported but unused in several locations.
gpMgmt/test/behave/mgmt_utils/steps/mgmt_utils.py had numerous unused
or duplicated imports.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

931b5077

Remove unused yaml class in mainUtils · 949010f8

由 Tyler Ramer 提交于 6月 23, 2020

It seems this yaml class is dead code. Removing it for this reason.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

949010f8

Update PyYAML to 5.3.1 · 8d6c3059

由 Tyler Ramer 提交于 6月 23, 2020

The version of PyYAML vendored in gpMgmt/bin/ext is old, unmaintained,
and does not support python3. Actually, it does not even contain a
`__version__` attribute, so it is not possible to know the version.

We need to unvendor YAML and get to a library version that supports
python3 - for this reason, we are updating to the latest PyYAML
available.

Also update yaml.load to use yaml.safe_load instead.
Co-authored-by: NTyler Ramer <tramer@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

8d6c3059

L
docs - gphdfs2pxf migration pxf supports avro compression (#10415) · ab91fee5
由 Lisa Owen 提交于 7月 07, 2020
```
* docs - gphdfs2pxf migration pxf supports avro compression

* missing plural
```
ab91fee5

07 7月, 2020 5 次提交

Alter table add column on AOCS table inherits the default storage settings · 9a574915

由 xiong-gang 提交于 7月 07, 2020

When alter table add a column to AOCS table, the storage setting (compresstype,
compresslevel and blocksize) of the new column can be specified in the ENCODING
clause; it inherits the setting from the table if ENCODING is not specified; it
will use the value from GUC 'gp_default_storage_options' when the table dosen't
have the compression configuration.

9a574915

Fix flaky test gp_replica_check · a1a0af55

由 xiong-gang 提交于 7月 07, 2020

When there is a big lag between primary and mirror replay, gp_replica_check
will fail if the checkpoint is not replayed in about 60 seconds. Extend the
timeout to 600 seconds to reduce the chance of flaky.

a1a0af55

Disallow the replicated table inherit or to be inherited (#10344) · dc4b839e

由 Hao Wu 提交于 7月 07, 2020

Currently, replicated tables are not allowed to inherit a parent
table. But ALTER TABLE .. INHERIT can pass around the restriction.

On the other hand, a replicated table is allowed to be inherited
by a hash distributed table. It makes things much complicated.
When the parent table is declared as a replicated table inherited by
a hash distributed table, its data on the parent is replicated
but the data on the child is hash distributed. When running
`select * from parent;`, the generated plan is:
```
gpadmin=# explain select * from parent;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..4.42 rows=14 width=6)
   ->  Append  (cost=0.00..4.14 rows=5 width=6)
         ->  Result  (cost=0.00..1.20 rows=4 width=7)
               One-Time Filter: (gp_execution_segment() = 1)
               ->  Seq Scan on parent  (cost=0.00..1.10 rows=4 width=7)
         ->  Seq Scan on child  (cost=0.00..3.04 rows=2 width=4)
 Optimizer: Postgres query optimizer
(7 rows)
```
It's not particularly useful for the parent table to be replicated.
So, we disallow the replicated table to be inherited.
Reported-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

dc4b839e

Move slack command trigger repo · 8e6a46f7

由 Chris Hajas 提交于 7月 02, 2020

We've moved the repo that holds trigger commits to a private repo since
there wasn't anything interesting there.

8e6a46f7

Fix vacuum on temporary AO table · 327abdb5

由 Ashwin Agrawal 提交于 7月 06, 2020

The path constructed in OpenAOSegmentFile() didn't take into account
"t_" semantic of filename. Ideally, the correct filename is passed to
function, so no need to construct the same.

Would be better if can move MakeAOSegmentFileName() inside
OpenAOSegmentFile(), as all callers call it except
truncate_ao_perFile(), which doesn't fit that model.

327abdb5

06 7月, 2020 1 次提交

(

Fix bitmap scan crash issue for AO/AOCS table. (#10407) · cb5d18d1

由 (Jerome)Junfeng Yang 提交于 7月 06, 2020

When ExecReScanBitmapHeapScan get executed, bitmap state (tbmiterator
and tbmres) gets freed in freeBitmapState. So the tbmres is NULL, and we
need to reinit bitmap state to start scan from the beginning and reset AO/AOCS
bitmap pages' flags(baos_gotpage, baos_lossy, baos_cindex and baos_ntuples).

Especially when ExecReScan happens on the bitmap append only scan and
not all the matched tuples in bitmap are consumed, for example, Bitmap
Heap Scan as inner plan of the Nest Loop Semi Join. If tbmres not get init,
and not read all tuples in last bitmap, BitmapAppendOnlyNext will assume the
current bitmap page still has data to return. but bitmap state already freed.

From the code, for Nest Loop Semi Join, when a match find, a new outer slot is
requested, and then `ExecReScanBitmapHeapScan` get called, `node->tbmres` and
`node->tbmiterator` set to NULL. `node->baos_gotpage` still keeps true.
When execute `BitmapAppendOnlyNext`, it skip create new `node->tbmres`.
And jump to access `tbmres->recheck`.
Reviewed-by: NJinbao Chen <jinchen@pivotal.io>
Reviewed-by: NAsim R P <pasim@vmware.com>

cb5d18d1

03 7月, 2020 4 次提交

Do not remove PYLIB_SRC_EXT during make clean/distclean · ef39a644

由 Taylor Vesely 提交于 7月 02, 2020

Commit 8190ed40 removed lockfile from
mainUtils, but did not remove a reference to its source directory in the
make clean/distclean target. As a result, because LOCKFILE_DIR is no
longer defined, the make clean/distclean target removes the
PYLIB_SRC_EXT directory.

ef39a644

Force to update checkpoint timeline id in control data after promotion and... · e072dff7

由 Paul Guo 提交于 7月 03, 2020

Force to update checkpoint timeline id in control data after promotion and before pg_rewind. (#10402)

After promotion there is a checkpoint but not an immediate one, so the
checkpoint timeline id in pg_control data might not be updated in time when
running pg_rewind which depends on that to decide the necessity of incremental
recovery. Let's run checkpoint after promotion in tests so that later
pg_rewind will not bypass the incremental recovery.

This should be able to fix some long-existing annoying flaky pg_rewind tests.
Reviewed-by: N(Jerome)Junfeng Yang <jeyang@pivotal.io>

e072dff7

M

docs - remove spaces in cross references. · 4090165b
由 mkiyama 提交于 7月 02, 2020

4090165b
M
docs - add config file parameter fill_missing_fields (#10404) · a1c42dbb
由 Mel Kiyama 提交于 7月 02, 2020
```
--Also, add some links to other topics.
```
a1c42dbb

02 7月, 2020 2 次提交

docs - add GUC -write_to_gpfdist_timeout (#10391) · 86a53828

由 Mel Kiyama 提交于 7月 01, 2020

* docs - add GUC -write_to_gpfdist_timeout

Add GUC and add link to GUC in gpfdist reference

* docs - correct default value 600 --> 300. Fix xref.

86a53828

Fix Orca optimizer search stage couldn't measure elapsed time correctly · db25c3c8

由 Haisheng Yuan 提交于 6月 28, 2020

Previously, CTimerUser didn't initialize timer, so the elapsed time provided by
Orca was not meaningful, sometimes confusing.

When traceflag T101012 is turned on, we can see the following trace message:

[OPT]: Memo (stage 0): [20 groups, 0 duplicate groups, 44 group expressions, 4 activated xforms]
[OPT]: stage 0 completed in 860087 msec,  plan with cost 1028.470667 was found
[OPT]: <Begin Xforms - stage 0>
......
[OPT]: <End Xforms - stage 0>
[OPT]: Search terminated at stage 1/1
[OPT]: Total Optimization Time: 67ms

As shown above, the stage 0 elapsed timer is much greater than the total
optimization time, which is obviously incorrect.

db25c3c8

01 7月, 2020 1 次提交

(

Let FTS mark mirror down if replication keeps crash. (#10327) · 252ba888

由 (Jerome)Junfeng Yang 提交于 7月 01, 2020

For GPDB FTS, if the primary, mirror replication keeps crash
continuously and attempt to crate replication connection too many times,
FTS should mark the mirror down. Otherwise, it may block other
processes.
If the WAL starts streaming, clear the attempt count to 0. This is because the blocked
transaction can only be released once the WAL in streaming state.

The solution for this is:

1. Use ` FTSReplicationStatus` which under `gp_replication.c` to track current primary-mirror
replication status. This includes:
- A continuous failure counter. The counter gets reset once the replication
starts streaming, or replication restarted.
- A record of the last disconnect timestamp which is refactored from
`WalSnd` slot.
The reason for moving this is: When FTS probe happens, the `WalSnd`
slot may already get freed. And `WalSnd` slot is designed reusable.
It's hacky to read value from a freed slot in shared memory.

2. When handling each probe query, `GetMirrorStatus` will check the current
mirror status and the failure count from walsender's application ` FTSReplicationStatus`.
If the count exceeds the limit, the retry test will ignore the last replication
disconnect time since it gets refreshed when new walsender starts. (Since
in the current case, the walsender keeps restart.)

3. On FTS bgworker. If mirror down and retry set to false, mark the mirror
down.

A `gp_fts_replication_attempt_count` GUC is added. When the replication failure count
exceed this GUC, ignore the last replication disconnect time when checking for mirror
probe retry.

The life cycle of a ` FTSReplicationStatus`:
1. It gets created when first enable replication during the replication
start phase. Each replication's sender should have a unique
`application_name`, which also used to specify the replication priority
in multi-mirror env. So ` FTSReplicationStatus` uses the `application_name` mark
itself.

2. The ` FTSReplicationStatus` for replication will exist until FTS detects
failure and stop the replication between primary and mirror. Then
` FTSReplicationStatus` for that `application_name` will be dropped.

Now the `FTSReplicationStatus` is used only for GPDB primary-mirror replication.

252ba888