提交 · dddd8366d2a7f7ef07308d9116a25cfacd1cec4c · Greenplum / Gpdb

12 5月, 2020 1 次提交

Limit DPE stats to groups with unresolved partition selectors (#9988) · dddd8366

由 Hans Zeller 提交于 5月 11, 2020

DPE stats are computed when we have a dynamic partition selector that's
applied on another child of a join. The current code continues to use
DPE stats even for the common ancestor join and nodes above it, but
those nodes aren't affected by the partition selector.

Regular Memo groups pick the best expression among several to compute
stats, which makes row count estimates more reliable. We don't have
that luxury with DPE stats, therefore they are often less reliable.

By minimizing the places where we use DPE stats, we should overall get
more reliable row count estimates with DPE stats enabled.

The fix also ignores DPE stats with row counts greater than the group
stats. Partition selectors eliminate certain partitions, therefore
it is impossible for them to increase the row count.

dddd8366

09 5月, 2020 11 次提交

Fix compilation of unit tests. · 520f9a10

由 Heikki Linnakangas 提交于 4月 23, 2020

They use GPOS_RESET_EX, which needs ITask.

Fix missing includes in unit tests.

(cherry picked from commit 88f9744a)

520f9a10

Replace ops.h with more fine-grained includes. · 81b7d6e6

由 Heikki Linnakangas 提交于 4月 21, 2020

ops.h brings in the headers for *all* the in include/gpopt/operators/,
which is way more than is needed in most cases.

(cherry picked from commit 143dd82d)

81b7d6e6

H
More header cleanup. · 5de19efa
由 Heikki Linnakangas 提交于 4月 21, 2020
```
(cherry picked from commit 347fba32)
```
5de19efa

Reduce usage of dxlops.h · 6fbf2d64

由 Heikki Linnakangas 提交于 4月 21, 2020

Avoid including dxlops.h, which pulls *all* the CParseHandler header
files. Makes the postgres binary (with assertions and debugging
information) about 1.5 MB smaller.

(cherry picked from commit 529ce1a7)

6fbf2d64

H
Move typedef WOSTREAM closer to where it's needed. · 900d32fe
由 Heikki Linnakangas 提交于 4月 20, 2020
```
Let's keep base.h as slim as possible.

(cherry picked from commit b88c8195)
```
900d32fe
H
gpos/base/ITask.h isn't actually used very widely, remove it from base.h. · 54b333b4
由 Heikki Linnakangas 提交于 4月 20, 2020
```
(cherry picked from commit af6431ad)
```
54b333b4

Remove or move some more headers. · 1d3874d6

由 Heikki Linnakangas 提交于 4月 20, 2020

CMemoryPool.h is included literally everywhere, because it comes with
gpos/base.h. Every little there helps.

(cherry picked from commit 99a0066f)

1d3874d6

Remove unnecessary includes. · 9a5b94d5

由 Heikki Linnakangas 提交于 4月 20, 2020

Try to not pull in unnecessary dependencies in header files.

(cherry picked from commit 632ad764)

9a5b94d5

Cleanup dependencies. · 7b72de97

由 Heikki Linnakangas 提交于 4月 20, 2020

With this, the xerces headers are not pulled into the xforms/ files.
Makes each .o file about 100 kB shorter. Shrinks the postgres binary
from about 128 MB to 121 MB, with assertions and debugging enabled.

(cherry picked from commit 35cfc37d)

7b72de97

H
Remove some unused code. · 067fc72a
由 Heikki Linnakangas 提交于 4月 21, 2020
```
(cherry picked from commit c9756796)
```
067fc72a
H
Remove obsolete GPOS_SunOs ifdefs. · 0fd34b1c
由 Heikki Linnakangas 提交于 4月 20, 2020
```
(cherry picked from commit 4eebb0e1)
```
0fd34b1c

08 5月, 2020 2 次提交

Fix possible crash in COPY FORM on QEs · e9c00b80

由 Zhenghua Lyu 提交于 5月 08, 2020

Target partitions need new ResultRelInfos and override previous
estate->es_result_relation_info in NextCopyFromExecute(). The
new ResultRelInfo may leave its resultSlot as NULL. If sreh is
on, the parsing errors will be caught and loop back to parse
another row; however, the estate->es_result_relation_info was
already changed. This can cause crash.

Reproduce:

```sql
CREATE TABLE partdisttest(id INT, t TIMESTAMP, d VARCHAR(4))
DISTRIBUTED BY (id)
PARTITION BY RANGE (t)
(
  PARTITION p2020 START ('2020-01-01'::TIMESTAMP) END ('2021-01-01'::TIMESTAMP),
  DEFAULT PARTITION extra
);

COPY partdisttest FROM STDIN LOG ERRORS SEGMENT REJECT LIMIT 2;
1	'2020-04-15'	abcde
1	'2020-04-15'	abc
\.
```
Authored-by: Nggbq <taos.alias@outlook.com>

e9c00b80

P
Fix a spinlock leak for fault injector · d1b10d64
由 Pengzhou Tang 提交于 4月 28, 2020
```
This is a backport from master 2a7b2bf6
```
d1b10d64

07 5月, 2020 1 次提交

Add LockTagTypeNames for distributed xid · 38ca0dc2

由 dh-cloud 提交于 5月 07, 2020

Commit 13a1f66f forgot to modify the corresponding LockTagTypeNames array. 
This commit fixes this.

38ca0dc2

06 5月, 2020 1 次提交

6X Backport: Enable parallel writes for Foreign Data Wrappers · 86f6c666

由 Francisco Guerrero 提交于 4月 23, 2020

This commit enables parallel writes for Foreign Data Wrapper. This
feature is currently missing from the FDW framework, whilst parallel
scans are supported, parallel writes are missing. FDW parallel writes
are analogous to writing to writable external tables that run on all
segments.

One caveat is that in the external table framework, writable tables
support a distribution policy:

    CREATE WRITABLE EXTERNAL TABLE foo (id int)
    LOCATION ('....')
    FORMAT 'CSV'
    DISTRIBUTED BY (id);

In foreign tables, the distribution policy cannot be defined during the
table definition, so we assume random distribution for all foreign
tables.

Parallel writes are enabled when the foreign table's exec_location is
set to FTEXECLOCATION_ALL_SEGMENTS only. For foreign tables that run on
master or any segment, the current policy behavior remains.

86f6c666

01 5月, 2020 3 次提交

Eliminate a few more user-visible "cache lookup failed" errors. · 00377e02

由 Robert Haas 提交于 7月 29, 2016

Michael Paquier

Original Postgres commit:
https://github.com/postgres/postgres/commit/3153b1a52f8f2d1efe67306257aec15aaaf9e94c

00377e02

Change various deparsing functions to return NULL for invalid input. · 1ccc3df7

由 Robert Haas 提交于 7月 26, 2016

Previously, some functions returned various fixed strings and others
failed with a cache lookup error.  Per discussion, standardize on
returning NULL.  Although user-exposed "cache lookup failed" error
messages might normally qualify for bug-fix treatment, no back-patch;
the risk of breaking user code which is accustomed to the current
behavior seems too high.

Michael Paquier

Original Postgres commit:
https://github.com/postgres/postgres/commit/976b24fb477464907737d28cdf18e202fa3b1a5b

1ccc3df7

Fix zero plan_node_id for BitmapOr/And in ORCA · 9441be14

由 Denis Smirnov 提交于 4月 10, 2020

According to plannode.h "plan_node_id" should be unique across
entire final plan tree. But ORCA DXL to PlanStatement translator
returns uninitialized zero values for BitmapOr and BitmapAnd nodes.
This behaviour differs from Postgres planner and from all other
node translations in this class. It was fixed.

(cherry picked from commit 53a0b781)

9441be14

29 4月, 2020 3 次提交

Fix ALTER DATABASE SET ... FROM CURRENT · 8e05e217

由 xiong-gang 提交于 4月 29, 2020

ALTER DATABASE SET ... FROM CURRENT dispatches incorrect statement to the
segments. Reported in https://github.com/greenplum-db/gpdb/issues/9823

8e05e217

Improve statistics calculation for exprs like "var = ANY (ARRAY[...])" · 65926ad6

由 Shreedhar Hardikar 提交于 4月 21, 2020

Implements an algorithm in MakeHistArrayCmpAnyFilter() using CStatsPredArrayCmp:
1. Construct a histogram with the same bucket boundaries as present in the
base_histogram.
This is better than using a singleton bucket per point, because it that
case, the frequency of each bucket is so small, it is often less than
CStatistics::Epsilon, and may be considered as 0, leading to
cardinality misestimation. Using the same buckets as base_histogram
also aids in joining histogram later.
2. Compute the frequency for each bucket based on the number of points (NDV)
present within each bucket boundary. NB: the points must be de-duplicated
beforehand to prevent double counting.
3. Join this "dummy_histogram" with the base_histogram to determine the buckets
from base_histogram that should be selected (using MakeJoinHistogram)
4. Compute and adjust the resultant scale factor for the filter.
Co-authored-by: NAshuka Xue <axue@pivotal.io>
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>

65926ad6

[Refactor] Rename functions for clarity and add DbgPrints · dcfeb842

由 Ashuka Xue 提交于 4月 22, 2020

Functions renamed:
- CHistogram::Buckets -> GetNumBuckets
- CHistogram::ParseDXLToBucketsArray -> GetBuckets

Implemented DbgPrint for:
- CBucket
- CHistogram
Co-authored-by: NAshuka Xue <axue@pivotal.io>
Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>

dcfeb842

28 4月, 2020 6 次提交

Fix a bug that reader gang always fail due to missing writer gang. (#9828) · 7a665560

由 Paul Guo 提交于 4月 20, 2020

The reason is that new created reader gang would fail on QE due to missing
writer gang process in locking code, and retry would fail again with the same
reason, since the cached writer gang is still used because QD does not know &
check the real libpq network status. See below for the repro case.

Fixing this by checking the error message and then reset all gangs if seeing
the error message, similar to the code logic that checks the startup/recovery
message in gang create function. We could have other fixes, e.g. checking the
writer gang network status, etc but those fixes seem to be ugly after trying.

create table t1(f1 int, f2 text);
<kill -9 one idle QE>

insert into t1 values(2),(1),(5);
ERROR:  failed to acquire resources on one or more segments
DETAIL:  FATAL:  reader could not find writer proc entry, lock [0,1260] AccessShareLock 0 (lock.c:874)
 (seg0 192.168.235.128:7002)

insert into t1 values(2),(1),(5);
 ERROR:  failed to acquire resources on one or more segments
 DETAIL:  FATAL:  reader could not find writer proc entry, lock [0,1260] AccessShareLock 0 (lock.c:874)
  (seg0 192.168.235.128:7002)

<-- Above query fails again.

Cherry-picked from 24f16417 and a0a5b4d5

7a665560

Let Fts tolerate the in-progress 'starting up' case on primary nodes. · 5222ad86

由 Paul Guo 提交于 4月 28, 2020

commit d453a4aa implemented that for the crash
recovery case (not marking the node down and then not promoting the mirror). It
seems that we should do that for the usual "starting up" case also(i.e.
CAC_STARTUP), besides for the existing "in recovery mode" case (i.e.
CAC_RECOVERY).

We've seen that fts promotes the "starting up" primary during isolation2
testing due to 'pg_ctl restart'. In this patch we check recovery progress for
both CAC_STARTUP an CAC_RECOVERY during fts probe and thus can avoid this.
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

cherry-picked from d71b3afd

On master the commit message was eliminated by mistake. Added back on gpdb6.

5222ad86

Remove forceEos mechanism for TCP interconnect · 7cf0ac40

由 Pengzhou Tang 提交于 4月 16, 2020

In TCP interconnect, the sender used to force an EOS messages to the
receiver in two cases:
1. cancelUnfinished is true in mppExecutorFinishup.
2. an error occurs.

For case1, the comment says: to finish a cursor, the QD used to send
a cancel to the QEs, QEs then set the cancelUnfinished flag and did
a normal executor finish up. We now use QueryFinishPending mechanism
to stop a cursor, so case1 logic is invalid for a long time.

For case2, the purpose is: when an error occurs, we force an EOS to
the receiver so the receiver didn't report an interconnect error and
QD then will check the dispatch results and report the errors in the
QEs. From the view of interconnect, we have selectedd to the end of
the query and no error in the interconnect, this logic has two
problems:
1. it doesn't work for initplan, initplan will not check the dispatch
results and throw the errors, so when an error occurs in the QEs for
the initplan, the QD cannot notice that.
2. it doesn't work for cursors, for example:
   DECLARE c1 cursor for select i from t1 where i / 0 = 1;
   FETCH all from c1;
   FETCH all from c1;
All FETCH commands don't report errors which is not expected.

This commit removed the forceEos mechanism, for the case2, the
receiver will report an interconnect error without forceEos, this is
ok because when multiple errors reports from QEs, the QD is inclined
to report non-interconnect error.

7cf0ac40

P
Remove redundant 'hasError' flag in TeardownTCPInterconnect · d093c024
由 Pengzhou Tang 提交于 3月 30, 2020
```
This flag is duplicated with 'forceEOS', 'forceEOS' can also tell
whether errors occur or not.
```
d093c024

Fix a race condition in flushBuffer · 3e1cc863

由 Pengzhou Tang 提交于 4月 24, 2020

flushBuffer() is used to send packets through TCP interconnect, before
sending, it first check whether receiver stopped or teared down the
interconnect, however, there is window between checking and sending, the
receiver may tear down the interconnect and close the peer, so send()
will report an error, to resolve this, we recheck whether the receiver
stopped or teared down the interconnect in this window and don't error
out in that case.
Reviewed-by: NJinbao Chen <jinchen@pivotal.io>
Reviewed-by: NHao Wu <hawu@pivotal.io>

3e1cc863

Fix interconnect hung issue · 7c90c04f

由 Pengzhou Tang 提交于 3月 24, 2020

We hit interconnect hung issue many times in many cases, all have
the same pattern: the downstream interconnect motion senders keep
sending the tuples and they are blind to the fact that upstream
nodes have finished and quitted the execution earlier, the QD
then get enough tuples and wait all QEs to quit which cause a
deadlock.

Many nodes may quit execution earlier, eg, LIMIT, HashJoin, Nest
Loop, to resolve the hung issue, they need to stop the interconnect
stream explicitly by calling ExecSquelchNode(), however, we cannot
do that for rescan cases in which data might lose, eg, commit
2c011ce4. For rescan cases, we tried using QueryFinishPending to
stop the senders in commit 02213a73 and let senders check this
flag and quit, that commit has its own problem, firstly, QueryFini
shPending can only set by QD, it doesn't work for INSERT or UPDATE
cases, secondly, that commit only let the senders detect the flag
and quit the loop in a rude way (without sending the EOS to its
receiver), the receiver may still be stuck inreceiving tuples.

This commit revert the QueryFinishPending method firstly.

To resolve the hung issue, we move TeardownInterconnect to the
ahead of cdbdisp_checkDispatchResult so it guarantees to stop
the interconnect stream before waiting and checking the status
of QEs.

For UDPIFC, TeardownInterconnect() remove the ic entries, any
packets for this interconnect context will be treated as 'past'
packets and be acked with STOP flag.

For TCP, TeardownInterconnect() close all connection with its
children, the children will treat any readable data in the
connection as a STOP message include the closure operation.

A test case is not included, both commit 2c011ce4 and 02213a73
contain one.

7c90c04f

27 4月, 2020 1 次提交

Fix a bug that two-phase sub-transaction is considered as one-phase. · 48ffabce

由 Paul Guo 提交于 4月 20, 2020

QD backend should not forget whether a sub transaction performed writes

QD backend process can avoid two-phase commit overhead if it knows that no QEs
involved in this transaction or any of its sub transactions performed any
writes. Previously, if a sub transaction performed write on one or more QEs, it
was remembered in that sub transaction's global state. However, the sub
transaction state was lost after sub transaction commit. That resulted in QD
not performing two-phase commit at the end of top transaction.

In fact, regardless of the transaction nesting level, we only need to remember
whether a write was performed by a sub transaction. Therefore, use a backend
global variable, instead of current transaction state to record this
information.
Reviewed-by: NGang Xiong <gxiong@pivotal.io>
Reviewed-by: NHao Wu <gfphoenix78@gmail.com>
Reviewed-by: NAsim R P <apraveen@pivotal.io>

48ffabce

24 4月, 2020 1 次提交

Fix another missing-flow bug in JSON-format EXPLAIN. · 8be089d2

由 Heikki Linnakangas 提交于 4月 23, 2020

Similar to 3bdece19 and 51257848, an Append node doesn't always have
a Flow attached to it. Teach EXPLAIN code to dig into its children.

This seems to be a recurring issue, but since it's been refactored away in
'master', I guess we'll just whack these on 6X_STABLE as they're reported.
Downgrade it to a WARNING, though; an assertion failure seems too harsh
for this. Now you get the warning even on non-assertion builds, but I
think that's fine.

Fixes https://github.com/greenplum-db/gpdb/issues/9819Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>

8be089d2

23 4月, 2020 1 次提交

Fix some potential bugs that not update size results. · 324e5b88

由 Zhenghua Lyu 提交于 4月 23, 2020

At several places we just invoke the function add_size
but does not assign its return value to any variable.
The result is just discarded. This commit fixes this.

324e5b88

21 4月, 2020 1 次提交

Fix getDtxCheckPointInfo to contain all committed transactions (#9942) · 05913102

由 Hao Wu 提交于 4月 21, 2020

Half committed transactions in shmCommittedGxactArray are omitted.
The bug could cause data loss/inconsistency. If transaction T1
failed to commit prepared for some reasons, and the transaction T1
has been committed on the master and other segments, but the transaction
T1 isn't appended in the checkpoint record. So the DTX recovery
can't retrieve the transaction and run recovery-commit-prepared,
and the prepared transactions on the segment are aborted.
Co-authored-by: NGang Xiong <gxiong@pivotal.io>

05913102

20 4月, 2020 5 次提交

Improve efficiency in pg_lock_status() · 66e7f812

由 Zhenghua Lyu 提交于 4月 20, 2020

Allocate memory of CdbPgResults.pg_results with palloc0() instead
of calloc(), and free the memory afer use.

The CdbPgResults.pg_results array that is returned from various dispatch
functions is allocated by cdbdisp_returnResults() via calloc(), but in
most cases the memory is not free()-ed after use.

To avoid memory leak, the array is now allocated with palloc0() and
recycled with pfree().

Track which row and which result set is being processed in function context in pg_lock_status(), 
so that an ineffient inner loop can be eliminated.

Author: Fang Zheng <zhengfang.xjtu@gmail.com>

66e7f812

Do not push Volatile funcs below aggs · 89308890

由 Sambitesh Dash 提交于 4月 09, 2020

Consider the scenario below

```
create table tenk1 (c1 int, ten int);
create temp sequence ts1;
explain select * from (select distinct ten from tenk1) ss where ten < 10 + nextval('ts1') order by 1;
```

The filter outside the subquery is a candidate to be pushed below the
'distinct' in the sub-query.  But since 'nextval' is a volatile function, we
should not push it.

Volatile functions give different results with each execution. We don't want
aggs to use result of a volatile function before it is necessary. We do it for
all aggs - DISTINCT and GROUP BY.

Also see commit 6327f25d.

89308890

Fix memory leak and bug in checkpointer process (#9766) · fa07e427

由 Hao Wu 提交于 4月 20, 2020

1. The CurrentMemoryContext in CreateCheckPoint is long-live
memory context owned by the checkpointer process. The memory context
is reset only if the error occurs. So memory leak will lead to
huge memory leak in the OS.
2. The prepared transactions are not appended as an extension in
the checkpoint wal record, which introduces a bug. It occurs:
  1) T1 does some DML and is prepared.
  2) T2 runs a checkpoint.
  3) The seg0 crashes before the transaction T1 successfully runs `COMMIT PREPARE`, but
     all other segments have successfully committed.
  4) When running local recovery, seg0 doesn't know T1 is prepared
     and needs to be committed again.
This is different from the master branch. Master uses files to record
whether there exists any prepared transaction and what them are.
Reviewed-by: NNing Yu <nyu@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

fa07e427

Fix a bug when setting DistributedLogShared->oldestXmin · e246d777

由 xiong-gang 提交于 4月 20, 2020

The shared oldestXmin (DistributedLogShared->oldestXmin) may be updated
concurrently. It should be set to a higher value, because a higher xmin
can belong to another distributed log segment, its older segments might
already be truncated.

For Example: txA and txB call DistributedLog_AdvanceOldestXmin concurrently.

```
txA and txB: both hold shared DistributedLogTruncateLock.

txA: set the DistributedLogShared->oldestXmin to XminA. TransactionIdToSegment(XminA) = 0009

txB: set the DistributedLogShared->oldestXmin to XminB. TransactionIdToSegment(XminB) = 0008

txA: truncate segment 0008, 0007...
```

After that, DistributedLogShared->oldestXmin == XminB, it is on removed
segment 0008. Subsequent GetSnapshotData() calls will be failed because SimpleLruReadPage will error out.
Co-authored-by: Ndh-cloud <60729713+dh-cloud@users.noreply.github.com>

e246d777

Fix CTAS 'with no data' bug · a006571b

由 xiong-gang 提交于 4月 20, 2020

As reported in issue #9790, 'CTAS with no data' statement doesn't handle WITH
clause, the options in WITH clause should be added in 'pg_attribute_encoding'.

a006571b

18 4月, 2020 1 次提交

Fix plan when segmentgeneral union all general locus. · ca560132

由 Zhenghua Lyu 提交于 4月 18, 2020

Previously, planner cannot generate plan when a replicated
table union all a general locus scan. A typical case is:

  select a from t_replicate_table
  union all
  select * from generate_series(1, 10);

The root cause is in the function `set_append_path_locus`
it deduces the whole append path's locus to be segmentgeneral.
This is reasonable. However, in the function `cdbpath_create_motion_path`
it fails to handle two issues in the case:
  * segmentgeneral locus to segmentgeneral locus
  * general locus to segmentgeneral locus
And the both above locus change does not need motion in fact.

This commit fixes this by:
  1. add a check at the very begining of `cdbpath_create_motion_path`
     that if the subpath's locus is the same as target locus, just return
  2. add logic to handle general locus to segmentgeneral locus, just return

ca560132

15 4月, 2020 1 次提交

Speed up stats derivation for large number of disjunction in ORCA · 085952c1

由 Shreedhar Hardikar 提交于 4月 09, 2020

This bug is particularly evident with queries containing a large array
IN clause, e.g "a IN (1, 3, 5, ...)".

As a first step to improve optimization times for such queries, this
commit reduces unnecessary re-allocation of histogram buckets during the
merging of statistics of disjunctive predicates.

It improves the performance of the target query with 7000 elements in
the array comparison by around 50%.

085952c1

13 4月, 2020 1 次提交

Add check for indexpath when bring_to_singleQE. · 2fbc274b

由 Zhenghua Lyu 提交于 4月 13, 2020

Previously, in function bring_to_singleQE,
it depends on the path->param_info field to determine if the path
can be taken into consideration since we cannot pass params across
motion node. But this is not enough, for example, an index path's
param_info field might be null, but its orderbyclauses refs some
outer params. This commit fixes the issue by adding more check
for indexpath.

See Github Issue: https://github.com/greenplum-db/gpdb/issues/9733
for details.

2fbc274b