提交 · 300d3c19d293ad9a6e731622b63469d5dc968649 · Greenplum / Gpdb

16 6月, 2020 7 次提交

Fix lateral PANIC issue when subquery contain limit or groupby. · 300d3c19

由 Zhenghua Lyu 提交于 6月 16, 2020

Previous commit 62579728 fixes a lateral panic issue but does
not handle all the bad cases because it only check if the query
tree contains limit clause. Bad cases for example: if the subquery
is like `q1 union all (q2 limit 1)` then the whole query tree
does not contain limit clause.

Another bad case is the lateral subquery may contain groupby.
like:

    select * from t1_lateral_limit t1 cross join lateral
    (select (c).x+t2.a, sum(t2.a+t2.b) from t2_lateral_limit t2
     group by (c).x+t2.a)x;

When planning the lateraled subquery we do not know where is
the param in the subquery's query tree. Thus it is a bit complicated
to precisely and efficiently resolve this issue.

This commit adopts a simple method to fix panic issue: it justs
check the subquery's query tree to see if there is any group-by
or limit clause, if so, force gather each relation and materialize
them. This is not the best plan we might get. But let's make it
correct first and I think in future we should seriously consider
how to fully and efficiently support lateral.

300d3c19

Fix a startup process hang issue when primary node has not yet... · ab69cf9e

由 Paul Guo 提交于 6月 16, 2020

Fix a startup process hang issue when primary node has not yet committed/aborted prepare xlog before shutdown checkpoint. (#10164)

On primary if there is prepared transaction that is not committed or aborted,
restarting it with the fast mode could lead to startup process hang. That's
because it was cleanly shutdown so there is no recovery, however some variables
which are used by PrescanPreparedTransactions() are only initialized during
recovery.

The stack of the hanging startup process is as below:

0x0000000000bfb756 in pg_usleep (microsec=1000) at pgsleep.c:56
0x00000000005668ad in read_local_xlog_page (state=0x2a9c580, targetPagePtr=201326592, reqLen=32768, targetRecPtr=201571064, cur_page=0x2ab32e0 "\a",
pageTLI=0x2a9c5c0) at xlogutils.c:829
0x00000000005646ef in ReadPageInternal (state=0x2a9c580, pageptr=201555968, reqLen=15128) at xlogreader.c:503
0x0000000000563f86 in XLogReadRecord (state=0x2a9c580, RecPtr=201571064, errormsg=0x7ffe895409d8) at xlogreader.c:226
0x000000000054c0e0 in PrescanPreparedTransactions (xids_p=0x0, nxids_p=0x0) at twophase.c:1696
0x0000000000559dcc in StartupXLOG () at xlog.c:7595
0x00000000008e6c2f in StartupProcessMain () at startup.c:242
Reviewed-by: NGang Xiong <gxiong@pivotal.io>
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

ab69cf9e

Fix flaky test exttab1 and pxf_fdw · 2a41201c

由 Hubert Zhang 提交于 6月 15, 2020

The flaky case happens when select an external table with option
"fill missing fields". By gdb the qe, this value is not false
on QE sometimes. When ProcessCopyOptions, we use intVal(defel->arg)
to parse the boolean value, which is not correct. Using defGetBoolean
to replace it.
Also fix a pxf_fdw test case, which should set fill_missing_fields to
true explicitly.

cherry pick from: f154e5

2a41201c

Retry more for replication synchronization waiting to avoid isolation2 test flakiness. (#10281) · 6d829d98

由 Paul Guo 提交于 6月 15, 2020

Some test cases have been failing due to too few retries. Let's increase them and also
create some common UDF for use.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

Cherry-picked from ca360700

6d829d98

Fix flakiness of "select 1" output after master reset due to injected panic... · 8f753424

由 Paul Guo 提交于 6月 15, 2020

Fix flakiness of "select 1" output after master reset due to injected panic fault before_read_command (#10275)

Several tests inject panic in before_read_command to trigger master reset.
Previous we run "select 1" after the fault inject query to verify, but the
output is not deterministic sometimes. i.e. sometimes we do not see the line

PANIC: fault triggered, fault name:'before_read_command' fault type:'panic'

This was actually observed in test crash_recovery_redundant_dtx per commit
message and test comment. It ignores the output of "select 1", but probably
we still want the output to verify the fault is encountered.

It's still mysterious why sometimes the PANIC message is missing. I spent some
time on digging but reckon that I can not root cause in short time. One guess
is that the PANIC message was although sent to the frontend in errfinish() but
the kernel buffer-ed data was dropped after abort() due to ereport(PANIC);
Another guess is something wrong related to libpq protocol (not saying it's a
libpq bug). In any case, it does not deserve much time to work on the tests
only, so simply mask the PANIC message to make the test result deterministic
and also not affect the test purpose.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

Cherry-picked from 02ad1fc4

Note test segwalrep/dtx_recovery_wait_lsn was not backported so the change
on the test is not cherry-picked.

8f753424

J

Add a new line feed and fix a bad file name · 0919667b
由 J·Y 提交于 6月 05, 2020

0919667b

Fix ICW test if GPDB compiled without ORCA · 01cf8634

由 Chris Hajas 提交于 6月 08, 2020

We need to ignore the output when enabling/disabling an Orca xform, as
if the server is not compiled with Orca there will be a diff (and we
don't really care about this output).

Additionally, clean up unnecessaary/excessive setting of GUCs

Some of these gucs were on by default or only intended for a specific
test. Explicitly setting them caused them to appear at the end of
`explain verbose` plans, making the expected output more difficult to
match with if the server was built with/without Orca.

01cf8634

15 6月, 2020 1 次提交

Fix flaky test terminate_in_gang_creation · 6584b264

由 Hubert Zhang 提交于 6月 15, 2020

The test case restarts all primaries and expects the old session
would fail for the next query since gangs are cached.
But the restart may last more than 18s which is the max idle
time QEs could exist. In this case, the new query in the old
session will just fetch a new gang without expected errors.
Just set gp_vmem_idle_resource_timeout to 0 to fix this flaky test.
cherry-pick from: 63b5adf9Reviewed-by: NPaul Guo <pguo@pivotal.io>

6584b264

12 6月, 2020 1 次提交
- D
  
  Add materialized view info to \d reference content (#10282) · 7573892e
  由 David Yozie 提交于 6月 11, 2020
  
  7573892e
11 6月, 2020 3 次提交

Fix bitmap segfault while portal clean · c9dc4729

由 Denis Smirnov 提交于 5月 12, 2020

Currently when we finish a query with a bitmap index scan and
destroy its portal, we release bitmap resources in a wrong order.
First of all we should release bitmap iterator (a bitmap wrapper)
and only after that close down subplans (bitmap index scan with an
allocated bitmap). Before current commit this operations were done
in a reverse order that causes access to a freed bitmap in a
iterator closing function.

Underhood pfree() is a malloc's free() wrapper. Free() doesn't
return memory to OS in most cases or even doesn't immediately
corrupt data in a freed chunk, so it is possible to access freed
chunk data right after its deallocation. That is why we can get
segfault only under concurrent workloads when malloc's arena
returns memory to OS.

c9dc4729

Change branch for test_gpdb_6X slack command · fa45da99

由 Chris Hajas 提交于 6月 09, 2020

Previously, we used the same slack command for master and 6X. Now we
have separate slack commands, and need to look in separate branches.

fa45da99

docs - graph analytics new page (#10138) · de367f7d

由 Lena Hunter 提交于 6月 10, 2020

* clarifying pg_upgrade note

* graph edits

* graph analytics updates

* menu edits and code spacing

* graph further edits

* insert links for modules

de367f7d

10 6月, 2020 4 次提交

D

Docs - fix broken links · 3b424dc9
由 David Yozie 提交于 6月 10, 2020

3b424dc9
L
docs - clarify gpinitsystem -O behavior (#10228) · b12f633e
由 Lena Hunter 提交于 6月 09, 2020
```
* clarifying pg_upgrade note

* gpinitsystem -O clarification

* further reviews
```
b12f633e
W
Revert "Add "FILL_MISSING_FIELDS" option for gpload." (#10280) · 7118e8ac
由 Wen Lin 提交于 6月 10, 2020
```
This reverts commit 87fef901.
```
7118e8ac

Detect window functions plus correlated subquery in target list and fall back (#10265) · 683f767c

由 Hans Zeller 提交于 6月 09, 2020

We found that when we have window functions and also correlated subqueries in the
same target list, the CQueryMutators::NormalizeWindowProjList method would leave
the varattno attributes of outer references in the subquery unchanged. That needs
to be changed, since we are producing a different RTE for the query.

We will eventually create a fix. For now, this PR just searches for the
problem and triggers a fallback when we see it, to avoid incorrect results.
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NHans Zeller <hzeller@vmware.com>

683f767c

09 6月, 2020 2 次提交

Fix flaky test recoverseg_from_file (#10259) · 3c01cbaf

由 Paul Guo 提交于 6月 09, 2020

After gprecoverseg, need to wait until the cluster is synchronized before
running subsequent tests.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
Reviewed-by: NReviewed-by: Ashwin Agrawal <aagrawal@pivotal.io>

Cherry-picked from d490798b

3c01cbaf

D
Update statement about mirroring recommendations & support (#10206) · dbb6ea77
由 David Yozie 提交于 6月 08, 2020
```
* Update statement about mirroring recommendations & support

* Updates based on k8s feedback
```
dbb6ea77

08 6月, 2020 1 次提交

[Backport 6X] Using gp_add_segment to register mirror in catalog · 7b0f1758

由 Hubert Zhang 提交于 6月 08, 2020

When introducing a new mirror, we need two steps:
1. start mirror segment
2. update gp_segment_configuration catalog

Previously gp_add_segment_mirror will be called to update
the catalog, but dbid is chosen by get_availableDbId() which
cannot ensure to be the same dbid in internal.auto.conf.
Reported by issue9837
Reviewed-by: NPaul Guo <pguo@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bhuvnesh2703@gmail.com>

cherry-pick from commit: f7965d and 1ee999

7b0f1758

06 6月, 2020 5 次提交
- L
  
  docs - using info for moving a query to a diff resgroup (6x) (#10251) · 1d37e0ff
  由 Lisa Owen 提交于 6月 05, 2020
  
  1d37e0ff
- D
  Revert "docs - add info about moving a query to a different resource group (#10238)" · 1796bc57
  由 David Yozie 提交于 6月 05, 2020
```
This reverts commit 07775046.
```
  1796bc57
- L
  docs - add info about moving a query to a different resource group (#10238) · 07775046
  由 Lisa Owen 提交于 6月 05, 2020
```
* docs - add info about moving a query to a different resource group

* need to be superuser

* remove upgrade/downgrade info for master
```
  07775046
- D
  
  Docs - xml fix · 10f6774f
  由 David Yozie 提交于 6月 05, 2020
  
  10f6774f
- L
  
  docs - update some xrefs and upgrade info (#10237) · 7d298303
  由 Lisa Owen 提交于 6月 05, 2020
  
  7d298303
05 6月, 2020 5 次提交
- H
  Support multiple INITPLAN functions in one query. · 30020edf
  由 Hubert Zhang 提交于 5月 20, 2020
```
We now use initplan id to differentiate the tuplestore used by
different INITPLAN functions. INITPLAN will also write the function
result into different tuplestores.

Also fix the bug which appends initplan in the wrong place. It may
generate wrong result in UNION ALL case.

cherry-pick from: 2589a3
```
  30020edf
- D
  
  Docs - update gpcc compatibility info for 6.8 · c9bb3363
  由 David Yozie 提交于 6月 04, 2020
  
  c9bb3363
- L
  
  docs - pxf hive column mapping round 2 (#10086) · 2f47ccc3
  由 Lisa Owen 提交于 5月 11, 2020
  
  2f47ccc3
- L
  docs - new pxf IGNORE_MISSING_PATH option (#10010) · 9ec512ca
  由 Lisa Owen 提交于 5月 05, 2020
```
* docs - new pxf IGNORE_MISSING_PATH option

* reword default case

* add IGNORE_MISSING_PATH info to relevant profiles

* the action to take

* try to describe why pxf behaviour is not optimal
```
  9ec512ca
- L
  
  docs - add pxf v5.12 to supported platforms (#10235) · c15ed69a
  由 Lisa Owen 提交于 6月 04, 2020
  
  c15ed69a
04 6月, 2020 5 次提交

H

Fix plan difference in gporca regression test · ed9b9eea
由 Hans Zeller 提交于 6月 03, 2020

ed9b9eea
W

Add "FILL_MISSING_FIELDS" option for gpload. · 87fef901
由 Wen Lin 提交于 6月 04, 2020

87fef901

Support "NDV-preserving" function and op property · 392c2e97

由 Hans Zeller 提交于 6月 01, 2020

Orca uses this property for cardinality estimation of joins.
For example, a join predicate foo join bar on foo.a = upper(bar.b)
will have a cardinality estimate similar to foo join bar on foo.a = bar.b.

Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1)
won't be treated that way, since they are more likely to have a greater
effect on join cardinalities.

Since this is specific to ORCA, we use logic in the translator to determine
whether a function or operator is NDV-preserving. Right now, we consider
a very limited set of operators, we may add more at a later time.

Let's assume that we join tables R and S and that f is a function or
expression that refers to a single column and does not preserve
NDVs. Let's also assume that p is a function or expression that also
refers to a single column and that does preserve NDVs:

join predicate       card. estimate                         comment
-------------------  -------------------------------------  -----------------------------
col1 = col2          |R| * |S| / max(NDV(col1), NDV(col2))  build an equi-join histogram
f(col1) = p(col2)    |R| * |S| / NDV(col2)                  use NDV-based estimation
f(col1) = col2       |R| * |S| / NDV(col2)                  use NDV-based estimation
p(col1) = col2       |R| * |S| / max(NDV(col1), NDV(col2))  use NDV-based estimation
p(col1) = p(col2)    |R| * |S| / max(NDV(col1), NDV(col2))  use NDV-based estimation
otherwise            |R| * |S| * 0.4                        this is an unsupported pred
Note that adding casts to these expressions is ok, as well as switching left and right side.

Here is a list of expressions that we currently treat as NDV-preserving:

coalesce(col, const)
col || const
lower(col)
trim(col)
upper(col)

One more note: We need the NDVs of the inner side of Semi and
Anti-joins for cardinality estimation, so only normal columns and
NDV-preserving functions are allowed in that case.

This is a port of these GPDB 5X and GPOrca PRs:
https://github.com/greenplum-db/gporca/pull/585
https://github.com/greenplum-db/gpdb/pull/10090

(cherry picked from commit 3ccd1ebfa1ea949ac77ed3b5d8f5faadfa87affd)

Also updated join.sql expected files with minor motion changes.

392c2e97

D

Docs - update PXF, gpss versions · 235ab4ff
由 David Yozie 提交于 6月 03, 2020

235ab4ff
D

Docs - update versioning & build for 6.8 release · c1f648ae
由 David Yozie 提交于 6月 03, 2020

c1f648ae

03 6月, 2020 5 次提交

Remove unnecessary projections from duplicate sensitive Distribute(s) in ORCA · 3f84adff

由 Shreedhar Hardikar 提交于 5月 27, 2020

Duplicate sensitive HashDistribute Motions generated by ORCA get
translated to Result nodes with hashFilter cols set. However, if the
Motion needs to distribute based on a complex expression (rather than
just a Var), the expression must be added into the targetlist of the
Result node and then referenced in hashFilterColIdx.

However, this can affect other operators above the Result node. For
example, a Hash operator expects the targetlist of its child node to
contain only elements that are to be hashed. Additional expressions here
can cause issues with memtuple bindings that can lead to errors.

(E.g The attached test case, when run without our fix, will give an
error: "invalid input syntax for integer:")

This PR fixes the issue by adding an additional Result node on top of
the duplicate sensitive Result node to project only the elements from
the original targetlist in such cases.

3f84adff

Fixed the \dm empty output error · 387bf9cd

由 Jinbao Chen 提交于 6月 03, 2020

The psql client ignored rel storage when he create the \dm command.
So the output of \dm was empty. Add the correct rel storage check in command

387bf9cd

Revert "Upgrade pgbouncer to 1.13" · 0da92fbc

由 Xiaoran Wang 提交于 6月 03, 2020

This reverts commit 412493b0.

Failed to compile pgbouncer on centos6: can't find libevent.
pgbouncer 1.13 uses pkg-config to look libevent up instead of
using --with-libevent.

Another issue is: pgbouncer does not support libevent version 1.x
in 1.13 version, but we use libevent 1.4 on centos6.

0da92fbc

X

Upgrade pgbouncer to 1.13 · 412493b0
由 Xiaoran Wang 提交于 6月 01, 2020

412493b0

Refactoring the DbgPrint and OsPrint methods (#10149) · d9b16e34

由 Hans Zeller 提交于 6月 02, 2020

* Make DbgPrint and OsPrint methods on CRefCount

Create a single DbgPrint() method on the CRefCount class. Also create
a virtual OsPrint() method, making some objects derived from CRefCount
easier to print from the debugger.

Note that not all the OsPrint methods had the same signatures, some
additional OsPrintxxx() methods have been generated for that.

* Making print output easier to read, print some stuff on demand

Required columns in required plan properties are always the same
for a given group. Also, equivalent expressions in required distribution
properties are important in certain cases, but in most cases they
disrupt the display and make it harder to read.

Added two traceflags, EopttracePrintRequiredColumns and
EopttracePrintEquivDistrSpecs that have to be set to print this
information. If you want to go back to the old display, use these
options when running gporca_test: -T 101016 -T 101017

* Add support for printing alternative plans

A new method, CEngine::DbgPrintExpr() can be called from
COptimizer::PexprOptimize, to allow printing of the best plan
for different contexts. This is only enabled in debug builds.

To use this:

- run an MDP using gporca_test, using a debug build
- print out memo after optimization (-T 101006 -T 101010)
- set a breakpoint near the end of COptimizer::PexprOptimize()
- if, after looking at the contents of memo, you want to see
  the optimal plan for context c of group g, do the following:
  p eng.DbgPrintExpr(g, c)

You could also get the same info from the memo printout, but it
would take a lot longer.

(cherry picked from commit b3fdede6)

d9b16e34

02 6月, 2020 1 次提交

Add missing field skipData in RefreshClause serialization. (#10219) · 7c47e862

由 Jinbao Chen 提交于 6月 02, 2020

SkipData flag should only short circuit in transientrel_receive on QE

We should still do the begin/end work, e.g. remove the new
created temp file, or we will have file leak.
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
Co-authored-by: NHubert Zhang <hzhang@pivotal.io>

7c47e862