提交 · 4354f28c10385c3f4d7b967295d9398a67754419 · Greenplum / Gpdb

24 9月, 2019 12 次提交

由 Paul Guo 提交于 9月 24, 2019

That commit is "Necessary legal steps for python subprocess32 shipping."
Forgot to install the required file. Hmm...

4354f28c

P

Necessary legal steps for python subprocess32 shipping. · a8090c13
由 Paul Guo 提交于 9月 24, 2019

a8090c13

Use root's stat info instead of largest child's. · 8ca6c8d1

由 Zhenghua Lyu 提交于 9月 24, 2019

Currently, for partition table we have maintained some
stat info for root table if the GUC optimizer_analyze_root_partition
is set so that we could use root's stat info directly.

Previously we use largest child's stat info for root partition.
This may lead to serious issue. Consider a partition table t,
all data with null partition key goes into default partition and
it happens to be the largest child. Then for the result size of the
query that t join other table on partition key we will estimate 0
because we use the default partition's stat info which contains all
null partition key. What is worse, we may broadcast the join result.

This commit fixes this issue but leave some future work to do:
maintain STATISTIC_KIND_MCELEM and STATISTIC_KIND_DECHIST for root
table. This commit sets the GUC gp_statistics_pullup_from_child_partition
to false defaultly. Now the whole logic is:
  * if gp_statistics_pullup_from_child_partition is true, we try to
    use largest child's stat
  * if gp_statistics_pullup_from_child_partition is false, we first
    try to fetch root's stat:
      - if root contains stat info, that's fine, we just use it
      - otherwise, we still try to use largest child's stat
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>

8ca6c8d1

Omit slice information for SubPlans that are not dispatched separately. · 96c6d318

由 Heikki Linnakangas 提交于 9月 24, 2019

Printing the slice information makes sense for Init Plans, which are
dispatched separately, before the main query. But not so much for other
Sub Plans, which are just part of the plan tree; there is no dispatching
or motion involved at such SubPlans. The SubPlan might *contain* Motions,
but we print the slice information for those Motions separately. The slice
information was always just the same as the parent node's, which adds no
information, and can be misleading if it makes the reader think that there
is inter-node communication involved in such SubPlans.

96c6d318

T
Update the build artifacts path for coverity pipeline · f57a3694
由 Tingfang Bao 提交于 9月 24, 2019
```
Authored-by: NTingfang Bao <bbao@pivotal.io>
```
f57a3694
T
Update the build artifacts path for gpdb_prd pipeline · 69ce9aee
由 Tingfang Bao 提交于 9月 24, 2019
```
Authored-by: NTingfang Bao <bbao@pivotal.io>
```
69ce9aee
T

fix resource definition typo · d2548d1c
由 Tingfang Bao 提交于 9月 24, 2019

d2548d1c

Update the gpdb internal build artifacts path (#8678) · e106872b

由 Tingfang Bao 提交于 9月 24, 2019

In order to maintain the gpdb build process better.
  gp-releng re-organize the build artifacts storage.

  Only the artifacts path changed, the content is still
  the same as before.
Authored-by: NTingfang Bao <bbao@pivotal.io>

e106872b

Avoid gp_tablespace_with_faults test failure by pg_switch_xlog() · efd76c4c

由 Ashwin Agrawal 提交于 9月 23, 2019

gp_tablespace_with_faults test writes no-op record and waits for
mirror to replay the same before deleting the tablespace
directories. This step fails sometime in CI and causes flaky
behavior. The is due to existing code behavior in startup and
walreceiver process. If primary writes big (means spanning across
multiple pages) xlog record, flushes only partial xlog record due to
XLogBackgroundFlush() but restarts before commiting the transaction,
mirror only receives partial record and waits to get complete
record. Meanwhile after recover, no-op record gets written in place of
that big record, startup process on mirror continues to wait to
receive xlog beyond previously received point to proceed further.

Hence, as temperory workaround till the actual code problem is not
resolved and to avoid failures for this test, switch xlog before
emitting no-op xlog record, to have no-op record at far distance from
previously emitted xlog record.

efd76c4c

L

docs - replace incorrect refs to PXF_HOME with PXF_CONF (#8663) · f02c136c
由 Lisa Owen 提交于 9月 23, 2019

f02c136c

Fix CTAS with gp_use_legacy_hashops GUC · 9040f296

由 Jimmy Yih 提交于 9月 10, 2019

When gp_use_legacy_hashops GUC was set, CTAS would not assign the
legacy hash class operator to the new table. This is because CTAS goes
through a different code path and uses the first operator class of the
SELECT's result when no distribution key is provided.

9040f296

A
Remove plan->nMotionNodes · 00326ca2
由 Ashuka Xue 提交于 9月 23, 2019
```
After commit 1c2489d0 nMotionNodes are no longer part of the plan
struct.
```
00326ca2

23 9月, 2019 8 次提交

Make estimate_hash_bucketsize MPP-correct · d6a567b4

由 Zhenghua Lyu 提交于 9月 23, 2019

In Greenplum, when estimating costs, most of the time we are
in a global view, but sometimes we should shift to a local
view. Postgres does not suffer from this issue because everything
is in one single segment.

The function `estimate_hash_bucketsize` is from postgres and
it plays a very important role in the cost model of hash join.
It should output a result based on locally view. However, the
input parameters like, rows in a table, and ndistinct of the
relation, are all taken from a global view (from all segments).
So, we have to do some compensation for it. The logic is:
  1. for broadcast-like locus, the global ndistinct is the same
     as the local one, we do the compensation by `ndistinct*=numsegments`.
  2. for the case that hash key collcated with locus, on each
     segment, there are `ndistinct/numsegments` distinct groups, so
     no need to do the compensation.
  3. otherwise, the locus has to be partitioned and not collocated with
     hash keys, for these cases, we first estimate the local distinct
     group number, and then do do the compensation.
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>

d6a567b4

Refactor away add_slice_to_motion() function. · 08553cc4

由 Heikki Linnakangas 提交于 9月 23, 2019

The function did completely different things for different callers, so
seems better to move the logic to the callers instead.
Reviewed-by: NAdam Lee <ali@pivotal.io>
Reviewed-by: NNing Yu <nyu@pivotal.io>

08553cc4

Remove separate Query argument, it's the same as root->parse. · 0f7a9b47

由 Heikki Linnakangas 提交于 9月 23, 2019

Remove the Query argument from cdbparallelize(), and its apply_motion()
subroutine. Like most planner functions, these functions are passed a
"PlannerInfo root" which represents the query, and its Query struct is
available at root->parse. Passing a separate Query is confusing because you
might think that you could pass some different query, perhaps a subquery.

0f7a9b47

H
Handle VALUES expressions in plan_tree_mutator() · 88c99773
由 Heikki Linnakangas 提交于 9月 23, 2019
```
Fixes github issue https://github.com/greenplum-db/gpdb/issues/8621
```
88c99773

Remove nMotionNodes and nInitPlans from Plan struct. · 1c2489d0

由 Heikki Linnakangas 提交于 9月 23, 2019

It was quite silly to have them in the Plan struct, which all the plan
nodes "inherit", when the fields were actually only used in the topmost
node in a plan tree. The sillyness was noted in the comments, along with
"Someday, find a better place to keep it". Today is that day.

In the executor, the natural place for these is the PlannedStmt struct.
PlannedStmt contains information for the plan tree as a whole, and in fact,
we already had copies of the fields there, we were just not always using
them! PlannedStmt is only build in the last steps of planning, though.
During planning, stash them PlannerGlobal, like many other fields that
are finally copied to PlannedStmt.

There was one little wrinkle in this plan: there was a check in
EvalPlanQual, which checked that EvalPlanQual is not used on a Plan node
that has any Motions in its subtree. Move that check to ExecInitMotion().

1c2489d0

Set error code for "incompatible loci in target inheritance set" error. · e59aff50

由 Heikki Linnakangas 提交于 9月 23, 2019

There is a test case that reaches this, in the 'file_fdw' test. With
ERRCODE_INTERNAL_ERROR, the error message includes the source file location
(planner.c:1513) in the error message. That's problematic, because the line
number changes whenever we touch planner.c.

Since this error is in fact reachable, mark it as FEATURE_NOT_SUPPORTED.

e59aff50

H

Remove unused 'sliceTable' field from Plan struct. · 0a6312a1
由 Heikki Linnakangas 提交于 9月 23, 2019

0a6312a1

Replace planIsParallel by checking Plan->dispatch flag. · c1851b62

由 Heikki Linnakangas 提交于 9月 23, 2019

Commit 7d74aa55 introduced a new function, planIsParallel() to check
whether the main plan tree needs the interconnect, by checking whether
it contains any Motion nodes. However, we already determine that, in
cdbparallelize(), by setting the Plan->dispatch flag. We were just not
checking it when deciding whether the interconnect needs to be set up.
Let's just check the 'dispatch' flag, like we did earlier in the
function, instead of introducing another way of determining whether
dispatching is needed.

I'm about to get rid of the Plan->nMotionNodes field soon, which is why
I don't want any new code to rely on it.

c1851b62

21 9月, 2019 5 次提交

Enable Init Plans in queries executed locally in QEs. · 98c8b550

由 Heikki Linnakangas 提交于 9月 21, 2019

I've been wondering for some time why we have disabled constructing Init
Plans in queries that are planned in QEs, like in SPI queries that run in
user-defined functions. So I removed the diff vs upstream in
build_subplan() to see what happens. It turns out it was because we always
ran the ExtractParamsFromInitPlans() function in QEs, to get the InitPlan
values that the QD sent with the plan, even for queries that were not
dispatched from the QD but planned locally. Fix the call in InitPlan to
only call ExtractParamsFromInitPlans() for queries that were actually
dispatched from the QD, and allow QE-local queries to build Init Plans.

Include a new test case, for clarity, even though there were some existing
ones that incidentally covered this case.

98c8b550

Split MOTIONTYPE_FIXED into GATHER and BROADCAST. · 49f01ddf

由 Heikki Linnakangas 提交于 9月 21, 2019

MOTIONTYPE_FIXED was used for both Gather and Broadcast Motions, and
there was an extra flag to indicate which one it was. There was a comment
that suggested we should have a two different MOTIONTYPE codes for them,
instead. I totally agree, Gather and Broadcast motions are quite different,
and practically all the code that checked for MOTIONTYPE_FIXED also had to
check the flag to see which it is, so separating the two makes a lot of
sense.

This doesn't have any user-visible effect, just refactoring to make the
code nicer.

49f01ddf

M

docs -clarify GUC default_transaction_deferrable has not effect. (#8674) · c6e4f86d
由 Mel Kiyama 提交于 9月 20, 2019

c6e4f86d

Make READ/WRITE_*_ARRAY macros safe for more complex arguments. · 3fc5fd74

由 Heikki Linnakangas 提交于 9月 20, 2019

If you passed e.g. "1+1" as the 'count' argument and "int" as the 'Type'
argument, the macro would expand the allocation to:

   palloc(1+1*sizeof(int))

when clearly it should be

   palloc((1+1)*sizeof(int))

3fc5fd74

Replace toolsmiths/ccp with pivotaldata/ccp (#8667) · a69ed682

由 Amil Khanzada 提交于 9月 20, 2019

The toolsmiths DockerHub repository is deprecated and replaced by pivotaldata.
Co-authored-by: NAmil Khanzada <akhanzada@pivotal.io>
Co-authored-by: NJose Munoz <jmunoz@pivotal.io>

a69ed682

20 9月, 2019 7 次提交

P

Fix pipeline failures caused by the previous commit. · da724e8d
由 Paul Guo 提交于 9月 20, 2019

da724e8d

Ship subprocess32 and replace subprocess with it in python code (#8658) · 9c4a885b

由 Paul Guo 提交于 9月 20, 2019

* Ship modified python module subprocess32 again

subprocess32 is preferred over subprocess according to python documentation.
In addition we long ago modified the code to use vfork() against fork() to
avoid some "Cannot allocate memory" kind of error (false alarm though - memory
is actually sufficient) on gpdb product environment that is usually with memory
overcommit disabled.  And we compiled and shipped it also but later it was just
compiled but not shipped somehow due to makefile change (maybe a regression).
Let's ship it again.

* Replace subprocess with our own subprocess32 in python code.

9c4a885b

Two simple cleanups. (#8659) · 0439ae41

由 Paul Guo 提交于 9月 20, 2019

1. checkpoint_segments does not exist since pg9.5. Cleaning up the code that
includes it.

2. GPTest.pm should be cleaned up in src/test/regress/GNUmakefile
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

0439ae41

Don't Assert() in MemoryAccounting_ConvertIdToAccount() · 3c54abf3

由 Adam Lee 提交于 9月 20, 2019

Greenplum's Assert() invokes Trap(), then ExceptionalCondition(), then
errdetail(), MemoryAccounting_Allocate(),
MemoryAccounting_ConvertIdToAccount() and Assert() again.

It will fall into an infinite error handling loop until the process got
signaled.

Just abort here.

3c54abf3

S

Fix auto_explain_optimizer expected output file. · 3d468888
由 Sambitesh Dash 提交于 9月 19, 2019

3d468888

Bump ORCA v3.72.0 and reduce Error logging for ORCA · 13ee8482

由 Sambitesh Dash 提交于 9月 10, 2019

- The corresponding ORCA PR is : https://github.com/greenplum-db/gporca/pull/533

- Change GUC value OPTIMIZER_UNEXPECTED_FAIL so that we log only unexpected failures.
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NSambitesh Dash <sdash@pivotal.io>

13ee8482

Fix miscellaneous warnings when building ORCA translator code · 450e41bc

由 Shreedhar Hardikar 提交于 9月 17, 2019

- Fix "unelaborated friend declaration" warnings
- Fix "missing prototype" warnings
- Fix "generalized initializer lists are a C++ extension" warning

CTranslatorQueryToDXL.h:63:10: warning: unelaborated friend declaration is a C++11 extension; specify 'class' to befriend 'gpdxl::CTranslatorScalarToDXL' [-Wc++11-extensions]
                friend CTranslatorScalarToDXL;
                       ^
                       class

funcs.cpp:43:1: warning: no previous prototype for function 'DisableXform' [-Wmissing-prototypes]
DisableXform(PG_FUNCTION_ARGS)
^
funcs.cpp:76:1: warning: no previous prototype for function 'EnableXform' [-Wmissing-prototypes]
EnableXform(PG_FUNCTION_ARGS)
^
funcs.cpp:109:1: warning: no previous prototype for function 'LibraryVersion' [-Wmissing-prototypes]
LibraryVersion()
^
funcs.cpp:123:1: warning: no previous prototype for function 'OptVersion' [-Wmissing-prototypes]
OptVersion()
^
4 warnings generated.

CTranslatorDXLToScalar.cpp:730:9: warning: generalized initializer lists are a C++11 extension [-Wc++11-extensions]
        return { .oid_type = inner_type_oid, .type_modifier = type_modifier};

450e41bc

19 9月, 2019 8 次提交

X

Allocate and zero out the memory of MyTmGxactLocal · 70c9dbd3
由 xiong-gang 提交于 9月 19, 2019

70c9dbd3
G
The XactLogCommitRecord logical is broken when merge 9.5, doing minor refactor · e80b0212
由 Gang Xiong 提交于 9月 02, 2019
```
to fix it.
```
e80b0212
G

Refactor the dispatch function and error messages. · 9471a04d
由 Gang Xiong 提交于 7月 31, 2019

9471a04d

Refactor TMGXACT · c0b60c3f

由 Gang Xiong 提交于 7月 31, 2019

- some members of 'MyTmGxact' is only accessed locally, extract them to local
  variable 'MyTmGxactLocal'
- get rid of gid in MyTmGxact and form it with timestamp and gxid if needed.
- get rid of 'currentGxact' and check 'MyTmGxactLocal->state' to see if the
  distributed transaction is started or not.

c0b60c3f

G

Remove dead code · cae7abe9
由 Gang Xiong 提交于 7月 31, 2019

cae7abe9

Check parallel plans correctly · 97ec75a7

由 Ning Yu 提交于 9月 11, 2019

In standard_ExecutorStart() we should dispatch a plan if it is parallel,
currently this is determined by checking planTree->dispatch is
DISPATCH_PARALLEL or not.  However sometimes a DISPATCH_UNDETERMINED
plan can also be parallel.  For example:

        CREATE TABLE arrtest_f (f0 int, f1 text, f2 float8)
          DISTRIBUTED RANDOMLY;
        EXPLAIN
        SELECT ARRAY(select f2 from arrtest_f order by f2) AS "ARRAY"
          ORDER BY 1;
                             QUERY PLAN
        --------------------------------------------------
         Result
           InitPlan 1 (returns $0)  (slice2)
             ->  Gather Motion 3:1  (slice1; segments: 3)
                   Merge Key: f2
                   ->  Sort
                         Sort Key: f2
                         ->  Seq Scan on arrtest_f
         Optimizer: Postgres query optimizer
        (8 rows)

To fix it we should also check whether the plan contains motions.  Note
that we should only check for the motions of itself, the motions of its
init plans should not be counted.

(cherry picked from commit 7d74aa55)

97ec75a7

Fix tcp interconnect hang · 2fa26b2a

由 xiong-gang 提交于 9月 09, 2019

There was a hang like this: when one QE errors out before 'SetupInterconnect',
QD will keep waiting for the incoming connections to be established and doesn't
check the error message from dispatcher. Other QEs are finished and hang in
function 'waitOnOutbound'.
Co-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

(cherry picked from commit b2101122)

2fa26b2a

W
Remove unused local variable · 5a1c076c
由 Weinan WANG 提交于 9月 19, 2019
```
In DispatchSyncPGVariable, `ListCell *l` is unused. remove it
```
5a1c076c