提交 · eae823f4d7d019994a2239525806ac483ce60f3c · Greenplum / Gpdb

02 8月, 2019 3 次提交

Support GIN Indexes with ORCA. · eae823f4

由 Bhuvnesh Chaudhary 提交于 6月 26, 2019

This commit adds the GPDB side changes required to support GIN Indexes with
ORCA. It also adds a new test file gp_gin_indexes to test plans produced for
ORCA/planner.

GIN indexes are not supported with index expression or predicate constraints.
ORCA does not support it currently for other types of indexes too.

eae823f4

Unify backend/access/gin unittest infrastructure with other unit tests (#8275) · fc88b4ee

由 Ivan Leskin 提交于 8月 01, 2019

There is a pipeline for unit tests in GPDB used in most cases.

However, unit tests of src/backend/access/gin introduced by 99360f54
used a custom implementation of unit test build script. This led to
errors e.g. when a compiler different than GCC was used to build GPDB.

Rewrite Makefile in order to unify test infrastructure with common
pattern used in the backend while retaining test isolation from the
backend objects.

See also similar Makefile: src/backend/catalog/test/Makefile
at 122c79f2

Note the Makefile in src/backend/access/gin/test is different from
currently most used version of a backend unit test Makefile. These
differences and motivation for them is described in the README.

Run pgindent on ginpostlist_fakes.c
Reviewed-by: NAdam Berlin <aberlin@pivotal.io>
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NIvan Leskin <leskin.in@arenadata.io>

fc88b4ee

Fix aocs table block version mismatch (#8202) · 41fd823a

由 David Kimura 提交于 8月 01, 2019

ALTER TABLE DROP COLUMN followed by reorganize leads to loss of column
encoding settings of the dropped column. When the column's compresstype
encoding is incorrect, we can encounter block version mismatch error
later during block info validation check of the dropped column.

One idea was to skip dropped columns when constructing AOCSScanDesc.
However, dropping all columns is a special case that is not easily handled
because it is not equivalent to deleted rows. Instead, the fix is to preserve
column encoding settings even for dropped columns.
Co-authored-by: NSoumyadeep Chakraborty <sochakraborty@pivotal.io>
Co-authored-by: NIvan Leskin <leskin.in@arenadata.io>

41fd823a

01 8月, 2019 1 次提交
- N
  
  Test concurrent call to pg_mkdir_p() · c8d4e32f
  由 Ning Yu 提交于 7月 26, 2019
  
  c8d4e32f
31 7月, 2019 5 次提交

Fix gpperfomon GUC definitions · b5729a03

由 Asim R P 提交于 7月 31, 2019

GUC gp_enable_gpperfmon is defined to be set only at postmaster
restart. Having a check hook that checks if the process setting it
has superuser privileges is meaningless. The check hook is removed.

GUC gp_gpperfmon_send_interval is intended to be set only by
superuser. Adjust its definition accordingly and leverage checks
built into GUC framework for superuser privileges. The check hook for
this GUC tried to achieve the same but incorrectly. If the check hook
was invoked at the beginning of main query processing loop by a
backend process, it would crash. At the beginning of the main loop, a
transaction is not started yet. The check hook invokes superuser()
interface, which performs catalog access. Doing so without starting a
transaction is a recipe for crashing badly. Such a crash was observed
in production at least once.

Thank you Jesse Zhang for suggesting to remove superuser check.

The patch doesn't add any tests because, after removing the check
hooks, the checks built into GUC framework are being used. That code
path is well exercised by existing regression tests.

Reviewed-by: Daniel Gustafsson

b5729a03

Convert bin, sbin and doc in gpMgmt to recursive targets · b5aba18b

由 Daniel Gustafsson 提交于 7月 31, 2019

Installing the Management utilities used to be pretty brute-force
operation which copied more or less everything over blindly and then
tried to remove what shouldn't be installed. This is clearly not a
terribly clean and sustainable solution, as subsequent issues with
it has proven (editor savefiles, patch .rej/.orig files etc were
routinely copied and never purged etc).

This takes a first stab at turning installation of gpMgmt/bin, sbin
and doc into proper recursive make targets which only install the
files that were intended to be installed.

Discussion: https://github.com/greenplum-db/gpdb/pull/8179
Reviewed by Bradford Boyle, Kalen Krempely, Jamie McAtamney and
many more

b5aba18b

Create a sub memory context for serialization functions · f780ca11

由 Adam Lee 提交于 7月 10, 2019

And reset it after to make sure no memory leaks there.

For instance, the deserialized transValues for new entries (not that
temporary) are not in group_buf, and not freed.
Co-authored-by: NNing Yu <nyu@pivotal.io>

f780ca11

Use mpool to allocate memory for aggregate transition data · c9258ae3

由 Adam Lee 提交于 6月 28, 2019

So we could account its usage via GET_TOTAL_USED_SIZE(hashtable) and
decide if it's time to spill.

And now it's not needed to pfree() the transValue since it's in mpool
and will be reset after spill.
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NRichard Guo <riguo@pivotal.io>

c9258ae3

Serialize the aggstates while spilling hash table · 6dadce04

由 Adam Lee 提交于 6月 24, 2019

AggStates are now pointers allocated in aggcontext with type INTERNAL,
just spilling the pointers don't decrease the memory usage and have
possible memory leak if combining states without free.

This commit serialize the aggstates, write the real data into file and
free the memory.

6dadce04

30 7月, 2019 1 次提交

resgroup: hashagg: revert operator memory auto enlarging · 3827546a

由 Ning Yu 提交于 7月 30, 2019

In resource group mode we ever introduced an operator memory automatic
enlarging logic for hashagg, the point is to let hashagg fail on actual
OOM instead of a soft quote checking, this helps to let hashagg run
successfully with an initial low operator memory.

However a bug was introduced by the auto enlarging logic, hashagg
spilling can be disabled in resource group mode by accident.

On the other hand we introduced a memory_spill_ratio=0 mode in resource
group to use statement_mem for operators, which is the same behavior as
resource queue.  The statement_mem setting is usually large enough and
fine tuned by the users, in such a case we do not need the auto enlarge
for hashagg, and it is better to keep the old quote checking behavior.

In such a case we revert the hashagg related changes from below commits:

- 40d955d6 Rid resource group on hashagg spill evaluation (#8199)
- ede74cdc resgroup: reduce log level for operator memory overuse
- f053e6cd resgroup: allow memory overuse for hashagg spill meta data
- 90795402 resgroup: allow operators enlarge their memory quota
Reviewed-by: NAdam Lee <ali@pivotal.io>
Reviewed-by: NWeinan WANG <wewang@pivotal.io>

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/30hiTArxsgo/aydVMcrXBQAJ

3827546a

26 7月, 2019 4 次提交

Emit warning that an injector of a panic should consider disabling FTS. · fbd0f091

由 Adam Berlin 提交于 7月 08, 2019

- Adds hint to user on how to disable FTS.

- Registers warnings outside of the core injection framework

  This allows us to create warnings that are GPDB specific, for
  things like FTS without tainting the core framework which is
  theoretically Postgres-only dependent.

fbd0f091

CREATE DATABASE two-phase commit safe (#8078) · 7c89cde7

由 Weinan WANG 提交于 7月 26, 2019

When a failed raised by `CREATE DATABASE`, relevant files, directories
should not be leftover.

Add new DB information into `pendingDbDeletes` list, so that promise `CREATE DATABASE` as a 2pc safe command.
Co-authored-by: NAsim R P <apraveen@pivotal.io>

7c89cde7

Save the complete slot data in ExecMaterializeSlot() · 1eeae2e9

由 Adam Lee 提交于 7月 26, 2019

ExecMaterializeSlot() transformed any tuple to a virtual tuple via
slot_getallattrs(), then formed a heaptuple from it, ctid was lost here
since virtual tuples have no system columns.

This commit copies the entire htup directly if we have a regular
physical tuple but not locally palloc'd.

1eeae2e9

Propagate external table format options to PXF headers · 91d4e40a

由 Francisco Guerrero 提交于 7月 25, 2019

Currently, PXF is not propagating the format options in the external
table framework. In Foreign Data Wrappers, these options are defined at
the foreign table-level and are being propagated to PXF.

In order for the PXF Server to better support both FDW and external
tables we are consistently passing format information from both clients.

91d4e40a

25 7月, 2019 6 次提交

Reset executable-flags in source files to match upstream. · 754759ff

由 Heikki Linnakangas 提交于 7月 25, 2019

Git stores the 'x' flag for files in the repository. For no good reason,
some files had it set differently from upstream. It's harmless, but let's
avoid spurious differences.

754759ff

Remove obsolete files for unused ports. · 308da2a4

由 Heikki Linnakangas 提交于 7月 25, 2019

These were removed from upstream years ago already, but were left over in
GPDB for some reason. GPDB never supported these.

308da2a4

H

Remove unused #includes. · 3401b4fb
由 Heikki Linnakangas 提交于 7月 25, 2019

3401b4fb

Turn optimizer_dpe_stats to on. · 9ed37b46

由 Abhijit Subramanya 提交于 7月 15, 2019

This guc allows the optimizer to estimate the cardinality in case of DPE much
more accurately. On the TPC-DS test suite, it improves the execution time for
query 13 from 114s to 28s and for query 48 from 84s to 34s.

9ed37b46

Adding README for FTS. · 45558a78

由 congnan.luo 提交于 7月 24, 2019

This is initial version of README to document FTS details. Lot more
details need to be captured to completely cover FTS working, but this
acts as good starting point.

I (Ashwin) performed minor touch-ups to the README.
Reviewed-by: NAsim R P <apraveen@pivotal.io>
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>

45558a78

Lock leaf parts for DML statement if GDD is off · d7d09b27

由 Bhuvnesh Chaudhary 提交于 7月 12, 2019

For partitioned tables, in case of DELETE/UPDATE operation on the root
table, we must acquire ExclusiveLock on the root and leaf partitions, so
that any other operation requesting lock higher than AccessShareLock and
ShareLock must wait on QD.

If we don't acquire lock for the leaf partitions, other concurrent DML
operation may be dispatched to the segments and a deadlock can occur.
Refer to the test case added for GDD.

resultRelations may not have all the entries including the root and leaf
partition tables, so in this fix we explicitly acquire locks on all the
tables.

While GDD is on, it will be able to detect the deadlock and cancel one
of the query, so we don't acquire locks on the leaf parts if the DML
operation is on the root.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Signed-off-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>

d7d09b27

24 7月, 2019 2 次提交

H
Remove unused 'gp_interconnect_hash_multiplier' GUC · fefd603f
由 Heikki Linnakangas 提交于 2月 07, 2019
```
It was made unused by commit 8eed4217Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
```
fefd603f

GPDB changes for supporting equi- full merge joins in ORCA (#7814) · 452a463f

由 Ashuka Xue 提交于 7月 23, 2019

This commit corresponds with ORCA commit "Implement Full Merge Join in ORCA".
It also bumps ORCA version to v3.59.0.

This commit includes the following changes to support merge join in
ORCA:
1. Update optimizer_expand_fulljoin guc to use traceflags instead of
disabling the transform.
2. Translator changes for Merge Join.
3. Add IsOpMergeJoinable() and GetMergeJoinOpFamilies() wrappers.
4. Introduces the guc optimizer_enable_mergejoin.

452a463f

23 7月, 2019 4 次提交

Refactor and improve cdbpath_motion_for_join · e2731add

由 Zhenghua Lyu 提交于 3月 08, 2019

This commit refactor the function `cdbpath_motion_for_join` to make
it clear and generate better plan for some cases.

In distributed computing system, to gather distributed data into a
singleQE should always be the last choice. Previous code for general
and segmentgeneral, when they are not ok_to_replicate, will try to
gather other locus to singleQD. This commit improves this by firstly
trying to add redistributed motion.

The logic for the join result's locus (outer's locus is general):
  1. if outer is ok to replicated, then result's locus is the same
     as inner's locus
  2. if outer is not ok to replicated (like left join or wts cases)
     2.1 if inner's locus is hashed or hashOJ, we try to redistribute
         outer as the inner, if fails, make inner singleQE
     2.2 if inner's locus is strewn, we try to redistribute
         outer and inner, if fails, make inner singleQE
     2.3 just return the inner's locus, no motion is needed

The logic for the join results' locus (outer's locus is segmentgenral):
- if both are SegmentGeneral:
     1. if both locus are equal, no motion needed, simply return
     2. For update cases. If resultrelation
        is SegmentGeneral, the update must execute
        on each segment of the resultrelation, if resultrelation's
        numsegments is larger, the only solution is to broadcast
        other
     3. no motion is needed, change both numsegments to common
       - if only one of them is SegmentGeneral:
         1. consider update case, if resultrelation is SegmentGeneral,
            the only solution is to broadcast the other
         2. if other's locus is singleQE or entry, make SegmentGeneral
            to other's locus
         3. the remaining possibility of other's locus is partitioned
            3.1 if SegmentGeneral is not ok_to_replicate, try to
                add redistribute motion, if fails gather each to
                singleQE
            3.2 if SegmentGeneral's numsegments is larger, just return
                other's locus
            3.3 try to add redistribute motion, if fails, gather each
                to singleQE

e2731add

Remove Replicated Locus in cdbpath_motion_for_join · 20248a31

由 Zhenghua Lyu 提交于 3月 11, 2019

Locus type Replicated can only be generated by join operation.
And in the function cdbpathlocus_join there is a rule:
    `<any locus type> join <Replicated> => any locus type`

Proof by contradiction, it shows that when code arrives here,
it is impossible that any of the two input paths' locus
is Replicated. So we add two asserts here.

20248a31

Re-enable `COPY (query) TO` on utility mode · 41a8cf29

由 Adam Lee 提交于 7月 19, 2019

It was disabled by accident several months ago while implementing
`COPY (query) TO ON SEGMENT`, re-enable it.

```
commit bad6cebc
Author: Jinbao Chen <jinchen@pivotal.io>
Date:   Tue Nov 13 12:37:13 2018 +0800

    Support 'copy (select statement) to file on segment' (#6077)
```

WARNING: there are no safety protections on utility mode, it's not
recommended except disaster recovery situation.
Co-authored-by: NWeinan WANG <wewang@pivotal.io>

41a8cf29

Rid resource group on hashagg spill evaluation (#8199) · 40d955d6

由 Weinan WANG 提交于 7月 23, 2019

Resource group believe memory access speed always faster than disk,
and it adds hashagg executor node spill mechanism into its memory management.
If the hash table size overwhelms `max_mem`, in resource group model, the hash
table does not spill and fan out data. Resource group wants to grant more memory
for the hash table. However, this strategy impact hash collision rate, so that
some performance regression in some OLAP query.

We rid resource group guc when hashagg evaluate if it needs to spill.
Co-authored-by: NAdam Li <ali@pivotal.io>

40d955d6

22 7月, 2019 6 次提交

A
Fix memory quota calculation of aggregation · 79c83ea1
由 Adam Lee 提交于 7月 17, 2019
```
MemoryAccounting_RequestQuotaIncrease() returns a number in bytes, but
here expects kB.
```
79c83ea1

Refactor to remove raw_buf_done from external scan · 585e90b6

由 Adam Lee 提交于 6月 14, 2019

scan->raw_buf_done was used for custom external table only, refactor
to remove the MERGE_FIXME.

cstate->raw_buf_len is safe to use since we operate pstate->raw_buf
directly in this case.

585e90b6

Remove some unnecessary MERGE_FIXMEs · b6afd44c

由 Adam Lee 提交于 6月 14, 2019

About the `isjoininner`, I searched the history commit in merge branch,
it was removed by "e2fa76d8 - Use parameterized paths to generate
inner indexscans more flexibly" on upstream from 9.2, that MERGE_FIXME
was there because at that time functions which rely on `isjoininner`
refused to compile.

b6afd44c

Expand sreh rejected count to int64 · 1f8254a8

由 Adam Lee 提交于 6月 14, 2019

If there are more than INT_MAX rejected rows, this will overflow. That
is possible at least if you specify the segment reject limit as a
percentage.

Still keep the SEGMENT REJECT LIMIT value as int, expanding that will
break lots of things like catalog but benefit too little.

1f8254a8

Place cdbsreh counting into copy function · 1425f036

由 Adam Lee 提交于 6月 14, 2019

Now the processed and rejected counting are in the NextCopyFrom() only,
which reads next tuple from file, makes much more sense.

1425f036

Keep the order of reusing idle gangs · 51a7ea27

由 Ning Yu 提交于 7月 16, 2019

For example:
In the same session,
query 1 has 3 slices and it creates gang 1, gang 2 and gang 3.
query 2 has 2 slices, we hope it reuses gang 1 and gang 2 instead of other
cases like gang 3 and gang 2.

In this way, the two queries can have the same send-receive port pair. It's
useful in platform like Azure. Because Azure limits the number of different
send-receive port pairs (AKA flow) in a certain time period.
Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
Co-authored-by: NPaul Guo <pguo@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

51a7ea27

18 7月, 2019 1 次提交

Consider non direct dispatch cost for cached plan. · 6f936827

由 Hubert Zhang 提交于 6月 18, 2019

Prepare statement will bind parameters for each execution. It needs to decide to
use a cached generic plan without params or a custom plan with params. In past,
GPDB use plan cost plus re-plan cost to choose generic and custom plan. But
generic plan does not contain params which leads to it could not generate
direct dispatch plan compared with custom plan.

For non direct dispatch plan it will introduce unneccessary QEs, which still
need to go through volcano model, do two phase commit and write prepare xlog.
So the cost of failed to generate direct dispatch plan would be higher in some
case than the re-plan cost which makes custom plan runs faster than generic
plan even if it needs to re-plan for every execute.

Note that non direct dispatch cost is not considered in planner yet. Planner
treats direct dispatch as an optimization and always enable it when possible.
But for prepare statement, the case is that for generic plan it could not
generate direct dispatch plan at all. But we need to consider this cost here,
as a result, we introduce non direct dispatch cost into total cost only for
cached plans.
Co-authored-by: NNing Yu <nyu@pivotal.io>

6f936827

17 7月, 2019 1 次提交

Use the correct time unit in BackoffSweeper backend · 16522314

由 Pengzhou Tang 提交于 7月 17, 2019

Commit bfd1f46c used the wrong time unit (expect Ms, passed with Us)
in BackoffSweeper backend which makes it cannot re-calculate the CPU shares
in time and the normal backends will sleep more CPU ticks than before in
CHECK_FOR_INTERRUPTS and cause a performance downgrade.

16522314

15 7月, 2019 2 次提交

Pass correct in-progress array pointer for writing to file · e2886364

由 Asim R P 提交于 7月 15, 2019

In case of extended queries, snapshot information is shared between
reader and writer QEs using files.  The writer obtains a snapshot and
writes it to file.  This patch fixes a bug that passed incorrect address
of the array of in-progress transactions for writing to file.  The bug
caused hard to reproduce errors in production workloads that involved
extended queries (bind/execute libpq messages, declare cursor
statements, certain PL/* statements such as RETURN and EXECUTE in
pl/pgsql).

Reviewed-by: Adam Berlin and Jesse Zhang

e2886364

Unittest for writing and reading cursor snapshot · 3304ac00

由 Asim R P 提交于 7月 15, 2019

Functions under test are dumpSharedLocalSnapshot_forCursor() and
readSharedLocalSnapshot_forCursor(). Significant amount of global state needs
to be created for the test to be able to invoke the two functions. What we are
testing here is, whether correct snapshot information is included in what is
written to file. Validation is performed by reading the contents of the file
and compairing them with expected values.

An implicit rule to run a unittest is defined in src/Makefile.mock. This patch
overrides it such that the required directory for the cursor snapshot file is
created before running the test.

Reviewed-by: Adam Berlin and Jesse Zhang

3304ac00

13 7月, 2019 1 次提交

Disable shareinputscan with outer refs. · b658e3ed

由 Richard Guo 提交于 4月 08, 2019

Currently shareinputscan doesn't handle rescan properly, and to fully
support it, there will be a lot of code changes. After some discussions,
we decide to disable shareinputscan with outer refs for now.

b658e3ed

12 7月, 2019 1 次提交

Skip start the stats sender process when gpperfmon not enabled (#8126) · 4a347206

由 Wang Hao 提交于 7月 12, 2019

Before GP6, the stats sender process works for gpperfmon and metrics
collector. If any of them enabled, stats sender should start.
From GP6 metrics collector become a standalone bgworker.
So this commit fix startup condition for stats sender, now it only
starts when gpperfmon enabled.

4a347206

11 7月, 2019 2 次提交

Skip cdbcomponent_updateCdbComponents for FTS · f37bcae1

由 Pengzhou Tang 提交于 7月 10, 2019

FTS takes responsibility for updating gp_segment_configuration, in each
fts probe cycle, FTS firstly gets a copy of current configuration, then
probe the segments based on it and finally free the copy in the end. In
the probe stage, FTS might start/close transactions many times, so FTS
should not update current copy of gp_segment_configuration when a new
transaction is started.

f37bcae1

W
Merge pull request #8109 from magi345/master_bgworker_fix · b6c1b467
由 Wang Hao 提交于 7月 11, 2019
```
Auxiliary bgworkers should skip resource group assignment
```
b6c1b467