提交 · e833befe7b2a80ebb077dac063ca5aabcd1c8f25 · Greenplum / Gpdb

12 12月, 2018 14 次提交

H
Downgrade FIXME comment on "compact" WAL records for distributed xacts. · e833befe
由 Heikki Linnakangas 提交于 12月 12, 2018
```
We have no plans to implement that optimization for distributed
transactions any time soon.
```
e833befe

Remove obsolete FIXME comment. · e5d395f2

由 Heikki Linnakangas 提交于 12月 12, 2018

The PR that it referred to was pushed, and replace_sirvf_rte() looks OK to
me now. I don't think there's anything left to do here. I just forgot to
remove this before pushing the 9.4 merge commit.

e5d395f2

H
Remove unnecessary masking of DETAIL messages for ORCA. · 028463c3
由 Heikki Linnakangas 提交于 12月 12, 2018
```
Commit e89be84b changed the behavior so that this is not needed anymore.
```
028463c3

Remove obsolete forcing of test query to use index scans. · 6b94952a

由 Heikki Linnakangas 提交于 12月 12, 2018

I'm not sure what happened here, but the planner now chooses an index only
scan, like PostgreSQL does, without forcing it. Remove the 'set
enable_sort=off', and FIXME comment about it.

6b94952a

Remove FIXME comment on how pg_dist_wait_status() is dispatched. · 4d2364c0

由 Heikki Linnakangas 提交于 12月 12, 2018

We could declare the function as ON SEGMENT, use SPI, and let the
planner/executor dispatch it, but there's no particular reason we would
need to do it that  way. I think dispatching the function call like this
is totally OK.

4d2364c0

Runtime key context is not needed, when there are no runtime keys. · d95127af

由 Heikki Linnakangas 提交于 12月 12, 2018

I'm not sure what was wrong when this comment was written, but it seems to
work now. For good measure, reset iss_RuntimeContext to NULL when it's
free'd, but 'dpe' is passing even without it.

d95127af

Update ORCA expected file for altering indexed column · ff83b198

由 Daniel Gustafsson 提交于 12月 12, 2018

Commit 28dd0152 relaxed the restriction on
altering indexed columns, but missed to update the expected test
output for ORCA. This applies the diff from qp_misc_jiras.out to the
_optimizer.out file as well.

ff83b198

Don't pass distribution policy from QD to QEs in in COPY. · 8acaa682

由 Heikki Linnakangas 提交于 12月 12, 2018

We store the distribution policy in segments nowadays (since commit
7efe3204). This is no longer needed.
Reviewed-by: NAdam Lee <ali@pivotal.io>

8acaa682

Enable alter table column with index (#6286) · 28dd0152

由 Jinbao Chen 提交于 12月 12, 2018

* Make sure ALTER TABLE preserves index tablespaces.

When rebuilding an existing index, ALTER TABLE correctly kept the
physical file in the same tablespace, but it messed up the pg_class
entry if the index had been in the database's default tablespace
and "default_tablespace" was set to some non-default tablespace.
This led to an inaccessible index.

Fix by fixing pg_get_indexdef_string() to always include a tablespace
clause, whether or not the index is in the default tablespace. The
previous behavior was installed in commit 537e92e4, and I think it just
wasn't thought through very clearly; certainly the possible effect of
default_tablespace wasn't considered. There's some risk in changing the
behavior of this function, but there are no other call sites in the core
code. Even if it's being used by some third party extension, it's fairly
hard to envision a usage that is okay with a tablespace clause being
appended some of the time but can't handle it being appended all the time.

Back-patch to all supported versions.

Code fix by me, investigation and test cases by Michael Paquier.

Discussion: <1479294998857-5930602.post@n3.nabble.com>

* Enable alter table column with index

Originally, we disabled alter table column with index. Because INDEX CREATE is dispatched immediately, which unfortunately breaks the ALTER work queue. So we have a workaround and disable alter table column with index.
In postgres91, is_alter_table param was introduced to 'DefineIndex'. So we have a chance to re-enable this feature.

28dd0152

Remove redundant index on pg_stat_last_operation. · de0edeac

由 Heikki Linnakangas 提交于 12月 12, 2018

It had two indexes:

* pg_statlastop_classid_objid_index on (classid oid_ops, objid oid_ops), and
* pg_statlastop_classid_objid_staactionname_index on (classid oid_ops, objid
  oid_ops, staactionname name_ops)

The first one is completely redundant with the second one. Remove it.

Fixes assertion failure https://github.com/greenplum-db/gpdb/issues/6362.
The assertion was added in PostgreSQL 9.1, commit d2f60a3a. The failure
happened on "VACUUM FULL pg_stat_last_operation", if the VACUUM FULL
itself added a new row to the table. The insertion also inserted entries
in the indexes, which tripped the assertion that checks that you don't try
to insert entries into an index that's currently being reindexed, or
pending reindexing:

> (gdb) bt
> #0  0x00007f02f5189783 in __select_nocancel () from /lib64/libc.so.6
> #1  0x0000000000be76ef in pg_usleep (microsec=30000000) at pgsleep.c:53
> #2  0x0000000000ad75aa in elog_debug_linger (edata=0x11bf760 <errordata>) at elog.c:5293
> #3  0x0000000000acdba4 in errfinish (dummy=0) at elog.c:675
> #4  0x0000000000acc3bf in ExceptionalCondition (conditionName=0xc15798 "!(!ReindexIsProcessingIndex(((indexRelation)->rd_id)))", errorType=0xc156ef "FailedAssertion",
>     fileName=0xc156d0 "indexam.c", lineNumber=215) at assert.c:46
> #5  0x00000000004fded5 in index_insert (indexRelation=0x7f02f6b6daa0, values=0x7ffdb43915e0, isnull=0x7ffdb43915c0 "", heap_t_ctid=0x240bd64, heapRelation=0x24efa78,
>     checkUnique=UNIQUE_CHECK_YES) at indexam.c:215
> #6  0x00000000005bda59 in CatalogIndexInsert (indstate=0x240e5d0, heapTuple=0x240bd60) at indexing.c:136
> #7  0x00000000005bdaaa in CatalogUpdateIndexes (heapRel=0x24efa78, heapTuple=0x240bd60) at indexing.c:162
> #8  0x00000000005b2203 in MetaTrackAddUpdInternal (classid=1259, objoid=6053, relowner=10, actionname=0xc51543 "VACUUM", subtype=0xc5153b "REINDEX", rel=0x24efa78,
>     old_tuple=0x0) at heap.c:744
> #9  0x00000000005b229d in MetaTrackAddObject (classid=1259, objoid=6053, relowner=10, actionname=0xc51543 "VACUUM", subtype=0xc5153b "REINDEX") at heap.c:773
> #10 0x00000000005b2553 in MetaTrackUpdObject (classid=1259, objoid=6053, relowner=10, actionname=0xc51543 "VACUUM", subtype=0xc5153b "REINDEX") at heap.c:856
> #11 0x00000000005bd271 in reindex_index (indexId=6053, skip_constraint_checks=1 '\001') at index.c:3741
> #12 0x00000000005bd418 in reindex_relation (relid=6052, flags=2) at index.c:3870
> #13 0x000000000067ba71 in finish_heap_swap (OIDOldHeap=6052, OIDNewHeap=16687, is_system_catalog=1 '\001', swap_toast_by_content=0 '\000', swap_stats=1 '\001',
>     check_constraints=0 '\000', is_internal=1 '\001', frozenXid=821, cutoffMulti=1) at cluster.c:1667
> #14 0x0000000000679ed5 in rebuild_relation (OldHeap=0x7f02f6b7a6f0, indexOid=0, verbose=0 '\000') at cluster.c:648
> #15 0x0000000000679913 in cluster_rel (tableOid=6052, indexOid=0, recheck=0 '\000', verbose=0 '\000', printError=1 '\001') at cluster.c:461
> #16 0x0000000000717580 in vacuum_rel (onerel=0x0, relid=6052, vacstmt=0x2533c38, lmode=8, for_wraparound=0 '\000') at vacuum.c:2315
> #17 0x0000000000714ce7 in vacuumStatement_Relation (vacstmt=0x2533c38, relid=6052, relations=0x24c12f8, bstrategy=0x24c1220, do_toast=1 '\001', for_wraparound=0 '\000',
>     isTopLevel=1 '\001') at vacuum.c:787
> #18 0x0000000000714303 in vacuum (vacstmt=0x2403260, relid=0, do_toast=1 '\001', bstrategy=0x24c1220, for_wraparound=0 '\000', isTopLevel=1 '\001') at vacuum.c:337
> #19 0x0000000000969cd2 in standard_ProcessUtility (parsetree=0x2403260, queryString=0x24027e0 "vacuum full;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x2403648,
>     completionTag=0x7ffdb4392550 "") at utility.c:804
> #20 0x00000000009691be in ProcessUtility (parsetree=0x2403260, queryString=0x24027e0 "vacuum full;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x2403648,
>     completionTag=0x7ffdb4392550 "") at utility.c:373

In this scenario, we had just reindexed one of the indexes of
pg_stat_last_operation, and the metatrack update of that tried to insert a
row into the same table. But the second index in the table was pending
reindexing, which triggered the assertion.

After removing the redundant index, pg_stat_last_operation has only one
index, and that scenario no longer happens. This is a bit fragile fix,
because the problem will reappear as soon as you add a second index on the
table. But we have no plans of doing that, and I believe no harm would be
done in production builds with assertions disabled, anyway. So this will
do for now.
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
Reviewed-by: NShaoqi Bai <sbai@pivotal.io>
Reviewed-by: NJamie McAtamney <jmcatamney@pivotal.io>

de0edeac

Fix compilation with AMS_VERBOSE_LOGGING. · 134f9f65

由 Heikki Linnakangas 提交于 12月 12, 2018

I'm not sure if anyone cares about it, but this at least makes it compile
again.
Reviewed-by: NShaoqi Bai <sbai@pivotal.io>

134f9f65

Optimize the code to send data rows from QD to QEs in COPY FROM. · 64713257

由 Heikki Linnakangas 提交于 12月 12, 2018

This code is performance-critical, as the QD will quickly become the
bottleneck when loading data with COPY FROM. Optimizing it is therefore
warranted.

I did some quick testing on my laptop, loading a table with 50 boolean
columns. This commit improved the performance of that by about 10%.

In GPDB 5, we used to postpone calling the input functions for columns
that were not needed to determine which segment to send the row to. We
lost that optimization during the 9.1 merge, when the COPY code was
heavily refactored. We now call the input functions for every column in
the QD, which added a lot of overhead to the QD. On the other hand, some
other things got faster. Before this commit, the test case with 50
booleans was about 10% slower in GPDB 6, compared to GPDB 5, so this buys
back the performance, so that it's about the same speed again. I'm sure
you can find a test case where GPDB 6 is still slower - you can make the
input functions arbitrarily slow, after all - but every little helps.
Reviewed-by: NAdam Lee <ali@pivotal.io>

64713257

Avoid assertion in CdbDispatchUtilityStatement without debug_query_string. · 24471e17

由 Heikki Linnakangas 提交于 12月 12, 2018

It's just for debugging purposes, so be lenient.

Fixes github issue https://github.com/greenplum-db/gpdb/issues/6298.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

24471e17

Add gpexpand rollback test case (#6129) · 329f5f1b

由 Jialun 提交于 12月 12, 2018

- Add a fault inject to test rollback process
- Remove pg_hba.conf updating during online expand, for only master
  will connect to segment via libpq, master's ip has already been
  in pg_hba.conf

329f5f1b

11 12月, 2018 6 次提交

H

Remove unused faultinject probe. · 3f6f76d8
由 Heikki Linnakangas 提交于 12月 11, 2018

3f6f76d8
H
Remove unused source files. · 35175df8
由 Heikki Linnakangas 提交于 12月 11, 2018
```
Commit 198f701e forgot to remove these.
```
35175df8

Optimize CHECK_FOR_INTERRUPTS() · 2f628da3

由 Heikki Linnakangas 提交于 12月 11, 2018

CHECK_FOR_INTERRUPTS() had grown quite bloated, given how frequently it runs.
It called IsResQueueEnabled(), which added the overhead of a function call,
even if resource queues were not enabled. If they were enabled, it would
call BackoffBackendTick(), which adds another function call.
BackoffBackendTick() checked a lot of other conditions, so it would still
often not do anything.

It turns out to be cheaper to always increment the "tick" counter, to avoid
the overhead of checking whether we need it or not. Move all the the checks
for whether the whole backoff mechanism is even needed, into a function
that only runs when the tick counter expires.

I've seen those calls take up to 5% of CPU overhead, when profiling some
queries with linux 'perf'. This should shave off the time spent on that.
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
Reviewed-by: NNing Yu <nyu@pivotal.io>

2f628da3

resgroup: allow operators enlarge their memory quota · 90795402

由 Ning Yu 提交于 12月 11, 2018

Operators like HashAgg might need more memory quota than granted. In
resource group mode this could be allowed as memory can be allocated
from global or per-group shared memory.

By supporting this we improved the support for low-end resgroups, as
long as there is enough shared memory low-end resgroups could still run
large queries although the performance might not be the best.

90795402

Fix OOM during HLL merge calculation · 7fd56fd0

由 Bhuvnesh Chaudhary 提交于 12月 04, 2018

We need to free up the temporary HLL counters created during HLL merging,
if there are lot of partitions and the data is high for each partition,
we may run into OOM if the temporary counters are not released.

In the merge_leaf_stats function, a nested for loop leaks a single HLL
counter for N^2 iterations (where N = number of partitions). This is run
in a single context, hence leaking considerable memory, leading to VMEM
errors.

This patchs fixes the leaks.
Co-authored-by: NRahul Iyer <riyer@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>

7fd56fd0

S

Bump ORCA version to v3.14.0 · c99b7715
由 Sambitesh Dash 提交于 12月 10, 2018

c99b7715

10 12月, 2018 15 次提交

H

Add back missing upstream comment. · b4f04a32
由 Heikki Linnakangas 提交于 12月 10, 2018

b4f04a32
H
Remove obsolete comment. · 01583783
由 Heikki Linnakangas 提交于 12月 10, 2018
```
'subqfromlist' field was removed by commit f16deabd.
```
01583783

Remove duplicated code for IA-64 (Itanium) stack depth checking. · fe3ab604

由 Heikki Linnakangas 提交于 12月 10, 2018

We had backported this earlier, and got a second copy as part of upstream
merge. Went unnoticed because no one has tried building GPDB on Itanium.

fe3ab604

H
Remove duplicated 'forbidden_in_wal_sender' call. · f475a8f5
由 Heikki Linnakangas 提交于 12月 10, 2018
```
Fumbled this in upstream merge. Harmless, but let's be tidy.
```
f475a8f5

Remove leftover code for non-existent 'G' and 'W' protocol messages. · ac486dd4

由 Heikki Linnakangas 提交于 12月 10, 2018

Looking at the historical pre-open-sourcing repository, the 'G' message
was used by ancient prototype code from 2007-2009. The prototype code was
removed in 2009, but this snippet was left over. I don't think it was used
in any released version of GPDB.

The 'W' type message was removed in commit daf6cdbc, but this was left
over.

ac486dd4

Remove dangerous 'extern' declarations in .c file. · c9cfb4c8

由 Heikki Linnakangas 提交于 12月 10, 2018

If the actual signature of the function/variable changes, with an extern
declaration directly in the other .c file, you might not notice. The
correct way to use 'extern', is to have only one 'extern' declaration, in
a header file.

c9cfb4c8

Fix outdated comment. · 570893a1

由 Heikki Linnakangas 提交于 12月 10, 2018

Commit b3c50e40 changed the argument from StringInfo to ErrorData *,
but neglected the comment in the header file about it. Fix, by copying
the up-to-date comment from the .c file.

570893a1

H
Remove unused round() macro. · 83b4bed7
由 Heikki Linnakangas 提交于 12月 10, 2018
```
It was unused, and we don't support Windows as a server platform, anyway.
```
83b4bed7

Remove a couple of unused functions. · 953e4d7b

由 Heikki Linnakangas 提交于 12月 10, 2018

I think these functions were to implement ALTER TABLE MODIFY RANGE/LIST,
but that command doesn't exist anymore. Not sure if it was ever fully
implemented.

953e4d7b

H

Remove unused argument. · ff0e2384
由 Heikki Linnakangas 提交于 12月 10, 2018

ff0e2384
H

Remove duplicate function prototype. · 196045c8
由 Heikki Linnakangas 提交于 12月 10, 2018

196045c8
H

Remove unused PSETOP_PARALLEL_REPLICATED. · acf7534e
由 Heikki Linnakangas 提交于 12月 10, 2018

acf7534e
H
Remove unused 'is_agg' args. · 85c51e49
由 Heikki Linnakangas 提交于 12月 10, 2018
```
It was always passed as 'true'.
```
85c51e49

Add check Nullptr for CdbTryOpenRelation in parserOpenTable · 7863052b

由 Zhenghua Lyu 提交于 12月 10, 2018

`CdbTryOpenRelation` might return a NULL pointer. Each time we
use the value returned by it we should always check if the
result is NULL. `parserOpenTable` forgot doing such check, this
commit adds the check for NULL.

7863052b

gpexpand improvement using newly added features · a490e50f

由 Tang Pengzhou 提交于 12月 10, 2018

commit 8eed4217 & e0b06678 allow us to do a gpexpand of GPDB cluster
without a restart, so we called this strategy as "online expand", those
two commits were mainly focus on how to avoid restarting cluster when
expanding cluster, this commit will do the left work to improve gpexpand
using a few features we merged to master recently.

First improvement is, it's no longer necessary to change policy of all
non-partition table to random at the beginning of gpexpand. Previously,
we couldn't tell the difference between expanded table and non-expaned
table, so if the distribution policy of these tables were the same,
planner took these tables's data as co-located and produced a incorrect
plan. To avoid this, gpexpand used to change policy of all table to
random so planner could produce a correct but a non-effective plan,
because random policy always means annoying broadcase motion.

Now each table has a 'numsegments' attribute introduced by 4eb65a53, GPDB
can recognize expanded and non-expand table and produced correct plans. so
the first imporvement is removing the randomization step of tables.

The second improvement is, use a brand new syntax "ALTER TABLE foo EXPAND TABLE"
to focus on the data rebalance of tables. Previously, tables were converted
to randomly before rebalance and numsegments of tables were always the same
with GPDB cluster size, so we can use a tricky "ALTER TABLE foo SET WITH
(REORGNIZE = true) DISTRIBUTED BY (original_key)" command to rebalance data
to new added segments, now policy and numsegments are not changed before
rebalancing data, so expanding such tables is not match the concept of "SET
DISTRIBUTED BY" now, we need a proper syntax to focus on table expanding.

New expand syntax named "ALTER TABLE foo EXPAND TABLE", we have two methods
to do the real data movement inside, one is CTAS, another is RESHUFFLE, which
one is better depends on how much data needs to be moved and whether the table
has index and whether the table is a append only table (analyzed by Hekki).

A drawback of this commit is we always expand a partition within a transaction,
the old behavior is expand the leaf partition in parallel which is faster for
a partition table. We don't allow root partition and its leaf partitions to
have different numsegments for now, this commit disabled the old behavior
temporarily. A topic on how to expanding partition table in parallel is
under discussion and we wish to bring the ability back properly in the feature.

* Refine name quote

Eg: if schema name of table is a.b, table name is c'd, then current gpexpand
cann't handle it.

a490e50f

08 12月, 2018 5 次提交
- L
  
  docs - remove reference to java 1.7 in pxf docs (#6452) · 4ba354ed
  由 Lisa Owen 提交于 12月 07, 2018
  
  4ba354ed
- A
  
  Adds documentation to sql_isolation_testcase.py for the ability to include helper files. · 892379b5
  由 Adam Berlin 提交于 12月 06, 2018
  
  892379b5
- L
  docs - foreign data wrapper sql and catalog ref page updates (#6412) · 8a433b4c
  由 Lisa Owen 提交于 12月 07, 2018
```
* docs - foreign data wrapper sql and catalog re page updates

* misc edits

* clarify mpp_execute any value
```
  8a433b4c
- L
  docs - remove operator RECHECK; add note, update sys catalog (#6414) · 3fb6e266
  由 Lisa Owen 提交于 12月 07, 2018
```
* docs - remove operator RECHECK; add note, update sys catalog

* add blank line
```
  3fb6e266
- C
  
  Docs - more updates from 8.4 merge commits (#6449) · 26d33561
  由 Chuck Litzell 提交于 12月 07, 2018
  
  26d33561