提交 · 65822b80193b02368110e312c5d47e7170639d92 · Greenplum / Gpdb

07 12月, 2018 33 次提交

Remove alert sending support via email and SNMP · 65822b80

由 Daniel Gustafsson 提交于 12月 07, 2018

The support for sending alerts via Email or SNMP was quite a kludge,
and there are much better external tools for managing alerts than
what we can supply in core anyways so this retires the capability.

All references to alert sending in the docs are removed, but there
needs to be section written about how to migrate off this feature
in the release notes or a similar location.

Discussion: https://github.com/greenplum-db/gpdb/pull/6384

65822b80

Avoid overhead of extracting first key column in Sorted Motion Receiver. · 414b0f49

由 Heikki Linnakangas 提交于 12月 07, 2018

Extract the first key column's Datum only once, to avoid the
memtuple_getattr / heap_getattr overhead. This is the same optimization we
have in tuplesort.c.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

414b0f49

Optimize CvtChunksToTup. · 03b42a2d

由 Heikki Linnakangas 提交于 12月 07, 2018

Avoid palloc+memcpy for "TC_WHOLE" tuples.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

03b42a2d

Use SortSupport in Motion node. · 5e071da3

由 Heikki Linnakangas 提交于 12月 07, 2018

More modern, and faster too.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

5e071da3

H

Run pgindent on nodeMotion.c · ef20065b
由 Heikki Linnakangas 提交于 12月 07, 2018

ef20065b

Remove limit on htupfifo freelist size. · 41a6eb5d

由 Heikki Linnakangas 提交于 12月 07, 2018

Htupfifo is used when parsing incoming messages from the interconnect.
Each UDP message consists of a number of tuple chunks, which form a
number of tuples. When one incoming message is parsed, the htupfifo is
used to hold the tuples formed from the single message.

Typically, each message contains roughly the same number of tuples, so it
is wasteful to palloc/pfree the list node for every tuple. That's why
there is a free list in the FIFO. However, with narrow tuples, one message
can contain hundreds of tuples, which is much more than the built-in max
size on the free list size (10). So in practice, the free list was almost
never enough to cover the need. The message size puts a natural limit on
how large the FIFO can grow, so I don't think we need a limit on the free
list size. Just let it grow as large as needed, and avoid the palloc/pfree
overhead.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

41a6eb5d

Remove incomplete memory tracking code from htfifo. · 37f13596

由 Heikki Linnakangas 提交于 12月 07, 2018

It was dead code, and we have no plans to resurrect it.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

37f13596

Fix crash with 'gp_enable_motion_mk_sort=off'. · fbd50c90

由 Heikki Linnakangas 提交于 12月 07, 2018

The non-MK code was abusing the child's result tuple slot. The
corresponding MK code was changed back in 2010 to not do that, but that
commit missed the non-MK version. This caused a segfault in the
'subselect' regression test. Apparently no one has run the regression
tests with 'gp_enable_motion_mk_sort=off' recently.
Reviewed-by: NPengzhou Tang <ptang@pivotal.io>
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

fbd50c90

H

Make a couple of functions static. · 40e72ca4
由 Heikki Linnakangas 提交于 12月 07, 2018

40e72ca4

Remove unused 'consec_csv_err' code. · 6b4d1ff6

由 Heikki Linnakangas 提交于 12月 07, 2018

The 9.1 merge removed the code that incremented 'num_consec_csv_err'.
Since then, it was only ever initialized to 0. Hence, all the related code
was dead.

I'm not sure how we behave with the kinds of errors that used to trigger
this code now. But I don't see a need to treat them specially, the
generic error handling code should cope with them. The GUC that controlled
the max CSV line length was removed in commit 6ac29fdc already.

6b4d1ff6

H

Remove unused fields in CopyState. · 8ac3d479
由 Heikki Linnakangas 提交于 12月 07, 2018

8ac3d479
H
Remove fields from CdbCopy that are not really needed there. · c7670efe
由 Heikki Linnakangas 提交于 12月 07, 2018
```
They were only used to pass the information to cdbCopyStart(). Better to
pass them directly as arguments.
```
c7670efe

Don't try so hard to pick initial size of hash table. · 113306ca

由 Heikki Linnakangas 提交于 12月 07, 2018

Simpler to just pick a constant starting size. It's probably faster, too,
to let the hash table expand as needed, than work hard upfront to make a
good guess.

113306ca

H

Remove unused fields from CdbCopy. · b1f254da
由 Heikki Linnakangas 提交于 12月 07, 2018

b1f254da

Regenerate master pipelines. · caf42fde

由 Ning Yu 提交于 12月 07, 2018

This is to apply the pipeline changes added in
a8fd476b, it will create a new pipeline
job which run ICG with random numsegments.

caf42fde

Suppress a warning generated on inherited tables. · 8a8b7e5a

由 Ning Yu 提交于 12月 04, 2018

The following WARNING is generated by ANALYZE when some sample tuples
are from segments outside the [0, numsegments-1] range, however this
does not indicate the data distribution is wrong.  Take inherited tables
for example, when inherited tables has greater numsegments than parent
this WARNING will be raised, and it is expected.  This could happen
normally on the random_numsegments pipeline job, so ignore this WARNING.

    WARNING:  table "patest0" contains rows in segment 2,
              which is outside the # of segments for the table's policy
              (2 segments)

Added this pattern to init_file to ignore it.

8a8b7e5a

N
Adjust one regression test in partial_table.sql · 323685d1
由 Ning Yu 提交于 12月 04, 2018
```
It was placed under JOIN group but as it is a regression test moved it
to REGRESSION group.
```
323685d1

New pipeline job to run ICG with random numsegments. · a8fd476b

由 Ning Yu 提交于 12月 04, 2018

This new pipeline job run ICG tests, all the tables are created with
random numsegments, the purpose is to improve test coverage on MxN
queries.

Some tests are disabled because their behavior and/or output depend on
data distribution, in general could be grouped as below:
- checks gp_segment_id;
- contains EXCHANGE PARTITION, which require the two tables have the
  same numsegments;
- turns on `test_print_direct_dispatch_info`, the output depends on
  numsegments;

Some tests can pass by adding some enforcement or alignments on
numsegments, but we do not want to include such kind of hacks on those
tests for now.

Some tests can be re-enabled later if we can make them stable without
changing them.

a8fd476b

Allow numsegments to be specified by DISTRIBUTED BY · 439aa9f4

由 Ning Yu 提交于 12月 07, 2018

CREATE TABLE always set numsegments to DEFAULT, however when there is a
DISTRIBUTED BY clause it might already contain a valid numsegments.

This will not happen in a user typed CREATE TABLE sql because there is
no syntax to specify numsegments, so far the only chance for this to
happen is the internal command constructed by reorganization, it might
be a CTAS or (CREATE + INSERT), both will pass original numsegments via
DISTRIBUTED BY.

One bug is that we only accept numsegments passed by CTAS but not the
other.  The (CREATE + INSERT) command is only constructed in 3 cases:

1. original table contains dropped column(s);
2. original table is AOCO;
3. original table is AO with index(es);

Fixed and added tests.

439aa9f4

Create partition table with same numsegments for parent and children · 8f898338

由 Ning Yu 提交于 12月 07, 2018

When creating a partition table we want children have the same
numsegments with parent.  As they all set their numsegments to DEFAULT,
does this meet our expectation?  No, because DEFAULT does not always
equal to DEFAULT itself.  When DEFAULT is set to RANDOM a different
value is returned each time.

So we have to align numsegments explicitly.

Also removed an incorrect assert and comment.

8f898338

Remove superfluous pass statements · 839f9902

由 Daniel Gustafsson 提交于 12月 06, 2018

The pass statement is a no-op intended to be used in empty classes
or other types of stubs. When following other statements in a block
it has no purpose. Remove all such occurrences.
Reviewed-by: NJacob Champion <pchampion@pivotal.io>

839f9902

A

Allow include files to be added in an isolation2 test. · a473abfa
由 Adam Berlin 提交于 12月 06, 2018

a473abfa

Remove convenience function for TopMemoryContext dumping · 962549ed

由 Daniel Gustafsson 提交于 12月 06, 2018

The dump_tmc() function is just a shorthand for running dump_mc()
on the TopMemoryContext. Since dumping MemoryContexts has no call-
sites in the code and is only available in a debugger, it seems a
bit rich to spend object file space for saving a few keystrokes in
GDB. This removes dump_tmc() in favour of just using dump_mc().
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

962549ed

Remove SwitchedMemoryContext · 4293a87a

由 Daniel Gustafsson 提交于 12月 06, 2018

The SwitchedMemoryContext is not an actual new context type, but
merely a set of convenience functions for wrapping allocation and
switching into a single call. This was only used at two callsites,
and cause pointless merge conflicts with upstream (not to mention
making code reading harder since it's a new concept from what we're
all used to). This removes and replaces with normal memory context
calls.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

4293a87a

Remove dead code · 747e3f93

由 Daniel Gustafsson 提交于 12月 06, 2018

The floor_log2_Size() and ceil_log2_Size() functions were used by
the old malloc() based abstract memory context which was removed
in the 5.x cycle. They are now unused so remove.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

747e3f93

pg_resetxlog: handle platform support for uint64 parsing · 280cac8f

由 Jacob Champion 提交于 11月 16, 2018

For 8-byte-long platforms, strtoul() is fine. Otherwise we try to use
strtoull(), and then fail to compile if that doesn't exist.
Co-authored-by: NJim Doty <jdoty@pivotal.io>

280cac8f

J
pg_resetxlog: only allow system ID to be set during upgrade · 7b5d77b0
由 Jacob Champion 提交于 11月 15, 2018
```
Per suggestion from Daniel.
Co-authored-by: NJim Doty <jdoty@pivotal.io>
```
7b5d77b0

pg_upgrade: reset segment system identifiers using pg_resetxlog · 49306526

由 Jacob Champion 提交于 10月 31, 2018

The copied system identifiers from the master need to be made unique so
that mirrors cannot accidently connect to the wrong primaries.  We do
this with the --system-identifier option to pg_resetxlog introduced in
the last commit.

We have also added a test to verify unique system identifiers are
assigned to the segments.
Co-authored-by: NShoaib Lari <slari@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>

49306526

pg_resetxlog: add option to update system identifier · 81571e99

由 Jacob Champion 提交于 10月 31, 2018

For upgrade, we need to reset the database system identifier in the
pg_control file for the segments. The --system-identifier option lets us
do that.

Per Daniel's suggestion, move all GPDB-specific additions to use long
options; the -y option is now --binary-upgrade. We've based the new
long-options machinery on e22b27f0 from upstream, which adds long
options to pg_resetwal. This should make it easier to merge later.
Co-authored-by: NJim Doty <jdoty@pivotal.io>
Co-authored-by: NShoaib Lari <slari@pivotal.io>

81571e99

A

Bump ORCA version to 3.13.0 · fbc0c5c1
由 Abhijit Subramanya 提交于 12月 06, 2018

fbc0c5c1

pg_dump: Preserve OID for schema "public" during upgrade · 5844158d

由 Jesse Zhang 提交于 12月 03, 2018

Given a database in which the "public" schema does not have its default
OID 2200 (e.g. it was dropped and recreated), pg_upgrade would ignore
the previous OID and use the built-in "public" schema at restoration.
This is because creation of the "public" schema is skipped in
dump / restore.

Add a binary_upgrade field to RestoreOptions, to teach pg_dump /
pg_restore not to skip the public schema during binary upgrade. This
adds a new "--binary-upgrade" option to pg_restore.
Co-authored-by: NJacob Champion <pchampion@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>

5844158d

Control gpsegstart's thread pool size from gpstart · ee1e6a9f

由 Ivan Leskin 提交于 12月 04, 2018

Use gpstart '-B' ('--parallel') option value to limit the maximum size
of a thread pool used by gpsegstart to create segments.

ee1e6a9f

Limit the number of workers in gpsegstart's pool · 86c88e4d

由 Ivan Leskin 提交于 12月 04, 2018

Put a limit on the number of workers in a pool created by gpstart.

When there are many segments on one server node, the unlimited number
of workers may open unlimited number of pipes.
This may lead to a failure of 'select()' call
in 'gpMgmt/bin/gppylib/gpsubprocess.py' and inability
to start the cluster at all.

This commit prevents the described problem from happening.

Closes https://github.com/greenplum-db/gpdb/issues/6176

86c88e4d

06 12月, 2018 7 次提交

Fix typos in documentation · 748813f6

由 Daniel Gustafsson 提交于 12月 06, 2018

Discussion: https://github.com/greenplum-db/gpdb/pull/6410
Reviewed-by: Heikki Linnakangas

748813f6

S

Fix error message when there is ORCA version mismatch · d1df28c1
由 Sambitesh Dash 提交于 12月 05, 2018

d1df28c1

resgroup: allow shared memory as operator memory · 98f9b174

由 Ning Yu 提交于 12月 06, 2018

GPDB assigns 100KB for each non memory intensive operator, and report an
error if there is not enough memory reserved for the operators.  In low
memory resource groups it is easy to trigger this error with large
queries.

Improved by always assign at least 100KB memory to operators, both
memory intensive or non-intensive, the memory can be allocated from
shared memory, and will raise OOM error if there is not enough shared
memory.

98f9b174

Disable Current Of for replicated table (#6407) · d6b1289f

由 Tang Pengzhou 提交于 12月 06, 2018

* Disable Current Of for replicated table

Let's list the plans using CURRENT OF:

    explain delete from t1 WHERE CURRENT OF c1;
                            QUERY PLAN
    -----------------------------------------------------------
     Delete on t1  (cost=0.00..100.01 rows=1 width=10)
       ->  Tid Scan on t1  (cost=0.00..100.01 rows=1 width=10)
             Output: ctid, gp_segment_id
             TID Cond: CURRENT OF c1
             Filter: CURRENT OF c1

    explain UPDATE r1 SET b = 3 WHERE CURRENT OF c1;
                            QUERY PLAN
    -----------------------------------------------------------
     Update on r1  (cost=0.00..100.01 rows=1 width=18)
       ->  Tid Scan on r1  (cost=0.00..100.01 rows=1 width=18)
             Output: ctid, gp_segment_id
             TID Cond: CURRENT OF c1
             Filter: CURRENT OF c1

(ctid + gp_segment_id) can identify a unique row for hash distributed
table or randomly distributed table, but for replicated table, it can
only identify a row in one replica, so in this case other replicas
are not updated or deleted properly. There is no proper way to let
CURRENT OF work fine for replicated table, so just disable it for now.

Updatable view works fine now because all the operations are designed
to executed locally, rows of each replica are never flowed to other
segments and each replica have the same data flow. For example:
     create view v_r2 as select * from r2 where c2 = 3;

     explain update v_r2 set c2 = 4 from p1 where p1.c2=v_r2.c2;
                            QUERY PLAN
    -----------------------------------------------------------
     Update on r2
       ->  Nested Loop
             ->  Seq Scan on r2
                   Filter: (c2 = 3)
             ->  Materialize
                   ->  Broadcast Motion 3:3
                         ->  Seq Scan on p1
                               Filter: (c2 = 3)
As the plan shows, p1 is broadcasted, so all replicas of replicated
table operate locally

d6b1289f

docs - pxf CLI refactor, add ref pages for pxf, pxf cluster (#6401) · e648cf45

由 Lisa Owen 提交于 12月 05, 2018

* docs - pxf CLI refactor, add ref pages for pxf, pxf cluster

* display pxf cluster first in ref landing page

* remedy -> solution

* remove java 1.7

* remove commands that source greenplum_path.sh

* edits requested by david

* fix link

* update OSS book for new ref pages, add jdbc, edits

e648cf45

D

Docs: Adding pivotal-specific connector references to loader topic, loader map file · 6b1d7a34
由 dyozie 提交于 12月 05, 2018

6b1d7a34
M
docs - ANALYZE - clarify using analyze on partitioned tables. (#6402) · c6c0cd7b
由 Mel Kiyama 提交于 12月 05, 2018
```
* docs - ANALYZE - clarify using analyze on partitioned tables.

* docs - ANALYZE - review comment updates
```
c6c0cd7b