提交 · e254287ef997b2b33cdca5ec627bd9fc163d12e3 · Greenplum / Gpdb

31 7月, 2017 1 次提交

Implement "COPY ... FROM ... ON SEGMENT' · e254287e

由 Ming LI 提交于 7月 31, 2017

Support COPY statement that imports the data file on segments directly
parallel. It could be used to import data files generated by "COPY ...
to ... ON SEGMENT'.

This commit also supports all kinds of data file formats which "COPY ...
TO" supports, processes reject limit numbers and logs errors accordingly.

Key workflow:
   a) For COPY FROM, nothing changed by this commit, dispatch modified
   COPY command to segments at first, then read data file on master, and
   dispatch the data to relevant segment to process.

   b) For COPY FROM ON SEGMENT, on QD, read dummy data file, other parts
   keep unchanged, on QE, process the data stream (empty) dispatched
   from QD at first, then re-do the same workflow to read and process
   the local segment data file.
Signed-off-by: NMing LI <mli@pivotal.io>
Signed-off-by: NAdam Lee <ali@pivotal.io>
Signed-off-by: NHaozhou Wang <hawang@pivotal.io>
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

e254287e

29 7月, 2017 6 次提交
- M
  gpinitsystem: update help usage · b6bdc202
  由 Marbin Tan 提交于 7月 25, 2017
```
- Sort arguments alphabetically
- Add -I and -O description
```
  b6bdc202
- D
  
  using standby_master_filespaces to match help output · d7fd4c27
  由 David Yozie 提交于 7月 17, 2017
  
  d7fd4c27
- D
  
  add missing options · 5195438f
  由 dyozie 提交于 7月 07, 2017
  
  5195438f
- D
  
  more work on input config file format · 4e4d10e1
  由 dyozie 提交于 7月 07, 2017
  
  4e4d10e1
- D
  
  add gpinitsystem -I -O options · 8cf0f229
  由 David Yozie 提交于 7月 07, 2017
  
  8cf0f229
- C
  Fix gpdbrestore filter script to handle BEGIN..END block in functions · 18c13cb4
  由 Chris Hajas 提交于 7月 27, 2017
```
Signed-off-by: NKaren Huddleston <khuddleston@pivotal.io>
```
  18c13cb4
28 7月, 2017 6 次提交
- L
  docs - resource groups catalog table additions/updates (#2825) · 8e7df761
  由 Lisa Owen 提交于 7月 27, 2017
```
* docs - resource groups catalog additions/updates

* address review comments from gang, mel

* clarify rsgqueueuduration - for current query, not all
```
  8e7df761
- C
  Conditionalize gpcc (#2832) · d09f216c
  由 Chuck Litzell 提交于 7月 27, 2017
```
* De-conflate gpperfmon & gpcc. Conditionalize gpcc. Remove unreferenced topics.

* Update GPAdminGuide.ditaval

Remove addition to Admin Guide ditaval

* Fixes from review.
```
  d09f216c
- C
  
  Create test database after exporting PGDATABASE in test driver script · ac44e4b1
  由 Chris Hajas 提交于 7月 27, 2017
  
  ac44e4b1
- M
  Refactor ccp_create using anchor · a7a7dbfe
  由 Marbin Tan 提交于 7月 25, 2017
```
Signed-off-by: NXin Zhang <xzhang@pivotal.io>
```
  a7a7dbfe
- S
  Add pipeline task for gprecoverseg via terraform · 1fef4945
  由 Shoaib Lari 提交于 7月 14, 2017
```
Signed-off-by: NTushar Dadlani <tdadlani@pivotal.io>
Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
```
  1fef4945
- L
  Test to verify preservation of checksum configuration after gprecoverseg · 9a908189
  由 Larry Hamel 提交于 7月 14, 2017
```
Add a behave test to verify that the checksum configuration is preserved
after a segment is recovered using gprecoveseg.
Signed-off-by: NShoaib Lari <slari@pivotal.io>
```
  9a908189
27 7月, 2017 11 次提交

Ensure ccp_destroy on debug_sleep · c36f83cb

由 Kris Macoskey 提交于 7月 26, 2017

This will allow a user to cancel debug_sleep and be ensured the
ccp_destroy will still cleanup any created clusters.

c36f83cb

Use xl_heaptid_set() in heap_update_internal. · f1d1d55b

由 Ashwin Agrawal 提交于 7月 26, 2017

Commit d50f429c added xlog lock record, but
missed to tune into for Greenplum which is to add persistent table
information. Hence caused failure during recovery with FATAL message "xlog
record with zero persistenTID". Using xl_heaptid_set() which calls
`RelationGetPTInfo()` making sure PT info is populated for xlog record.

f1d1d55b

Fix flaky 'insufficient memory reserved' issue in pipeline · a7cce539

由 Pengzhou Tang 提交于 7月 25, 2017

The 'insufficient memory reserved' issue existed for a long time, the
root cause is the default statement_mem (125MB) is not enough for
queries using by gpcheckcat script when regression database is huge.

This commit add STATEMENT_MEM in demo_cluster.sh to initialize gpdb
with required statement_mem and set statement_mem to 225MB in common.bash

a7cce539

A
Log gpload threads' terminating · 4b71d480
由 Adam Lee 提交于 7月 25, 2017
```
It's useful and important for debugging.
```
4b71d480
A

Fix error in schedule file · 138141f8
由 Asim R P 提交于 7月 26, 2017

138141f8

Move dtm test to pg_regress from its own contrib module · c10e75fd

由 Asim R P 提交于 7月 25, 2017

The gp_inject_fault() function is now available in pg_regress so a contrib
module is not required. The test was not being run, it trips an assertion. So
it is not added to greenplum_schedule.

c10e75fd

A

Update fsync test to use SQL UDF to inject faults · 9bd14bd3
由 Asim R P 提交于 7月 24, 2017

9bd14bd3

Make SQL based fault injection function available to all tests. · b23680d6

由 Asim R P 提交于 7月 24, 2017

The function gp_inject_fault() was defined in a test specific contrib module
(src/test/dtm).  It is moved to a dedicated contrib module gp_inject_fault.
All tests can now make use of it.  Two pg_regress tests (dispatch and cursor)
are modified to demonstrate the usage.  The function is modified so that it can
inject fault in any segment, specified by dbid.  No more invoking
gpfaultinjector python script from SQL files.

The new module is integrated into top level build so that it is included in
make and make install.

b23680d6

Ensure Execution of Shared Scan Writer On Squelch [#149182449] · 9fbd2da5

由 Jesse Zhang 提交于 7月 25, 2017

SharedInputScan (a.k.a. "Shared Scan" in EXPLAIN) is the operator
through which Greenplum implements Common Table Expression execution. It
executes in two modes: writer (a.k.a. producer) and reader (a.k.a.
consumer). Writers will execute the common table expression definition
and materialize the output, and readers can read the materialized output
(potentially in parallel).

Because of the parallel nature of Greenplum execution, slices containing
Shared Scans need to synchronize among themselves to ensure that readers
don't start until writers are finished writing. Specifically, a slice
with readers depending on writers on a different slice will block during
`ExecutorRun`, before even pulling the first tuple from the executor
tree.

Greenplum's Hash Join implementation will skip executing its outer
("probe side") subtree if it detects an empty inner ("hash side"), and
declare all motions in the skipped subtree as "stopped" (we call this
"squelching"). That means we can potentially squelch a subtree that
contains a shared scan writer, leaving cross-slice readers waiting
forever.

For example, with ORCA enabled, the following query:

```SQL
CREATE TABLE foo (a int, b int);
CREATE TABLE bar (c int, d int);
CREATE TABLE jazz(e int, f int);

INSERT INTO bar  VALUES (1, 1), (2, 2), (3, 3);
INSERT INTO jazz VALUES (2, 2), (3, 3);

ANALYZE foo;
ANALYZE bar;
ANALYZE jazz;

SET statement_timeout = '15s';

SELECT * FROM
        (
        WITH cte AS (SELECT * FROM foo)
        SELECT * FROM (SELECT * FROM cte UNION ALL SELECT * FROM cte)
        AS X
        JOIN bar ON b = c
        ) AS XY
        JOIN jazz on c = e AND b = f;
```
leads to a plan that will expose this problem:

```
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..2155.00 rows=1 width=24)
   ->  Hash Join  (cost=0.00..2155.00 rows=1 width=24)
         Hash Cond: bar.c = jazz.e AND share0_ref2.b = jazz.f AND share0_ref2.b = jazz.e AND bar.c = jazz.f
         ->  Sequence  (cost=0.00..1724.00 rows=1 width=16)
               ->  Shared Scan (share slice:id 2:0)  (cost=0.00..431.00 rows=1 width=1)
                     ->  Materialize  (cost=0.00..431.00 rows=1 width=1)
                           ->  Table Scan on foo  (cost=0.00..431.00 rows=1 width=8)
               ->  Hash Join  (cost=0.00..1293.00 rows=1 width=16)
                     Hash Cond: share0_ref2.b = bar.c
                     ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..862.00 rows=1 width=8)
                           Hash Key: share0_ref2.b
                           ->  Append  (cost=0.00..862.00 rows=1 width=8)
                                 ->  Shared Scan (share slice:id 1:0)  (cost=0.00..431.00 rows=1 width=8)
                                 ->  Shared Scan (share slice:id 1:0)  (cost=0.00..431.00 rows=1 width=8)
                     ->  Hash  (cost=431.00..431.00 rows=1 width=8)
                           ->  Table Scan on bar  (cost=0.00..431.00 rows=1 width=8)
         ->  Hash  (cost=431.00..431.00 rows=1 width=8)
               ->  Table Scan on jazz  (cost=0.00..431.00 rows=1 width=8)
                     Filter: e = f
 Optimizer status: PQO version 2.39.1
(20 rows)
```
where processes executing slice1 on the segments that have an empty
`jazz` will hang.

We fix this by ensuring we execute the Shared Scan writer even if it's
in the sub tree that we're squelching.
Signed-off-by: NMelanie Plageman <mplageman@pivotal.io>
Signed-off-by: NSambitesh Dash <sdash@pivotal.io>

9fbd2da5

Fix torn-page, unlogged xid and further risks from heap_update(). · d50f429c

由 Andres Freund 提交于 7月 15, 2016

When heap_update needs to look for a page for the new tuple version,
because the current one doesn't have sufficient free space, or when
columns have to be processed by the tuple toaster, it has to release the
lock on the old page during that. Otherwise there'd be lock ordering and
lock nesting issues.

To avoid concurrent sessions from trying to update / delete / lock the
tuple while the page's content lock is released, the tuple's xmax is set
to the current session's xid.

That unfortunately was done without any WAL logging, thereby violating
the rule that no XIDs may appear on disk, without an according WAL
record.  If the database were to crash / fail over when the page level
lock is released, and some activity lead to the page being written out
to disk, the xid could end up being reused; potentially leading to the
row becoming invisible.

There might be additional risks by not having t_ctid point at the tuple
itself, without having set the appropriate lock infomask fields.

To fix, compute the appropriate xmax/infomask combination for locking
the tuple, and perform WAL logging using the existing XLOG_HEAP_LOCK
record. That allows the fix to be backpatched.

This issue has existed for a long time. There appears to have been
partial attempts at preventing dangers, but these never have fully been
implemented, and were removed a long time ago, in
11919160 (cf. HEAP_XMAX_UNLOGGED).

In master / 9.6, there's an additional issue, namely that the
visibilitymap's freeze bit isn't reset at that point yet. Since that's a
new issue, introduced only in a892234f, that'll be fixed in a
separate commit.

Author: Masahiko Sawada and Andres Freund
Reported-By: Different aspects by Thomas Munro, Noah Misch, and others
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: 9.1/all supported versions

d50f429c

Generate a table file during a filtered backup · a6d36d7d

由 Karen Huddleston 提交于 7月 19, 2017

This file contains a list of schema-qualified tablenames in the backup
set. It is not used in the restore process; it is there solely to allow
users to determine which tables were dumped in that backup set.
Signed-off-by: NJamie McAtamney <jmcatamney@pivotal.io>
Signed-off-by: NChris Hajas <chajas@pivotal.io>

a6d36d7d

26 7月, 2017 1 次提交
- C
  Docs: remove statement about gp-wlm resource queue support and rm errant file. (#2807) · d5436917
  由 Chuck Litzell 提交于 7月 26, 2017
```
* Remove statement advertising gp-wlm resource queue support and delete errant file.

* Removes statement about WLM managing resource queues
```
  d5436917
25 7月, 2017 6 次提交

Mark local functions as static where appropriate · 684fe68f

由 Daniel Gustafsson 提交于 7月 25, 2017

Set local functions as static and include a prototype. This fixes a
multitude of warnings for missing prototypes in clang like this one:

gpcheckcloud.cpp:32:6: warning: no previous prototype for function
                       'registerSignalHandler' [-Wmissing-prototypes]
void registerSignalHandler() {
		     ^

684fe68f

Install libevent in Travis CI builds · ac52d9ed

由 Daniel Gustafsson 提交于 7月 25, 2017

Since Travis CI was upgraded to use Ubuntu Trusty as the base
Linux [0], libevent-dev is no longer available out of the box.
Install it manually with the Apt addon, since we don't want to
use sudo due to longer instance bootup time. While at it,
remove the macOS section since we never actually supported
Travis for macOS builds (it was a leftover from an attempt).

[0] https://blog.travis-ci.com/2017-07-11-trusty-as-default-linux-is-coming

ac52d9ed

I

Update branding on sample config · aa4c8075
由 Ivan Novick 提交于 7月 24, 2017

aa4c8075

Fix resgroup ICW failures · 4165a543

由 Ning Yu 提交于 7月 25, 2017

* Fix the resgroup assert failure on CREATE INDEX CONCURRENTLY syntax.

When resgroup is enabled an assertion failure will be encountered with
below case:

    SET gp_create_index_concurrently TO true;
    DROP TABLE IF EXISTS concur_heap;
    CREATE TABLE concur_heap (f1 text, f2 text, dk text) distributed by (dk);
    CREATE INDEX CONCURRENTLY concur_index1 ON concur_heap(f2,f1);

The root cause is that we had the assumption on QD that a command is
dispatched to QEs when assigned to a resgroup, but this is false with
CREATE INDEX CONCURRENTLY syntax.

To fix it we have to make necessary check and cleanup on QEs.

* Do not assign a resource group in SIGUSR1 handler.

When assigning a resource group on master it might call WaitLatch() to
wait for a free slot. However as WaitLatch() expects to be waken by the
SIGUSR1 signal, it will run into endless waiting when SIGUSR1 is
blocked.

One scenario is the catch up handler. Catch up handler is triggered and
executed directly in the SIGUSR1 handler, so during its execution
SIGUSR1 is blocked. And as catch up handler will begin a transaction so
it will try to assign a resource group and trigger the endless waiting.

To fix this we add the check to not assign a resource group when running
inside the SIGUSR1 handler. As signal handlers are supposed to be light
and short and safe, so skip resource group in such a case shall be
reasonable.

4165a543

Best Practices update (#2797) · f7acb99f

由 Jane Beckman 提交于 7月 24, 2017

* Initial updates for Best Practices

* Additional comments from review

* Changes from Craig Sylvester

* David's comments from PR, changes to code output from reviewers

* Review tweaks

f7acb99f

gpperfmon overview improvements (#2785) · a1110fc7

由 Chuck Litzell 提交于 7月 24, 2017

* gpperfmon overview improvements

* Add a link to the log rotation section.

* Edits from review

a1110fc7

24 7月, 2017 2 次提交

Use non-blocking recv() in internal_cancel() · 23e5a5ee

由 xiong-gang 提交于 7月 24, 2017

The issue of hanging on recv() in internal_cancel() are reported
serveral times, the socket status is shown 'ESTABLISHED' on master,
while the peer process on the segment has already exit. We are not
sure how exactly dose this happen, but we are able to simulate this
hang issue by dropping packet or reboot the system on the segment.

This patch use poll() to do non-blocking recv() in internal_cancel();
the timeout of poll() is set to the max value of authentication_timeout
to make sure the process on segment has already exit before attempting
another retry; and we expect retry on connect() can detect network issue.
Signed-off-by: NNing Yu <nyu@pivotal.io>

23e5a5ee

Detect cgroup mount point at runtime. (#2790) · 1b1b3a11

由 Zhenghua Lyu 提交于 7月 24, 2017

In the past, we use hard coded path "/sys/fs/cgroup" as cgroup mount
point, this can be wrong when 1) running on old kernels or 2) the
customer has special cgroup mount points.

Now we detect the mount point at runtime by checking /proc/self/mounts.
Signed-off-by: NNing Yu <nyu@pivotal.io>

1b1b3a11

22 7月, 2017 7 次提交

Remove file that is not used in this commit · 0d691b88

由 Jim Doty 提交于 7月 21, 2017

Removing the file that is not used in the commit.  First, although this
file will be used very soon, it will not show up in the commit where it
will be used.  Also the team may want to consider using the pattern
demonstrated in run_behave_tests.sh where the script is refactored to
run on the master.
Signed-off-by: NDivya Bhargov <dbhargov@pivotal.io>

0d691b88

Suggested changes from toolsmiths · 39088059

由 Jim Doty 提交于 7月 21, 2017

The source code for gpdb is being copied to the master node, so there is no need to run scripts
in the container that can be directly run on the master node. This makes scripts in the GPDB source
tree not depend on the connection that CI system has to the cluster, just that your test is running on
a cluster.
Signed-off-by: NDivya Bhargov <dbhargov@pivotal.io>

39088059

Specify the instance size required in the test · 17d2392d

由 Kris Macoskey 提交于 7月 20, 2017

The instance size specified in the ccp_options anchor do not have to
match in the ccp_destroy anchor, the only requirement is that the field
must exist. With this revelation, there does not need to be test
specific anchors defined in ccp_options.

Instead the pattern is to not use a ccp_options anchor, but instead to
explicitly set the desired instance type for a given test.
Signed-off-by: NJingyi Mei <jmei@pivotal.io>

17d2392d

Migrate backup-restore test from pulse to terraform · c7d71d07

由 Tom Meyer 提交于 7月 05, 2017

Signed-off-by: NKaren Huddleston <khuddleston@pivotal.io>
Signed-off-by: NChris Hajas <chajas@pivotal.io>

c7d71d07

Revert "Make SQL based fault injection function available to all tests." · 582d0fd4

由 Asim R P 提交于 7月 21, 2017

Loading a C UDF within postgres binary is not a good idea.  The binary cannot
be loaded as a shared object on Linux (it works on OSX).

This reverts commit 9361a6dd.

582d0fd4

A
Revert "Move dtm test to pg_regress from its own contrib module" · 96c63912
由 Asim R P 提交于 7月 21, 2017
```
This reverts commit a021d1b5.
```
96c63912

Revert "Move gp_inject_fault function to pg_regress/regress.c" · d3c78a1a

由 Asim R P 提交于 7月 21, 2017

regress.c cannot include fmgroids.h because the header file is generated during
build process. The ICW jobs in CI checkout gpdb source code and run make from
within src/test/regress. That fails to find fmgroids.h. It seems we need a
dedicated contrib module for gp_inject_fault.

This reverts commit bd26a268.

d3c78a1a