提交 · fbc0c5c1fcc09582ceb49872ba7409cd0d8e2241 · Greenplum / Gpdb

07 12月, 2018 4 次提交

A

Bump ORCA version to 3.13.0 · fbc0c5c1
由 Abhijit Subramanya 提交于 12月 06, 2018

fbc0c5c1

pg_dump: Preserve OID for schema "public" during upgrade · 5844158d

由 Jesse Zhang 提交于 12月 03, 2018

Given a database in which the "public" schema does not have its default
OID 2200 (e.g. it was dropped and recreated), pg_upgrade would ignore
the previous OID and use the built-in "public" schema at restoration.
This is because creation of the "public" schema is skipped in
dump / restore.

Add a binary_upgrade field to RestoreOptions, to teach pg_dump /
pg_restore not to skip the public schema during binary upgrade. This
adds a new "--binary-upgrade" option to pg_restore.
Co-authored-by: NJacob Champion <pchampion@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>

5844158d

Control gpsegstart's thread pool size from gpstart · ee1e6a9f

由 Ivan Leskin 提交于 12月 04, 2018

Use gpstart '-B' ('--parallel') option value to limit the maximum size
of a thread pool used by gpsegstart to create segments.

ee1e6a9f

Limit the number of workers in gpsegstart's pool · 86c88e4d

由 Ivan Leskin 提交于 12月 04, 2018

Put a limit on the number of workers in a pool created by gpstart.

When there are many segments on one server node, the unlimited number
of workers may open unlimited number of pipes.
This may lead to a failure of 'select()' call
in 'gpMgmt/bin/gppylib/gpsubprocess.py' and inability
to start the cluster at all.

This commit prevents the described problem from happening.

Closes https://github.com/greenplum-db/gpdb/issues/6176

86c88e4d

06 12月, 2018 9 次提交

Fix typos in documentation · 748813f6

由 Daniel Gustafsson 提交于 12月 06, 2018

Discussion: https://github.com/greenplum-db/gpdb/pull/6410
Reviewed-by: Heikki Linnakangas

748813f6

S

Fix error message when there is ORCA version mismatch · d1df28c1
由 Sambitesh Dash 提交于 12月 05, 2018

d1df28c1

resgroup: allow shared memory as operator memory · 98f9b174

由 Ning Yu 提交于 12月 06, 2018

GPDB assigns 100KB for each non memory intensive operator, and report an
error if there is not enough memory reserved for the operators.  In low
memory resource groups it is easy to trigger this error with large
queries.

Improved by always assign at least 100KB memory to operators, both
memory intensive or non-intensive, the memory can be allocated from
shared memory, and will raise OOM error if there is not enough shared
memory.

98f9b174

Disable Current Of for replicated table (#6407) · d6b1289f

由 Tang Pengzhou 提交于 12月 06, 2018

* Disable Current Of for replicated table

Let's list the plans using CURRENT OF:

    explain delete from t1 WHERE CURRENT OF c1;
                            QUERY PLAN
    -----------------------------------------------------------
     Delete on t1  (cost=0.00..100.01 rows=1 width=10)
       ->  Tid Scan on t1  (cost=0.00..100.01 rows=1 width=10)
             Output: ctid, gp_segment_id
             TID Cond: CURRENT OF c1
             Filter: CURRENT OF c1

    explain UPDATE r1 SET b = 3 WHERE CURRENT OF c1;
                            QUERY PLAN
    -----------------------------------------------------------
     Update on r1  (cost=0.00..100.01 rows=1 width=18)
       ->  Tid Scan on r1  (cost=0.00..100.01 rows=1 width=18)
             Output: ctid, gp_segment_id
             TID Cond: CURRENT OF c1
             Filter: CURRENT OF c1

(ctid + gp_segment_id) can identify a unique row for hash distributed
table or randomly distributed table, but for replicated table, it can
only identify a row in one replica, so in this case other replicas
are not updated or deleted properly. There is no proper way to let
CURRENT OF work fine for replicated table, so just disable it for now.

Updatable view works fine now because all the operations are designed
to executed locally, rows of each replica are never flowed to other
segments and each replica have the same data flow. For example:
     create view v_r2 as select * from r2 where c2 = 3;

     explain update v_r2 set c2 = 4 from p1 where p1.c2=v_r2.c2;
                            QUERY PLAN
    -----------------------------------------------------------
     Update on r2
       ->  Nested Loop
             ->  Seq Scan on r2
                   Filter: (c2 = 3)
             ->  Materialize
                   ->  Broadcast Motion 3:3
                         ->  Seq Scan on p1
                               Filter: (c2 = 3)
As the plan shows, p1 is broadcasted, so all replicas of replicated
table operate locally

d6b1289f

docs - pxf CLI refactor, add ref pages for pxf, pxf cluster (#6401) · e648cf45

由 Lisa Owen 提交于 12月 05, 2018

* docs - pxf CLI refactor, add ref pages for pxf, pxf cluster

* display pxf cluster first in ref landing page

* remedy -> solution

* remove java 1.7

* remove commands that source greenplum_path.sh

* edits requested by david

* fix link

* update OSS book for new ref pages, add jdbc, edits

e648cf45

D

Docs: Adding pivotal-specific connector references to loader topic, loader map file · 6b1d7a34
由 dyozie 提交于 12月 05, 2018

6b1d7a34
M
docs - ANALYZE - clarify using analyze on partitioned tables. (#6402) · c6c0cd7b
由 Mel Kiyama 提交于 12月 05, 2018
```
* docs - ANALYZE - clarify using analyze on partitioned tables.

* docs - ANALYZE - review comment updates
```
c6c0cd7b

Avoid including 1 kB of zeros in every QD -> QE query dispatch. · 51a01235

由 Heikki Linnakangas 提交于 12月 05, 2018

The 'M' type QD->QE dispatch message includes a serialized version of the
current resource group information. That included a 'cpuset' field, which
has 1 kB of space reserved for it, and the serialization routine included
all the padding zeros in it. (Copying or memsetting that into a local
variable in SerializeResGroupInfo() wasn't completely free either).
Rewrite the serialized format to be less silly.
Reviewed-by: NJacob Champion <pchampion@pivotal.io>

51a01235

Fix numeric round() and trunc() to not scribble on their input. · 08542bf2

由 Heikki Linnakangas 提交于 12月 05, 2018

numeric_round() and numeric_trunc() initialized the NumericVar to
represent the argument with init_var_from_num(), and then called
round_var() on it. That's not cool, init_var_from_num() creates a
read-only NumericVar, whose 'digits' points directly to the argument
datum, and round_var() scribbles on it. There is a comment in
init_var_from_num() warning explicitly to not do exactly that, but we
failed to heed the warning.

Fixes github issue https://github.com/greenplum-db/gpdb/issues/6383. I
also added a test case for this, although it's a bit hard to see this bug
to reappear in exactly the same form. It's not far-fetched that we might
re-introduce it in a slightly different form, though, so I also added some
extra Asserts in round_var() and trunc_var() for it.
Reviewed-by: NEkta Khanna <ekhanna@pivotal.io>

08542bf2

05 12月, 2018 7 次提交

Remove leftover autoconf files from gpfdist · 6d055042

由 Daniel Gustafsson 提交于 12月 05, 2018

Commit e00ca2c4 moved gpfdist into
src/bin and also folded it into the main autoconf program for GPDB.
The local config/ files were however mistakenly left behind. Remove
the files as they are no longer used.

6d055042

Bump ORCA Version to v3.12.0 · 4598f7c2

由 Sambitesh Dash 提交于 12月 04, 2018

Co-authored-by: NChris Hajas <chajas@pivotal.io>
Co-authored-by: NSambitesh Dash <sdash@pivotal.io>

4598f7c2

Remove ao_create_alter_valid_table test suite · a33c0f2e

由 Daniel Gustafsson 提交于 12月 04, 2018

This test suite was mostly testing unsupported append-only table
functionality which has since long been implemented, but since the
tests were wrapped in ignore blocks it wasn't even testing that.

All the tests in this suite are covered by other suite which are
better maintained so remove to shave a few cycles off ICW time
(no wallclock time improvement of ICW is expected since it was
running in a parallel group).

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/mxld3WyhPZ8Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

a33c0f2e

Downgrade FIXMEs on simple union all pullup optimization to plain comments. · 3a9c9652

由 Heikki Linnakangas 提交于 12月 04, 2018

As the comments in code explain, the simple union all pullup optimization
is problematic in GPDB. I don't think we have plans to "fix" it any time
soon.

3a9c9652

Remove references to RHEL 5 in enterprise build scripts · a3dcf94a

由 Taylor Vesely 提交于 12月 04, 2018

The RHEL 5 platform is no longer supported, so remove references to it
in the various enterprise build scripts.
Authored-by: NTaylor Vesely <tvesely@pivotal.io>

a3dcf94a

docs - updates for jsonb type (#6209) · edf4846a

由 Mel Kiyama 提交于 12月 04, 2018

* docs - updates for jsonb type

* docs - jsonb docs - review comment updates.

* docs - jsonb docs - remove duplicate text.

edf4846a

L

docs - remove gp_enable_sequential_window_plans ref, edits (#6415) · c8c4fe7c
由 Lisa Owen 提交于 12月 04, 2018

c8c4fe7c

04 12月, 2018 11 次提交

Synchronize configure.in with configure · e5f41414

由 Daniel Gustafsson 提交于 12月 04, 2018

Commit f8afec06 extended the zstd
check to also look for zstd_errors.h, and updated the comment. The
comment update was however only committed to configure and not the
source file configure.in. This synchronizes the files.

e5f41414

T

Add hook point for `alter stable set distributed by` · d3bb358e
由 Teng zhang 提交于 11月 20, 2018

d3bb358e

Fix collations handling in sorted Motions. · 8093ea76

由 Heikki Linnakangas 提交于 12月 04, 2018

There was a FIXME about that. Add a test case for it, too, since we
apparently have no test coverage for gp_enable_motion_mk_sort=off.
Reviewed-by: NBaiShaoqi <sbai@pivotal.io>

8093ea76

Remove the restriction of not pulling up ANY sublink in the case of SRF. · 708e8e86

由 Richard Guo 提交于 12月 04, 2018

Previously GPDB would refuse to pull up ANY sublink if the subquery
returns a set-returning function (SRF) in the targetlist, while
PostgreSQL will do that kind of pull-up.

This restriction was added in GPDB years ago to workaround MPP-7085.
That issue has been fixed in other place, so we can remove the
restriction now.

708e8e86

resgroup: fix startup error when all cpu cores are allocated · a50fc538

由 Ning Yu 提交于 12月 04, 2018

When all the cpu cores are allocated the default cpuset group should
fallback to core 0. However this fallback logic was only added on
CREATE / ALTER RESOURCE GROUP, but missing in startup logic, an empty cpu
core list "" is set to cgroup and cause a runtime error:

can't write data to file '/sys/fs/cgroup/cpuset/gpdb/1/cpuset.cpus':
No space left on device (resgroup-ops-linux.c:916)

Fixed by converting "" to "-1" in startup logic.

a50fc538

Use dwarf 2 symbols when building on rhel6 · f7eaf5ca

由 Taylor Vesely 提交于 11月 28, 2018

GDB on centos/rhel6 doesn't recognize dwarf 4 symbols, as it was
originally implemented in GDB 7.4. Centos 6 ships with GDB 7.2, so it
can't read the newer dwarf formats.

From the GCC manual:

-gstrict-dwarf

	Disallow using extensions of later DWARF standard version than
	selected with -gdwarf-version. On most targets using
	non-conflicting DWARF extensions from later standard versions is
	allowed.

-gdwarf-[version]

	Produce debugging information in DWARF format (if that is supported).
	The value of version may be either 2, 3, 4 or 5; the default version for
	most targets is 4. DWARF Version 5 is only experimental.
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NBen Christel <bchristel@pivotal.io>

f7eaf5ca

Remove gpexpand TINC tests · 48ffd7d3

由 David Krieger 提交于 11月 29, 2018

This removes gpexpand_1 and gpexpand_2 concourse jobs in favor of the
gpexpand behave tests.
Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>

48ffd7d3

docs: update gprecoverseg_help · ccaa8bc4

由 Jim Doty 提交于 11月 26, 2018

Update documentation to match the default setting when adding the option
in gpMgmt/bin/gppylib/programs/clsRecoverSegment.py:

```
    addTo.add_option("-B", None, type="int", default=16,
```

[ci skip]
Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
Co-authored-by: NKalen Kempely <kkrempely@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>

ccaa8bc4

Remove obsolete cdbhashnokey(). · 75e54388

由 Heikki Linnakangas 提交于 12月 03, 2018

We can use the more straightforward cdbhashrandomseg() instead.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>

75e54388

Cleanup gpexpand behave tests and use dump of ICW database · f7fb724c

由 Jim Doty 提交于 10月 24, 2018

- Some tests were expanding into /tmp which ran out of space, so
now expand into /data/gpdata.
- Consolidate test to verify redistribution after expand.
- Actually use dump of ICW database in the relevant test.
Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>
Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>

f7fb724c

gpexpand: zero out distributed_xlog in newly created segments · 99d48e1e

由 Jim Doty 提交于 10月 22, 2018

Because expand copies the entire master directory, we need to zero out the
distributed_xlog since the segment is not yet part of the cluster, to
allow it to only see local transactions.  Fixed the log level to WARN
when expanding a cluster with unique indexes.
Co-authored-by: NDavid Krieger <dkrieger@pivotal.io>
Co-authored-by: NJim Doty <jdoty@pivotal.io>
Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>

99d48e1e

03 12月, 2018 9 次提交

Check distribution key in COPY FROM ON SEGMENT on partitioned table. · 8a21f4c5

由 Heikki Linnakangas 提交于 12月 03, 2018

This was previously not implemented, because the gp_distribution_policy
catalog was not populated in QE segments. Starting with commit 7efe3204,
it is, so we can easily support this now.
Reviewed-by: NXiaoran Wang <xiwang@pivotal.io>
Reviewed-by: NAdam Lee <ali@pivotal.io>

8a21f4c5

P

modification following comment · afadea49
由 Pengzhou Tang 提交于 11月 29, 2018

afadea49

Add a testcase to test the gang management · dd6414da

由 Pengzhou Tang 提交于 11月 26, 2018

The gang management now has a smaller granularity and a better management
of memory context, meanwhile, gangs are not pre-assigned anymore for main
plan and init plans. An issue was reported for old implementation of gang
management, although it cannot been reproduced now, it's still necessary
to add a test case to avoid regression in the feature.

The test strategy is preparing a query contains both main and init plan,
main plan create a unnamed portal and will need multiple slices , init
plan also need to allocated multiple slices in a named portal (eg: a
cursor), the expect result is gang management works fine in such mixed
combination.

dd6414da

Remove return after ereport(ERROR..) · 1b5ebbb5

由 Daniel Gustafsson 提交于 12月 02, 2018

Calling ereport(ERROR..) will break out of any codepath, never to
return, so returning NULL after invocation is dead code. Resources
are also automatically clenaed on erroring out, so remove the pfree
call (which is also dead code) rather than moving it before the
ereport.
Reviewed-by: NJacob Champion <pchampion@pivotal.io>
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NJimmy Yih <jyih@pivotal.io>

1b5ebbb5

Change representation of hash filter in Result from List to array. · 302a2aa8

由 Heikki Linnakangas 提交于 12月 02, 2018

For consistency: this is how we represent column indexes e.g. in Sort,
Unique, MergeAppend and many other plan types.
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>

302a2aa8

Avoid reconstructing CdbHash for every row in Result hashFilter. · e5f11c2b

由 Heikki Linnakangas 提交于 12月 02, 2018

Create the CdbHash object in the initialization phase, and reuse it for
all the tuples. I'm not sure how much performance difference this makes,
but seems cleaner anyway.
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>

e5f11c2b

Stop abusing Result's hash filter for running a plan on arbitrary segment. · 3c89b2b4

由 Heikki Linnakangas 提交于 12月 02, 2018

ORCA generated plans, where the "hash filter" in the Result node was set
to an empty set of columns. That meant "discard all the rows, on all
segments, except one segment". This is used at least with set-returning
functions, where we don't care where the function is executed, but it only
needs to be executed once. (The planner creates a one-to-many Redistribute
Motion plan in that scenario, which makes a lot more sense to me, but
doing the same in ORCA would require more invasive surgery than what I'm
capable of.)

Instead of executing the subplan, and throwing away the result one row at
a time, use a Result plan with a One-Off Filter. That's more efficient.
Also, it allows removing the Result.hashFilter boolean flag, because there
the weird case of a hashFilter with zero columns is gone. You can check
"hashList != NIL" directly now.

The old method would always choose the same segment, which seems bad for
load distribution. The way it was chosen seemed totally accidental too:
we initialized the cdbhash object to the initial constant value, and
then reduced that into the target segment number, using the jump
consistent hash algorithm. We computed that for every row, but the result
was always the same. On a three-node cluster, the target was always
segment 1. Now, we pick a segment at random when generating the plan.
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>

3c89b2b4

Remove table that should've been removed in the 8.4 merge. · c185255c

由 Heikki Linnakangas 提交于 12月 02, 2018

This was moved to encnames.c in upstream commit 420ea688, but for some
reason, we failed to remove it in the 8.4 merge from the old location.
It's unused.

c185255c

H

Remove unused arguments from function get_comparison_type() · 262ae5e7
由 Heikki Linnakangas 提交于 12月 02, 2018

262ae5e7