提交 · 59abec446236b97debee64c77eea82455dffd0a3 · Greenplum / Gpdb

12 1月, 2018 3 次提交

Fix Filter required properties for correlated subqueries in ORCA · 59abec44

由 Shreedhar Hardikar 提交于 1月 03, 2018

This commit brings in ORCA changes that ensure that a Materialize node is not
added under a Filter when its child contains outer references.  Otherwise, the
subplan is not rescanned (because it is under a Material), producing wrong
results. A rescan is necessary for it evaluates the subplan for each of the
outer referenced values.

For example:

```
SELECT * FROM A,B WHERE EXISTS (
  SELECT * FROM E WHERE E.j = A.j and B.i NOT IN (
    SELECT E.i FROM E WHERE E.i != 10));
```

For the above query ORCA produces a plan with two nested subplans:

```
Result
  Filter: (SubPlan 2)
  ->  Gather Motion 3:1
        ->  Nested Loop
              Join Filter: true
              ->  Broadcast Motion 3:3
                    ->  Table Scan on a
              ->  Table Scan on b
  SubPlan 2
    ->  Result
          Filter: public.c.j = $0
          ->  Materialize
                ->  Result
                      Filter: (SubPlan 1)
                      ->  Materialize
                            ->  Gather Motion 3:1
                                  ->  Table Scan on c
                      SubPlan 1
                        ->  Materialize
                              ->  Gather Motion 3:1
                                    ->  Table Scan on c
                                          Filter: i <> 10
```

The Materialize node (on top of Filter with Subplan 1) has cdb_strict = true.
The cdb_strict semantics dictate that when the Materialize is rescanned,
instead of destroying its tuplestore, it resets the accessor pointer to the
beginning and the subtree is NOT rescanned.
So the entries from the first scan are returned for all future calls; i.e. the
results depend on the first row output by the cross join. This causes wrong and
non-deterministic results.

Also, this commit reinstates this test in qp_correlated_query.sql. It also
fixes another wrong result caused by the same issue. Note that the changes in
rangefuncs_optimizer.out are because ORCA now no longer falls back for those
queries. Instead it produces a plan which is executed on master (instead of the
segments as was done by planner) which changes the error messages.

Also bump ORCA version to 2.53.8.
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

59abec44

gpstart: fix OOM issue · a0fcfc37

由 Shoaib Lari 提交于 1月 05, 2018

gpstart did a cluster-wide check of heap_checksum settings and refused
to start the cluster if this setting was inconsistent. This meant a
round of ssh'ing across the cluster which was causing OOM errors with
large clusters.

This commit moves the heap_checksum validation to gpsegstart.py, and
changes the logic so that only those segments which have the same
heap_checksum setting as master are started.

Author: Nadeem Ghani <nghani@pivotal.io>
Author: Shoaib Lari <slari@pivotal.io>

a0fcfc37

D

docs - typo fix · 0310216b
由 dyozie 提交于 1月 11, 2018

0310216b

11 1月, 2018 7 次提交
- L
  Revert "Rename user group "supergroup" to "gpadmin"" · c770669f
  由 Larry Hamel 提交于 1月 10, 2018
```
This reverts commit c4062620.
```
  c770669f
- G
  Rename user group "supergroup" to "gpadmin" · c4062620
  由 Goutam Tadi 提交于 1月 08, 2018
```
- In the script used for creating user "gpadmin", instead of using a
nonstandard group named "supergroup" that doesn't seem to be used
anywhere, do the more standard practice of creating a group "gpadmin" that matches
the username.
Signed-off-by: NLarry Hamel <lhamel@pivotal.io>
```
  c4062620
- M
  docs: updates for CREATE FUNCTION attributes EXECUTE ON (#4182) · acc30f03
  由 Mel Kiyama 提交于 1月 10, 2018
```
* docs: updates to create function attribute execute on

* docs: review comment updates for CREATE FUNCTION
```
  acc30f03
- J
  Fix conan install command after 1.0.0 upgrade · c7457ea0
  由 Jimmy Yih 提交于 1月 10, 2018
```
Some of our OS container images have been updated to have conan 1.0.0.
This new version changes the conan install command a bit.
```
  c7457ea0
- L
  
  docs - correct the reference to parquet data type fixed_len_byte_array (#4284) · 5fbe8880
  由 Lisa Owen 提交于 1月 10, 2018
  
  5fbe8880
- L
  
  docs - add ALTER TYPE name RENAME TO new_name (#4283) · c0bcb379
  由 Lisa Owen 提交于 1月 10, 2018
  
  c0bcb379
- L
  docs - type SET DEFAULT ENCODING (#4235) · 8a70d17c
  由 Lisa Owen 提交于 1月 10, 2018
```
* docs - type SET DEFAULT ENCODING

* edits requested by david

* add gp-specific clauses to summary of greenplum features page
```
  8a70d17c
10 1月, 2018 6 次提交
- H
  Fix pg_total_relation_size() function for AO tables. · 4c9fe2aa
  由 Heikki Linnakangas 提交于 1月 10, 2018
```
It didn't handle relation forks correctly, and counted the main fork's size
three times.
```
  4c9fe2aa
- H
  
  Fix typo on "gpaddmirrors" in docs. · 1b8a081c
  由 Heikki Linnakangas 提交于 1月 10, 2018
  
  1b8a081c
- H
  Remove obsolete GUCs related to old window functions implementation. · 6e34115f
  由 Heikki Linnakangas 提交于 1月 10, 2018
```
Now that we use the upstream implementation for window functions, the
'gp_enable_sequential_window_plans' and 'gp_idf_deduplicate' GUCs are no
longer.
```
  6e34115f
- L
  
  docs - identify enum type support for GUCs (#4248) · 97fc33cf
  由 Lisa Owen 提交于 1月 09, 2018
  
  97fc33cf
- D
  Remove redundant macro `HASHJOIN_IS_OUTER` · 344874fe
  由 Dhanashree Kashid 提交于 1月 09, 2018
```
HASHJOIN_IS_OUTER already existed in nodeHashJoin.c; but was added again
by commit 321c0529.
This commit removes the redundant copy of the same.
```
  344874fe
- J
  Move segwalrep binary to its own binary directory · cccd1c40
  由 Jimmy Yih 提交于 1月 09, 2018
```
This is to avoid collision of the nonsegwalrep binary.
```
  cccd1c40
09 1月, 2018 4 次提交

gppylib: refactor SegmentPair to not support multiple mirrors · a19f7327

由 Shoaib Lari 提交于 12月 13, 2017

Long ago, we thought we might need to support multiple mirrors. But we
don't, and don't forsee it coming soon. Simplify the code to only ever
have one mirror, but still allow for the possibility of no mirrors

Author: Shoaib Lari <slari@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>

a19f7327

gppylib: Rename gpArray variables and classes · bbc47080

由 Marbin Tan 提交于 12月 12, 2017

The gpArray use of GpDB and Segment classes was confusing. This change renames
GpDB to Segment and Segment to SegmentPair to clarify usage. Its a big diff, but
a simple, repeating change.

Author: Shoaib Lari <slari@pivotal.io>
Author: Marbin Tan <mtan@pivotal.io>
Author: C.J. Jameson <cjameson@pivotal.io>

bbc47080

L
Have a separate docker file for gpadmin user (#4236) · 53278041
由 Lav Jain 提交于 1月 08, 2018
```
* Have a separate docker file for gpadmin user

* Add indent package for centos6
```
53278041

isolation2: stop littering <stdout>.pid files on failure · cbab1ee6

由 Jacob Champion 提交于 1月 07, 2018

The temporary file hack to deal with pygresql output oddities didn't
ensure that those files were always removed. Instead of a named file we
don't need, just use a TemporaryFile that will go away as soon as we
release it.

Added a FIXME to revisit this once we upgrade pygresql.

cbab1ee6

08 1月, 2018 1 次提交

Update pg_proc.sql for changes made to generated file pg_proc_gp.h. · e942ccdc

由 Heikki Linnakangas 提交于 1月 08, 2018

pg_proc_gp.h is generated from pg_proc.sql, but a few recent commits
updated only pg_proc_gp.h, not pg_proc.sql. As a result, if you ran the
perl script, the gp_replication_error() function vanished, and the obsolete
PT verification functions reappeared. Update pg_proc_gp.h, so that
pg_proc_gp.h is reproduced by the perl script again.

e942ccdc

06 1月, 2018 5 次提交
- S
  Faithfully translate a cast over param for subplan testexpr · b83a17bd
  由 Sambitesh Dash 提交于 12月 20, 2017
```
Instead of assuming that casts are always binary coercible (and hence that we
could get away with just dropping them), translate casts in ORCA plans into
either a RelabelType or a FuncExpr.
Signed-off-by: NSambitesh Dash <sdash@pivotal.io>
```
  b83a17bd
- M
  docs: gpbackup update - metadata backed up into a single file, add incompatibility warning (#4257) · e92cb58e
  由 Mel Kiyama 提交于 1月 05, 2018
```
PR for 5X_STABLE
Will be ported to MAIN
```
  e92cb58e
- L
  
  docs - gphdfs file name and gp_hadoop_target_version updates (#4177) · 63d65cf9
  由 Lisa Owen 提交于 1月 05, 2018
  
  63d65cf9
- M
  docs: change GUC default: optimizer_join_arity_for_associativity_comm… (#4242) · 62a0cf12
  由 Mel Kiyama 提交于 1月 05, 2018
```
* docs: change GUC default: optimizer_join_arity_for_associativity_commutativity=18

* docs: GUC removed ZSTD (zstandard)  from gp_default_storage_options (added by mistake)
```
  62a0cf12
- T
  Fixes macOS readme so make create-demo-cluster works · c322279a
  由 Todd Sedano 提交于 1月 03, 2018
```
The correct order is as follows:
1. modify ~/.bash_profile
2. ssh to localhost
3. make create-demo-cluster

See https://github.com/greenplum-db/gpdb/issues/3704 for more detail
```
  c322279a
05 1月, 2018 7 次提交

Minimize the race condition in BackoffSweeper() · ab74e1c6

由 Pengzhou Tang 提交于 1月 03, 2018

There is a long-standing race condition in BackoffSweeper() which
triggers an error and then triggers another assertion failure for
not reset sweeperInProgress to false.

This commit doesn't resolve the race condition fundamentally with
lock or other implementation, because the whole backoff mechanism
did not ask for accurate control, so skipping some sweeps should
be fine so far. We also downgrade the log level to DEBUG because
a restart of sweeper backend is unnecessary.

ab74e1c6

K

Add test using WITH RECURSIVE in a correlated subquery · e8dc3ea4
由 Kavinder Dhaliwal 提交于 10月 18, 2017

e8dc3ea4

Reinstate dropping schema for gporca test suite · db1ecd3c

由 Jesse Zhang 提交于 1月 04, 2018

Partition tables hard-code the operator '=' lookup to namespace
'pg_catalog', which means that in this test we had to put our
user-defined operator into that special system namespace. This works
fine, until we try to pg_dump the resulting database: pg_catalog is not
dumped by default. That led to an incomplete dump that will fail to
restore.

This commit reinstates the dropping of the schema at the end of `gporca`
test to get the pipelines back to green (broken after c7ab6924 ).
Backpatch to 5X_STABLE.

db1ecd3c

Remove dead contrib module xlogviewer · a76607c2

由 Daniel Gustafsson 提交于 1月 04, 2018

xlogviewer is disconnected from the build, and hasn't built since
we merged the HOT patch during the PostgreSQL 8.3 merge in the 5.x
development cycle. Since we have contrib/xlogdump, remove rather
than trying to resurrect it.

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/ov5Sxrd1JL4

a76607c2

Fixes broken compile aix remote (Detatched head) · de3a8845

由 Divya Bhargov 提交于 1月 04, 2018

Since the docs folder is ignored, it is possible for the gpdb_src to get
to a detatched HEAD state if there was a docs commit after a given
commit. We don't really need the branch to get checkout the commit in
question.
Signed-off-by: NEd Espino <edespino@pivotal.io>
Signed-off-by: NDivya Bhargov <dbhargov@pivotal.io>

de3a8845

set search_path and stop dropping schema in gporca test · c7ab6924

由 Jesse Zhang 提交于 1月 02, 2018

The `gporca` regression test suite uses a schema but doesn't really
switch `search_path` to the schema that's meant to encapsulate most of
the objects it uses. This has led to multiple instances where we:
  1. Either used a table from another namespace by accident;
  2. Or we leaked objects into the public namespace that other tests in
  turned accidentally depended on.

As we were about to add a few user-defined types and casts to the test
suite, we want to (at last) ensure that all future additions are scoped
to the namespace.
Signed-off-by: NSambitesh Dash <sdash@pivotal.io>

Closes #4238

c7ab6924

C
behave: reset the faultinjector explicitly · 8fcf01c2
由 C.J. Jameson 提交于 1月 03, 2018
```
Author: C.J. Jameson <cjameson@pivotal.io>
```
8fcf01c2

04 1月, 2018 5 次提交

docs: pl/container updates - plcontainer command, configuration file (#4223) · 38e6d5f8

由 Mel Kiyama 提交于 1月 03, 2018

*  docs: pl/container updates - plcontainer command, configuration file

-removed experimental warning.
-replaced plcontainer command with new options.
-updated configuration file - changed element names, added setting element.
-updated default shared volume.

PR for 5X_STABLE
Will be ported to MAIN

* docs: pl/container - updates based on review comments.

* docs: pl/container - more updates based on additional review comments.

* docs: pl/container - minor edit

* docs: pl/container - another minor fix.

38e6d5f8

docs - add info about greenplum database release versioning (#4193) · 4916d30e

由 Lisa Owen 提交于 1月 03, 2018

* docs - add info about greenplum database release versioning

* misc edits requested by david

* update subnav titles

* clarify that deprecated features removed only in major release

4916d30e

docs: gpstop new option --host (#4069) · 18185d41

由 Mel Kiyama 提交于 1月 03, 2018

* docs: gpstop new option --host

* docs: gpstop - update update/clarify --host description based on review comments.

* docs: gpstop --host. updates based on review comments.

* docs: gpstop - added information on restoring segments after using --host.

* docs: gpstop --host. corrected name of utility to recover segments : gprecoverseg.

18185d41

atmsort: try to find the end of \d tables correctly · d89fef52

由 Jacob Champion 提交于 1月 03, 2018

Several \d variants don't put a row count at the end of their tables,
which means that atmsort doesn't stop sorting output until it finds a
row count somewhere later. Some tests are having their diff output
horribly mangled because of this, which makes debugging difficult.

When we see a \d command, try to apply more heuristics for finding the
end of a table. In addition to ending at a row count, end table sorting
whenever we find a line that doesn't have the same number of column
separators as the table header. If we don't have a table header, still
attempt to end table sorting at a blank line.

extprotocol's test output order must be fixed as a result. Put the
"External options" line where psql actually prints it, after "Format
options".

d89fef52

gpperfmon: Document installing a fork of sigar on CentOS 7 · f9f6d2fd

由 David Sharp 提交于 12月 22, 2017

https://github.com/hyperic/sigar is not under active development.

https://github.com/boundary/sigar is a fork that has been somewhat updated. In particular, it has a fix to allow it to compile with GCC 5 and 6.

f9f6d2fd

03 1月, 2018 2 次提交

Enable optimizer for tests with qp_olap_windowerr · 177a52fd

由 Shreedhar Hardikar 提交于 12月 21, 2017

* Fix 4 (out of 64) windowerr tests that use row_number() to be non-deterministic

* Fix remaining 58 (out of 64) tests.

There was a difference in the results between planner and optimizer due
to different value of row_number assigned.

row_number() is inherently non-deterministic in GPDB. For example, for
the following query:

  select row_number() over (partition by b) from foo;

Let's say that foo was not distributed by b. In this case, to compute
the WindowAgg, we would first have to redistribute the table on b (or
gather all the tuples on master). Thus, for rows having the same b
value, the row_number assigned depends on the order in which they are
received by the WindowAgg - which is non-deterministic.

In qp_olap_windowerr.sql tests, we mitigate this by forcing an order on
ord column, which is unique in this context, making it easier to compare
test results.

* Remove FIXME comment and enable optimizer_trace_fallback
Signed-off-by: NShreedhar Hardikar <shardikar@pivotal.io>

177a52fd

Pipeline generation production updates and fix for SLES jobs. · 851984f9

由 Ed Espino 提交于 12月 22, 2017

o This moves the validation (subprocess call) of the pipelines release
  jobs into the tool which generates the production pipeline. The
  corresponding task validate_pipeline.yml is removed and
  corresponding job.

o Output production fly commands for both "gpdb_master" and
  "gpdb_master_without_asserts" pipelines. This will help engineers
  update both production pipelines.

o Fix the icw sles jobs in the development pipelines.

o Update README.md with usage examples.

o Remove validate_pipeline from pr_pipeline.yml as this validation is
  moving to gen_pipeline.py.

851984f9