提交 · b8545d57a5d85b434ab8a797888db81a008bb84f · Greenplum / Gpdb

31 8月, 2018 6 次提交

Rename "prelim function" to "combine function", to match upstream. · b8545d57

由 Heikki Linnakangas 提交于 8月 31, 2018

The GPDB "prelim" functions did the same things as the "combine"
functions introduced in PostgreSQL 9.6 This commit includes just the
catalog changes, to essentially search & replace "prelim" with
"combine". I did not pick the planner and executor changes that were
made as part of this in the upstream, yet.

Also replace the GPDB implementation of float8_amalg() and
float8_regr_amalg(), with the upstream float8_combine() and
float8_regr_combine(). They do the same thing, but let's use upstream
functions where possible.

Upstream commits:
commit a7de3dc5
Author: Robert Haas <rhaas@postgresql.org>
Date:   Wed Jan 20 13:46:50 2016 -0500

    Support multi-stage aggregation.

    Aggregate nodes now have two new modes: a "partial" mode where they
    output the unfinalized transition state, and a "finalize" mode where
    they accept unfinalized transition states rather than individual
    values as input.

    These new modes are not used anywhere yet, but they will be necessary
    for parallel aggregation.  The infrastructure also figures to be
    useful for cases where we want to aggregate local data and remote
    data via the FDW interface, and want to bring back partial aggregates
    from the remote side that can then be combined with locally generated
    partial aggregates to produce the final value.  It may also be useful
    even when neither FDWs nor parallelism are in play, as explained in
    the comments in nodeAgg.c.

    David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
    Linnakangas, Haribabu Kommi, and me.

commit af025eed
Author: Robert Haas <rhaas@postgresql.org>
Date:   Fri Apr 8 13:44:50 2016 -0400

    Add combine functions for various floating-point aggregates.

    This allows parallel aggregation to use them.  It may seem surprising
    that we use float8_combine for both float4_accum and float8_accum
    transition functions, but that's because those functions differ only
    in the type of the non-transition-state argument.

    Haribabu Kommi, reviewed by David Rowley and Tomas Vondra

b8545d57

Improve some O(N^2) behavior in window function evaluation. · 4d084270

由 Tom Lane 提交于 4月 13, 2014

Repositioning the tuplestore seek pointer in window_gettupleslot() turns
out to be a very significant expense when the window frame is sizable and
the frame end can move. To fix, introduce a tuplestore function for
skipping an arbitrary number of tuples in one call, parallel to the one we
introduced for tuplesort objects in commit 8d65da1f. This reduces the cost
of window_gettupleslot() to O(1) if the tuplestore has not spilled to disk.
As in the previous commit, I didn't try to do any real optimization of
tuplestore_skiptuples for the case where the tuplestore has spilled to
disk. There is probably no practical way to get the cost to less than O(N)
anyway, but perhaps someone can think of something later.

Also fix PersistHoldablePortal() to make use of this API now that we have
it.

Based on a suggestion by Dean Rasheed, though this turns out not to look
much like his patch.

4d084270

Remove unused invtransfn and invprelimfn from aggregates. · 96d4f2ed

由 Heikki Linnakangas 提交于 8月 31, 2018

These were left unused by the Window Aggregates rewrite last year. The
"inverse" functions would be a very useful optimization, but if we
want to have that, we should cherry-pick the same feature from the
upstream, so these GPDB-specific functions are dead in any case. (The
corresponding upstream feature is called "moving aggregate support",
added in PostgreSQL 9.4. See commit a9d9acbf, and follow-up commits.)

96d4f2ed

Enable 'triggers' test · db913af7

由 xiong-gang 提交于 8月 31, 2018

Quite a few cases in 'triggers' don't apply on GPDB: non-SELECT statement
in trigger function, modification on views, INSTEAD OF triggers, etc. But
we can still use some of the test cases, for example, FOR STATEMENT trigger
start acting differently since we changed INSERT/DELETE/UPDATE to modifyTable
by merging 8.5, and it should have been caught by the tests.

db913af7

Set motion node's rescannable field to false. · 624d7491

由 ZhangJackey 提交于 8月 31, 2018

This commit fix the github issue #5628.

We can not create a motion directly in the RHS of Nested loop
join, because Motion node is not rescannable.

In the previous codes, we set the rescannable field the value
same as subpath's even if the RHS is SegmentGeneral or SingleQE.
We fix this by setting the motion's rescannable to false directly.
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

624d7491

Default gp_external_enable_filter_pushdown GUC to true and fix to handle both... · 3ac93fb4

由 Alexander Denissov 提交于 8月 30, 2018

Default gp_external_enable_filter_pushdown GUC to true and fix to handle both planner & orca qualifiers (#5470)

* Changed default for gp_external_enable_filter_pushdown GUC to true.
Co-authored-by: NShivram Mani <smani@pivotal.io>
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>

* Added default value to config array.
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

* Fixed the PXF filter serialized expression to handle both planner and orca query tree

* Updated logic to deal with missing AND operators with qualifiers with
planner
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

* set gp_external_enable_filter_pushdown default to true
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

* removed redundant stuff
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

* Added unittests to test filterstring with pxf predicate pushdown

* Added unittests to test filterstring with pxf predicate pushdown

* added new test case for serialization of qualifiers
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

* added missing function to the test

* fixed expected string value

* Added unit tests for serializePxfFilterQuals
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

* minor fixes
Co-authored-by: NAlex Denissov <adenissov@pivotal.io>
Co-authored-by: NShivram Mani <smani@pivotal.io>

3ac93fb4

30 8月, 2018 6 次提交

Remove a GPDB_91_MERGE_FIXME · 17fdaa63

由 xiong-gang 提交于 8月 30, 2018

'tgenabled' was a boolean type before and it's a multi-value char now,
so use TriggerEnabled() to test 'tgenabled'.

17fdaa63

Remove dead code in gp_bash_functions · 7f0cdef5

由 yanchaozhong 提交于 8月 22, 2018

The following code has been removed in the new version of
function PROCESS_QE in commit ce4d96b6, but the function
UPDATE_MPP definition used in it has not been deleted.

7f0cdef5

Completely remove interconnect kernel module related test case · cd139eeb

由 Pengzhou Tang 提交于 8月 30, 2018

We used to use kernel module named ickm to do functional test of
interconnect, an example is: ickm capature a UDP packet in kernel,
if this packet is a data packet, drop it, then ickm will capture
another packet and will test that this packet is a resent packet.

Those kernel module related cases have many defects:

it relies on tinc to do kernel module installation/deletion
it needs root permission
it's kernel version sensitive to compile the module
the case itself is not stable
In the current pipeline, those jobs are actually been skipped due
to platform mismatch, but it still make troubles when we want to
integrate the interconnect job to another platform like Oracle7,
so this commit will remove those kernel module related tests completely.

Next step:
We need to do similar functional test for interconnect but instead
of kernel module, to make it more dependent and stable.

cd139eeb

Remove unused function 'isPrimaryWriterGangAlive' · d58b9816

由 Pengzhou Tang 提交于 7月 11, 2018

It's originally added to test if the writer gang is alive
(like segment be killed by 9) by calling an actual recv()
syscall. It's removed because of calling syscall for each
segments is heavy, there is no need to do the optimization
for such a rare case, the error can bump up when the command
is actually sent.

d58b9816

docs - add pseudo-type anytable to datatypes topic (#5485) · faffdff9

由 Chuck Litzell 提交于 8月 29, 2018

* docs - add pseudo-type anytable to datatypes topic

* Move GPPC out of conditional section and ack GPPC can also access table

faffdff9

D

Docs: updating and correcting gppkg commands for madlib · 68b6dd8f
由 dyozie 提交于 8月 29, 2018

68b6dd8f

29 8月, 2018 11 次提交

Remove fixmes on function get_eclass_for_sort_expr (#5604) · 440269ae

由 Jinbao Chen 提交于 8月 29, 2018


The param 'rel' is only used on making and findding EC in childred
rels. But I think the situation does not happen in adding cdb path.

440269ae

J
Make create rule on update with from clause work (#5605) · 38ab5650
由 Jinbao Chen 提交于 8月 29, 2018
```
After merge postgres90, UpdateStmt has a fromClause list, but out
and read functions do not have. Add them.
```
38ab5650

Remove unused argument 'root' from set_cheapest and add_path. · 9ea947d2

由 Richard Guo 提交于 8月 29, 2018

This keeps the same with PostgreSQL.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NRichard Guo <guofenglinux@gmail.com>

9ea947d2

Add memory tuning and dirty page management guidelines to install guide (#5442) · 568e4e76

由 Tyler Ramer 提交于 8月 28, 2018

These settings are in line with recommendations from Red Hat and are
already documented here:
https://community.pivotal.io/s/article/Overview-of-Memory-Tuning-Best-Practices-for-Greenplum-Database

See Red Hat's documentation for an Oracle database here:
https://access.redhat.com/solutions/39188

As noted in the knowledge base, these tuneables encourage more active
and frequent writing of dirty pages to disk. This is important to
prevent a large accumulation of dirty pages during a busy workload -
behavior that can quickly lead to an IO bottleneck and subsequent
slowdown or stall of the database or other running processes.

For Greenplum specifically, this tuning has been show to assist in
mitigating segment failures due to FTS probe being unable to write to
disk in a timely manner. Fewer blocked tasked messages are generally
noted in the logs on high workload systems, and in general database
performance should be improved.

These values are most important to apply on the segment nodes, but it is
safe to apply them on the master as well, and best to do so for
consistency.

Authored by Tyler Ramer <tyaramer@gmail.com>

568e4e76

Clarify comments around const-evaluation of Params. · ce32a0db

由 Heikki Linnakangas 提交于 8月 28, 2018

GPDB has a concept of "one-off" plans. In PostgreSQL, the planner doesn't
normally const-evaluate stable functions, but GPDB does that. Such plans
cannot be reused across invocations, so they're marked as "one-off".

Before the PostgreSQL 9.2 merge, the same "one-off" flag was used to mark
plans where query Params were aggressively const-evaluated, for similar
reasons. That mechanism was obsoleted in 9.2, however, when the concept
of "custom" and "generic" plans was introduced. The planner will now try
to create "custom" plans for the specific query parameters, without any
GPDB adde code.

The snippet in In eval_const_expressions that this removes was a no-op.
There used to be an extra condition, before the 9.2 merge, so that it
const-evaluated Params more aggressively. But that was removed as part
of the merge, and as a result, the removed if()'s condition was always
false. We don't need the more aggressive stance here, because the upstream
custom plans mechanism does the same. Move the comment explaining the
aggressive const-evaluation of stable functions to evaluate_function(),
that seems like a better place for it, anyway.

Remove the GPDB_92_MERGE_FIXME comment in BuildCachedPlan(). I'm not sure
why we had a check for boundParams there before. AFAICS, marking the plans
as one-off in eval_const_expressions_mutator (the code that was removed),
should've been enough, and the check for boundParams in plancache.c was
redundant (and sub-optimal, because some plans were marked not-for-reuse,
which could've been reused safely). In any case, there isn't anything
special about query parameters in this code anymore, it's all upstream
code, so we don't need an extra boundParams check either.

Add a test case for the const-evaluation of query Params, too. It was
passing before these changes as well, but seems good to have a test for it.

ce32a0db

Don't emit extra ALTER FUNCTION OWNER TO commands after CREATE LANGUAGE. · b5eb21ba

由 Heikki Linnakangas 提交于 8月 28, 2018

In pg_dump, we had a GPDB-specific hack to change the owners of the
handler and validator functions of the language, after dumping the CREATE
LANGUAGE command. This was added back in 2008, with commit message:

    MPP-3774: Change pg_dump to emit ALTER FUNCTION ... OWNER statements
    for functions underlying created languages for languages defined
    using pg_pltemplate-specified languages.

I looked into MPP-3774, but I couldn't make much sense of it.

Title:
    pg_dumpall doesn't dump Language function owners
Description:
    pg_dumpall doesn't dump the permissions for the CREATE LANGUAGE. need to
    modify the function owners. so ALTER FUNCTION ... OWNER TO ...

PostgreSQL never had that code. I don't think it is relevant anymore,
there's been a lot of change in the way languages are handled. Firstly,
they are usually created as part of extensions, these days, so the
"language template" mechanism is obsolete. Secondly, when this change was
made in GPDB, languages didn't have owners. That was added later.

All in all, I don't think this is needed anymore, so remove it, and dump
languages and their functions just like in the upstream.

PS. If this was still valid and needed, I think we'd also need to emit the
ALTER FUNCTION commands for the 'laninline' function.

b5eb21ba

Fix non-text format EXPLAIN of GPDB-specific plan nodes. · f41af643

由 Heikki Linnakangas 提交于 8月 28, 2018

This was a bug-of-omission when the upstream support for XML and JSON
formats was merged. The initialization of the 'sname' variable was added
for every node type handled in the switch-case statement, but the GPDB
addons didn't get the same treatment. As a result, non-text format
EXPLAIN of one of these GPDB-specific nodes produced garbage output, or
a segfault.

f41af643

Avoid calling GPDB backend functions directly from ORCA translator. · 3ec049a5

由 Heikki Linnakangas 提交于 8月 28, 2018

It was safe in this case, because numeric_is_nan() doesn't throw an
error, but let's follow the usual convention and go through a wrapper
function gpdbwrappers.cpp. For the sake of consistency, and safety in
the unlikely case that numeric_is_nan() is modified in the future to
do more than a simple comparison.

3ec049a5

L

docs - add pg_statistic stainherit column (#5308) · 5193a384
由 Lisa Owen 提交于 8月 28, 2018

5193a384

Remove unused upstream macro. · 3f89ad25

由 Heikki Linnakangas 提交于 8月 28, 2018

This isn't currently used in GPDB, and it wouldn't work as it is anyway,
because TupleTableSlots are quite different in GPDB. There's no point in
keeping this around as a FIXME, so let's just remove it.

3f89ad25

H

Remove some unused code. · 30d23f6a
由 Heikki Linnakangas 提交于 8月 28, 2018

30d23f6a

28 8月, 2018 10 次提交

pg_upgrade: fix GPDB-specific long options · 6e137747

由 Jacob Champion 提交于 8月 27, 2018

The --progress, --add-checksum, and --remove-checksum options weren't
being recognized because their option.vals were set to the characters
'2', '3', and '4' (integer values 50, 51, and 52, respectively), while
the option handling was comparing to the literal values 2, 3, and 4.
Switch back to integers.

6e137747

L

Set default JAVA_HOME for all users (#5601) · 5f098c9e
由 Lav Jain 提交于 8月 27, 2018

5f098c9e

Fix psql \dE command to display both foreign and external tables. · 4a6353f7

由 Heikki Linnakangas 提交于 8月 27, 2018

It would previously not display foreign tables, because they were filtered
out by the GPDB-added condition on relstorage. Fix, and add test case.

4a6353f7

Remove FIXME comment; on second thoughts, the query is fine. · 304de73e

由 Heikki Linnakangas 提交于 8月 27, 2018

In GPDB5, we used to get an "IS NOT NULL" condition in this query, but lost
it as part of the 9.0 merge. The plan difference was flagged as a FIXME,
because the IS NOT NULL filter could save some time, by eliminating rows
early on, that we know can't match the join qual. On second thoughts, I
think we should just live with it, even if it might be a performance
regression in some queries, because:

* Even in GPDB 5, you only got the extra Filter with the Postgres planner.
  ORCA never bothered with it.

* The same Filter could be added to most joins, but we were only doing it
  for NOT IN queries. That suggest that it wasn't done as a performance
  optimization in the first place, but it was probably required for
  correctness for some plans. (Subquery planning has changed heavily in
  the Postgres merges, so whatever the reason for it originally was, I'm
  pretty sure it's not needed anymore.)

* Eliminating the rows early doesn't matter much in the typical case that
  you have a Join node immediately on top of the Scan node. (The main case
  where it helps is if there is a Motion or Sort node inbetween.)

* It's rare in practice, to do joins on a column that contains a lot of
  NULLs.

All in all, I think it would be an interesting little optimization to do,
but we should do it consistently for all joins. And we should do that in
PostgreSQL community first, as it's not really MPP-related. So for now,
in GPDB, just remove the comment and accept the plan as it is, without
the Filter.

304de73e

Clean up max, flags, and docs of 'track_activity_query_size' setting. · d4892f81

由 Heikki Linnakangas 提交于 8月 27, 2018

In GPDB 5, this setting was called 'pgstat_track_activity_query_size',
but in the 8.4 merge, it was renamed to just 'track_activity_query_size',
to match upstream. Clean it up to match upstream:

* Change the maximum from INT_MAX to 100 kB. 100 kB should be enough for
  everybody.

* It is in the sample configuration file now , so remove GUC_NOT_IN_SAMPLE
  flag

* Fix documentation. I just replaced 'pgstat_track_activity_query_size'
  with 'track_activity_query_size', and moved the paragraphs to keep
  alphabetical order. Other than the name, the old text still seems valid.

d4892f81

C
Remove unused behave steps and files · fb9ddd8b
由 Chris Hajas 提交于 7月 03, 2018
```
Authored-by: NChris Hajas <chajas@pivotal.io>
```
fb9ddd8b

Remove unused and unmaintained behave feature files · 376ca136

由 Chris Hajas 提交于 8月 07, 2018

The gpstop, gpstart, and gpdeletesystem behave tests haven't been
touched in years and aren't being run.
Authored-by: NChris Hajas <chajas@pivotal.io>

376ca136

Autotrigger stats merge for root partition when leaf is analyzed · 9a113c21

由 Omer Arap 提交于 8月 08, 2018

This commit automatically triggers root partition statistics merge
operation when the leaf partition is analyzed individually. It only
triggers the merge only if the other partitions are already analyzed
and stats are in place.

Remove analyze MERGE

As we are automatically trigger root partition merge from theh previous
commit, we do not need the separate keyword to the root partition
statistics MERGE.

9a113c21

Define RELCACHE_FORCE_RELEASE for --enable-cassert. · 42c79eea

由 Ashwin Agrawal 提交于 7月 04, 2018

Many issues are surfaced due to RELCACHE_FORCE_RELEASE, so best to enable the
same with --enable-cassert along with already enabled CLOBBER_FREED_MEMORY.

Concern existed would cause test time to increase significantly with
RELCACHE_FORCE_RELEASE. But trying it out ICG time marginally gets affected from
20mins to 21mins on my laptop. Hence enabling the same with --enable-cassert.

42c79eea

docs - add best practices page for resource groups (#5430) · 1551d1e5

由 Lisa Owen 提交于 8月 27, 2018

* docs - add resource group best practices page

* assign non-admin roles a resource group

* edits requested by david

* address comments from ning

* remove workload terminology from command center bullet

* move example calculation section below configuring RGs

1551d1e5

27 8月, 2018 6 次提交

Fix gpperfmon bug that ATExecSetDistributedBy statement produce wrong tsubmit time (#5583) · ad8b9a0e

由 Wenlin Zhang 提交于 8月 27, 2018

* Fix gpperfmon bug that ATExecSetDistributedBy statement produce wrong tsubmit.

Bypass gpmon start packet for ExecutorStart from ATExecSetDistributedBy,
make such kind of query no more record in gpperfmon.
Co-authored-by: NWenlin Zhang <wzhang@pivotal.io>
Co-authored-by: NRu Yang <ruyang@pivotal.io>
Co-authored-by: NRenyuan Wang <rewang@pivotal.io>

ad8b9a0e

Fix compilation error in print_path() · 77181805

由 Paul Guo 提交于 8月 24, 2018

print_path() is useful for planner debugging.
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NPaul Guo <paulguo@gmail.com>

77181805

Fix rechecking index quals in with lossy index operators on AO tables. · 83d055aa

由 Heikki Linnakangas 提交于 8月 26, 2018

In 8.4, a 'recheck' flag was added to TBMIterateResult. But the bitmap
AO table scan methods didn't get the memo. Because of that, they would
fail to recheck the original index quals, when the index operator was
lossy. Test case included to demonstrate the bug.

Unrelatedly, remove a FIXME comment about the need to have EvalPlanQual
recheck functions for bitmap AO table scans. That 'recheck' method is
in fact unnecessary for AO tables, as explained in the comment that
replaces the FIXME. This is confusing, so pay attention: the 'recheck'
functionality that this commit fixes, is unrelated to the FIXME comment
and the Recheck "access methods". But I found the bug with the
TBMIterateResult.recheck field when I started to look into that. I'm glad
I was confused about that myself, or I would not have tested the thing
that was actually broken :-)

As a side-note, it's confusing that we have essentially duplicated the
bitmap heap/AO/AOCS scan code so many times. We have Bitmap Table Scan for
heap, AO and AOCS tables, when ORCA is used. And separate Bitmap Heap Scan
and Bitmap AO/AOCS Scan nodes when the Postgres planner is used. We
should refactor that, but not in this patch.

83d055aa

Make 'incremental_analyze' more resilient with different settings. · 3472fe60

由 Heikki Linnakangas 提交于 8月 26, 2018

1. Force lc_monetary='C' in the test database, like pg_regress does for
   the 'regression' database. Otherwise the money values might come out
   e.g. as '£1,234.00' while the expected output has '$1,234.00'

2. Set log_statement='none'. Otherwise, because the test sets
   client_min_messages='log', the output depends on the original
   log_statement setting.

3472fe60

Remove obsolete backwards-compatibility code in hba parsing. · e9f506e6

由 Heikki Linnakangas 提交于 8月 26, 2018

This chunk was introduced back in 2009. The commit message said:
(from the old git repository, long before GPDB was open sourced)

    Some backwards compatability fixes.
    Allow database name to come before arguments in psql
    Allow old pg_hba syntax " local all userid indent sameuser" as well as new syntax  of " local sameuser userid ident".

Back then, there was a comment in the code too, that said:

    XXX: attempt to do some backwards compatible parsing here?

That commit was removed in 2010, when the hba.c code was synced with
PostgreSQL 9.0.

I don't think we need to maintain compatibility with ancient pg_hba.conf
files anymore. If anyone still has a config file with those fields swapped,
it's time to fix the file.

e9f506e6

H
Remove obsolete, commented-out, "devenv" call in Windows build script. · aa044eb9
由 Heikki Linnakangas 提交于 8月 26, 2018
```
The upstream doesn't have this, and the Windows build in Concourse is happy
without it, so apparently we don't need it.
```
aa044eb9

25 8月, 2018 1 次提交

Exclude partition table roots in gp_toolkit.gp_bloat_expected_pages view · f6d698fe

由 Jimmy Yih 提交于 8月 22, 2018

The gp_toolkit.gp_bloat_expected_pages view could mistakenly report a
partition table root as bloated even though partition table roots do
not contain data. This could deceive a user into running a costly
VACUUM FULL on the partition table when it was not needed.

f6d698fe