提交 · 96c6d3182be8517438be40d60fa13bc2b093c723 · Greenplum / Gpdb

24 9月, 2019 1 次提交

Omit slice information for SubPlans that are not dispatched separately. · 96c6d318

由 Heikki Linnakangas 提交于 9月 24, 2019

Printing the slice information makes sense for Init Plans, which are
dispatched separately, before the main query. But not so much for other
Sub Plans, which are just part of the plan tree; there is no dispatching
or motion involved at such SubPlans. The SubPlan might *contain* Motions,
but we print the slice information for those Motions separately. The slice
information was always just the same as the parent node's, which adds no
information, and can be misleading if it makes the reader think that there
is inter-node communication involved in such SubPlans.

96c6d318

12 8月, 2019 1 次提交

Init plan with input param follow upstream execution flow · f6f1f9bb

由 Weinan WANG 提交于 8月 12, 2019

gpdb, we pre-execute init plan in ExecutorStart to reduce slice number.
However, for some initplans, which require params from the same level node,
should follow upstream execution flow.

To recognize this initplan's pattern, record `extParam` when creating
`subplan` object.

Fix issue: #6953

f6f1f9bb

25 7月, 2019 1 次提交

Reset executable-flags in source files to match upstream. · 754759ff

由 Heikki Linnakangas 提交于 7月 25, 2019

Git stores the 'x' flag for files in the repository. For no good reason,
some files had it set differently from upstream. It's harmless, but let's
avoid spurious differences.

754759ff

14 3月, 2019 1 次提交

Rename legacy planner to Postgres planner · b4e2e3e2

由 Daniel Gustafsson 提交于 3月 14, 2019

As we merge with upstream and by that keep refining the Postgres
planner, legacy planner is no longer a suitable name. This changes
all variations of the spelling (legacy planner, legacy optimizer,
legacy query optimizer etc) to say "Postgres" rather than "legacy".
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
Reviewed-by: NDavid Yozie <dyozie@pivotal.io>
Reviewed-by: NGeorgios Kokolatos <gkokolatos@pivotal.io>

b4e2e3e2

15 2月, 2019 1 次提交
- P
  Fix subquery_motionHazard_walker() to accomodate more plan node types. (#6803) · eea51362
  由 Paul Guo 提交于 2月 15, 2019
```
Also refactor subquery_motionHazard_walker() to make it simpler.
```
  eea51362
01 2月, 2019 1 次提交

Use normal hash operator classes for data distribution. · 242783ae

由 Heikki Linnakangas 提交于 2月 01, 2019

Replace the use of the built-in hashing support for built-in datatypes, in
cdbhash.c, with the normal PostgreSQL hash functions. Now is a good time
to do this, since we've already made the change to use jump consistent
hashing in GPDB 6, so we'll need to deal with the upgrade problems
associated with changing the hash functions, anyway.

It is no longer enough to track which columns/expressions are used to
distribute data. You also need to know the hash function used. For that,
a new field is added to gp_distribution_policy, to record the hash
operator class used for each distribution key column. In the planner,
a new opfamily field is added to DistributionKey, to track that throughout
the planning.

Normally, if you do "CREATE TABLE ... DISTRIBUTED BY (column)", the
default hash operator class for the datatype is used. But this patch
extends the syntax so that you can specify the operator class explicitly,
like "... DISTRIBUTED BY (column opclass)". This is similar to how an
operator class can be specified for each column in CREATE INDEX.

To support upgrade, the old hash functions have been converted to special
(non-default) operator classes, named cdbhash_*_ops. For example, if you
want to use the old hash function for an integer column, you could do
"DISTRIBUTED BY (intcol cdbhash_int4_ops)". The old hard-coded whitelist
of operators that have "compatible" cdbhash functions has been replaced
by putting the compatible hash opclasses in the same operator family. For
example, all legacy integer operator classes, cdbhash_int2_ops,
cdbhash_int4_ops and cdbhash_int8_ops, are all part of the
cdbhash_integer_ops operator family).

This removes the pg_database.hashmethod field. The hash method is now
tracked on a per-table and per-column basis, using the opclasses, so it's
not needed anymore.

To help with upgrade from GPDB 5, this introduces a new GUC called
'gp_use_legacy_hashops'. If it's set, CREATE TABLE uses the legacy hash
opclasses, instead of the default hash opclasses, if the opclass is not
specified explicitly. pg_upgrade will set the new GUC, to force the use of
legacy hashops, when restoring the schema dump. It will also set the GUC
on all upgraded databases, as a per-database option, so any new tables
created after upgrade will also use the legacy opclasses. It seems better
to be consistent after upgrade, so that collocation between old and new
tables work for example. The idea is that some time after the upgrade, the
admin can reorganize all tables to use the default opclasses instead. At
that point, he should also clear the GUC on the converted databases. (Or
rather, the automated tool that hasn't been written yet, should do that.)

ORCA doesn't know about hash operator classes, or the possibility that we
might need to use a different hash function for two columns with the same
datatype. Therefore, it cannot produce correct plans for queries that mix
different distribution hash opclasses for the same datatype, in the same
query. There are checks in the Query->DXL translation, to detect that
case, and fall back to planner. As long as you stick to the default
opclasses in all tables, we let ORCA to create the plan without any regard
to them, and use the default opclasses when translating the DXL plan to a
Plan tree. We also allow the case that all tables in the query use the
"legacy" opclasses, so that ORCA works after pg_upgrade. But a mix of the
two, or using any non-default opclasses, forces ORCA to fall back.

One curiosity with this is the "int2vector" and "aclitem" datatypes. They
have a hash opclass, but no b-tree operators. GPDB 4 used to allow them
as DISTRIBUTED BY columns, but we forbid that in GPDB 5, in commit
56e7c16b. Now they are allowed again, so you can specify an int2vector
or aclitem column in DISTRIBUTED BY, but it's still pretty useless,
because the planner still can't form EquivalenceClasses on it, and will
treat it as "strewn" distribution, and won't co-locate joins.

Abstime, reltime, tinterval datatypes don't have default hash opclasses.
They are being removed completely on PostgreSQL v12, and users shouldn't
be using them in the first place, so instead of adding hash opclasses for
them now, we accept that they can't be used as distribution key columns
anymore. Add a check to pg_upgrade, to refuse upgrade if they are used
as distribution keys in the old cluster. Do the same for 'money' datatype
as well, although that's not being removed in upstream.

The legacy hashing code for anyarray in GPDB 5 was actually broken. It
could produce a different hash value for two arrays that are considered
equal, according to the = operator, if there were differences in e.g.
whether the null bitmap was stored or not. Add a check to pg_upgrade, to
reject the upgrade if array types were used as distribution keys. The
upstream hash opclass for anyarray works, though, so it is OK to use
arrays as distribution keys in new tables. We just don't support binary
upgrading them from GPDB 5. (See github issue
https://github.com/greenplum-db/gpdb/issues/5467). The legacy hashing of
'anyrange' had the same problem, but that was new in GPDB 6, so we don't
need a pg_upgrade check for that.

This also tightens the checks ALTER TABLE ALTER COLUMN and CREATE UNIQUE
INDEX, so that you can no longer create a situation where a non-hashable
column becomes the distribution key. (Fixes github issue
https://github.com/greenplum-db/gpdb/issues/6317)

Discussion: https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/4fZVeOpXllQCo-authored-by: NMel Kiyama <mkiyama@pivotal.io>
Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
Co-authored-by: NPengzhou Tang <ptang@pivotal.io>
Co-authored-by: NChris Hajas <chajas@pivotal.io>
Reviewed-by: NBhuvnesh Chaudhary <bchaudhary@pivotal.io>
Reviewed-by: NNing Yu <nyu@pivotal.io>
Reviewed-by: NSimon Gao <sgao@pivotal.io>
Reviewed-by: NJesse Zhang <jzhang@pivotal.io>
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
Reviewed-by: NYandong Yao <yyao@pivotal.io>

242783ae

11 9月, 2018 1 次提交

Make planner generate redistribute-motion. · a4cbf586

由 ZhangJackey 提交于 9月 11, 2018

When doing an inner join, we will test that if we can use redistribute motion
by the function cdbpath_partkeys_from_preds. But if a_partkey is NIL(it is NIL
at the beginning of the function), we append nothing into it. Thus this
function will only return false. This leads to the planner can only generate a
broadcast motion for the inner relation.

We fix this by the same logic as an outer join.

WTS node is immovable, this commit adds some code to handle it.
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

a4cbf586

18 8月, 2018 1 次提交

Change cdbhash to use jump consistent hash. · 4a174240

由 Zhenghua Lyu 提交于 8月 18, 2018

Previously, Greenplum uses a very simple algorithm to reduce hash
value to segment id: it just does a modulo computation. Simply
modulo is not suitable for cluster expanding.

In this commit, we change cdbhash to use jump consistent hash algorithm.
More details, please refer the proposal on gpdb-dev mail-list:
https://groups.google.com/a/greenplum.org/forum/#!topic/gpdb-dev/f5bqTzAZjAsCo-authored-by: NZhenghua Lyu <zlv@pivotal.io>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NJialun Du <jdu@pivotal.io>
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>

4a174240

03 8月, 2018 1 次提交
- K
  Revert "Merge with PostgreSQL 9.2beta2." · e0aa3ef2
  由 Karen Huddleston 提交于 8月 02, 2018
```
This reverts commit 4750e1b6.
```
  e0aa3ef2
02 8月, 2018 1 次提交

Merge with PostgreSQL 9.2beta2. · 4750e1b6

由 Richard Guo 提交于 8月 02, 2018

This is the final batch of commits from PostgreSQL 9.2 development,
up to the point where the REL9_2_STABLE branch was created, and 9.3
development started on the PostgreSQL master branch.

Notable upstream changes:

* Index-only scan was included in the batch of upstream commits. It
  allows queries to retrieve data only from indexes, avoiding heap access.

* Group commit was added to work effectively under heavy load. Previously,
  batching of commits became ineffective as the write workload increased,
  because of internal lock contention.

* A new fast-path lock mechanism was added to reduce the overhead of
  taking and releasing certain types of locks which are taken and released
  very frequently but rarely conflict.

* The new "parameterized path" mechanism was added. It allows inner index
  scans to use values from relations that are more than one join level up
  from the scan. This can greatly improve performance in situations where
  semantic restrictions (such as outer joins) limit the allowed join orderings.

* SP-GiST (Space-Partitioned GiST) index access method was added to support
  unbalanced partitioned search structures. For suitable problems, SP-GiST can
  be faster than GiST in both index build time and search time.

* Checkpoints now are performed by a dedicated background process. Formerly
  the background writer did both dirty-page writing and checkpointing. Separating
  this into two processes allows each goal to be accomplished more predictably.

* Custom plan was supported for specific parameter values even when using
  prepared statements.

* API for FDW was improved to provide multiple access "paths" for their tables,
  allowing more flexibility in join planning.

* Security_barrier option was added for views to prevents optimizations that
  might allow view-protected data to be exposed to users.

* Range data type was added to store a lower and upper bound belonging to its
  base data type.

* CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
  SELECT query is planned during the execution of the utility. To conform to
  this change, GPDB executes the utility statement only on QD and dispatches
  the plan of the SELECT query to QEs.
Co-authored-by: NAdam Lee <ali@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NPaul Guo <paulguo@gmail.com>
Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

4750e1b6

12 3月, 2018 1 次提交

Re-enable RETURNING tests. · c271eba6

由 Heikki Linnakangas 提交于 3月 12, 2018

Commit 8a736a5f made INSERT/UPDATE/DELETE RETURNING work again. But
forgot to also re-enable these upstream regression tests.

c271eba6

01 3月, 2018 1 次提交

Give a better error message, if preparing an xact fails. · b3c50e40

由 Heikki Linnakangas 提交于 3月 01, 2018

If an error happens in the prepare phase of two-phase commit, relay the
original error back to the client, instead of the fairly opaque
"Abort [Prepared]' broadcast failed to one or more segments" message you
got previously. A lot of things happen during the prepare phase that
can legitimately fail, like checking deferred constraints, like in the
'constraints' regression test. But even without that, there can be
triggers, ON COMMIT actions, etc., any of which can fail.

This commit consists of several parts:

* Pass 'true' for the 'raiseError' argument when dispatching the prepare
  dtx command in doPrepareTransaction(), so that the error is emitted to
  the client.

* Bubble up an ErrorData struct, with as many fields intact as possible,
  to the caller,  when dispatching a dtx command. (Instead of constructing
  a message in a StringInfo). So that we can re-throw the message to
  the client, with its original formatting.

* Don't throw an error in performDtxProtocolCommand(), if we try to abort
  a prepared transaction that doesn't exist. That is business-as-usual,
  if a transaction throws an error before finishing the prepare phase.

* Suppress the "NOTICE: Releasing segworker groups to retry broadcast."
  message, when aborting a prepared transaction.

Put together, the effect is if an error happens during prepare phase, the
client receives a message that is largely indistinguishable from the
message you'd get if the same failure happened while running a normal
statement.

Fixes github issue #4530.

b3c50e40

09 2月, 2018 1 次提交

Fix more whitespace in tests, mostly in expected output. · 93b92ca4

由 Heikki Linnakangas 提交于 2月 09, 2018

Commit ce3153fa, about to be merged from PostgreSQL 9.0 soon, removes
the -w option from pg_regress's "diff" invocation. That commit will fix
all the PostgreSQL regression tests to pass without it, but we need to
also fix all the GPDB tests. That's what this commit does.

I did much of this in commit 06a2bb64, but now that we're about to
actually merge that, more cases popped up.

Co-Author: Daniel Gustafsson <dgustafsson@pivotal.io>

93b92ca4

13 12月, 2017 1 次提交

Rename querytree_safe_for_segment to querytree_safe_for_qe · 32f099fd

由 Shreedhar Hardikar 提交于 12月 08, 2017

The original name was deceptive because this check is also done for QE
slices that run on master. For example:

EXPLAIN SELECT * FROM func1_nosql_vol(5), foo;

                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.30..1.37 rows=4 width=12)
   ->  Nested Loop  (cost=0.30..1.37 rows=2 width=12)
         ->  Seq Scan on foo  (cost=0.00..1.01 rows=1 width=8)
         ->  Materialize  (cost=0.30..0.33 rows=1 width=4)
               ->  Broadcast Motion 1:3  (slice1)  (cost=0.00..0.30 rows=3 width=4)
                     ->  Function Scan on func1_nosql_vol  (cost=0.00..0.26 rows=1 width=4)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(8 rows)

Note that in the plan, the function func1_nosql_vol() will be executed on a
master slice with Gp_role as GP_ROLE_EXECUTE.

Also, update output files
Signed-off-by: NJesse Zhang <sbjesse@gmail.com>

32f099fd

27 9月, 2017 2 次提交

CDB Specific changes, other fix-ups after merging · e5f6e826

由 Shreedhar Hardikar 提交于 6月 08, 2017

0. Fix up post join dedup logic after cherry-pick
0. Fix pull_up_sublinks_jointree_recurse returning garbage relids
0. Update gporca, rangefuncs, eagerfree answer fileis
	1. gporca
	Previously we were generating a Hash Inner Join with an
	HashAggregate for deduplication. Now we generate a Hash
	Semi Join in which case we do not need to deduplicate the
	inner.

	2. rangefuncs
	We updated this answer file during the cherry-pick of
	e006a24a since there was a change in plan.
	After these cherry-picks, we are back to the original
	plan as master. Hence we see the original error.

	3. eagerfree
	We are generating a not-very-useful subquery scan node
	with this change. This is not producing wrong results.
	But this subqeury scan needs to be removed.
	We will file a follow-up chore to investigate and fix this.

0. We no longer need helper function `hasSemiJoin()` to check whether
this specialInfo list has any specialJoinInfos constructed for Semi Join
(IN/EXISTS sublink). We have moved that check inside
`cdb_set_cheapest_dedup()`

0. We are not exercising the pre-join-deduplication code path after
this cherry-pick. Before this merge, we had three CDB specific
nodes in `InClauseInfo` in which we recorded information for
pre-join-dedup in case of simple uncorrelated IN sublinks.
`try_join_unique`, `sub_targetlist` and `InOperators`
Since we now have `SpecialJoinInfo` instead of `InClauseInfo`, we need
to devise a way to record this information in `SpecialJoinInfo`.
We have filed a follow-up story for this.

Ref [#142356521]
Signed-off-by: NDhanashree Kashid <dkashid@pivotal.io>

e5f6e826

Update test output files and indentation · 102295b8

由 Ekta Khanna and Omer Arap 提交于 5月 15, 2017

This commit updates test answer files after merging with e006a24a.

0. Replace `EXISTS Join` to `Semi Join`
0. Replace `Left Anti Semi Join` to `Anti Join`
0. Updated plan for `table_functions` for IN queries as we do not pull
up the sublink and convert it into a join when sublink testexpr does not
contain Vars of parent query.
0. Updated the error in `rangefuncs.out` since we now create a different
plan for the following query :
```
CREATE TABLE foorescan (fooid int, foosubid int, fooname text, primary key(fooid,foosubid));
INSERT INTO foorescan values(5000,1,'abc.5000.1');
INSERT INTO foorescan values(5001,1,'abc.5001.1');
CREATE FUNCTION foorescan(int,int) RETURNS setof foorescan AS 'SELECT * FROM foorescan WHERE fooid >= $1 and fooid < $2 ;' LANGUAGE SQL;
SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2;
```
Plan before fix :
```
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Sort  (cost=270.41..270.54 rows=50 width=50)
   Sort Key: f.fooid, f.foosubid
   ->  HashAggregate  (cost=268.50..269.00 rows=50 width=50)
         Group By: f.ctid::bigint, f.gp_segment_id
         ->  Hash Join  (cost=5.12..268.25 rows=50 width=29)
               Hash Cond: foorescan.fooid = f.fooid
               ->  Function Scan on foorescan  (cost=0.00..260.00 rows=1000 width=4)
               ->  Hash  (cost=4.50..4.50 rows=17 width=29)
                     ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..4.50 rows=50 width=29)
                           ->  Seq Scan on foorescan f  (cost=0.00..3.50 rows=17 width=29)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(12 rows)

```
here the function scan is done on master and since the function is
accessing a distributed relation,  `cdbdisp_dispatchToGang()`
errors out.

Plan after fix:
```
 explain SELECT * FROM foorescan f WHERE f.fooid IN (SELECT fooid FROM foorescan(5002,5004)) ORDER BY 1,2;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice3; segments: 3)  (cost=299.16..299.29 rows=50 width=19)
   Merge Key: f.fooid, f.foosubid
   ->  Sort  (cost=299.16..299.29 rows=17 width=19)
         Sort Key: f.fooid, f.foosubid
         ->  Hash Semi Join  (cost=292.50..297.75 rows=17 width=19)
               Hash Cond: f.fooid = foorescan.fooid
               ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..4.50 rows=17 width=19)
                     Hash Key: f.fooid
                     ->  Seq Scan on foorescan f  (cost=0.00..3.50 rows=17 width=19)
               ->  Hash  (cost=280.00..280.00 rows=334 width=4)
                     ->  Redistribute Motion 1:3  (slice2)  (cost=0.00..280.00 rows=1000 width=4)
                           Hash Key: foorescan.fooid
                           ->  Function Scan on foorescan  (cost=0.00..260.00 rows=1000 width=4)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(15 rows)
```
With this new plan, function scan is executed on segment in which case
`init_sql_fcache()` first walks the query tree and checks if it is safe
to be planned and executed on the segment using `querytree_safe_for_segment_walker()`.
`querytree_safe_for_segment_walker()` errors out since the function is
accessing distributed table. Both the new and old errors are testing the same scenario.
But due to plan change, the place where we bail out is different.

Ref [#142355175]
Signed-off-by: NEkta Khanna <ekhanna@pivotal.io>

102295b8

18 11月, 2016 1 次提交

Use proper error code for errors. · 0bf31cd6

由 Heikki Linnakangas 提交于 11月 17, 2016

Attach a suitable error code for many errors that were previously reported
as "internal errors". GPDB's elog.c prints a source file name and line
number for any internal errors, which is a bit ugly for errors that are
in fact unexpected internal errors, but user-facing errors that happen
as a result of e.g. an invalid query.

To make sure we don't accumulate more of these, adjust the regression tests
to not ignore the source file and line number in error messages. There are
a few exceptions, which are listed explicitly.

0bf31cd6

06 6月, 2016 1 次提交

Backport from PostgreSQL 8.4 · 78b0a42e

由 Heikki Linnakangas 提交于 6月 06, 2016

This is a partial backport of a larger body of work which also already have
been partially backported.

Remove the GPDB-specific "breadcrumbs" mechanism from the parser. It is
made obsolete by the upstream mechanism. We lose context information from
a few errors, which is unfortunate, but seems acceptable. Upstream doesn't
have context information for those errors either.

The backport was originally done by Daniel Gustafsson, on top of the
PostgreSQL 8.3 merge. I tweaked it to apply it to master, before the
merge.

Upstream commit:

  commit b153c092
  Author: Tom Lane <tgl@sss.pgh.pa.us>
  Date:   Mon Sep 1 20:42:46 2008 +0000

    Add a bunch of new error location reports to parse-analysis error messages.
    There are still some weak spots around JOIN USING and relation alias lists,
    but most errors reported within backend/parser/ now have locations.

78b0a42e

28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be
12 3月, 2015 1 次提交

Support flattening of empty-FROM subqueries and one-row VALUES tables. · f4abd024

由 Tom Lane 提交于 3月 11, 2015

We can't handle this in the general case due to limitations of the
planner's data representations; but we can allow it in many useful cases,
by being careful to flatten only when we are pulling a single-row subquery
up into a FROM (or, equivalently, inner JOIN) node that will still have at
least one remaining relation child. Per discussion of an example from
Kyotaro Horiguchi.

f4abd024

21 10月, 2014 2 次提交

Fix mishandling of FieldSelect-on-whole-row-Var in nested lateral queries. · 33343b86

由 Tom Lane 提交于 10月 20, 2014

If an inline-able SQL function taking a composite argument is used in a
LATERAL subselect, and the composite argument is a lateral reference,
the planner could fail with "variable not found in subplan target list",
as seen in bug #11703 from Karl Bartel. (The outer function call used in
the bug report and in the committed regression test is not really necessary
to provoke the bug --- you can get it if you manually expand the outer
function into "LATERAL (SELECT inner_function(outer_relation))", too.)

The cause of this is that we generate the reltargetlist for the referenced
relation before doing eval_const_expressions() on the lateral sub-select's
expressions (cf find_lateral_references()), so what's scheduled to be
emitted by the referenced relation is a whole-row Var, not the simplified
single-column Var produced by optimizing the function's FieldSelect on the
whole-row Var. Then setrefs.c fails to match up that lateral reference to
what's available from the outer scan.

Preserving the FieldSelect optimization in such cases would require either
major planner restructuring (to recursively do expression simplification
on sub-selects much earlier) or some amazingly ugly kluge to change the
reltargetlist of a possibly-already-planned relation. It seems better
just to skip the optimization when the Var is from an upper query level;
the case is not so common that it's likely anyone will notice a few
wasted cycles.

AFAICT this problem only occurs for uplevel LATERAL references, so
back-patch to 9.3 where LATERAL was added.

33343b86

Fix mishandling of FieldSelect-on-whole-row-Var in nested lateral queries. · f330a6d1

由 Tom Lane 提交于 10月 20, 2014

AFAICT this problem only occurs for uplevel LATERAL references, so
back-patch to 9.3 where LATERAL was added.

f330a6d1

01 5月, 2014 1 次提交

Fix indentation of JOIN clauses in rule/view dumps. · d166eed3

由 Tom Lane 提交于 4月 30, 2014

The code attempted to outdent JOIN clauses further left than the parent
FROM keyword, which was odd in any case, and led to inconsistent formatting
since in simple cases the clauses couldn't be moved any further left than
that. And it left a permanent decrement of the indentation level, causing
subsequent lines to be much further left than they should be (again, this
couldn't be seen in simple cases for lack of indentation to give up).

After a little experimentation I chose to make it indent JOIN keywords
two spaces from the parent FROM, which is one space more than the join's
lefthand input in cases where that appears on a different line from FROM.

Back-patch to 9.3. This is a purely cosmetic change, and the bug is quite
old, so that may seem arbitrary; but we are going to be making some other
changes to the indentation behavior in both HEAD and 9.3, so it seems
reasonable to include this in 9.3 too. I committed this one first because
its effects are more visible in the regression test results as they
currently stand than they will be later.

d166eed3

09 1月, 2014 1 次提交

Fix "cannot accept a set" error when only some arms of a CASE return a set. · 080b7db7

由 Tom Lane 提交于 1月 08, 2014

In commit c1352052, I implemented an
optimization that assumed that a function's argument expressions would
either always return a set (ie multiple rows), or always not.  This is
wrong however: we allow CASE expressions in which some arms return a set
of some type and others just return a scalar of that type.  There may be
other examples as well.  To fix, replace the run-time test of whether an
argument returned a set with a static precheck (expression_returns_set).
This adds a little bit of query startup overhead, but it seems barely
measurable.

Per bug #8228 from David Johnston.  This has been broken since 8.0,
so patch all supported branches.

080b7db7

10 12月, 2013 1 次提交

Rename TABLE() to ROWS FROM(). · 53685d79

由 Noah Misch 提交于 12月 10, 2013

SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and
other collection types. This feature, however, deals with set-returning
functions. Use a different syntax for this feature to keep open the
possibility of implementing the standard TABLE().

53685d79

22 11月, 2013 1 次提交

Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. · 784e762e

由 Tom Lane 提交于 11月 21, 2013

This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry. The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others. This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.

This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.

Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).

The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does. There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST(). After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.

Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me

784e762e

24 8月, 2013 1 次提交

In locate_grouping_columns(), don't expect an exact match of Var typmods. · fcf9ecad

由 Tom Lane 提交于 8月 23, 2013

It's possible that inlining of SQL functions (or perhaps other changes?)
has exposed typmod information not known at parse time. In such cases,
Vars generated by query_planner might have valid typmod values while the
original grouping columns only have typmod -1. This isn't a semantic
problem since the behavior of grouping only depends on type not typmod,
but it breaks locate_grouping_columns' use of tlist_member to locate the
matching entry in query_planner's result tlist.

We can fix this without an excessive amount of new code or complexity by
relying on the fact that locate_grouping_columns only gets called when
make_subplanTargetList has set need_tlist_eval == false, and that can only
happen if all the grouping columns are simple Vars. Therefore we only need
to search the sub_tlist for a matching Var, and we can reasonably define a
"match" as being a match of the Var identity fields
varno/varattno/varlevelsup. The code still Asserts that vartype matches,
but ignores vartypmod.

Per bug #8393 from Evan Martin. The added regression test case is
basically the same as his example. This has been broken for a very long
time, so back-patch to all supported branches.

fcf9ecad

29 7月, 2013 1 次提交
- G
  Add SQL Standard WITH ORDINALITY support for UNNEST (and any other SRF) · c62736cc
  由 Greg Stark 提交于 7月 29, 2013
```
Author: Andrew Gierth, David Fetter
Reviewers: Dean Rasheed, Jeevan Chalke, Stephen Frost
```
  c62736cc
27 1月, 2013 1 次提交

Make LATERAL implicit for functions in FROM. · 2378d79a

由 Tom Lane 提交于 1月 26, 2013

The SQL standard does not have general functions-in-FROM, but it does
allow UNNEST() there (see the <collection derived table> production),
and the semantics of that are defined to include lateral references.
So spec compliance requires allowing lateral references within UNNEST()
even without an explicit LATERAL keyword. Rather than making UNNEST()
a special case, it seems best to extend this flexibility to any
function-in-FROM. We'll still allow LATERAL to be written explicitly
for clarity's sake, but it's now a noise word in this context.

In theory this change could result in a change in behavior of existing
queries, by allowing what had been an outer reference in a function-in-FROM
to be captured by an earlier FROM-item at the same level. However, all
pre-9.3 PG releases have a bug that causes them to match variable
references to earlier FROM-items in preference to outer references (and
then throw an error). So no previously-working query could contain the
type of ambiguity that would risk a change of behavior.

Per a suggestion from Andrew Gierth, though I didn't use his patch.

2378d79a

08 8月, 2012 1 次提交

Implement SQL-standard LATERAL subqueries. · 5ebaaa49

由 Tom Lane 提交于 8月 07, 2012

This patch implements the standard syntax of LATERAL attached to a
sub-SELECT in FROM, and also allows LATERAL attached to a function in FROM,
since set-returning function calls are expected to be one of the principal
use-cases.

The main change here is a rewrite of the mechanism for keeping track of
which relations are visible for column references while the FROM clause is
being scanned. The parser "namespace" lists are no longer lists of bare
RTEs, but are lists of ParseNamespaceItem structs, which carry an RTE
pointer as well as some visibility-controlling flags. Aside from
supporting LATERAL correctly, this lets us get rid of the ancient hacks
that required rechecking subqueries and JOIN/ON and function-in-FROM
expressions for invalid references after they were initially parsed.
Invalid column references are now always correctly detected on sight.

In passing, remove assorted parser error checks that are now dead code by
virtue of our having gotten rid of add_missing_from, as well as some
comments that are obsolete for the same reason. (It was mainly
add_missing_from that caused so much fudging here in the first place.)

The planner support for this feature is very minimal, and will be improved
in future patches. It works well enough for testing purposes, though.

catversion bump forced due to new field in RangeTblEntry.

5ebaaa49

05 7月, 2012 1 次提交
- R
  Reduce messages about implicit indexes and sequences to DEBUG1. · d7c73484
  由 Robert Haas 提交于 7月 04, 2012
```
Per recent discussion on pgsql-hackers, these messages are too
chatty for most users.
```
  d7c73484
27 6月, 2012 1 次提交

Make DROP FUNCTION hint more informative. · 0caa0d04

由 Robert Haas 提交于 6月 26, 2012

If you decide you want to take the hint, this gives you something you
can paste right back to the server.

Dean Rasheed

0caa0d04

08 10月, 2011 1 次提交

Support index-only scans using the visibility map to avoid heap fetches. · a2822fb9

由 Tom Lane 提交于 10月 07, 2011

When a btree index contains all columns required by the query, and the
visibility map shows that all tuples on a target heap page are
visible-to-all, we don't need to fetch that heap page. This patch depends
on the previous patches that made the visibility map reliable.

There's a fair amount left to do here, notably trying to figure out a less
chintzy way of estimating the cost of an index-only scan, but the core
functionality seems ready to commit.

Robert Haas and Ibrar Ahmed, with some previous work by Heikki Linnakangas.

a2822fb9

01 12月, 2010 2 次提交

Prevent inlining a SQL function with multiple OUT parameters. · 0d45e8c5

由 Tom Lane 提交于 12月 01, 2010

There were corner cases in which the planner would attempt to inline such
a function, which would result in a failure at runtime due to loss of
information about exactly what the result record type is. Fix by disabling
inlining when the function's recorded result type is RECORD. There might
be some sub-cases where inlining could still be allowed, but this is a
simple and backpatchable fix, so leave refinements for another day.
Per bug #5777 from Nate Carson.

Back-patch to all supported branches. 8.1 happens to avoid a core-dump
here, but it still does the wrong thing.

0d45e8c5

Prevent inlining a SQL function with multiple OUT parameters. · 225f0aa3

由 Tom Lane 提交于 12月 01, 2010

Back-patch to all supported branches. 8.1 happens to avoid a core-dump
here, but it still does the wrong thing.

225f0aa3

13 7月, 2010 1 次提交

Make NestLoop plan nodes pass outer-relation variables into their inner · 53e75768

由 Tom Lane 提交于 7月 12, 2010

relation using the general PARAM_EXEC executor parameter mechanism, rather
than the ad-hoc kluge of passing the outer tuple down through ExecReScan.
The previous method was hard to understand and could never be extended to
handle parameters coming from multiple join levels. This patch doesn't
change the set of possible plans nor have any significant performance effect,
but it's necessary infrastructure for future generalization of the concept
of an inner indexscan plan.

ExecReScan's second parameter is now unused, so it's removed.

53e75768

19 4月, 2010 1 次提交

Add an 'enable_material' GUC. · 5b89ef38

由 Robert Haas 提交于 4月 19, 2010

The logic for determining whether to materialize has been significantly
overhauled for 9.0.  In case there should be any doubt about whether
materialization is a win in any particular case, this should provide a
convenient way of seeing what happens without it; but even with enable_material
turned off, we still materialize in cases where it is required for
correctness.

Thanks to Tom Lane for the review.

5b89ef38

14 12月, 2009 1 次提交

Fix a bug introduced when set-returning SQL functions were made inline-able: · a620d500

由 Tom Lane 提交于 12月 14, 2009

we have to cope with the possibility that the declared result rowtype contains
dropped columns. This fails in 8.4, as per bug #5240.

While at it, be more paranoid about inserting binary coercions when inlining.
The pre-8.4 code did not really need to worry about that because it could not
inline at all in any case where an added coercion could change the behavior
of the function's statement. However, when inlining a SRF we allow sorting,
grouping, and set-ops such as UNION. In these cases, modifying one of the
targetlist entries that the sort/group/setop depends on could conceivably
change the behavior of the function's statement --- so don't inline when
such a case applies.

a620d500

08 10月, 2009 1 次提交

Support use of function argument names to identify which actual arguments · 717fa274

由 Tom Lane 提交于 10月 08, 2009

match which function parameters.  The syntax uses AS, for example
	funcname(value AS arg1, anothervalue AS arg2)

Pavel Stehule

717fa274

12 6月, 2009 1 次提交

Fix things so that you can still do "select foo()" where foo is a SQL · 0c19f058

由 Tom Lane 提交于 6月 11, 2009

function returning setof record. This used to work, more or less
accidentally, but I had broken it while extending the code to allow
materialize-mode functions to be called in select lists. Add a regression
test case so it doesn't get broken again. Per gripe from Greg Davidson.

0c19f058