提交 · b1f254dae9c5ac6e9e3905c4b4217cfb818c8f68 · Greenplum / Gpdb

07 12月, 2018 1 次提交
- H
  
  Remove unused fields from CdbCopy. · b1f254da
  由 Heikki Linnakangas 提交于 12月 07, 2018
  
  b1f254da
04 12月, 2018 1 次提交

Remove obsolete cdbhashnokey(). · 75e54388

由 Heikki Linnakangas 提交于 12月 03, 2018

We can use the more straightforward cdbhashrandomseg() instead.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>

75e54388

03 12月, 2018 1 次提交

Check distribution key in COPY FROM ON SEGMENT on partitioned table. · 8a21f4c5

由 Heikki Linnakangas 提交于 12月 03, 2018

This was previously not implemented, because the gp_distribution_policy
catalog was not populated in QE segments. Starting with commit 7efe3204,
it is, so we can easily support this now.
Reviewed-by: NXiaoran Wang <xiwang@pivotal.io>
Reviewed-by: NAdam Lee <ali@pivotal.io>

8a21f4c5

29 11月, 2018 7 次提交
- H
  
  Simplify COPY's internal struct to track per-partition distribution policy. · bc7d269b
  由 Heikki Linnakangas 提交于 11月 29, 2018
  
  bc7d269b
- H
  Remove unused 'last_hash_field'. · db1746f4
  由 Heikki Linnakangas 提交于 11月 29, 2018
```
We might need it in the future again, but as it stands, it's just dead
code. If we want to resurrect the optimization that needed it, we can put
it back at that point. (I'm not 100% convinced it was correct, it looked
a bit funky..)
```
  db1746f4
- H
  Remove unused helper structs from COPY code. · 9f1fb656
  由 Heikki Linnakangas 提交于 11月 29, 2018
```
GetAttrContext was outright unused, and PartitionData was filled in and
passed around, but not used for anything.
```
  9f1fb656
- H
  Remove unused fields from structs used in COPY. · a6f51149
  由 Heikki Linnakangas 提交于 11月 29, 2018
```
'p_attr_types' and 'part_attr_types' fields are unused. 'p_nattrs' is
redundant, you can look at policy->nattrs instead.
```
  a6f51149
- H
  
  Misc little cleanup of COPY code. · d53ea12e
  由 Heikki Linnakangas 提交于 11月 29, 2018
  
  d53ea12e
- H
  Add test case for COPY with differing distribution key attr numbers. · 3ce663a2
  由 Heikki Linnakangas 提交于 11月 29, 2018
```
Melanie pointed out that there is no existing test for this scenario.
Also expand the comment in the code to explain this scenario.
```
  3ce663a2
- H
  Clarify code on the roles of QD and QE, around CopyInitPartitioningState. · 4bb3376c
  由 Heikki Linnakangas 提交于 11月 29, 2018
```
The control flow and the comment was pretty confusing. Rearrange it.

Per comments from @melanieplageman and @dgkimura.
```
  4bb3376c
14 11月, 2018 1 次提交

Avoid freeing memory in error path · de09e863

由 Daniel Gustafsson 提交于 11月 14, 2018

Erroring out via ereport(ERROR ..) will clean up resources allocated
during execution, so explicitly freeing right before is not useful
(unless the allocation is the in TopMemoryContext).  Remove pfree()
calls for lower allocations, and reorder one to happen just after a
conditional ereport instead to make for slightly easier debugging
when breaking on the error.
Reviewed-by: NJacob Champion <pchampion@pivotal.io>

de09e863

13 11月, 2018 1 次提交

Support 'copy (select statement) to file on segment' (#6077) · bad6cebc

由 Jinbao Chen 提交于 11月 13, 2018

In ‘copy (select statement) to file’, we generate a query plan and set
its dest receivor to copy_dest_receive. And run the dest receivor on QD.
In 'copy (select statement) to file on segment', we modify the query plan,
delete gather mothon, and let dest receivor run on QE.

Change 'isCtas' in Query to 'parentStmtType' to be able to mark the upper
utility statement type. Add a CopyIntoClause node to store copy
informations. Add copyIntoClause to PlannedStmt.

In postgres, we don't need to make a different query plan for the
query in the utility stament. But in greenplum, we need to.
So we use a field to indicate whether the query is contained in utitily
statemnt, and the type of utitily statemnt.

Actually the behavior of 'copy (select statement) to file on segment'
is very similar to 'SELECT ... INTO ...' and 'CREATE TABLE ... AS SELECT ...'.
We use distribution policy inherent in the query result as the final data
distribution policy. If not, we use the first clomn in target list as the key,
and redistribute. The only difference is that we used 'copy_dest_receiver'
instead of 'intorel_dest_receiver'

bad6cebc

06 11月, 2018 1 次提交

Pass type OIDs in makeCdbHash() call. · ce38fc23

由 Heikki Linnakangas 提交于 11月 05, 2018

Avoids looking through domains, array types, etc. on every call. That
seems like a more sensible API, since the data types don't change during
the lifetime of a CdbHash.

Make cdbhash() more convenient for callers, by handling NULLs within the
function. This way the callers don't need to do the NULL check and call
either cdbhash() or cdbhashnull().

This also fixes the performance issue caused by the syscache lookups
reported in https://github.com/greenplum-db/gpdb/issues/5961. The type's
type is now checked only once, when the CdbHash object is initialized,
instead of every row.
Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>

ce38fc23

05 11月, 2018 1 次提交

Add test case about shell-quoting in COPY PROGRAM · 4037f51b

由 Heikki Linnakangas 提交于 10月 11, 2018

Notes in testcase about backslash escaping:
- Need to add ESCAPE 'OFF' to COPY ... PROGRAM
- echo will behaves differently on different platforms, force to use bash shell with -E option.
Signed-off-by: NMing LI <liming01@gmail.com>

4037f51b

29 10月, 2018 1 次提交

Remove memory context argument from GpPolicyFetch and friends. · 6d17d31f

由 Heikki Linnakangas 提交于 10月 29, 2018

Most callers were passing CurrentMemoryContext, so this makes most callers
slightly simpler. The few places that needed to pass a different context
now switch to the correct one before calling the GpPolicy*() function.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>

6d17d31f

25 10月, 2018 1 次提交

Unify the way to fetch/manage the number of segments (#6034) · 8eed4217

由 Tang Pengzhou 提交于 10月 25, 2018

* Don't use GpIdentity.numsegments directly for the number of segments

Use getgpsegmentCount() instead.

* Unify the way to fetch/manage the number of segments

Commit e0b06678 lets us expanding a GPDB cluster without a restart,
the number of segments may changes during a transaction now, so we
need to take care of the numsegments.

We now have two way to get segments number, 1) from GpIdentity.numsegments
2) from gp_segment_configuration (cdb_component_dbs) which dispatcher used
to decide the segments range of dispatching. We did some hard work to
update GpIdentity.numsegments correctly within e0b06678 which made the
management of segments more complicated, now we want to use an easier way
to do it:

1. We only allow getting segments info (include number of segments) through
gp_segment_configuration, gp_segment_configuration has newest segments info,
there is no need to update GpIdentity.numsegments, GpIdentity.numsegments is
left only for debugging and can be removed totally in the future.

2. Each global transaction fetches/updates the newest snapshot of
gp_segment_configuration and never change it until the end of transaction
unless a writer gang is lost, so a global transaction can see consistent
state of segments. We used to use gxidDispatched to do the same thing, now
it can be removed.

* Remove GpIdentity.numsegments

GpIdentity.numsegments take no effect now, remove it. This commit
does not remove gp_num_contents_in_cluster because it needs to
modify utilities like gpstart, gpstop, gprecoverseg etc, let's
do such cleanup work in another PR.

* Exchange the default UP/DOWN value in fts cache

Previously, Fts prober read gp_segment_configuration, checked the
status of segments and then set the status of segments in the shared
memory struct named ftsProbeInfo->fts_status[], so other components
(mainly used by dispatcher) can detect a segment was down.

All segments were initialized as down and then be updated to up in
most common cases, this brings two problems:

1. The fts_status is invalid until FTS does the first loop, so QD
need to check ftsProbeInfo->fts_statusVersion > 0
2. gpexpand add a new segment in gp_segment_configuration, the
new added segment may be marked as DOWN if FTS doesn't scan it
yet.

This commit changes the default value from DOWN to UP which can
resolve problems mentioned above.

* Fts should not be used to notify backends that a gpexpand occurs

As Ashwin mentioned in PR#5679, "I don't think giving FTS responsibility to
provide new segment count is right. FTS should only be responsible for HA
of the segments. The dispatcher should independently figure out the count
based on catalog.gp_segment_configuration should be the only way to get
the segment count", FTS should decouple from gpexpand.

* Access gp_segment_configuration inside a transaction

* upgrade log level from ERROR to FATAL if expand version changed

* Modify gpexpand test cases according to new design

8eed4217

19 10月, 2018 1 次提交

Fix error handling in "COPY <table> TO <file>". · eab449f8

由 Heikki Linnakangas 提交于 10月 19, 2018

If an error occurred in the segments, in a "COPY <table> TO <file>"
command, the COPY was stopped, but the error was not reported to the user.
That gave the false impression that it finished successfully, but what you
actually got was an incomplete file.

A test case is included. It uses a little helper output function that
sometimes throws an error. Output functions are fairly unlikely to fail,
but it could happen e.g. because of an out of memory error, or a disk
failure. The "COPY (SELECT ...) TO <file>" variant did not suffer from
this (otherwise, a query that throws an error would've been a much simpler
way to test this.)

The reason for this was that the code in cdbCopyGetData() that called
PQgetResult(), and extracted the error message from the result, didn't
indicate to the caller in any way that the error happened. To fix, delay
the call to PQgetResult(), to a later call to cdbCopyEnd(). cdbCopyEnd()
already had the logic to extract the error information from the PGresult,
and throw it to the user. While we're at it, refactor cdbCopyEnd a
little bit, to give the callers a nicer function signature.

I also changed a few places that used 32-bit int to store rejected row
counts, to use int64 instead. There was a FIXME comment about that. I
didn't fix all the places that do that, though, so I moved the FIXME to
one of the remaining places.

Apply to master branch only. GPDB 5 didn't handle this too well, either;
with the included test case, you got an error like this:

postgres=# copy broken_type_test to '/tmp/x';
ERROR:  missing error text

That's not very nice, but at least you get an error, even if it's not a very
good one. The code looks quite different in 5X_STABLE, so I'm not going to
attempt improving that.
Reviewed-by: NAdam Lee <ali@pivotal.io>

eab449f8

28 9月, 2018 1 次提交

Allow tables to be distributed on a subset of segments · 4eb65a53

由 ZhangJackey 提交于 9月 28, 2018

There was an assumption in gpdb that a table's data is always
distributed on all segments, however this is not always true for example
when a cluster is expanded from M segments to N (N > M) all the tables
are still on M segments, to workaround the problem we used to have to
alter all the hash distributed tables to randomly distributed to get
correct query results, at the cost of bad performance.

Now we support table data to be distributed on a subset of segments.

A new columne `numsegments` is added to catalog table
`gp_distribution_policy` to record how many segments a table's data is
distributed on.  By doing so we could allow DMLs on M tables, joins
between M and N tables are also supported.

```sql
-- t1 and t2 are both distributed on (c1, c2),
-- one on 1 segments, the other on 2 segments
select localoid::regclass, attrnums, policytype, numsegments
    from gp_distribution_policy;
 localoid | attrnums | policytype | numsegments
----------+----------+------------+-------------
 t1       | {1,2}    | p          |           1
 t2       | {1,2}    | p          |           2
(2 rows)

-- t1 and t1 have exactly the same distribution policy,
-- join locally
explain select * from t1 a join t1 b using (c1, c2);
                   QUERY PLAN
------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Seq Scan on t1 b
 Optimizer: legacy query optimizer

-- t1 and t2 are both distributed on (c1, c2),
-- but as they have different numsegments,
-- one has to be redistributed
explain select * from t1 a join t2 b using (c1, c2);
                          QUERY PLAN
------------------------------------------------------------------
 Gather Motion 1:1  (slice2; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Redistribute Motion 2:1  (slice1; segments: 2)
                     Hash Key: b.c1, b.c2
                     ->  Seq Scan on t2 b
 Optimizer: legacy query optimizer
```

4eb65a53

06 9月, 2018 1 次提交

Fix command tag in "COPY (SELECT ...) TO <file>". · f5130c20

由 Heikki Linnakangas 提交于 9月 05, 2018

It used to always say "COPY 0", instead of the number of rows copied. This
source line was added in PostgreSQL 9.0 (commit 8ddc05fb), but it was
missed in the merge. Add a test case to check the command tags of
different variants of COPY, including this one.

f5130c20

05 9月, 2018 1 次提交

COPY: fix transfer of "unknown" data type · ca869e25

由 Jim Doty 提交于 8月 20, 2018

The "unknown" type has an attlen of -2, which signifies that the actual
length is determined by strlen(). We weren't handling this case, so
handle it now.
Co-authored-by: NJacob Champion <pchampion@pivotal.io>

ca869e25

15 8月, 2018 1 次提交

Erase FIXMEs in errcodes.txt · 96968046

由 xiong-gang 提交于 8月 15, 2018

* Remove ERRCODE_GP_FEATURE_NOT_SUPPORTED and use ERRCODE_FEATURE_NOT_SUPPORTED instead
* Remove ERROR_INVALID_WINDOW_FRAME_PARAMETER and use ERRCODE_WINDOWING_ERROR instead
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>

96968046

14 8月, 2018 1 次提交

Refine dispatching of COPY command · a1b6b2ae

由 Pengzhou Tang 提交于 7月 19, 2018

Previously, COPY use CdbDispatchUtilityStatement directly to
dispatch 'COPY' statements to all QEs and then send/receive
data from primaryWriterGang, this way happens to work because
primaryWriterGang is not recycled when a dispatcher state is
destroyed. This seems nasty because the COPY command has finished
logically.

This commit splits the COPY dispatching logic to two parts to
make it more reasonable.

a1b6b2ae

03 8月, 2018 2 次提交

Remove PQArgBlock redefine workarounds · b9817c51

由 Daniel Gustafsson 提交于 8月 03, 2018

The definitions of PQArgBlock in libpq.h and libpq-fe.h were in
conflict with each other when including both.  The definition in
libpq.h was superfluous and removed in 23c7b583, so remove the
redefines to clean up the code.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

b9817c51

K
Revert "Merge with PostgreSQL 9.2beta2." · e0aa3ef2
由 Karen Huddleston 提交于 8月 02, 2018
```
This reverts commit 4750e1b6.
```
e0aa3ef2

02 8月, 2018 1 次提交

Merge with PostgreSQL 9.2beta2. · 4750e1b6

由 Richard Guo 提交于 8月 02, 2018

This is the final batch of commits from PostgreSQL 9.2 development,
up to the point where the REL9_2_STABLE branch was created, and 9.3
development started on the PostgreSQL master branch.

Notable upstream changes:

* Index-only scan was included in the batch of upstream commits. It
  allows queries to retrieve data only from indexes, avoiding heap access.

* Group commit was added to work effectively under heavy load. Previously,
  batching of commits became ineffective as the write workload increased,
  because of internal lock contention.

* A new fast-path lock mechanism was added to reduce the overhead of
  taking and releasing certain types of locks which are taken and released
  very frequently but rarely conflict.

* The new "parameterized path" mechanism was added. It allows inner index
  scans to use values from relations that are more than one join level up
  from the scan. This can greatly improve performance in situations where
  semantic restrictions (such as outer joins) limit the allowed join orderings.

* SP-GiST (Space-Partitioned GiST) index access method was added to support
  unbalanced partitioned search structures. For suitable problems, SP-GiST can
  be faster than GiST in both index build time and search time.

* Checkpoints now are performed by a dedicated background process. Formerly
  the background writer did both dirty-page writing and checkpointing. Separating
  this into two processes allows each goal to be accomplished more predictably.

* Custom plan was supported for specific parameter values even when using
  prepared statements.

* API for FDW was improved to provide multiple access "paths" for their tables,
  allowing more flexibility in join planning.

* Security_barrier option was added for views to prevents optimizations that
  might allow view-protected data to be exposed to users.

* Range data type was added to store a lower and upper bound belonging to its
  base data type.

* CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
  SELECT query is planned during the execution of the utility. To conform to
  this change, GPDB executes the utility statement only on QD and dispatches
  the plan of the SELECT query to QEs.
Co-authored-by: NAdam Lee <ali@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NPaul Guo <paulguo@gmail.com>
Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

4750e1b6

11 7月, 2018 1 次提交

Improve handling of rd_cdbpolicy. · 0bfc7251

由 Ashwin Agrawal 提交于 7月 06, 2018

Pointers from Relation object needs to be handled with special care. As having
refcount on the object doesn't mean the object is not modified. Incase of cache
invalidation message handling Relation object gets *rebuild*. As part of rebuild
only guarantee maintained is that Relation object address will not change. But
the memory addresses inside the Relation object gets freed and freshly allocated
and populated with latest data from catalog.

For example below code sequence is dangerous

rel->rd_cdbpolicy = original_policy;
GpPolicyReplace(RelationGetRelid(rel), original_policy);

If relcache invalidation message is served after assigning value to
rd_cdbpolicy, the rebuild will free the memory for rd_cdbpolicy (which means
original_policy) and replaced with current contents of
gp_distribution_policy. So, when GpPolicyReplace() called with original_policy
is going to access freed memory. Plus, rd_cdbpolicy will have stale value in
cache and not intended refreshed value. This issue was hit in CI few times and
reproduces with higher frequency with `-DRELCACHE_FORCE_RELEASE`.

Hence this patch fixes all uses to rd_cdbpolicy to make use of rd_cdbpolicy
pointer directly from Relation object and also to update the catalog first
before assigning the value to rd_cdbpolicy.

0bfc7251

10 7月, 2018 1 次提交

Include required headers · b25f7613

由 Daniel Gustafsson 提交于 7月 10, 2018

Make sure to include all required header files to silence compilers
that are picky about that.

b25f7613

04 7月, 2018 1 次提交
- A
  copy.c: dispatch missing AO segno map for not-partitioned tables · 14cb1039
  由 Adam Lee 提交于 7月 03, 2018
```
The map was missed by mistake, all AO loading actions need it.
```
  14cb1039
27 6月, 2018 1 次提交
- A
  COPY: don't dispatch AO segno map for unloading · b320948d
  由 Adam Lee 提交于 6月 26, 2018
```
Unloading doesn't need it, checking the distribution policy neither.
```
  b320948d
19 6月, 2018 2 次提交

A
Fix COPY TO ON SEGMENT processed counting · cb63e543
由 Adam Lee 提交于 6月 13, 2018
```
The processed variable should not be reset while looping all partitions.
```
cb63e543

Fix COPY TO IGNORE EXTERNAL PARTITIONS · f118f4bd

由 Adam Lee 提交于 6月 12, 2018

BeginCopy() returns a brand new CopyState but ignored the value of
skip_ext_partition, set after it.

It's a simple boolean of struct CopyStmt, no need to wrap in options.

f118f4bd

11 6月, 2018 1 次提交

Fix external table with non-UTF8 encoding data · 6822104f

由 Adam Lee 提交于 6月 08, 2018

1, pass external table encoding to copy's options, then set
cstate->file_encoding to it, for reading and writing.

2, after the merge, copy state doesn't have a member of client encoding,
which used to set to the target encoding, get the converted data as a
client, now passes the file encoding (from copy options) to convert
directly.

6822104f

25 5月, 2018 1 次提交

Fix an issue with COPY FROM for partition tables · 01a22423

由 Jimmy Yih 提交于 5月 23, 2018

The Postgres 9.1 merge introduced a problem where issuing a COPY FROM
to a partition table could result in an unexpected error, "ERROR:
extra data after last expected column", even though the input file was
correct. This would happen if the partition table had partitions where
the relnatts were not all the same (e.g. ALTER TABLE DROP COLUMN,
ALTER TABLE ADD COLUMN, and then ALTER TABLE EXCHANGE PARTITION). The
internal COPY logic would always use the COPY state's relation, the
partition root, instead of the actual partition's relation to obtain
the relnatts value. In fact, the only reason this is intermittently
seen is because the COPY logic, when working on the leaf partition's
relation that has a different relnatts value, was looking beyond a
boolean array's allocated memory and got a phony value that would
evaluate to TRUE.
Co-authored-by: NJimmy Yih <jyih@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

01a22423

17 5月, 2018 1 次提交

COPY: expand the type of numcompleted to 64 bits · 8d40268b

由 Adam Lee 提交于 5月 14, 2018

Integer overflow occurs without this when copied more than 2^31 rows,
under the `COPY ON SEGMENT` mode.

Errors happen when it is casted to uint64, the type of `processed` in
`CopyStateData`, third-party Postgres driver, which takes it as an
int64, fails out of range.

8d40268b

08 5月, 2018 1 次提交
- A
  Don't pfree() the new tuple generated by triggers · 3ce5921b
  由 Adam Lee 提交于 5月 03, 2018
```
ExecBRInsertTriggers() uses the per tuple memory context, which might be
reset and pfree() SEGV.
```
  3ce5921b
29 3月, 2018 1 次提交

Support replicated table in GPDB · 7efe3204

由 Pengzhou Tang 提交于 1月 29, 2018

* Support replicated table in GPDB

Currently, tables are distributed across all segments by hash or random in GPDB. There
are requirements to introduce a new table type that all segments have the duplicate
and full table data called replicated table.

To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark
a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify
the distribution of tuples of a replicated table.  CdbLocusType_SegmentGeneral implies
data is generally available on all segments but not available on qDisp, so plan node with
this locus type can be flexibly planned to execute on either single QE or all QEs. it is
similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral
node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion
on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other
rel has bottleneck locus type, a problem is such motion may be redundant if the single QE
is not promoted to executed on qDisp finally, so we need to detect such case and omit the
redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since
it's always implies a broadcast motion bellow, it's not easy to plan such node as direct
dispatch to avoid getting duplicate data.

We don't support replicated table with inherit/partition by clause now, the main problem is
that update/delete on multiple result relations can't work correctly now, we can fix this
later.

* Allow spi_* to access replicated table on QE

Previously, GPDB didn't allow QE to access non-catalog table because the
data is incomplete,
we can remove this limitation now if it only accesses replicated table.

One problem is QE need to know if a table is replicated table,
previously, QE didn't maintain
the gp_distribution_policy catalog, so we need to pass policy info to QE
for replicated table.

* Change schema of gp_distribution_policy to identify replicated table

Previously, we used a magic number -128 in gp_distribution_policy table
to identify replicated table which is quite a hack, so we add a new column
in gp_distribution_policy to identify replicated table and partitioned
table.

This commit also abandon the old way that used 1-length-NULL list and
2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED
FULLY clause.

Beside, this commit refactor the code to make the decision-making of
distribution policy more clear.

* support COPY for replicated table

* Disable row ctid unique path for replicated table.
  Previously, GPDB use a special Unique path on rowid to address queries
  like "x IN (subquery)", For example:
  select * from t1 where t1.c2 in (select c2 from t3), the plan looks
  like:
   ->  HashAggregate
         Group By: t1.ctid, t1.gp_segment_id
            ->  Hash Join
                  Hash Cond: t2.c2 = t1.c2
                ->  Seq Scan on t2
                ->  Hash
                    ->  Seq Scan on t1

  Obviously, the plan is wrong if t1 is a replicated table because ctid
  + gp_segment_id can't identify a tuple, in replicated table, a logical
  row may have different ctid and gp_segment_id. So we disable such plan
  for replicated table temporarily, it's not the best way because rowid
  unique way maybe the cheapest plan than normal hash semi join, so
  we left a FIXME for later optimization.

* ORCA related fix
  Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io>
  Fallback to legacy query optimizer for queries over replicated table

* Adapt pg_dump/gpcheckcat to replicated table
  gp_distribution_policy is no longer a master-only catalog, do
  same check as other catalogs.

* Support gpexpand on replicated table && alter the dist policy of replicated table

7efe3204

02 3月, 2018 1 次提交

Need to close pipe files even if COPY PROGRAME already exit (#4585) · 70f55c3d

由 Ming LI 提交于 3月 02, 2018

Need to close pipe files even if COPY PROGRAME already exit before parent process ask child process to exit, otherwise it will lead file description leak in a long time running.

70f55c3d

01 2月, 2018 1 次提交

Fix COPY PROGRAM issues · d6bd4ac4

由 Adam Lee 提交于 1月 29, 2018

1, pipes might not exist while in close_program_pipes(), check it.
  For instance, relation doesn't exist, copy workflow fails before
  executing the program, "cstate->program_pipes->pid" dereferences NULL.

2, the program might be still running or hang when copy exits, kill it.
  Cases like the program hangs, doesn't take signals, user is trying to
  cancel.
  Since it's already the end of copy, and the program was started by
  copy, it should be safe to kill to clean up.

d6bd4ac4

30 1月, 2018 2 次提交

Add hook to handle query info · 49b9bbc8

由 Wang Hao 提交于 1月 29, 2018

The hook is called for
 - each query Submit/Start/Finish/Abort/Error
 - each plan node, on executor Init/Start/Finish

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

49b9bbc8

Alloc Instrumentation in Shmem · 67db4274

由 Wang Hao 提交于 10月 20, 2017

On postmaster start, additional space in Shmem is allocated for Instrumentation
slots and a header. The number of slots is controlled by a cluster level GUC,
default is 5MB (approximate 30K slots). The default number is estimated by 250
concurrent queries * 120 nodes per query. If the slots are exhausted,
instruments are allocated in local memory as fallback.

These slots are organized as a free list:
  - Header points to the first free slot.
  - Each free slot points to next free slot.
  - The last free slot's next pointer is NULL.

ExecInitNode calls GpInstrAlloc to pick an empty slot from the free list:
  - The free slot pointed by the header is picked.
  - The picked slot's next pointer is assigned to the header.
  - A spin lock on the header to prevent concurrent writing.
  - When GUC gp_enable_query_metrics is off, Instrumentation will
    be allocated in local memory.

Slots are recycled by resource owner callback function.

Benchmark result with TPC-DS shows performance impact by this commit is less than 0.1%
To improve performance of instrumenting, following optimizations are added:
  - Introduce instrument_option to skip CDB info collection
  - Optimize tuplecount in Instrumentation from double to uint64
  - Replace instrument tuple entry/exit function with macro
  - Add need_timer to Instrumentation, to allow eliminating of timing overhead.
    This is porting part of upstream commit:
------------------------------------------------------------------------
commit af7914c6
Author: Robert Haas <rhaas@postgresql.org>
Date:   Tue Feb 7 11:23:04 2012 -0500

Add TIMING option to EXPLAIN, to allow eliminating of timing overhead.
------------------------------------------------------------------------

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

67db4274