提交 · 4eb65a53dea6a6f380415f3a0fe3934ac20ce8ca · Greenplum / Gpdb

28 9月, 2018 1 次提交

Allow tables to be distributed on a subset of segments · 4eb65a53

由 ZhangJackey 提交于 9月 28, 2018

There was an assumption in gpdb that a table's data is always
distributed on all segments, however this is not always true for example
when a cluster is expanded from M segments to N (N > M) all the tables
are still on M segments, to workaround the problem we used to have to
alter all the hash distributed tables to randomly distributed to get
correct query results, at the cost of bad performance.

Now we support table data to be distributed on a subset of segments.

A new columne `numsegments` is added to catalog table
`gp_distribution_policy` to record how many segments a table's data is
distributed on.  By doing so we could allow DMLs on M tables, joins
between M and N tables are also supported.

```sql
-- t1 and t2 are both distributed on (c1, c2),
-- one on 1 segments, the other on 2 segments
select localoid::regclass, attrnums, policytype, numsegments
    from gp_distribution_policy;
 localoid | attrnums | policytype | numsegments
----------+----------+------------+-------------
 t1       | {1,2}    | p          |           1
 t2       | {1,2}    | p          |           2
(2 rows)

-- t1 and t1 have exactly the same distribution policy,
-- join locally
explain select * from t1 a join t1 b using (c1, c2);
                   QUERY PLAN
------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Seq Scan on t1 b
 Optimizer: legacy query optimizer

-- t1 and t2 are both distributed on (c1, c2),
-- but as they have different numsegments,
-- one has to be redistributed
explain select * from t1 a join t2 b using (c1, c2);
                          QUERY PLAN
------------------------------------------------------------------
 Gather Motion 1:1  (slice2; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Redistribute Motion 2:1  (slice1; segments: 2)
                     Hash Key: b.c1, b.c2
                     ->  Seq Scan on t2 b
 Optimizer: legacy query optimizer
```

4eb65a53

06 9月, 2018 1 次提交

Fix command tag in "COPY (SELECT ...) TO <file>". · f5130c20

由 Heikki Linnakangas 提交于 9月 05, 2018

It used to always say "COPY 0", instead of the number of rows copied. This
source line was added in PostgreSQL 9.0 (commit 8ddc05fb), but it was
missed in the merge. Add a test case to check the command tags of
different variants of COPY, including this one.

f5130c20

05 9月, 2018 1 次提交

COPY: fix transfer of "unknown" data type · ca869e25

由 Jim Doty 提交于 8月 20, 2018

The "unknown" type has an attlen of -2, which signifies that the actual
length is determined by strlen(). We weren't handling this case, so
handle it now.
Co-authored-by: NJacob Champion <pchampion@pivotal.io>

ca869e25

15 8月, 2018 1 次提交

Erase FIXMEs in errcodes.txt · 96968046

由 xiong-gang 提交于 8月 15, 2018

* Remove ERRCODE_GP_FEATURE_NOT_SUPPORTED and use ERRCODE_FEATURE_NOT_SUPPORTED instead
* Remove ERROR_INVALID_WINDOW_FRAME_PARAMETER and use ERRCODE_WINDOWING_ERROR instead
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>

96968046

14 8月, 2018 1 次提交

Refine dispatching of COPY command · a1b6b2ae

由 Pengzhou Tang 提交于 7月 19, 2018

Previously, COPY use CdbDispatchUtilityStatement directly to
dispatch 'COPY' statements to all QEs and then send/receive
data from primaryWriterGang, this way happens to work because
primaryWriterGang is not recycled when a dispatcher state is
destroyed. This seems nasty because the COPY command has finished
logically.

This commit splits the COPY dispatching logic to two parts to
make it more reasonable.

a1b6b2ae

03 8月, 2018 2 次提交

Remove PQArgBlock redefine workarounds · b9817c51

由 Daniel Gustafsson 提交于 8月 03, 2018

The definitions of PQArgBlock in libpq.h and libpq-fe.h were in
conflict with each other when including both.  The definition in
libpq.h was superfluous and removed in 23c7b583, so remove the
redefines to clean up the code.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

b9817c51

K
Revert "Merge with PostgreSQL 9.2beta2." · e0aa3ef2
由 Karen Huddleston 提交于 8月 02, 2018
```
This reverts commit 4750e1b6.
```
e0aa3ef2

02 8月, 2018 1 次提交

Merge with PostgreSQL 9.2beta2. · 4750e1b6

由 Richard Guo 提交于 8月 02, 2018

This is the final batch of commits from PostgreSQL 9.2 development,
up to the point where the REL9_2_STABLE branch was created, and 9.3
development started on the PostgreSQL master branch.

Notable upstream changes:

* Index-only scan was included in the batch of upstream commits. It
  allows queries to retrieve data only from indexes, avoiding heap access.

* Group commit was added to work effectively under heavy load. Previously,
  batching of commits became ineffective as the write workload increased,
  because of internal lock contention.

* A new fast-path lock mechanism was added to reduce the overhead of
  taking and releasing certain types of locks which are taken and released
  very frequently but rarely conflict.

* The new "parameterized path" mechanism was added. It allows inner index
  scans to use values from relations that are more than one join level up
  from the scan. This can greatly improve performance in situations where
  semantic restrictions (such as outer joins) limit the allowed join orderings.

* SP-GiST (Space-Partitioned GiST) index access method was added to support
  unbalanced partitioned search structures. For suitable problems, SP-GiST can
  be faster than GiST in both index build time and search time.

* Checkpoints now are performed by a dedicated background process. Formerly
  the background writer did both dirty-page writing and checkpointing. Separating
  this into two processes allows each goal to be accomplished more predictably.

* Custom plan was supported for specific parameter values even when using
  prepared statements.

* API for FDW was improved to provide multiple access "paths" for their tables,
  allowing more flexibility in join planning.

* Security_barrier option was added for views to prevents optimizations that
  might allow view-protected data to be exposed to users.

* Range data type was added to store a lower and upper bound belonging to its
  base data type.

* CTAS (CREATE TABLE AS/SELECT INTO) is now treated as utility statement. The
  SELECT query is planned during the execution of the utility. To conform to
  this change, GPDB executes the utility statement only on QD and dispatches
  the plan of the SELECT query to QEs.
Co-authored-by: NAdam Lee <ali@pivotal.io>
Co-authored-by: NAlexandra Wang <lewang@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>
Co-authored-by: NAsim R P <apraveen@pivotal.io>
Co-authored-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
Co-authored-by: NGang Xiong <gxiong@pivotal.io>
Co-authored-by: NHaozhou Wang <hawang@pivotal.io>
Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Co-authored-by: NJesse Zhang <sbjesse@gmail.com>
Co-authored-by: NJinbao Chen <jinchen@pivotal.io>
Co-authored-by: NJoao Pereira <jdealmeidapereira@pivotal.io>
Co-authored-by: NMelanie Plageman <mplageman@pivotal.io>
Co-authored-by: NPaul Guo <paulguo@gmail.com>
Co-authored-by: NRichard Guo <guofenglinux@gmail.com>
Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
Co-authored-by: NTaylor Vesely <tvesely@pivotal.io>
Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>

4750e1b6

11 7月, 2018 1 次提交

Improve handling of rd_cdbpolicy. · 0bfc7251

由 Ashwin Agrawal 提交于 7月 06, 2018

Pointers from Relation object needs to be handled with special care. As having
refcount on the object doesn't mean the object is not modified. Incase of cache
invalidation message handling Relation object gets *rebuild*. As part of rebuild
only guarantee maintained is that Relation object address will not change. But
the memory addresses inside the Relation object gets freed and freshly allocated
and populated with latest data from catalog.

For example below code sequence is dangerous

rel->rd_cdbpolicy = original_policy;
GpPolicyReplace(RelationGetRelid(rel), original_policy);

If relcache invalidation message is served after assigning value to
rd_cdbpolicy, the rebuild will free the memory for rd_cdbpolicy (which means
original_policy) and replaced with current contents of
gp_distribution_policy. So, when GpPolicyReplace() called with original_policy
is going to access freed memory. Plus, rd_cdbpolicy will have stale value in
cache and not intended refreshed value. This issue was hit in CI few times and
reproduces with higher frequency with `-DRELCACHE_FORCE_RELEASE`.

Hence this patch fixes all uses to rd_cdbpolicy to make use of rd_cdbpolicy
pointer directly from Relation object and also to update the catalog first
before assigning the value to rd_cdbpolicy.

0bfc7251

10 7月, 2018 1 次提交

Include required headers · b25f7613

由 Daniel Gustafsson 提交于 7月 10, 2018

Make sure to include all required header files to silence compilers
that are picky about that.

b25f7613

04 7月, 2018 1 次提交
- A
  copy.c: dispatch missing AO segno map for not-partitioned tables · 14cb1039
  由 Adam Lee 提交于 7月 03, 2018
```
The map was missed by mistake, all AO loading actions need it.
```
  14cb1039
27 6月, 2018 1 次提交
- A
  COPY: don't dispatch AO segno map for unloading · b320948d
  由 Adam Lee 提交于 6月 26, 2018
```
Unloading doesn't need it, checking the distribution policy neither.
```
  b320948d
19 6月, 2018 2 次提交

A
Fix COPY TO ON SEGMENT processed counting · cb63e543
由 Adam Lee 提交于 6月 13, 2018
```
The processed variable should not be reset while looping all partitions.
```
cb63e543

Fix COPY TO IGNORE EXTERNAL PARTITIONS · f118f4bd

由 Adam Lee 提交于 6月 12, 2018

BeginCopy() returns a brand new CopyState but ignored the value of
skip_ext_partition, set after it.

It's a simple boolean of struct CopyStmt, no need to wrap in options.

f118f4bd

11 6月, 2018 1 次提交

Fix external table with non-UTF8 encoding data · 6822104f

由 Adam Lee 提交于 6月 08, 2018

1, pass external table encoding to copy's options, then set
cstate->file_encoding to it, for reading and writing.

2, after the merge, copy state doesn't have a member of client encoding,
which used to set to the target encoding, get the converted data as a
client, now passes the file encoding (from copy options) to convert
directly.

6822104f

25 5月, 2018 1 次提交

Fix an issue with COPY FROM for partition tables · 01a22423

由 Jimmy Yih 提交于 5月 23, 2018

The Postgres 9.1 merge introduced a problem where issuing a COPY FROM
to a partition table could result in an unexpected error, "ERROR:
extra data after last expected column", even though the input file was
correct. This would happen if the partition table had partitions where
the relnatts were not all the same (e.g. ALTER TABLE DROP COLUMN,
ALTER TABLE ADD COLUMN, and then ALTER TABLE EXCHANGE PARTITION). The
internal COPY logic would always use the COPY state's relation, the
partition root, instead of the actual partition's relation to obtain
the relnatts value. In fact, the only reason this is intermittently
seen is because the COPY logic, when working on the leaf partition's
relation that has a different relnatts value, was looking beyond a
boolean array's allocated memory and got a phony value that would
evaluate to TRUE.
Co-authored-by: NJimmy Yih <jyih@pivotal.io>
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

01a22423

17 5月, 2018 1 次提交

COPY: expand the type of numcompleted to 64 bits · 8d40268b

由 Adam Lee 提交于 5月 14, 2018

Integer overflow occurs without this when copied more than 2^31 rows,
under the `COPY ON SEGMENT` mode.

Errors happen when it is casted to uint64, the type of `processed` in
`CopyStateData`, third-party Postgres driver, which takes it as an
int64, fails out of range.

8d40268b

08 5月, 2018 1 次提交
- A
  Don't pfree() the new tuple generated by triggers · 3ce5921b
  由 Adam Lee 提交于 5月 03, 2018
```
ExecBRInsertTriggers() uses the per tuple memory context, which might be
reset and pfree() SEGV.
```
  3ce5921b
29 3月, 2018 1 次提交

Support replicated table in GPDB · 7efe3204

由 Pengzhou Tang 提交于 1月 29, 2018

* Support replicated table in GPDB

Currently, tables are distributed across all segments by hash or random in GPDB. There
are requirements to introduce a new table type that all segments have the duplicate
and full table data called replicated table.

To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark
a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify
the distribution of tuples of a replicated table.  CdbLocusType_SegmentGeneral implies
data is generally available on all segments but not available on qDisp, so plan node with
this locus type can be flexibly planned to execute on either single QE or all QEs. it is
similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral
node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion
on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other
rel has bottleneck locus type, a problem is such motion may be redundant if the single QE
is not promoted to executed on qDisp finally, so we need to detect such case and omit the
redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since
it's always implies a broadcast motion bellow, it's not easy to plan such node as direct
dispatch to avoid getting duplicate data.

We don't support replicated table with inherit/partition by clause now, the main problem is
that update/delete on multiple result relations can't work correctly now, we can fix this
later.

* Allow spi_* to access replicated table on QE

Previously, GPDB didn't allow QE to access non-catalog table because the
data is incomplete,
we can remove this limitation now if it only accesses replicated table.

One problem is QE need to know if a table is replicated table,
previously, QE didn't maintain
the gp_distribution_policy catalog, so we need to pass policy info to QE
for replicated table.

* Change schema of gp_distribution_policy to identify replicated table

Previously, we used a magic number -128 in gp_distribution_policy table
to identify replicated table which is quite a hack, so we add a new column
in gp_distribution_policy to identify replicated table and partitioned
table.

This commit also abandon the old way that used 1-length-NULL list and
2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED
FULLY clause.

Beside, this commit refactor the code to make the decision-making of
distribution policy more clear.

* support COPY for replicated table

* Disable row ctid unique path for replicated table.
  Previously, GPDB use a special Unique path on rowid to address queries
  like "x IN (subquery)", For example:
  select * from t1 where t1.c2 in (select c2 from t3), the plan looks
  like:
   ->  HashAggregate
         Group By: t1.ctid, t1.gp_segment_id
            ->  Hash Join
                  Hash Cond: t2.c2 = t1.c2
                ->  Seq Scan on t2
                ->  Hash
                    ->  Seq Scan on t1

  Obviously, the plan is wrong if t1 is a replicated table because ctid
  + gp_segment_id can't identify a tuple, in replicated table, a logical
  row may have different ctid and gp_segment_id. So we disable such plan
  for replicated table temporarily, it's not the best way because rowid
  unique way maybe the cheapest plan than normal hash semi join, so
  we left a FIXME for later optimization.

* ORCA related fix
  Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io>
  Fallback to legacy query optimizer for queries over replicated table

* Adapt pg_dump/gpcheckcat to replicated table
  gp_distribution_policy is no longer a master-only catalog, do
  same check as other catalogs.

* Support gpexpand on replicated table && alter the dist policy of replicated table

7efe3204

02 3月, 2018 1 次提交

Need to close pipe files even if COPY PROGRAME already exit (#4585) · 70f55c3d

由 Ming LI 提交于 3月 02, 2018

Need to close pipe files even if COPY PROGRAME already exit before parent process ask child process to exit, otherwise it will lead file description leak in a long time running.

70f55c3d

01 2月, 2018 1 次提交

Fix COPY PROGRAM issues · d6bd4ac4

由 Adam Lee 提交于 1月 29, 2018

1, pipes might not exist while in close_program_pipes(), check it.
  For instance, relation doesn't exist, copy workflow fails before
  executing the program, "cstate->program_pipes->pid" dereferences NULL.

2, the program might be still running or hang when copy exits, kill it.
  Cases like the program hangs, doesn't take signals, user is trying to
  cancel.
  Since it's already the end of copy, and the program was started by
  copy, it should be safe to kill to clean up.

d6bd4ac4

30 1月, 2018 2 次提交

Add hook to handle query info · 49b9bbc8

由 Wang Hao 提交于 1月 29, 2018

The hook is called for
 - each query Submit/Start/Finish/Abort/Error
 - each plan node, on executor Init/Start/Finish

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

49b9bbc8

Alloc Instrumentation in Shmem · 67db4274

由 Wang Hao 提交于 10月 20, 2017

On postmaster start, additional space in Shmem is allocated for Instrumentation
slots and a header. The number of slots is controlled by a cluster level GUC,
default is 5MB (approximate 30K slots). The default number is estimated by 250
concurrent queries * 120 nodes per query. If the slots are exhausted,
instruments are allocated in local memory as fallback.

These slots are organized as a free list:
  - Header points to the first free slot.
  - Each free slot points to next free slot.
  - The last free slot's next pointer is NULL.

ExecInitNode calls GpInstrAlloc to pick an empty slot from the free list:
  - The free slot pointed by the header is picked.
  - The picked slot's next pointer is assigned to the header.
  - A spin lock on the header to prevent concurrent writing.
  - When GUC gp_enable_query_metrics is off, Instrumentation will
    be allocated in local memory.

Slots are recycled by resource owner callback function.

Benchmark result with TPC-DS shows performance impact by this commit is less than 0.1%
To improve performance of instrumenting, following optimizations are added:
  - Introduce instrument_option to skip CDB info collection
  - Optimize tuplecount in Instrumentation from double to uint64
  - Replace instrument tuple entry/exit function with macro
  - Add need_timer to Instrumentation, to allow eliminating of timing overhead.
    This is porting part of upstream commit:
------------------------------------------------------------------------
commit af7914c6
Author: Robert Haas <rhaas@postgresql.org>
Date:   Tue Feb 7 11:23:04 2012 -0500

Add TIMING option to EXPLAIN, to allow eliminating of timing overhead.
------------------------------------------------------------------------

Author: Wang Hao <haowang@pivotal.io>
Author: Zhang Teng <tezhang@pivotal.io>

67db4274

28 12月, 2017 1 次提交

Able to cancel COPY PROGRAM ON SEGMENT if the program hangs · 110b825f

由 Adam Lee 提交于 12月 14, 2017

There are two places that QD keep trying to get data, ignore SIGINT, and
not send signal to QEs. If the program on segment has no input/output,
copy command hangs.

To fix it, this commit:

1, lets QD wait connections able to be read before PQgetResult(), and
cancels queries if gets interrupt signals while waiting
2, sets DF_CANCEL_ON_ERROR when dispatch in cdbcopy.c
3, completes copy error handling

-- prepare
create table test(t text);
copy test from program 'yes|head -n 655360';

-- could be canceled
copy test from program 'sleep 100 && yes test';
copy test from program 'sleep 100 && yes test<SEGID>' on segment;
copy test from program 'yes test';
copy test to '/dev/null';
copy test to program 'sleep 100 && yes test';
copy test to program 'sleep 100 && yes test<SEGID>' on segment;

-- should fail
copy test from program 'yes test<SEGID>' on segment;
copy test to program 'sleep 0.1 && cat > /dev/nulls';
copy test to program 'sleep 0.1<SEGID> && cat > /dev/nulls' on segment;

110b825f

12 12月, 2017 2 次提交

Replace usage of deprecated error codes · fd0a1b75

由 Daniel Gustafsson 提交于 12月 12, 2017

These error codes were marked as deprecated in September 2007 but
the code didn't get the memo. Extend the deprecation into the code
and actually replace the usage. Ten years seems long enough notice
so also remove the renames, the odds of anyone using these in code
which compiles against a 6X tree should be low (and easily fixed).

fd0a1b75

Refactor the CopyReadAttributeCsv code to follow upstream according · a1648bca

由 Xiaoran Wang 提交于 12月 11, 2017

commit:
	commit 95c238d9
	Author: Andrew Dunstan <andrew@dunslane.net>
	Date:   Sat Mar 8 01:16:26 2008 +0000

	    Improve efficiency of attribute scanning in CopyReadAttributesCSV.
	    The loop is split into two parts, inside quotes, and outside quotes, saving some instructions in both parts.

Author: Max Yang <myang@pivotal.io>

a1648bca

06 11月, 2017 1 次提交

Refactoring around ExecInsert. · d0fbecf6

由 Heikki Linnakangas 提交于 11月 05, 2017

This is mostly in preparation for changes soon to be merged from PostgreSQL
8.4, commit a77eaa6a to be more precise. Currently GPDB's ExecInsert
uses ExecSlotFetch*() functions to get the tuple from the slot, while in
the upstream, it makes a modifiable copy with ExecMaterializeSlot(). That's
OK as the code stands, because there's always a "junk filter" that ensures
that the slot doesn't point directly to an on-disk tuple. But commit
a77eaa6a will change that, so we have to start being more careful.

This does fix an existing bug, namely that if you UPDATE an AO table with
OIDs, the OIDs currently change (github issue #3732). Add a test case for
that.

More detailed breakdown of the changes:

* In ExecInsert, create a writeable copy of the tuple when we're about
  to modify it, so that we don't accidentally modify an existing on-disk
  tuple. By calling ExecMaterializeSlot().

* In ExecInsert, track the OID of the tuple we're about to insert in a
  local variable, when we call the BEFORE ROW triggers, because we don't
  have a "tuple" yet.

* Add ExecMaterializeSlot() function, like in the upstream, because we now
  need it in ExecInsert. Refactor ExecFetchSlotHeapTuple to use
  ExecMaterializeSlot(), like in upstream.

* Cherry-pick bug fix commit 3d02cae3 from upstream. We would get that
  soon anyway as part of the merge, but we'll soon have test failures if
  we don't fix it immediately.

* Change the API of appendonly_insert(), so that it takes the new OID as
  argument, instead of extracting it from the passed-in MemTuple. With this
  change, appendonly_insert() is guaranteed to not modify the passed-in
  MemTuple, so we don't need the equivalent of ExecMaterializeSlot() for
  MemTuples.

* Also change the API of appendonly_insert() so that it returns the new OID
  of the inserted tuple, like heap_insert() does. Most callers ignore the
  return value, so this way they don't need to pass a dummy pointer
  argument.

* Add test case for the case that a BEFORE ROW trigger sets the OID of
  a tuple we're about to insert.

This is based on earlier patches against the 8.4 merge iteration3 branch by
Jacob and Max.

d0fbecf6

04 11月, 2017 1 次提交

Add XLogIsNeeded() macro. · 20267fda

由 Abhijit Subramanya 提交于 11月 02, 2017

The macro is taken from the upstream commit
40f908bd.

This commit fixes issues for CLUSTER and COPY command where the commands would
not generate necessary XLOG records when streaming replication is enabled. With
the correct use of XLogIsNeeded() this is now fixed.

This also cleans up the XLog_CanBypassWal() and XLog_UnconvertedCanBypassWal()
functions by replacing their usage with XLogIsNeeded().
Signed-off-by: NTaylor Vesely <tvesely@pivotal.io>
Signed-off-by: NAsim R P <apraveen@pivotal.io>

20267fda

02 11月, 2017 1 次提交

Fix confusion on different kinds of "tuples". · 740f304e

由 Heikki Linnakangas 提交于 11月 01, 2017

The code in ExecInsert, ExecUpdate, and CopyFrom was confused on what kind
of a tuple the "tuple" variable might hold at different times. In
particular, if you had a trigger on an append-only table, they would pass
a MemTuple to Exec*Triggers() functions, which expect a HeapTuple.

To fix, refactor the code so that it's always clear what kind of a tuple
we're dealing with. The compiler will now throw warnings if they are
conflated.

We cannot, in fact, support ON UPDATE or ON DELETE triggers on AO tables
in a sane way. GetTupleForTrigger() is hopelessly heap-specific. We could
perhaps change it to do a lookup in the append only table's visibility map
instead of looking at a heap tuple's xmin/xmax, but looking up the original
tuple in an AO table would be fairly expensive anyway. As far as I can see,
that never worked correctly, but let's add a check in CREATE TRIGGER to
forbid that.

ON INSERT triggers now work, also on AOCS tables. There was previously
checks to throw an error if an AOCS table had a trigger, but I see no
reason to forbid that particular case.

Fixes github issue #3680.

740f304e

11 10月, 2017 1 次提交

copy from on segment should not check random distributing policy · f5e6e79a

由 Xiaoran Wang 提交于 10月 11, 2017

copy from on segment will not check distributing policy
when the table is distributed randomly.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

f5e6e79a

25 9月, 2017 1 次提交

Report COPY PROGRAM's error output · 2b51c16b

由 Adam Lee 提交于 9月 14, 2017

Replace popen() with popen_with_stderr() which is used in external web
table also to collect the stderr output of program.

Since popen_with_stderr() forks a `sh` process, it's almost always
sucessful, this commit catches errors happen in fwrite().

Also passes variables as the same as what external web table does.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

2b51c16b

20 9月, 2017 1 次提交
- M
  Report error when 'COPY (SELECT ...) TO' with 'ON SEGMENT' · cbddcc86
  由 Ming LI 提交于 9月 20, 2017
```
Because we don't know the data location of the result of SELECT query,
ON SEGMENT is forbidden.
```
  cbddcc86
15 9月, 2017 1 次提交
- M
  
  Fixed crash at copy report unexpected message type error · c7a382c6
  由 Ming LI 提交于 9月 14, 2017
  
  c7a382c6
13 9月, 2017 1 次提交

Add support for piping COPY to/from an external program. · c415415a

由 Adam Lee 提交于 9月 01, 2017

commit 3d009e45
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed Feb 27 18:17:21 2013 +0200

Add support for piping COPY to/from an external program.

This includes backend "COPY TO/FROM PROGRAM '...'" syntax, and corresponding
psql \copy syntax. Like with reading/writing files, the backend version is
superuser-only, and in the psql version, the program is run in the client.

In the passing, the psql \copy STDIN/STDOUT syntax is subtly changed: if you
the stdin/stdout is quoted, it's now interpreted as a filename. For example,
"\copy foo from 'stdin'" now reads from a file called 'stdin', not from
standard input. Before this, there was no way to specify a filename called
stdin, stdout, pstdin or pstdout.

This creates a new function in pgport, wait_result_to_str(), which can
be used to convert the exit status of a process, as returned by wait(3),
to a human-readable string.

Etsuro Fujita, reviewed by Amit Kapila.
Signed-off-by: NAdam Lee <ali@pivotal.io>
Signed-off-by: NMing LI <mli@apache.org>

c415415a

08 9月, 2017 2 次提交
- A
  Revert "Rename `COPY ON SEGMENT` placeholder <SEGID> to <SEG_ID>" · 63333f6d
  由 Adam Lee 提交于 9月 08, 2017
```
This reverts commit f71e57c6, better to
keep the behavior.
```
  63333f6d
- A
  Rename `COPY ON SEGMENT` placeholder <SEGID> to <SEG_ID> · f71e57c6
  由 Adam Lee 提交于 9月 08, 2017
```
The other one is <SEG_DATA_DIR>, they should keep the same style.
```
  f71e57c6
04 9月, 2017 2 次提交
- X
  Refactor copy's target segment computing function · 36f2f6d6
  由 Xiaoran Wang 提交于 8月 24, 2017
```
There are same codes computing target segment in both function CopyFrom
and CopyFromDispatch. Extract the codes into separate functions.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
```
  36f2f6d6
- D
  
  Fix typo in copy.c comment · 102aac6f
  由 Daniel Gustafsson 提交于 9月 04, 2017
  
  102aac6f
01 9月, 2017 2 次提交

Fix Copyright and file headers across the tree · ed7414ee

由 Daniel Gustafsson 提交于 9月 01, 2017

This bumps the copyright years to the appropriate years after not
having been updated for some time. Also reformats existing code
headers to match the upstream style to ensure consistency.

ed7414ee

Improve the error messages in a few COPY FROM SEGMENT cases. · 43b59d57

由 Heikki Linnakangas 提交于 9月 01, 2017

* Use ereport() with a proper error code, rather than elog(), so that you
  don't get the source file name and line number in the message, and the
  serious-looking backtrace in the log.

* Remove the hint that advised "SET gp_enable_segment_copy_checking=off",
  when a row failed the check that it's being loaded to the correct
  segment. Ignoring the mismatch seems like very bad idea, because if
  your rows are in incorrect segments, all bets are off, and you'll likely
  get incorrect query results when you try to query the table.

43b59d57