提交 · b1f254dae9c5ac6e9e3905c4b4217cfb818c8f68 · Greenplum / Gpdb

07 12月, 2018 1 次提交
- H
  
  Remove unused fields from CdbCopy. · b1f254da
  由 Heikki Linnakangas 提交于 12月 07, 2018
  
  b1f254da
29 10月, 2018 1 次提交

Remove memory context argument from GpPolicyFetch and friends. · 6d17d31f

由 Heikki Linnakangas 提交于 10月 29, 2018

Most callers were passing CurrentMemoryContext, so this makes most callers
slightly simpler. The few places that needed to pass a different context
now switch to the correct one before calling the GpPolicy*() function.
Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>

6d17d31f

19 10月, 2018 2 次提交

Fix error handling in "COPY <table> TO <file>". · eab449f8

由 Heikki Linnakangas 提交于 10月 19, 2018

If an error occurred in the segments, in a "COPY <table> TO <file>"
command, the COPY was stopped, but the error was not reported to the user.
That gave the false impression that it finished successfully, but what you
actually got was an incomplete file.

A test case is included. It uses a little helper output function that
sometimes throws an error. Output functions are fairly unlikely to fail,
but it could happen e.g. because of an out of memory error, or a disk
failure. The "COPY (SELECT ...) TO <file>" variant did not suffer from
this (otherwise, a query that throws an error would've been a much simpler
way to test this.)

The reason for this was that the code in cdbCopyGetData() that called
PQgetResult(), and extracted the error message from the result, didn't
indicate to the caller in any way that the error happened. To fix, delay
the call to PQgetResult(), to a later call to cdbCopyEnd(). cdbCopyEnd()
already had the logic to extract the error information from the PGresult,
and throw it to the user. While we're at it, refactor cdbCopyEnd a
little bit, to give the callers a nicer function signature.

I also changed a few places that used 32-bit int to store rejected row
counts, to use int64 instead. There was a FIXME comment about that. I
didn't fix all the places that do that, though, so I moved the FIXME to
one of the remaining places.

Apply to master branch only. GPDB 5 didn't handle this too well, either;
with the included test case, you got an error like this:

postgres=# copy broken_type_test to '/tmp/x';
ERROR:  missing error text

That's not very nice, but at least you get an error, even if it's not a very
good one. The code looks quite different in 5X_STABLE, so I'm not going to
attempt improving that.
Reviewed-by: NAdam Lee <ali@pivotal.io>

eab449f8

Cleanups in cdbcopy.c · 976b0f01

由 Heikki Linnakangas 提交于 10月 19, 2018

* 'segdb_state' and 'err_context' fields in CdbCopy were unused, remove.

* 'failedSegDBs' in processCopEndResults was unused, remove.

* Plus some other cosmetic cleanup, for better readability.

976b0f01

28 9月, 2018 1 次提交

Allow tables to be distributed on a subset of segments · 4eb65a53

由 ZhangJackey 提交于 9月 28, 2018

There was an assumption in gpdb that a table's data is always
distributed on all segments, however this is not always true for example
when a cluster is expanded from M segments to N (N > M) all the tables
are still on M segments, to workaround the problem we used to have to
alter all the hash distributed tables to randomly distributed to get
correct query results, at the cost of bad performance.

Now we support table data to be distributed on a subset of segments.

A new columne `numsegments` is added to catalog table
`gp_distribution_policy` to record how many segments a table's data is
distributed on.  By doing so we could allow DMLs on M tables, joins
between M and N tables are also supported.

```sql
-- t1 and t2 are both distributed on (c1, c2),
-- one on 1 segments, the other on 2 segments
select localoid::regclass, attrnums, policytype, numsegments
    from gp_distribution_policy;
 localoid | attrnums | policytype | numsegments
----------+----------+------------+-------------
 t1       | {1,2}    | p          |           1
 t2       | {1,2}    | p          |           2
(2 rows)

-- t1 and t1 have exactly the same distribution policy,
-- join locally
explain select * from t1 a join t1 b using (c1, c2);
                   QUERY PLAN
------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Seq Scan on t1 b
 Optimizer: legacy query optimizer

-- t1 and t2 are both distributed on (c1, c2),
-- but as they have different numsegments,
-- one has to be redistributed
explain select * from t1 a join t2 b using (c1, c2);
                          QUERY PLAN
------------------------------------------------------------------
 Gather Motion 1:1  (slice2; segments: 1)
   ->  Hash Join
         Hash Cond: a.c1 = b.c1 AND a.c2 = b.c2
         ->  Seq Scan on t1 a
         ->  Hash
               ->  Redistribute Motion 2:1  (slice1; segments: 2)
                     Hash Key: b.c1, b.c2
                     ->  Seq Scan on t2 b
 Optimizer: legacy query optimizer
```

4eb65a53

27 9月, 2018 1 次提交

Dispatcher can create flexible size gang (#5701) · a3ddac06

由 Tang Pengzhou 提交于 9月 27, 2018

* change type of db_descriptors to SegmentDatabaseDescriptor **

A new gang definination may consist of cached segdbDesc and new
created segdbDesc, there is no need to palloc all segdbDesc struct
as new.

* Remove unnecessary allocate gang unit test

* Manage idle segment dbs using CdbComponentDatabases instead of available* lists.

To support vary size gang, we now need to manage segment dbs in a lower
granularity, previously, idle QEs is managed by a bunch of lists like
availablePrimaryWriterGang, availableReaderGangsN, this restrict
dispatcher to only create N-size (N = number of segments) or 1-size
gang.

CdbComponentDatabases is a snapshot of segment components within current
cluster, now it maintains a freelist for each segment component. When
creating gang, dispatcher will make up a gang from each segment
component (from freelist or create a new segment db). When cleaning up
a gang, dispatcher will return idle segment dbs to each segment
component.

CdbComponentDatabases provide a few functions to manipulate segment dbs
(SegmentDatabaseDescriptor *):
* cdbcomponent_getCdbComponents
* cdbcomponent_destroyCdbComponents
* cdbcomponent_allocateIdleSegdb
* cdbcomponent_recycleIdleSegdb
* cdbcomponent_cleanupIdleSegdbs

CdbComponentDatabases is also FTS version sensitive, so once a FTS
version changed, CdbComponentDatabases destroy all idle segment dbs
and allocate QEs in the new promoted segment. This provides the ability
to transparent mirror failover to users.

Since segment dbs(SegmentDatabaseDescriptor *) are managed by
CdbComponentDatabases now, we can simplify the memory context
management by replacing GangContext & perGangContext with
DispatcherContext & CdbComponentsContext.

* Postpone the error hanlding when creating gang

Now we have AtAbort_DispatcherState, one advantage of it is that
we can postpone gang error hanlding in this function and make
code cleaner.

* Handle FTS version change correctly

In some cases, when a FTS version changed, we can't update current
snapshot of segment components, to be more specifically, we can't
destroy current writer segment dbs and create new segment dbs.

These cases include:
* session has temp table created.
* query need two-phase commit and gxid has been dispatched to
  segments.

* Replace <gangId, sliceId> map with <qeIdentifier, sliceId> map

We used to dispatch a <gangId, sliceId> map along with query to
segment dbs so segment dbs can know which slice they should
execute.

Now gangId is useless for a segment db because a segment db can
be reused by different gang, so we need a new way to tell the
info to segment dbs. To resolve this, CdbComponentDatabases
assign a unique identifier to each segment db and make up a
bitmap set which consist of segment identifiers for each slice,
segment dbs then can go through the slice table and find the
right slice to execute.

* Allow dispatcher to create vary size gang and refine AssignGangs()

Previously, dispatcher can only create N-size gang for
GANGTYPE_PRIMARY_WRITER or GANGTYPE_PRIMARY_READER. this
restrict dispatcher in many ways, one example is direct
dispatch, it always create a N-size gang even it only
dispatch the command to one segment, another example is
some operations may be able to use N+ size gang, like
hash join, if both inner and outer plan is redistributed,
the hash join node can associate with a N+ size gang to
execute. This commit changes the API of createGang() so the
caller can specify a list of segments (partial or even
duplicate segments), CdbCompoentDatabase will guarantee
each segment has only one writer in a session. With this
it also resolves another pain point of AssignGangs(), so
the caller don't need to promote a GANGTYPE_PRIMARY_READER
to GANGTYPE_PRIMARY_WRITER, or promote a GANGTYPE_SINGLETON
_READER to GANGTYPE_PRIMARY_WRITER for replicated table
(see FinalizeSliceTree()).

With this commit, AssignGang() is very clear now.

a3ddac06

23 9月, 2018 1 次提交

Remove duplicate assertion segment count · b2c9182c

由 Daniel Gustafsson 提交于 9月 23, 2018

There is already an assertion in getgpsegmentCount() testing the count
to be > 0 (and 0 can only be returned in utility mode which still holds
this assertion always true).
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

b2c9182c

21 9月, 2018 1 次提交

Fix COPY SEGV caused by uninitialized variables · 688a43f0

由 Adam Lee 提交于 9月 21, 2018

It happens if the copy command errors out before assigning
dispatcherState. Initialize the dispatcherState as NULL to fix it,
palloc0() to avoid future new member issues.

5X has no such problem.

```
(gdb) c
Continuing.
Detaching after fork from child process 25843.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000aa04dd in getCdbCopyPrimaryGang (c=0x23d4150) at cdbcopy.c:44
44              return (Gang *)linitial(c->dispatcherState->allocatedGangs);
(gdb) bt
\#0  0x0000000000aa04dd in getCdbCopyPrimaryGang (c=0x23d4150) at cdbcopy.c:44
\#1  0x0000000000aa12d8 in cdbCopyEndAndFetchRejectNum (c=0x23d4150, total_rows_completed=0x0, abort_msg=0xd0c8f8 "aborting COPY in QE due to error in QD") at cdbcopy.c:642
\#...
(gdb) p c->dispatcherState
$1 = (struct CdbDispatcherState *) 0x100000000
```

688a43f0

06 9月, 2018 1 次提交

Integrate Gang management from portal to Dispatcher and simplify AssignGangs for init plans (#5555) · 78a4890a

由 Tang Pengzhou 提交于 9月 06, 2018

* Simplify the AssignGangs() logic for init plans

Previously, AssignGangs() assign gangs for both main plans and
init plans in one shot. Because init plans and main plan are
executed sequentially, so the gangs can be reused between main
plan and init plans, function AccumSliceReq() is designed for
this.

This process can be simplified: already know the root slice
index id will be adjusted to according init plan id, init plan
only need to assign their own slices.

* Integrate Gang management from portal to Dispatcher

Previously, Gang was managed by portal, freeGangsForPortal()
was used to cleanup gang resource, DTM related commands also
needed a gang to dispatch command outside of a portal and
used freeGangsForPortal() too. There might be multiple
command/plan/utility executed within one portal, all commands
relied on a dispatcher routine like CdbDispatchCommand /
CdbDispatchPlan/CdbDispatchUtility... to dispatch, gangs were
created by each dispatcher routines, but not be recycled or
destroyed when a routine finished except for primary writer
gang, one defect of this is gang resource cannot be reused
between dispatcher routines. GPDB already had an optimization
for init plans, if a plan contained init plans, AssignGangs
was called before execution of any of them it went through
the whole slice tree and created the maximum gang that both
main plan and init plans needed, this was doable because init
plans and main plan were executed sequentially, but it also
made AssignGangs logic complex, meanwhile, reusing an not
clean gang was not safe.

Another confusing thing was the gang and dispatcher were
managed separately which cause context inconsistent like:
when a dispatcher state was destroyed, gang was not recycled,
when a gang was destroyed by portal, the dispatcher state was
still in use and may refer to the context of a destroyed gang.

As described above, this commit integrates gang management
with dispatcher, a dispather state is responsible for creating
and tracking gangs as needed and destroy them when dispatcher
state is destroyed.

* Handle the case when primary writer gang has gone

When members of primary writer gang gone, the writer gang
is destroyed immediately (primaryWriterGang is set to NULL)
when a dispatcher rountine (eg.CdbDispatchCommand) finished.
So when dispatching two-phase-DTM/DTX related command, QD
doesn't know writer gang has gone, it may get unexpected
error like 'savepoint not exist', 'subtransaction level not
match', 'temp file not exist'.

Previously, primaryWriterGang is not reset when DTM/DTX
commands start even it is pointing to invalid segments, so
those DTM/DTX commands will not actually sent to segments,
an normal error reported on QD looks like 'could not
connect to segment: initialization of segworker'.

So we need a way to info global transaction that its writer
gang has lost. so when aborting transaction, QD can:
1. disconnect all reader gangs, this is usefull to skip
dispatching "ABORT_NO_PREPARE"
2. reset session and drop temp files because temp files in
segment is gone.
3. report a error when dispatching "rollback savepoint" DTX
because savepoint in segment is gone.
4. report a error when dispatch "abort subtransaction" DTX
because subtransaction is rollback when writer segment is down.

78a4890a

14 8月, 2018 1 次提交

Refine dispatching of COPY command · a1b6b2ae

由 Pengzhou Tang 提交于 7月 19, 2018

Previously, COPY use CdbDispatchUtilityStatement directly to
dispatch 'COPY' statements to all QEs and then send/receive
data from primaryWriterGang, this way happens to work because
primaryWriterGang is not recycled when a dispatcher state is
destroyed. This seems nasty because the COPY command has finished
logically.

This commit splits the COPY dispatching logic to two parts to
make it more reasonable.

a1b6b2ae

03 8月, 2018 1 次提交

Remove dead code · b3043f5e

由 Daniel Gustafsson 提交于 8月 03, 2018

The recent COPY refactoring left cdbCopyEnd() unused, so remove
the function as it's now dead code. Also clean up a comment which
erroneously was referring to it.
Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>

b3043f5e

17 5月, 2018 1 次提交

COPY: expand the type of numcompleted to 64 bits · 8d40268b

由 Adam Lee 提交于 5月 14, 2018

Integer overflow occurs without this when copied more than 2^31 rows,
under the `COPY ON SEGMENT` mode.

Errors happen when it is casted to uint64, the type of `processed` in
`CopyStateData`, third-party Postgres driver, which takes it as an
int64, fails out of range.

8d40268b

29 3月, 2018 1 次提交

Support replicated table in GPDB · 7efe3204

由 Pengzhou Tang 提交于 1月 29, 2018

* Support replicated table in GPDB

Currently, tables are distributed across all segments by hash or random in GPDB. There
are requirements to introduce a new table type that all segments have the duplicate
and full table data called replicated table.

To implement it, we added a new distribution policy named POLICYTYPE_REPLICATED to mark
a replicated table and added a new locus type named CdbLocusType_SegmentGeneral to specify
the distribution of tuples of a replicated table.  CdbLocusType_SegmentGeneral implies
data is generally available on all segments but not available on qDisp, so plan node with
this locus type can be flexibly planned to execute on either single QE or all QEs. it is
similar with CdbLocusType_General, the only difference is that CdbLocusType_SegmentGeneral
node can't be executed on qDisp. To guarantee this, we try our best to add a gather motion
on the top of a CdbLocusType_SegmentGeneral node when planing motion for join even other
rel has bottleneck locus type, a problem is such motion may be redundant if the single QE
is not promoted to executed on qDisp finally, so we need to detect such case and omit the
redundant motion at the end of apply_motion(). We don't reuse CdbLocusType_Replicated since
it's always implies a broadcast motion bellow, it's not easy to plan such node as direct
dispatch to avoid getting duplicate data.

We don't support replicated table with inherit/partition by clause now, the main problem is
that update/delete on multiple result relations can't work correctly now, we can fix this
later.

* Allow spi_* to access replicated table on QE

Previously, GPDB didn't allow QE to access non-catalog table because the
data is incomplete,
we can remove this limitation now if it only accesses replicated table.

One problem is QE need to know if a table is replicated table,
previously, QE didn't maintain
the gp_distribution_policy catalog, so we need to pass policy info to QE
for replicated table.

* Change schema of gp_distribution_policy to identify replicated table

Previously, we used a magic number -128 in gp_distribution_policy table
to identify replicated table which is quite a hack, so we add a new column
in gp_distribution_policy to identify replicated table and partitioned
table.

This commit also abandon the old way that used 1-length-NULL list and
2-length-NULL list to identify DISTRIBUTED RANDOMLY and DISTRIBUTED
FULLY clause.

Beside, this commit refactor the code to make the decision-making of
distribution policy more clear.

* support COPY for replicated table

* Disable row ctid unique path for replicated table.
  Previously, GPDB use a special Unique path on rowid to address queries
  like "x IN (subquery)", For example:
  select * from t1 where t1.c2 in (select c2 from t3), the plan looks
  like:
   ->  HashAggregate
         Group By: t1.ctid, t1.gp_segment_id
            ->  Hash Join
                  Hash Cond: t2.c2 = t1.c2
                ->  Seq Scan on t2
                ->  Hash
                    ->  Seq Scan on t1

  Obviously, the plan is wrong if t1 is a replicated table because ctid
  + gp_segment_id can't identify a tuple, in replicated table, a logical
  row may have different ctid and gp_segment_id. So we disable such plan
  for replicated table temporarily, it's not the best way because rowid
  unique way maybe the cheapest plan than normal hash semi join, so
  we left a FIXME for later optimization.

* ORCA related fix
  Reported and added by Bhuvnesh Chaudhary <bchaudhary@pivotal.io>
  Fallback to legacy query optimizer for queries over replicated table

* Adapt pg_dump/gpcheckcat to replicated table
  gp_distribution_policy is no longer a master-only catalog, do
  same check as other catalogs.

* Support gpexpand on replicated table && alter the dist policy of replicated table

7efe3204

28 3月, 2018 1 次提交

Avoid infinite loop in processCopyEndResults · 25bc3855

由 Asim R P 提交于 3月 26, 2018

The command "COPY enumtest FROM stdin;" hit an infinite loop on merge
branch.  Code indicates that the issue can happen on master as well.
QD backend went into infinite loop when the connection was already
closed from QE end.  The TCP connection was in CLOSE_WAIT state.
Libpq connection status was CONNECTION_BAD and asyncStatus was
PGASYNC_BUSY.

Fix the infinite loop by checking libpq connection status in each
iteration.

25bc3855

23 3月, 2018 1 次提交
- A
  
  Cleanup some FTS functions. · 0e1b9a05
  由 Ashwin Agrawal 提交于 3月 20, 2018
  
  0e1b9a05
28 12月, 2017 1 次提交

Able to cancel COPY PROGRAM ON SEGMENT if the program hangs · 110b825f

由 Adam Lee 提交于 12月 14, 2017

There are two places that QD keep trying to get data, ignore SIGINT, and
not send signal to QEs. If the program on segment has no input/output,
copy command hangs.

To fix it, this commit:

1, lets QD wait connections able to be read before PQgetResult(), and
cancels queries if gets interrupt signals while waiting
2, sets DF_CANCEL_ON_ERROR when dispatch in cdbcopy.c
3, completes copy error handling

-- prepare
create table test(t text);
copy test from program 'yes|head -n 655360';

-- could be canceled
copy test from program 'sleep 100 && yes test';
copy test from program 'sleep 100 && yes test<SEGID>' on segment;
copy test from program 'yes test';
copy test to '/dev/null';
copy test to program 'sleep 100 && yes test';
copy test to program 'sleep 100 && yes test<SEGID>' on segment;

-- should fail
copy test from program 'yes test<SEGID>' on segment;
copy test to program 'sleep 0.1 && cat > /dev/nulls';
copy test to program 'sleep 0.1<SEGID> && cat > /dev/nulls' on segment;

110b825f

30 10月, 2017 1 次提交
- A
  Retire gp_libpq_fe part 2, changing including path · 974c414e
  由 Adam Lee 提交于 10月 23, 2017
```
Signed-off-by: NAdam Lee <ali@pivotal.io>
```
  974c414e
10 10月, 2017 1 次提交
- A
  
  pgindent cdb directory (part-2). · ad1b8386
  由 Ashwin Agrawal 提交于 9月 29, 2017
  
  ad1b8386
25 9月, 2017 1 次提交

Report COPY PROGRAM's error output · 2b51c16b

由 Adam Lee 提交于 9月 14, 2017

Replace popen() with popen_with_stderr() which is used in external web
table also to collect the stderr output of program.

Since popen_with_stderr() forks a `sh` process, it's almost always
sucessful, this commit catches errors happen in fwrite().

Also passes variables as the same as what external web table does.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

2b51c16b

01 9月, 2017 1 次提交

Fix Copyright and file headers across the tree · ed7414ee

由 Daniel Gustafsson 提交于 9月 01, 2017

This bumps the copyright years to the appropriate years after not
having been updated for some time. Also reformats existing code
headers to match the upstream style to ensure consistency.

ed7414ee

28 8月, 2017 2 次提交

Optimize `COPY TO ON SEGMENT` result processing · 266355d3

由 Adam Lee 提交于 8月 25, 2017

Don't send nonsense '\n' characters just for counting, let segments
report how many rows are processed instead.
Signed-off-by: NMing LI <mli@apache.org>

266355d3

Check distribution key restriction for `COPY FROM ON SEGMEN` · 65321259

由 Xiaoran Wang 提交于 8月 10, 2017

When use command `COPY FROM ON SEGMENT`, we copy data from
local file to the table on the segment directly. When copying
data, we need to apply the distribution policy on the record to compute
the target segment. If the target segment ID isn't equal to
current segment ID, we will report error to keep the distribution
key restriction.

Because the segment has no meta data info about table distribution policy and
partition policy,we copy the distribution policy of main table from
master to segment in the query plan. When the parent table and
partitioned sub table has different distribution policy, it is difficult
to check all the distribution key restriction in all sub tables. In this
case , we will report error.

In case of the partitioned table's distribution policy is
RANDOMLY and different from the parent table, user can use GUC value
`gp_enable_segment_copy_checking` to disable this check.

Check the distribution key restriction as follows:

1) Table isn't partioned:
    Compute the data target segment.If the data doesn't belong the
    segment, will report error.

2) Table is partitioned and the distribution policy of partitioned table
as same as the main table:
    Compute the data target segment.If the data doesn't belong
    the segment, will report error.

3) Table is partitioned and the distribution policy of partitioned
table is different from main table:
    Not support to check ,report error.
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>
Signed-off-by: NMing LI <mli@apache.org>
Signed-off-by: NAdam Lee <ali@pivotal.io>

65321259

11 8月, 2017 1 次提交
- H
  
  Remove unnecessary #includes · 14a983ef
  由 Heikki Linnakangas 提交于 8月 11, 2017
  
  14a983ef
09 8月, 2017 1 次提交

Include gp-libpq-int.h in cdbcopy.c · fdb5d6c3

由 Pengzhou Tang 提交于 8月 08, 2017

cf7cddf7 has conflict with cc38f526, struct PQExpBufferData is
needed by structure SegmentDatabaseDescriptor, so bring gp-libpq-int.h back

fdb5d6c3

08 8月, 2017 1 次提交

Remove unnecessary use of PQExpBuffer. · cc38f526

由 Heikki Linnakangas 提交于 8月 08, 2017

StringInfo is more appropriate in backend code. (Unless the buffer needs to
be used in a thread.)

In the passing, rename the 'conn' static variable in cdbfilerepconnclient.c.
It seemed overly generic.

cc38f526

31 7月, 2017 1 次提交

Implement "COPY ... FROM ... ON SEGMENT' · e254287e

由 Ming LI 提交于 7月 31, 2017

Support COPY statement that imports the data file on segments directly
parallel. It could be used to import data files generated by "COPY ...
to ... ON SEGMENT'.

This commit also supports all kinds of data file formats which "COPY ...
TO" supports, processes reject limit numbers and logs errors accordingly.

Key workflow:
   a) For COPY FROM, nothing changed by this commit, dispatch modified
   COPY command to segments at first, then read data file on master, and
   dispatch the data to relevant segment to process.

   b) For COPY FROM ON SEGMENT, on QD, read dummy data file, other parts
   keep unchanged, on QE, process the data stream (empty) dispatched
   from QD at first, then re-do the same workflow to read and process
   the local segment data file.
Signed-off-by: NMing LI <mli@pivotal.io>
Signed-off-by: NAdam Lee <ali@pivotal.io>
Signed-off-by: NHaozhou Wang <hawang@pivotal.io>
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

e254287e

20 1月, 2017 1 次提交

Add support for binary COPY · 26a98ccf

由 alldefector 提交于 10月 13, 2016

Binary COPY was previously disabled in Greenplum, this commit
re-enables the binary mode by incorporating the upstream code
from PostgreSQL.

Patch by Github user alldefector with additional hacking by
Daniel Gustafsson

26a98ccf

24 11月, 2016 1 次提交

Drain the COPY on cancellation · 8793c3bd

由 Daniel Gustafsson 提交于 11月 24, 2016

The COPY process isn't finished until PQgetCopyData() returns -1 so
we must consume all rows sent until we get -1. A single call should
in most cases do the trick but there are no guarantees so loop around
consumption until end-of-COPY is signalled. In case of error, break
out and continue operation to keep current flow, investigating the
error is probably warranted but handling cases that might fall out is
a bigger question that this isolated patch.

Also use PQfreemem() for the buffer per the manual. This is the same
as just free() on Linux/UNIX (while critically different on Windows)
but we might as well follow the set API to reduce confusion.

Original report by Coverity

8793c3bd

07 11月, 2016 1 次提交

Revamp the way OIDs are dispatched to segments on CREATE statements. · f9016da2

由 Heikki Linnakangas 提交于 11月 07, 2016

Instead of carrying a "new OID" field in all the structs that represent
CREATE statements, introduce a generic mechanism for capturing the OIDs
of all created objects, dispatching them to the QEs, and using those same
OIDs when the corresponding objects are created in the QEs. This allows
removing a lot of scattered changes in DDL command handling, that was
previously needed to ensure that objects are assigned the same OIDs in
all the nodes.

This also provides the groundwork for pg_upgrade to dictate the OIDs to use
for upgraded objects. The upstream has mechanisms for pg_upgrade to dictate
the OIDs for a few objects (relations and types, at least), but in GPDB,
we need to preserve the OIDs of almost all object types.

f9016da2

04 11月, 2016 1 次提交
- X
  Rename the interface routines of dispatcher, and add a README for illustration · be13fd00
  由 xiong-gang 提交于 11月 04, 2016
```
Signed-off-by: NKenan Yao <kyao@pivotal.io>
```
  be13fd00
18 8月, 2016 1 次提交

Remove dead code. · 6a1c4299

由 Heikki Linnakangas 提交于 8月 18, 2016

I found these with "callcatcher", written by Caolán McNamara. Many thanks
for the tool! See https://www.skynet.ie/~caolan/Packages/callcatcher.html

6a1c4299

25 7月, 2016 1 次提交

Refactor utility statement dispatch interfaces · 01769ada

由 Pengzhou Tang 提交于 7月 08, 2016

refactor CdbDispatchUtilityStatement() to make it flexible for cdbCopyStart(),
dispatchVacuum() to call directly. Introduce flags like DF_NEED_TWO_SNAPSHOT,
DF_WITH_SNAPSHOT, DF_CANCEL_ON_ERROR to make function call much clearer

01769ada

28 6月, 2016 1 次提交
- K
  
  Compelete missing initialization of CdbDispatcherState declaration. · 0b9c44f6
  由 Kenan Yao 提交于 6月 06, 2016
  
  0b9c44f6
19 5月, 2016 1 次提交

Split cdbdisp.c into several files, and put them into a new · 895b7d50

由 Pengzhou Tang 提交于 5月 12, 2016

dispatcher/ directory

This commit has no logic change, it just contains movement of code across
files, to make dispatcher code clearer, and easier for unit testing.

Signed-off-by: Kenan Yao

895b7d50

28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be