提交 · d03beb02e1859ba219b18ab7db334f83b87bca68 · Greenplum / Gpdb

30 10月, 2017 1 次提交
- A
  Retire gp_libpq_fe part 2, changing including path · 974c414e
  由 Adam Lee 提交于 10月 23, 2017
```
Signed-off-by: NAdam Lee <ali@pivotal.io>
```
  974c414e
28 10月, 2017 1 次提交

When dispatching, send ActiveSnapshot along, not some random snapshot. · 4a95afc1

由 Heikki Linnakangas 提交于 10月 28, 2017

If the caller specifies DF_WITH_SNAPSHOT, so that the command is dispatched
to the segments with a snapshot, but it currently has no active snapshot in
the QD itself, that seems like a mistake.

In qdSerializeDtxContextInfo(), the comment talked about which snapshot to
use when the transaction has already been aborted. I didn't quite
understand that. I don't think the function is used to dispatch the "ABORT"
statement itself, and we shouldn't be dispatching anything else in an
already-aborted transaction.

This makes it more clear which snapshot is dispatched along with the
command. In theory, the latest or serializable snapshot can be different
from the one being used when the command is dispatched, although I'm not
sure if there are any such cases in practice.

In the upcoming 8.4 merge, there are more changes coming up to snapshot
management, which make it more difficult to get hold of the latest acquired
snapshot in the transaction, so changing this now will ease the pain of
merging that.

I don't know why, but after making the change in qdSerializeDtxContextInfo,
I started to get a lot of "Too many distributed transactions for snapshot
(maxCount %d, count %d)" errors. Looking at the code, I don't understand
how it ever worked. I don't see any no guarantee that the array in
TempQDDtxContextInfo or TempDtxContextInfo was pre-allocated correctly.
Or maybe it got allocated big enough to hold max_prepared_xacts, which
was always large enough, but it seemed rather haphazard to me. So in
the spirit of "if you don't understand it, rewrite it until you do", I
changed the way the allocation of the inProgressXidArray array works.
In statically allocated snapshots, i.e. SerializableSnapshot and
LatestSnapshot, the array is malloc'd. In a snapshot copied with
CopySnapshot(), it is points to a part of the palloc'd space for the
snapshot. Nothing new so far, but I changed CopySnapshot() to set
"maxCount" to -1 to indicate that it's not malloc'd. Then I modified
DistributedSnapshot_Copy and DistributedSnapshot_Deserialize to not give up
if the target array is not large enough, but enlarge it as needed. Finally,
I made a little optimization in GetSnapshotData() when running in a QE, to
move the copying of the distributed snapshot data to outside the section
guarded by ProcArrayLock. ProcArrayLock can be heavily contended, so that's
a nice little optimization anyway, but especially now that
DistributedSnapshot_Copy() might need to realloc the array.

4a95afc1

10 10月, 2017 1 次提交
- A
  
  pgindent cdb directory (part-1). · 50f35e7b
  由 Ashwin Agrawal 提交于 9月 29, 2017
  
  50f35e7b
15 9月, 2017 1 次提交

Rewrite the way a DTM initialization error is logged, to retain file & lineno. · c6f931fe

由 Heikki Linnakangas 提交于 9月 15, 2017

While working on the 8.4 merge, I had a bug that tripped an Insist inside
the PG_TRY-CATCH. That was very difficult to track down, because the way
the error is logged here. Using ereport() includes filename and line
number where it's re-emitted, not the original place. So all I got was
"Unexpected internal error" in the log, with meaningless filename & lineno.

This rewrites the way the error is reported so that it preserves the
original filename and line number. It will also use the original error
level and will preserve all the other fields.

c6f931fe

01 9月, 2017 1 次提交

Fix Copyright and file headers across the tree · ed7414ee

由 Daniel Gustafsson 提交于 9月 01, 2017

This bumps the copyright years to the appropriate years after not
having been updated for some time. Also reformats existing code
headers to match the upstream style to ensure consistency.

ed7414ee

30 8月, 2017 1 次提交
- H
  Remove misc unused code. · 37d2a5b3
  由 Heikki Linnakangas 提交于 8月 30, 2017
```
'nuff said.
```
  37d2a5b3
11 8月, 2017 4 次提交
- H
  Mark functions as static, for clarity. · 0a2a2600
  由 Heikki Linnakangas 提交于 8月 11, 2017
```
They were mostly marked as static in the prototype already, so this is just
for readability.
```
  0a2a2600
- H
  
  Remove unnecessary #includes · 14a983ef
  由 Heikki Linnakangas 提交于 8月 11, 2017
  
  14a983ef
- H
  
  Remove unused debug helper function. · 1fe8f5c2
  由 Heikki Linnakangas 提交于 8月 11, 2017
  
  1fe8f5c2
- H
  
  Remove unused fields from cdbtm's shared memory struct. · 2f803aaf
  由 Heikki Linnakangas 提交于 8月 11, 2017
  
  2f803aaf
09 8月, 2017 1 次提交

Do not include gp-libpq-fe.h and gp-libpq-int.h in cdbconn.h · cf7cddf7

由 Pengzhou Tang 提交于 8月 07, 2017

The whole cdb directory was shipped to end users and all header files
that cdb*.h included are also need to be shipped to make checkinc.py
pass. However, exposing gp_libpq_fe/*.h will confuse customer because
they are almost the same as libpq/*, as Heikki's suggestion, we should
keep gp_libpq_fe/* unchanged. So to make system work, we include
gp-libpg-fe.h and gp-libpq-int.h directly in c files that need them

cf7cddf7

07 7月, 2017 1 次提交

Remove unused variable in checkpoint record. · f737c2d2

由 Ashwin Agrawal 提交于 7月 05, 2017

segmentCount variable is unused in TMGXACT_CHECKPOINT structure hence loose it
out. Also, removing the union in fspc_agg_state, tspc_agg_state and
dbdir_agg_state structures as don't see reason for having the same.

f737c2d2

20 6月, 2017 1 次提交

Remove tmlock test and add an assert instead. · 944306d7

由 Abhijit Subramanya 提交于 6月 12, 2017

The test used to validate that the tmlock is not held after completing the DTM
recovery. The root cause for not releasing the lock was that in case of an
error during recovery `elog_demote(WARNING)` was called which would demote the
error to a warning. This would cause the abort processing code to not get
executed and hence the lock would not be released. Adding a simple assert in
the code once DTM recovery is complete is sufficient to make sure that the lock
is released.

944306d7

02 6月, 2017 1 次提交

Remove subtransaction information from SharedLocalSnapshotSlot · b52ca70f

由 Xin Zhang 提交于 6月 01, 2017

Originally, the reader kept copies of subtransaction information in
two places.  First, it copied SharedLocalSnapshotSlot to share between
writer and reader.  Second, reader kept another copy in subxbuf for
better performance.  Due to lazy xid, subtransaction information can
change in the writer asynchronously with respect to the reader.  This
caused reader's subtransaction information out of date.

This fix removes those copies of subtransaction information in the
reader and adds a reference to the writer's PGPROC to
SharedLocalSnapshotSlot.  Reader should refer to subtransaction
information through writer's PGPROC and pg_subtrans.

Also added is a lwlock per shared snapshot slot.  The lock protects
shared snapshot information between a writer and readers belonging to
the same session.

Fixes github issues #2269 and #2284.
Signed-off-by: NAsim R P <apraveen@pivotal.io>

b52ca70f

01 6月, 2017 1 次提交

Optimize DistributedSnapshot check and refactor to simplify. · 3c21b7d8

由 Ashwin Agrawal 提交于 5月 24, 2017

Before this commit, snapshot stored information of distributed in-progress
transactions populated during snapshot creation and its corresponding localXids
found during tuple visibility check later (used as cache) by reverse mapping
using single tightly coupled data structure DistributedSnapshotMapEntry. Storing
the information this way possed couple of problems:

1] Only one localXid can be cached for a distributedXid. For sub-transactions
same distribXid can be associated with multiple localXid, but since can cache
only one, for other local xids associated with distributedXid need to consult
the distributed_log.

2] While performing tuple visibility check, code must loop over full size of
distributed in-progress array always first to check if cached localXid can be
utilized to avoid reverse mapping.

Now, decoupled the distributed in-progress with local xids cache separately. So,
this allows us to store multiple xids per distributedXid. Also, allows to
optimize scanning localXid only if tuple xid is relevant to it and also scanning
size only equivalent to number of elements cached instead of size of distributed
in-progress always even if nothing was cached.

Along the way, refactored relevant code a bit as well to simplify further.

3c21b7d8

28 4月, 2017 1 次提交

Correct calculation of xminAllDistributedSnapshots and set it on QE's. · d887fe0c

由 Ashwin Agrawal 提交于 4月 11, 2017

For vacuum, page pruning and freezing to perform its job correctly on QE's, it
needs to know globally what's the lowest dxid till any transaction can see in
full cluster. Hence QD must calculate and send that info to QE. For this purpose
using logic similar to one for calculating globalxmin by local snapshot. TMGXACT
for global transactions serves similar to PROC and hence its leveraged to
provide us lowest gxid for its snapshot. Further using its array, shmGxactArray,
can easily find the lowest across all global snapshots and pass down to QE via
snapshot.

Adding unit test for createDtxSnapshot along with the change.

d887fe0c

01 4月, 2017 2 次提交

Cleanup LocalDistribXactData related code. · 8c20bc94

由 Ashwin Agrawal 提交于 3月 27, 2017

Commit fb86c90d "Simplify management of
distributed transactions." cleanedup lot of code for LocalDistribXactData and
introduced LocalDistribXactData in PROC for debugging purpose. But it's only
correctly maintained for QE's, QD never populated LocalDistribXactData in
MyProc. Instead TMGXACT also had LocalDistribXactData which was just set
initially for QD but never updated later and confused more than serving the
purpose. Hence removing LocalDistribXactData from TMGXACT, as it already has
other fields which provide required information. Also, cleaned-up QD related
states as even in PROC only QE uses LocalDistribXactData.

8c20bc94

Fully enable lazy XID allocation in GPDB. · 0932453d

由 Ashwin Agrawal 提交于 3月 17, 2017

As part of 8.3 merge, upstream commit 295e6398
"Implement lazy XID allocation" was merged. But transactionIds were still
allocated in StartTransaction as code changes required to make it work for GPDB
with distrbuted transaction was pending, thereby feature remained as
disabled. Some progress was made by commit
a54d84a3 "Avoid assigning an XID to
DTX_CONTEXT_QE_AUTO_COMMIT_IMPLICIT queries." Now this commit addresses the
pending work needed for handling deferred xid allocation correctly with
distributed transactions and fully enables the feature.

Important highlights of changes:

1] Modify xlog write and xlog replay record for DISTRIBUTED_COMMIT. Even if
transacion is read-only for master and no xid is allocated to it, it can still
be distributed transaction and hence needs to persist itself in such a case. So,
write xlog record even if no local xid is assigned but transaction is
prepared. Similarly during xlog replay of the XLOG_XACT_DISTRIBUTED_COMMIT type,
perform distributed commit recovery ignoring local commit. Which also means for
this case don't commit to distrbuted log, as its only used to perform reverse
map of localxid to distributed xid.

2] Remove localXID from gxact, as its no more needed to be maintained and used.

3] Refactor code for QE Reader StartTransaction. There used to be wait-loop with
sleep checking to see if SharedLocalSnapshotSlot has distributed XID same as
that of READER to assign reader some xid as that of writer, for SET type
commands till READER actually performs GetSnapShotData(). Since now a) writer is
not going to have valid xid till it performs some write, writers transactionId
turns out InvalidTransaction always here and b) read operations like SET doesn't
need xid, any more hence need for this wait is gone.

4] Thow error if using distributed transaction without distributed xid. Earlier
AssignTransactionId() was called for this case in StartTransaction() but such
scenario doesn't exist hence convert it to ERROR.

5] QD earlier during snapshot creation in createDtxSnapshot() was able to assign
localXid in inProgressEntryArray corresponding to distribXid, as localXid was
known by that time. That's no more the case and localXid mostly will get
assigned after snapshot is taken. Hence now even for QD similar to QE's snapshot
creation time localXid is not populated but later found in
DistributedSnapshotWithLocalMapping_CommittedTest(). There is chance to optimize
and try to match earlier behavior somewhat by populating gxact in
AssignTransactionId() once locakXid is known but currently seems not so much
worth it as QE's anyways have to perform the lookups.

0932453d

07 3月, 2017 1 次提交

Fix checkpoint wait for CommitTransaction. · 787992e4

由 Ashwin Agrawal 提交于 3月 01, 2017

`MyProc->inCommit` is to protect checkpoint running during inCommit
transactions.

However, `MyProc->lxid` has to be valid because `GetVirtualXIDsDelayingChkpt()`
and `HaveVirtualXIDsDelayingChkpt()` require `VirtualTransactionIdIsValid()` in
addition to `inCommit` to block the checkpoint process.

In this fix, we defer clearing `inCommit` and `lxid` to `CommitTransaction()`.

787992e4

29 12月, 2016 1 次提交

Fix interrupt count issue in DTM. · e1cac369

由 Ashwin Agrawal 提交于 12月 22, 2016

PG_TRY - PG_CATCH block was added to distributed transaction commit prepared and
abort prepared calls as part of commit c6320c. This call though happens to be
inside HOLD_INTERRUPTS - RESUME_INTERRUPTS block. Hence need to maintain the
interrupts counter correctly as any ERROR sets the InterruptHoldoffCount to 0 in
elog code and due to this issue was hitting the PANIC "Resume interrupt holdoff
count is bad (0)"

e1cac369

13 12月, 2016 1 次提交

Refactor distributed transaction phase 2 retry logic. · c6320c13

由 Asim R P 提交于 10月 24, 2016

Refactor the phase 2 retry logic of distributed transaction so that the retry happens
immediately after failure instead of happening inside EndCommand(). The patch also
increases the number of retries in case of failure to 2 and introduces a guc called
dtx_phase2_retry_count to control the number of retries.

c6320c13

24 11月, 2016 1 次提交

Guard against possible NULL pointer dereferencing · 280416b7

由 Daniel Gustafsson 提交于 11月 24, 2016

Improves defensiveness of programming around pointer derefencing to
ensure that we don't risk a NULL pointer. Most of these are quite
straight-forward, those of note are discussed below.

In doDispatchDtxProtocolCommand() we relied on the result data being
created in zeroed out memory on CdbDispatchDtxProtocolCommand() which
isn't guaranteed for every compiler. Explcitly set numResults to zero
and also check the results for NULL.

Per multiple reports by Coverity

280416b7

04 11月, 2016 1 次提交
- X
  Rename the interface routines of dispatcher, and add a README for illustration · be13fd00
  由 xiong-gang 提交于 11月 04, 2016
```
Signed-off-by: NKenan Yao <kyao@pivotal.io>
```
  be13fd00
26 8月, 2016 1 次提交

Silence compiler warning. · ed004c4b

由 Heikki Linnakangas 提交于 8月 26, 2016

Gcc 6.1 complains about "tautological compare". Per the comment, the
intention here is to unconditionally fail the assertion, so use a more
straightforward Assert(false) to do that.

ed004c4b

18 8月, 2016 1 次提交

Remove dead code. · 6a1c4299

由 Heikki Linnakangas 提交于 8月 18, 2016

I found these with "callcatcher", written by Caolán McNamara. Many thanks
for the tool! See https://www.skynet.ie/~caolan/Packages/callcatcher.html

6a1c4299

25 7月, 2016 2 次提交

Refactor command dispatch related function, · f7078db2

由 Pengzhou Tang 提交于 7月 12, 2016

Original cdbdisp_dispatchRMCommand() and CdbDoCommand() is easy confusing. This commit combine
them to one and meanwhile push down error handling to make coding easier.

f7078db2

Refactor utility statement dispatch interfaces · 01769ada

由 Pengzhou Tang 提交于 7月 08, 2016

refactor CdbDispatchUtilityStatement() to make it flexible for cdbCopyStart(),
dispatchVacuum() to call directly. Introduce flags like DF_NEED_TWO_SNAPSHOT,
DF_WITH_SNAPSHOT, DF_CANCEL_ON_ERROR to make function call much clearer

01769ada

16 7月, 2016 2 次提交

Simplify management of distributed transactions. · fb86c90d

由 Heikki Linnakangas 提交于 7月 16, 2016

We used to have a separate array of LocalDistributedXactData instances, and
a reference in PGPROC to its associated LocalDistributedXact. That's
unnecessarily complicated: we can store the LocalDistributedXact information
directly in the PGPROC entry, and get rid fo the auxiliary array and the
bookkeeping needed to manage that array.

This doesn't affect the backend-private cache of committed Xids that also
lives in cdblocaldistribxact.c.

Now that the PGPROC->localDistributedXactData fields are never accessed
by other backends, don't protect it with ProcArrayLock anymore. This makes
the code simpler, and potentially improves performance too (ProcArrayLock
can be very heavily contended on a busy system).

fb86c90d

Remove mechanism to poll QEs for max distributed XID at QD startup. · 2914c24f

由 Heikki Linnakangas 提交于 7月 16, 2016

There's no need to try to make the dXIDs unique across restarts, because
we always carry the QD startup timestamp along with dXIDs, which
disambiguates the same dXID before and after restart.

Per Asim RP's comments.

2914c24f

04 7月, 2016 1 次提交

Use SIMPLE_FAULT_INJECTOR() macro where possible · 38741b45

由 Daniel Gustafsson 提交于 6月 30, 2016

Callers to FaultInjector_InjectFaultIfSet() which don't pass neither
databasename nor tablename and that use DDLNotSpecified can instead
use the convenient macro SIMPLE_FAULT_INJECTOR() which cuts down on
the boilerplate in the code. This commit does not bring any changes
in functionality, merely readability.

38741b45

13 6月, 2016 1 次提交

Dispatch exactly same text string for all slices. · 4b360942

由 Kenan Yao 提交于 6月 06, 2016

Include a map from sliceIndex to gang_id in the dispatched string,
and remove the localSlice field, hence QE should get the localSlice
from the map now. By this way, we avoid duplicating and modifying
the dispatch text string slice by slice, and each QE of a sliced
dispatch would get same contents now.

The extra space cost is sizeof(int) * SliceNumber bytes, and the extra
computing cost is iterating the SliceNumber-size array. Compared with
memcpy of text string for each slice in previous implementation, this
way is much cheaper, because SliceNumber is much smaller than the size
of dispatch text string. Also, since SliceNumber is so small, we just
use an array for the map instead of a hash table.

Also, clean up some dead code in dispatcher, including:
(1) Remove primary_gang_id field of Slice struct and DispatchCommandDtxProtocolParms
struct, since dispatch agent is deprecated now;
(2) Remove redundant logic in cdbdisp_dispatchX;
(3) Clean up buildGpDtxProtocolCommand;

4b360942

21 5月, 2016 1 次提交

refactor gang management code · 46dfa750

由 Gang Xiong 提交于 5月 20, 2016

1) add one new type of gang: singleton reader gang.
2) change interface of allocateGang.
3) handling exceptions during gang creation: segment down and segment reset.
4) cleanup some dead code.

46dfa750

19 5月, 2016 1 次提交

Split cdbdisp.c into several files, and put them into a new · 895b7d50

由 Pengzhou Tang 提交于 5月 12, 2016

dispatcher/ directory

This commit has no logic change, it just contains movement of code across
files, to make dispatcher code clearer, and easier for unit testing.

Signed-off-by: Kenan Yao

895b7d50

13 5月, 2016 1 次提交

Clean up the way the results array is allocated in cdbdisp_returnResults(). · 6a28c978

由 Heikki Linnakangas 提交于 5月 13, 2016

I saw the "nresults < nslots" assertion fail, while hacking on something
else. It happened when a Distributed Prepare command failed, and there were
several error result sets from a segment. I'm not sure how normal it is to
receive multiple ERROR responses to a single query, but the protocol
certainly allows it, and I don't see any explanation for why the code used
to assume that there can be at most 2 result sets from each segment.

Remove that assumption, and make the code cope with more than two result
sets from a segment, by calculating the required size of the array
accurately.

In the passing, remove the NULL-terminator from the array, and change the
callers that depended on it to use the returned size variable instead.
Makes the loops in the callers look less funky.

6a28c978

10 5月, 2016 1 次提交
- H
  Separate GPDB-specific shared snapshot code to its own .c and .h files. · 7745cde1
  由 Heikki Linnakangas 提交于 5月 10, 2016
```
Not urgent to do right now, but makes merging and diffing easier.
```
  7745cde1
22 3月, 2016 1 次提交
- G
  When recovering in-doubt transactions, there were some transactions · 8283f748
  由 Gang Xiong 提交于 3月 01, 2016
```
missed due to incorrect iterating logic.
```
  8283f748
12 2月, 2016 2 次提交

H
Misc header file cleanup · 442c105e
由 Heikki Linnakangas 提交于 2月 11, 2016
```
Remove unnecessary #includes, add #includes that are actually needed by
some headers.
```
442c105e

Replace "uint" type with uint32 or unsigned int. · ce33af22

由 Heikki Linnakangas 提交于 2月 11, 2016

"uint" is not a standard C type, so it might not be available on all
platforms. Indeed, we had a typedef for WIN32 for that. But there's no reason
to use "uint", might as well just use the C standard "unsigned int", or the
PostgreSQL-specific uint32. Makes the intention more clear too, IMHO.

ce33af22

09 2月, 2016 1 次提交

Fix race condition while preparing transaction instead of serializing prepares. · 75b2d55d

由 Ashwin Agrawal 提交于 2月 02, 2016

Removed the locking and some more cleanups.
Avoid looping again in FinishPrepared Transaction. prepare_lsn to commit
the transaction can be found using gxact which we have locked, seems
pointless to loop around again to scan the list.

This is modified patch for GPDB based on postgres patch:
https://github.com/postgres/postgres/commit/bb38fb0d43c8d7ff54072bfd8bd63154e536b384#diff-3ed77c70e54e7f56eff48f6157aba91e
Original Patch commit message:
To lock a prepared transaction's shared memory entry, we used to mark it
with the XID of the backend. When the XID was no longer active according
to the proc array, the entry was implicitly considered as not locked
anymore. However, when preparing a transaction, the backend's proc array
entry was cleared before transfering the locks (and some other state) to
the prepared transaction's dummy PGPROC entry, so there was a window where
another backend could finish the transaction before it was in fact fully
prepared.

To fix, rewrite the locking mechanism of global transaction entries. Instead
of an XID, just have simple locked-or-not flag in each entry (we store the
locking backend's backend id rather than a simple boolean, but that's just
for debugging purposes). The backend is responsible for explicitly unlocking
the entry, and to make sure that that happens, install a callback to unlock
it on abort or process exit.

75b2d55d

28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be