1. 28 10月, 2017 1 次提交
    • H
      When dispatching, send ActiveSnapshot along, not some random snapshot. · 4a95afc1
      Heikki Linnakangas 提交于
      If the caller specifies DF_WITH_SNAPSHOT, so that the command is dispatched
      to the segments with a snapshot, but it currently has no active snapshot in
      the QD itself, that seems like a mistake.
      
      In qdSerializeDtxContextInfo(), the comment talked about which snapshot to
      use when the transaction has already been aborted. I didn't quite
      understand that. I don't think the function is used to dispatch the "ABORT"
      statement itself, and we shouldn't be dispatching anything else in an
      already-aborted transaction.
      
      This makes it more clear which snapshot is dispatched along with the
      command. In theory, the latest or serializable snapshot can be different
      from the one being used when the command is dispatched, although I'm not
      sure if there are any such cases in practice.
      
      In the upcoming 8.4 merge, there are more changes coming up to snapshot
      management, which make it more difficult to get hold of the latest acquired
      snapshot in the transaction, so changing this now will ease the pain of
      merging that.
      
      I don't know why, but after making the change in qdSerializeDtxContextInfo,
      I started to get a lot of "Too many distributed transactions for snapshot
      (maxCount %d, count %d)" errors. Looking at the code, I don't understand
      how it ever worked. I don't see any no guarantee that the array in
      TempQDDtxContextInfo or TempDtxContextInfo was pre-allocated correctly.
      Or maybe it got allocated big enough to hold max_prepared_xacts, which
      was always large enough, but it seemed rather haphazard to me. So in
      the spirit of "if you don't understand it, rewrite it until you do", I
      changed the way the allocation of the inProgressXidArray array works.
      In statically allocated snapshots, i.e. SerializableSnapshot and
      LatestSnapshot, the array is malloc'd. In a snapshot copied with
      CopySnapshot(), it is points to a part of the palloc'd space for the
      snapshot. Nothing new so far, but I changed CopySnapshot() to set
      "maxCount" to -1 to indicate that it's not malloc'd. Then I modified
      DistributedSnapshot_Copy and DistributedSnapshot_Deserialize to not give up
      if the target array is not large enough, but enlarge it as needed. Finally,
      I made a little optimization in GetSnapshotData() when running in a QE, to
      move the copying of the distributed snapshot data to outside the section
      guarded by ProcArrayLock. ProcArrayLock can be heavily contended, so that's
      a nice little optimization anyway, but especially now that
      DistributedSnapshot_Copy() might need to realloc the array.
      4a95afc1
  2. 25 8月, 2017 1 次提交
    • H
      Use ereport, rather than elog, for performance. · 01dff3ba
      Heikki Linnakangas 提交于
      ereport() has one subtle but important difference to elog: it doesn't
      evaluate its arguments, if the log level says that the message doesn't
      need to be printed. This makes a small but measurable difference in
      performance, if the arguments contain more complicated expressions, like
      function calls.
      
      While performance testing a workload with very short queries, I saw some
      CPU time being used in DtxContextToString. Those calls were coming from the
      arguments to elog() statements, and the result was always thrown away,
      because the log level was not high enough to actually log anything. Turn
      those elog()s into ereport()s, for speed.
      
      The problematic case here was a few elogs containing DtxContextToString
      calls, in hot codepaths, but I changed a few surrounding ones too, for
      consistency.
      
      Simplify the mock test, to not bother mocking elog(), while we're at it.
      The real elog/ereport work just fine in the mock environment.
      01dff3ba
  3. 07 6月, 2017 1 次提交
    • P
      restore TCP interconnect · 353a937d
      Pengzhou Tang 提交于
      This commit restore TCP interconnect and fix some hang issues.
      
      * restore TCP interconnect code
      * Add GUC called gp_interconnect_tcp_listener_backlog for tcp to control the backlog param of listen call
      * use memmove instead of memcpy because the memory areas do overlap.
      * call checkForCancelFromQD() for TCP interconnect if there are no data for a while, this can avoid QD from getting stuck.
      * revert cancelUnfinished related modification in 8d251945, otherwise some queries will get stuck
      * move and rename faultinjector "cursor_qe_reader_after_snapshot" to make test cases pass under TCP interconnect.
      353a937d
  4. 02 6月, 2017 1 次提交
    • X
      Remove subtransaction information from SharedLocalSnapshotSlot · b52ca70f
      Xin Zhang 提交于
      Originally, the reader kept copies of subtransaction information in
      two places.  First, it copied SharedLocalSnapshotSlot to share between
      writer and reader.  Second, reader kept another copy in subxbuf for
      better performance.  Due to lazy xid, subtransaction information can
      change in the writer asynchronously with respect to the reader.  This
      caused reader's subtransaction information out of date.
      
      This fix removes those copies of subtransaction information in the
      reader and adds a reference to the writer's PGPROC to
      SharedLocalSnapshotSlot.  Reader should refer to subtransaction
      information through writer's PGPROC and pg_subtrans.
      
      Also added is a lwlock per shared snapshot slot.  The lock protects
      shared snapshot information between a writer and readers belonging to
      the same session.
      
      Fixes github issues #2269 and #2284.
      Signed-off-by: NAsim R P <apraveen@pivotal.io>
      b52ca70f
  5. 01 6月, 2017 1 次提交
    • A
      Optimize DistributedSnapshot check and refactor to simplify. · 3c21b7d8
      Ashwin Agrawal 提交于
      Before this commit, snapshot stored information of distributed in-progress
      transactions populated during snapshot creation and its corresponding localXids
      found during tuple visibility check later (used as cache) by reverse mapping
      using single tightly coupled data structure DistributedSnapshotMapEntry. Storing
      the information this way possed couple of problems:
      
      1] Only one localXid can be cached for a distributedXid. For sub-transactions
      same distribXid can be associated with multiple localXid, but since can cache
      only one, for other local xids associated with distributedXid need to consult
      the distributed_log.
      
      2] While performing tuple visibility check, code must loop over full size of
      distributed in-progress array always first to check if cached localXid can be
      utilized to avoid reverse mapping.
      
      Now, decoupled the distributed in-progress with local xids cache separately. So,
      this allows us to store multiple xids per distributedXid. Also, allows to
      optimize scanning localXid only if tuple xid is relevant to it and also scanning
      size only equivalent to number of elements cached instead of size of distributed
      in-progress always even if nothing was cached.
      
      Along the way, refactored relevant code a bit as well to simplify further.
      3c21b7d8
  6. 28 4月, 2017 1 次提交
    • A
      Correct calculation of xminAllDistributedSnapshots and set it on QE's. · d887fe0c
      Ashwin Agrawal 提交于
      For vacuum, page pruning and freezing to perform its job correctly on QE's, it
      needs to know globally what's the lowest dxid till any transaction can see in
      full cluster. Hence QD must calculate and send that info to QE. For this purpose
      using logic similar to one for calculating globalxmin by local snapshot. TMGXACT
      for global transactions serves similar to PROC and hence its leveraged to
      provide us lowest gxid for its snapshot. Further using its array, shmGxactArray,
      can easily find the lowest across all global snapshots and pass down to QE via
      snapshot.
      
      Adding unit test for createDtxSnapshot along with the change.
      d887fe0c
  7. 13 4月, 2017 1 次提交
    • A
      Fix dereference after null check in ProcArrayEndTransaction. · 7a9af586
      Ashwin Agrawal 提交于
      Coverity reported: Either the check against null is unnecessary, or there may be
      a null pointer dereference. In ProcArrayEndTransaction: Pointer is checked
      against null but then dereferenced anyway.
      
      While its not an issue and for commit case, pointer is never null, but simplify
      the code and stop using pointer itself here.
      7a9af586
  8. 01 4月, 2017 3 次提交
    • A
      Cleanup LocalDistribXactData related code. · 8c20bc94
      Ashwin Agrawal 提交于
      Commit fb86c90d "Simplify management of
      distributed transactions." cleanedup lot of code for LocalDistribXactData and
      introduced LocalDistribXactData in PROC for debugging purpose. But it's only
      correctly maintained for QE's, QD never populated LocalDistribXactData in
      MyProc. Instead TMGXACT also had LocalDistribXactData which was just set
      initially for QD but never updated later and confused more than serving the
      purpose. Hence removing LocalDistribXactData from TMGXACT, as it already has
      other fields which provide required information. Also, cleaned-up QD related
      states as even in PROC only QE uses LocalDistribXactData.
      8c20bc94
    • A
      Fully enable lazy XID allocation in GPDB. · 0932453d
      Ashwin Agrawal 提交于
      As part of 8.3 merge, upstream commit 295e6398
      "Implement lazy XID allocation" was merged. But transactionIds were still
      allocated in StartTransaction as code changes required to make it work for GPDB
      with distrbuted transaction was pending, thereby feature remained as
      disabled. Some progress was made by commit
      a54d84a3 "Avoid assigning an XID to
      DTX_CONTEXT_QE_AUTO_COMMIT_IMPLICIT queries." Now this commit addresses the
      pending work needed for handling deferred xid allocation correctly with
      distributed transactions and fully enables the feature.
      
      Important highlights of changes:
      
      1] Modify xlog write and xlog replay record for DISTRIBUTED_COMMIT. Even if
      transacion is read-only for master and no xid is allocated to it, it can still
      be distributed transaction and hence needs to persist itself in such a case. So,
      write xlog record even if no local xid is assigned but transaction is
      prepared. Similarly during xlog replay of the XLOG_XACT_DISTRIBUTED_COMMIT type,
      perform distributed commit recovery ignoring local commit. Which also means for
      this case don't commit to distrbuted log, as its only used to perform reverse
      map of localxid to distributed xid.
      
      2] Remove localXID from gxact, as its no more needed to be maintained and used.
      
      3] Refactor code for QE Reader StartTransaction. There used to be wait-loop with
      sleep checking to see if SharedLocalSnapshotSlot has distributed XID same as
      that of READER to assign reader some xid as that of writer, for SET type
      commands till READER actually performs GetSnapShotData(). Since now a) writer is
      not going to have valid xid till it performs some write, writers transactionId
      turns out InvalidTransaction always here and b) read operations like SET doesn't
      need xid, any more hence need for this wait is gone.
      
      4] Thow error if using distributed transaction without distributed xid. Earlier
      AssignTransactionId() was called for this case in StartTransaction() but such
      scenario doesn't exist hence convert it to ERROR.
      
      5] QD earlier during snapshot creation in createDtxSnapshot() was able to assign
      localXid in inProgressEntryArray corresponding to distribXid, as localXid was
      known by that time. That's no more the case and localXid mostly will get
      assigned after snapshot is taken. Hence now even for QD similar to QE's snapshot
      creation time localXid is not populated but later found in
      DistributedSnapshotWithLocalMapping_CommittedTest(). There is chance to optimize
      and try to match earlier behavior somewhat by populating gxact in
      AssignTransactionId() once locakXid is known but currently seems not so much
      worth it as QE's anyways have to perform the lookups.
      0932453d
    • A
      Optimize distributed xact commit check. · 692be1a1
      Ashwin Agrawal 提交于
      Leverage the fact that inProgressEntryArray is sorted based on distribXid while
      creating the snapshot in createDtxSnapshot. So, can break out fast in function
      DistributedSnapshotWithLocalMapping_CommittedTest().
      692be1a1
  9. 07 3月, 2017 2 次提交
    • A
      Fix checkpoint wait for CommitTransaction. · 787992e4
      Ashwin Agrawal 提交于
      `MyProc->inCommit` is to protect checkpoint running during inCommit
      transactions.
      
      However, `MyProc->lxid` has to be valid because `GetVirtualXIDsDelayingChkpt()`
      and `HaveVirtualXIDsDelayingChkpt()` require `VirtualTransactionIdIsValid()` in
      addition to `inCommit` to block the checkpoint process.
      
      In this fix, we defer clearing `inCommit` and `lxid` to `CommitTransaction()`.
      787992e4
    • A
      Use VXIDs instead of xid for checkpoint delay. · a02b9e99
      Ashwin Agrawal 提交于
      Originally checkpoint is checking for xid, however, xid is used to control the
      transaction visibility and it's crucial to clean this xid if process is done
      with commit and before release locks.
      
      However, checkpoint need to wait for the `AtExat_smgr()` to cleanup persistent
      table information, which happened after release locks, where `xid` is already
      cleaned.
      
      Hence, we use VXID, which doesn't have visibility impact.
      
      NOTE: Upstream PostgreSQL commit f21bb9cf for the
      similar fix.
      a02b9e99
  10. 10 2月, 2017 1 次提交
  11. 25 1月, 2017 1 次提交
    • A
      Stop ignoring Lazy vacuum from RecentXmin calculation. · 7383c2b0
      Ashwin Agrawal 提交于
      As part of 8.3 merge via this upstream commit
      92c2ecc1, code to ignore lazy vacuum from
      calculating RecentXmin and RecentGlobalXmin was introduced.
      
      In GPDB as part of lazy vacuum, reindex is performed for bitmap indexes, which
      generates tuples in pg_class with lazy vacuum's transaction ID. Ignoring lazy
      vacuum from RecentXmin and RecentGlobalXmin during GetSnapshotData caused
      incorrect setting of hintbits to `HEAP_XMAX_INVALID` for tuple intended to be
      deletd by lazy vacuum and breaking HOT chain. This transaction visibility issue
      was encountered in CI many times with parallel schedule `bitmap_index, analyze`
      failing with error `could not find pg_class tuple for index` at commit time of
      lazy vacuum. Hence this commit stops tracking lazy vacuum in MyProc and
      performing any specific action related to same.
      7383c2b0
  12. 21 12月, 2016 1 次提交
    • A
      Update SharedLocalSnapshot correctly for subtransactions · 46d9521b
      Ashwin Agrawal 提交于
      QE reader leverages SharedLocalSnapshot to perform visibility checks. QE writer
      is responsible to keep the SharedLocalSnapshot up to date. Before this fix,
      SharedLocalSnapshot was only updated by writer while acquiring the snapshot. But
      if transaction id is assigned to subtransaction after it has taken the snapshot,
      it was not reflected. Due to this when QE reader called
      TransactionIdIsCurrentTransactionId, it may get sometimes false based on timings
      for subtransaction ids used by QE writer to insert/update tuples. Hence to fix
      the situation, SharedLocalSnapshot is now updated when assigning transaction id
      and deregistered if subtransaction aborts.
      
      Also, adding faultinjector to suspend cursor QE reader instead of guc/sleep used
      in past. Moving cursor tests from bugbuster to ICG and adding deterministic test
      to exercise the behavior.
      
      Fixes #1276, reported by @pengzhout
      46d9521b
  13. 16 7月, 2016 1 次提交
    • H
      Simplify management of distributed transactions. · fb86c90d
      Heikki Linnakangas 提交于
      We used to have a separate array of LocalDistributedXactData instances, and
      a reference in PGPROC to its associated LocalDistributedXact. That's
      unnecessarily complicated: we can store the LocalDistributedXact information
      directly in the PGPROC entry, and get rid fo the auxiliary array and the
      bookkeeping needed to manage that array.
      
      This doesn't affect the backend-private cache of committed Xids that also
      lives in cdblocaldistribxact.c.
      
      Now that the PGPROC->localDistributedXactData fields are never accessed
      by other backends, don't protect it with ProcArrayLock anymore. This makes
      the code simpler, and potentially improves performance too (ProcArrayLock
      can be very heavily contended on a busy system).
      fb86c90d
  14. 04 7月, 2016 1 次提交
    • D
      Use SIMPLE_FAULT_INJECTOR() macro where possible · 38741b45
      Daniel Gustafsson 提交于
      Callers to FaultInjector_InjectFaultIfSet() which don't pass neither
      databasename nor tablename and that use DDLNotSpecified can instead
      use the convenient macro SIMPLE_FAULT_INJECTOR() which cuts down on
      the boilerplate in the code. This commit does not bring any changes
      in functionality, merely readability.
      38741b45
  15. 28 6月, 2016 1 次提交
  16. 10 5月, 2016 1 次提交
  17. 09 12月, 2015 1 次提交
    • H
      Backport the 5 second wait if a database is in use. · f042e3e8
      Heikki Linnakangas 提交于
      As promised in previous commit. Upstream patch:
      
      commit bd0a2609
      Author: Tom Lane <tgl@sss.pgh.pa.us>
      Date:   Fri Jun 1 19:38:07 2007 +0000
      
          Make CREATE/DROP/RENAME DATABASE wait a little bit to see if other backends
          will exit before failing because of conflicting DB usage.  Per discussion,
          this seems a good idea to help mask the fact that backend exit takes nonzero
          time.  Remove a couple of thereby-obsoleted sleeps in contrib and PL
          regression test sequences.
      f042e3e8
  18. 28 10月, 2015 1 次提交
  19. 29 7月, 2009 1 次提交
  20. 31 3月, 2009 1 次提交
    • H
      Fix a rare race condition when commit_siblings > 0 and a transaction commits · 6bd98835
      Heikki Linnakangas 提交于
      at the same instant as a new backend is spawned. Since CountActiveBackends()
      doesn't hold ProcArrayLock, it needs to be prepared for the case that a
      pointer at the end of the proc array is still NULL even though numProcs says
      it should be valid, since it doesn't hold ProcArrayLock. Backpatch to 8.1.
      8.0 and earlier had this right, but it was broken in the split of PGPROC and
      sinval shared memory arrays.
      
      Per report and proposal by Marko Kreen.
      6bd98835
  21. 02 1月, 2009 1 次提交
  22. 05 8月, 2008 1 次提交
    • T
      Improve CREATE/DROP/RENAME DATABASE so that when failing because the source · 4abd7b49
      Tom Lane 提交于
      or target database is being accessed by other users, it tells you whether
      the "other users" are live sessions or uncommitted prepared transactions.
      (Indeed, it tells you exactly how many of each, but that's mostly just
      because it was easy to do so.)  This should help forestall the gotcha of
      not realizing that a prepared transaction is what's blocking the command.
      Per discussion.
      4abd7b49
  23. 11 7月, 2008 1 次提交
  24. 13 5月, 2008 1 次提交
    • A
      Improve snapshot manager by keeping explicit track of snapshots. · 5da9da71
      Alvaro Herrera 提交于
      There are two ways to track a snapshot: there's the "registered" list, which
      is used for arbitrary long-lived snapshots; and there's the "active stack",
      which is used for the snapshot that is considered "active" at any time.
      This also allows users of snapshots to stop worrying about snapshot memory
      allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
      assignment.  This is all done automatically now.
      
      As a consequence, this allows us to reset MyProc->xmin when there are no
      more snapshots registered in the current backend, reducing the impact that
      long-running transactions have on VACUUM.
      5da9da71
  25. 27 3月, 2008 2 次提交
  26. 12 3月, 2008 1 次提交
    • T
      Make TransactionIdIsInProgress check transam.c's single-item XID status cache · 611b4393
      Tom Lane 提交于
      before it goes groveling through the ProcArray.  In situations where the same
      recently-committed transaction ID is checked repeatedly by tqual.c, this saves
      a lot of shared-memory searches.  And it's cheap enough that it shouldn't
      hurt noticeably when it doesn't help.
      Concept and patch by Simon, some minor tweaking and comment-cleanup by Tom.
      611b4393
  27. 10 1月, 2008 1 次提交
  28. 02 1月, 2008 1 次提交
  29. 01 12月, 2007 1 次提交
    • T
      Avoid incrementing the CommandCounter when CommandCounterIncrement is called · 895a94de
      Tom Lane 提交于
      but no database changes have been made since the last CommandCounterIncrement.
      This should result in a significant improvement in the number of "commands"
      that can typically be performed within a transaction before hitting the 2^32
      CommandId size limit.  In particular this buys back (and more) the possible
      adverse consequences of my previous patch to fix plan caching behavior.
      
      The implementation requires tracking whether the current CommandCounter
      value has been "used" to mark any tuples.  CommandCounter values stored into
      snapshots are presumed not to be used for this purpose.  This requires some
      small executor changes, since the executor used to conflate the curcid of
      the snapshot it was using with the command ID to mark output tuples with.
      Separating these concepts allows some small simplifications in executor APIs.
      
      Something for the TODO list: look into having CommandCounterIncrement not do
      AcceptInvalidationMessages.  It seems fairly bogus to be doing it there,
      but exactly where to do it instead isn't clear, and I'm disinclined to mess
      with asynchronous behavior during late beta.
      895a94de
  30. 16 11月, 2007 1 次提交
  31. 25 10月, 2007 1 次提交
  32. 24 9月, 2007 1 次提交
  33. 22 9月, 2007 1 次提交
    • T
      Make some simple performance improvements in TransactionIdIsInProgress(). · da072ab2
      Tom Lane 提交于
      For XIDs of our own transaction and subtransactions, it's cheaper to ask
      TransactionIdIsCurrentTransactionId() than to look in shared memory.
      Also, the xids[] work array is always the same size within any given
      process, so malloc it just once instead of doing a palloc/pfree on every
      call; aside from being faster this lets us get rid of some goto's, since
      we no longer have any end-of-function pfree to do.  Both ideas by Heikki.
      da072ab2
  34. 09 9月, 2007 1 次提交
    • T
      Replace the former method of determining snapshot xmax --- to wit, calling · 6bd4f401
      Tom Lane 提交于
      ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid"
      variable that is updated during transaction commit or abort.  Since
      latestCompletedXid is written only in places that had to lock ProcArrayLock
      exclusively anyway, and is read only in places that had to lock ProcArrayLock
      shared anyway, it adds no new locking requirements to the system despite being
      cluster-wide.  Moreover, removing ReadNewTransactionId from snapshot
      acquisition eliminates the need to take both XidGenLock and ProcArrayLock at
      the same time.  Since XidGenLock is sometimes held across I/O this can be a
      significant win.  Some preliminary benchmarking suggested that this patch has
      no effect on average throughput but can significantly improve the worst-case
      transaction times seen in pgbench.  Concept by Florian Pflug, implementation
      by Tom Lane.
      6bd4f401
  35. 08 9月, 2007 1 次提交
    • T
      Don't take ProcArrayLock while exiting a transaction that has no XID; there is · 0a51e707
      Tom Lane 提交于
      no need for serialization against snapshot-taking because the xact doesn't
      affect anyone else's snapshot anyway.  Per discussion.  Also, move various
      info about the interlocking of transactions and snapshots out of code comments
      and into a hopefully-more-cohesive discussion in access/transam/README.
      
      Also, remove a couple of now-obsolete comments about having to force some WAL
      to be written to persuade RecordTransactionCommit to do its thing.
      0a51e707
  36. 07 9月, 2007 1 次提交