1. 01 9月, 2016 1 次提交
    • A
      Check for zero persistentTID in xlog record · 9b2eb416
      Ashwin Agrawal 提交于
      Refactor code to use common routine to fetch PT info for xlogging. Check can
      be easliy added at this common place to validate persistent info is
      available. Plus still add check during recovery for persistentTID zero. As
      with postgres upstream merges possible the function to populate persistent info is
      not called at all, so this check will not hit during xlog record construction
      but atleast gives clear clue during recovery.
      9b2eb416
  2. 16 8月, 2016 1 次提交
  3. 02 8月, 2016 1 次提交
    • H
      Store pg_appendonly tuple in relcache. · a4fbb150
      Heikki Linnakangas 提交于
      This way, you don't need to always fetch it from the system catalogs,
      which makes things simpler, and is marginally faster too.
      
      To make all the fields in pg_appendonly accessible by direct access to
      the Form_pg_appendonly struct, change 'compresstype' field from text to
      name. "No compression" is now represented by an empty string, rather than
      NULL. I hope there are no applications out there that will get confused
      by this.
      
      The GetAppendOnlyEntry() function used to take a Snapshot as argument,
      but that seems unnecessary. The data in pg_appendonly doesn't change for
      a table after it's been created. Except when it's ALTERed, or rewritten
      by TRUNCATE or CLUSTER, but those operations invalidate the relcache, and
      we're never interested in the old version.
      
      There's not much need for the AppendOnlyEntry struct and the
      GetAppendOnlyEntry() function anymore; you can just as easily just access
      the Form_pg_appendonly struct directly. I'll remove that as a separate
      commit, though, to keep this one more readable.
      a4fbb150
  4. 16 6月, 2016 1 次提交
    • H
      Revert relcache invalidation at EOX to work like before the merge. · 3b8a4fe2
      Heikki Linnakangas 提交于
      The other FIXME comment that the removed comment refers to was reverted
      earlier, before we pushed the 8.3 merge. Should've uncommented this back
      then, but I missed the need for that because I've been building with
      --enable-cassert. This fixes the regression failure on gpdtm_plpgsql test
      case, when built without assertions.
      3b8a4fe2
  5. 28 5月, 2016 1 次提交
    • H
      Remove Debug_check_for_invalid_persistent_tid option. · b3ec6f00
      Heikki Linnakangas 提交于
      The checks for invalid TIDs are very cheap, a few CPU instructions. It's
      better to catch bugs involving invalid TID early, so let's always check
      for them.
      
      The LOGs in storeAttrDefault() that were also tied to this GUC seemed
      oddly specific. They were probably added long time ago to hunt for some
      particular bug, and don't seem generally useful, so I just removed them.
      b3ec6f00
  6. 06 5月, 2016 1 次提交
  7. 19 3月, 2016 2 次提交
    • A
      Validate pg_class pg_type tuple after index fetch. · 0ff952e7
      Ashwin Agrawal 提交于
      While fetching the pg_class or pg_type tuple using index, perform sanity
      check to make sure tuple intended to read, is the tuple index is
      pointing too. This is just sanity check to make sure index is not broken
      and not returning some incorrect tuple to contain the damage.
      0ff952e7
    • A
      Validate gp_relation_node tuple after index fetch. · c8a21e0d
      Ashwin Agrawal 提交于
      This commit makes sure while accessing gp_relation_node through its
      index, sanity check is always performed to verify the tuple being
      operated on is the intended tuple, else for any reason index is broken
      and provide bad tuple fail the operation instead of causing damage.
      For some scenarios like delete gp_relation_code tuple case it adds extra tuple
      deform call which was not done earlier but doesn't seem heavy enough to
      be performed on ddl operation.
      c8a21e0d
  8. 12 2月, 2016 1 次提交
    • H
      Remove unnecessary #includes. · 9aa7a22f
      Heikki Linnakangas 提交于
      In cdbcat.h, include only the header files that are actually needed for
      the single function prototype in that file. And don't include cdbcat.h
      unnecessarily. A couple of .c files were including cdbcat.h to get
      GpPolicy, but that's actually defined in catalog/gp_policy.h, so #include
      that directly instead where needed.
      9aa7a22f
  9. 12 1月, 2016 1 次提交
    • H
      Make functions in gp_toolkit to execute on all nodes as intended. · 246f7510
      Heikki Linnakangas 提交于
      Moving the installation of gp_toolkit.sql into initdb, in commit f8910c3c,
      broke all the functions that are supposed to execute on all nodes, like
      gp_toolkit.__gp_localid. After that change, gp_toolkit.sql was executed
      in utility mode, and the gp_distribution_policy entries for those functions
      were not created as a result.
      
      To fix, change the code so that gp_distribution_policy entries are never
      never created, or consulted, for EXECUTE-type external tables. They have
      more fine-grained information in pg_exttable.location field anyway, so rely
      on that instead. With this change, there is no difference in whether an
      EXECUTE-type external table is created in utility mode or not. We would
      still have similar problems if gp_toolkit contained other kinds of
      external tables, but it doesn't.
      
      This removes the isMasterOnly() function and changes all its callers to
      call GpPolicyFetch() directly instead. Some places used GpPolicyFetch()
      directly to check if a table is distributed, so this just makes that the
      canonical way to do it. The check for system schemas that used to be in
      isMasterOnly() are no longer performed, but they should've unnecessary in
      the first place. System tables don't have gp_distribution_policy entries,
      so they'll be treated as master-only even without that check.
      246f7510
  10. 19 11月, 2015 1 次提交
    • H
      Remove gp_upgrade_mode and related machinery. · d9b60cd2
      Heikki Linnakangas 提交于
      The current plan is to use something like pg_upgrade for future in-place
      upgrades. The gpupgrade mechanism will not scale to the kind of drastic
      catalog and other data directory layout changes that are coming as we
      merge with later PostgreSQL releases.
      
      Kept gpupgrademirror for now. Need to check if there's some logic that's
      worth saving, for a possible pg_upgrade based solution later.
      d9b60cd2
  11. 28 10月, 2015 1 次提交
  12. 30 11月, 2012 1 次提交
    • T
      Fix assorted bugs in CREATE INDEX CONCURRENTLY. · 5c8c7c7c
      Tom Lane 提交于
      This patch changes CREATE INDEX CONCURRENTLY so that the pg_index
      flag changes it makes without exclusive lock on the index are made via
      heap_inplace_update() rather than a normal transactional update.  The
      latter is not very safe because moving the pg_index tuple could result in
      concurrent SnapshotNow scans finding it twice or not at all, thus possibly
      resulting in index corruption.
      
      In addition, fix various places in the code that ought to check to make
      sure that the indexes they are manipulating are valid and/or ready as
      appropriate.  These represent bugs that have existed since 8.2, since
      a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
      index behind, and we ought not try to do anything that might fail with
      such an index.
      
      Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
      columns that are allowed to change after initial creation.  Previously we
      could have been left with stale values of some fields in an index relcache
      entry.  It's not clear whether this actually had any user-visible
      consequences, but it's at least a bug waiting to happen.
      
      This is a subset of a patch already applied in 9.2 and HEAD.  Back-patch
      into all earlier supported branches.
      
      Tom Lane and Andres Freund
      5c8c7c7c
  13. 17 8月, 2011 1 次提交
    • T
      Fix race condition in relcache init file invalidation. · 8407a11c
      Tom Lane 提交于
      The previous code tried to synchronize by unlinking the init file twice,
      but that doesn't actually work: it leaves a window wherein a third process
      could read the already-stale init file but miss the SI messages that would
      tell it the data is stale.  The result would be bizarre failures in catalog
      accesses, typically "could not read block 0 in file ..." later during
      startup.
      
      Instead, hold RelCacheInitLock across both the unlink and the sending of
      the SI messages.  This is more straightforward, and might even be a bit
      faster since only one unlink call is needed.
      
      This has been wrong since it was put in (in 2002!), so back-patch to all
      supported releases.
      8407a11c
  14. 23 3月, 2011 1 次提交
    • T
      Avoid potential deadlock in InitCatCachePhase2(). · cf735470
      Tom Lane 提交于
      Opening a catcache's index could require reading from that cache's own
      catalog, which of course would acquire AccessShareLock on the catalog.
      So the original coding here risks locking index before heap, which could
      deadlock against another backend trying to get exclusive locks in the
      normal order.  Because InitCatCachePhase2 is only called when a backend
      has to start up without a relcache init file, the deadlock was seldom seen
      in the field.  (And by the same token, there's no need to worry about any
      performance disadvantage; so not much point in trying to distinguish
      exactly which catalogs have the risk.)
      
      Bug report, diagnosis, and patch by Nikhil Sontakke.  Additional commentary
      by me.  Back-patch to all supported branches.
      cf735470
  15. 02 9月, 2010 1 次提交
    • T
      Fix up flushing of composite-type typcache entries to be driven directly by · a15e220a
      Tom Lane 提交于
      SI invalidation events, rather than indirectly through the relcache.
      
      In the previous coding, we had to flush a composite-type typcache entry
      whenever we discarded the corresponding relcache entry.  This caused problems
      at least when testing with RELCACHE_FORCE_RELEASE, as shown in recent report
      from Jeff Davis, and might result in real-world problems given the kind of
      unexpected relcache flush that that test mechanism is intended to model.
      
      The new coding decouples relcache and typcache management, which is a good
      thing anyway from a structural perspective.  The cost is that we have to
      search the typcache linearly to find entries that need to be flushed.  There
      are a couple of ways we could avoid that, but at the moment it's not clear
      it's worth any extra trouble, because the typcache contains very few entries
      in typical operation.
      
      Back-patch to 8.2, the same as some other recent fixes in this general area.
      The patch could be carried back to 8.0 with some additional work, but given
      that it's only hypothetical whether we're fixing any problem observable in
      the field, it doesn't seem worth the work now.
      a15e220a
  16. 15 4月, 2010 1 次提交
    • T
      Fix a problem introduced by my patch of 2010-01-12 that revised the way · 32616fb1
      Tom Lane 提交于
      relcache reload works.  In the patched code, a relcache entry in process of
      being rebuilt doesn't get unhooked from the relcache hash table; which means
      that if a cache flush occurs due to sinval queue overrun while we're
      rebuilding it, the entry could get blown away by RelationCacheInvalidate,
      resulting in crash or misbehavior.  Fix by ensuring that an entry being
      rebuilt has positive refcount, so it won't be seen as a target for removal
      if a cache flush occurs.  (This will mean that the entry gets rebuilt twice
      in such a scenario, but that's okay.)  It appears that the problem can only
      arise within a transaction that has previously reassigned the relfilenode of
      a pre-existing table, via TRUNCATE or a similar operation.  Per bug #5412
      from Rusty Conover.
      
      Back-patch to 8.2, same as the patch that introduced the problem.
      I think that the failure can't actually occur in 8.2, since it lacks the
      rd_newRelfilenodeSubid optimization, but let's make it work like the later
      branches anyway.
      
      Patch by Heikki, slightly editorialized on by me.
      32616fb1
  17. 14 1月, 2010 1 次提交
    • T
      When loading critical system indexes into the relcache, ensure we lock the · 8a6a40de
      Tom Lane 提交于
      underlying catalog not only the index itself.  Otherwise, if the cache
      load process touches the catalog (which will happen for many though not
      all of these indexes), we are locking index before parent table, which can
      result in a deadlock against processes that are trying to lock them in the
      normal order.  Per today's failure on buildfarm member gothic_moth; it's
      surprising the problem hadn't been identified before.
      
      Back-patch to 8.2.  Earlier releases didn't have the issue because they
      didn't try to lock these indexes during load (instead assuming that they
      couldn't change schema at all during multiuser operation).
      8a6a40de
  18. 13 1月, 2010 1 次提交
    • T
      Fix relcache reload mechanism to be more robust in the face of errors · d4b7cf06
      Tom Lane 提交于
      occurring during a reload, such as query-cancel.  Instead of zeroing out
      an existing relcache entry and rebuilding it in place, build a new relcache
      entry, then swap its contents with the old one, then free the new entry.
      This avoids problems with code believing that a previously obtained pointer
      to a cache entry must still reference a valid entry, as seen in recent
      failures on buildfarm member jaguar.  (jaguar is using CLOBBER_CACHE_ALWAYS
      which raises the probability of failure substantially, but the problem
      could occur in the field without that.)  The previous design was okay
      when it was made, but subtransactions and the ResourceOwner mechanism
      make it unsafe now.
      
      Also, make more use of the already existing rd_isvalid flag, so that we
      remember that the entry requires rebuilding even if the first attempt fails.
      
      Back-patch as far as 8.2.  Prior versions have enough issues around relcache
      reload anyway (due to inadequate locking) that fixing this one doesn't seem
      worthwhile.
      d4b7cf06
  19. 27 9月, 2009 1 次提交
    • T
      Fix RelationCacheInitializePhase2 (Phase3, in HEAD) to cope with the · 8b720b57
      Tom Lane 提交于
      possibility of shared-inval messages causing a relcache flush while it tries
      to fill in missing data in preloaded relcache entries.  There are actually
      two distinct failure modes here:
      
      1. The flush could delete the next-to-be-processed cache entry, causing
      the subsequent hash_seq_search calls to go off into the weeds.  This is
      the problem reported by Michael Brown, and I believe it also accounts
      for bug #5074.  The simplest fix is to restart the hashtable scan after
      we've read any new data from the catalogs.  It appears that pre-8.4
      branches have not suffered from this failure, because by chance there were
      no other catalogs sharing the same hash chains with the catalogs that
      RelationCacheInitializePhase2 had work to do for.  However that's obviously
      pretty fragile, and it seems possible that derivative versions with
      additional system catalogs might be vulnerable, so I'm back-patching this
      part of the fix anyway.
      
      2. The flush could delete the *current* cache entry, in which case the
      pointer to the newly-loaded data would end up being stored into an
      already-deleted Relation struct.  As long as it was still deleted, the only
      consequence would be some leaked space in CacheMemoryContext.  But it seems
      possible that the Relation struct could already have been recycled, in
      which case this represents a hard-to-reproduce clobber of cached data
      structures, with unforeseeable consequences.  The fix here is to pin the
      entry while we work on it.
      
      In passing, also change RelationCacheInitializePhase2 to Assert that
      formrdesc() set up the relation's cached TupleDesc (rd_att) with the
      correct type OID and hasoids values.  This is more appropriate than
      silently updating the values, because the original tupdesc might already
      have been copied into the catcache.  However this part of the patch is
      not in HEAD because it fails due to some questionable recent changes in
      formrdesc :-(.  That will be cleaned up in a subsequent patch.
      8b720b57
  20. 30 12月, 2008 1 次提交
  21. 11 8月, 2008 1 次提交
    • T
      Fix corner-case bug introduced with HOT: if REINDEX TABLE pg_class (or a · e9ec4bbf
      Tom Lane 提交于
      REINDEX DATABASE including same) is done before a session has done any other
      update on pg_class, the pg_class relcache entry was left with an incorrect
      setting of rd_indexattr, because the indexed-attributes set would be first
      demanded at a time when we'd forced a partial list of indexes into the
      pg_class entry, and it would remain cached after that.  This could result
      in incorrect decisions about HOT-update safety later in the same session.
      In practice, since only pg_class_relname_nsp_index would be missed out,
      only ALTER TABLE RENAME and ALTER TABLE SET SCHEMA could trigger a problem.
      Per report and test case from Ondrej Jirman.
      e9ec4bbf
  22. 17 4月, 2008 1 次提交
    • T
      Fix LOAD_CRIT_INDEX() macro to take out AccessShareLock on the system index · 95b7a876
      Tom Lane 提交于
      it is trying to build a relcache entry for.  This is an oversight in my 8.2
      patch that tried to ensure we always took a lock on a relation before trying
      to build its relcache entry.  The implication is that if someone committed a
      reindex of a critical system index at about the same time that some other
      backend were starting up without a valid pg_internal.init file, the second one
      might PANIC due to not seeing any valid version of the index's pg_class row.
      Improbable case, but definitely not impossible.
      95b7a876
  23. 01 4月, 2008 1 次提交
    • T
      Fix an oversight I made in a cleanup patch over a year ago: · e3a47483
      Tom Lane 提交于
      eval_const_expressions needs to be passed the PlannerInfo ("root") structure,
      because in some cases we want it to substitute values for Param nodes.
      (So "constant" is not so constant as all that ...)  This mistake partially
      disabled optimization of unnamed extended-Query statements in 8.3: in
      particular the LIKE-to-indexscan optimization would never be applied if the
      LIKE pattern was passed as a parameter, and constraint exclusion depending
      on a parameter value didn't work either.
      e3a47483
  24. 28 2月, 2008 1 次提交
  25. 02 1月, 2008 1 次提交
  26. 29 11月, 2007 1 次提交
    • T
      Improve test coverage of CLOBBER_CACHE_ALWAYS by having it also force · 03ffc4d6
      Tom Lane 提交于
      reloading of operator class information on each use of LookupOpclassInfo.
      Had this been in place a year ago, it would have helped me find a bug
      in the then-new 'operator family' code.  Now that we have a build farm
      member testing CLOBBER_CACHE_ALWAYS on a regular basis, it seems worth
      expending a little bit of effort here.
      03ffc4d6
  27. 16 11月, 2007 1 次提交
  28. 21 9月, 2007 1 次提交
    • T
      HOT updates. When we update a tuple without changing any of its indexed · 282d2a03
      Tom Lane 提交于
      columns, and the new version can be stored on the same heap page, we no longer
      generate extra index entries for the new version.  Instead, index searches
      follow the HOT-chain links to ensure they find the correct tuple version.
      
      In addition, this patch introduces the ability to "prune" dead tuples on a
      per-page basis, without having to do a complete VACUUM pass to recover space.
      VACUUM is still needed to clean up dead index entries, however.
      
      Pavan Deolasee, with help from a bunch of other people.
      282d2a03
  29. 26 7月, 2007 1 次提交
    • T
      Arrange to put TOAST tables belonging to temporary tables into special schemas · 82eed4db
      Tom Lane 提交于
      named pg_toast_temp_nnn, alongside the pg_temp_nnn schemas used for the temp
      tables themselves.  This allows low-level code such as the relcache to
      recognize that these tables are indeed temporary, which enables various
      optimizations such as not WAL-logging changes and using local rather than
      shared buffers for access.  Aside from obvious performance benefits, this
      provides a solution to bug #3483, in which other backends unexpectedly held
      open file references to temporary tables.  The scheme preserves the property
      that TOAST tables are not in any schema that's normally in the search path,
      so they don't conflict with user table names.
      
      initdb forced because of changes in system view definitions.
      82eed4db
  30. 27 5月, 2007 1 次提交
    • T
      Fix up pgstats counting of live and dead tuples to recognize that committed · 77947c51
      Tom Lane 提交于
      and aborted transactions have different effects; also teach it not to assume
      that prepared transactions are always committed.
      
      Along the way, simplify the pgstats API by tying counting directly to
      Relations; I cannot detect any redeeming social value in having stats
      pointers in HeapScanDesc and IndexScanDesc structures.  And fix a few
      corner cases in which counts might be missed because the relation's
      pgstat_info pointer hadn't been set.
      77947c51
  31. 03 5月, 2007 1 次提交
    • T
      Fix things so that when CREATE INDEX CONCURRENTLY sets pg_index.indisvalid · 8ec94385
      Tom Lane 提交于
      true at the very end of its processing, the update is broadcast via a
      shared-cache-inval message for the index; without this, existing backends that
      already have relcache entries for the index might never see it become valid.
      Also, force a relcache inval on the index's parent table at the same time,
      so that any cached plans for that table are re-planned; this ensures that
      the newly valid index will be used if appropriate.  Aside from making
      C.I.C. behave more reasonably, this is necessary infrastructure for some
      aspects of the HOT patch.  Pavan Deolasee, with a little further stuff from
      me.
      8ec94385
  32. 29 3月, 2007 1 次提交
  33. 20 3月, 2007 1 次提交
    • J
      Changes pg_trigger and extend pg_rewrite in order to allow triggers and · 0fe16500
      Jan Wieck 提交于
      rules to be defined with different, per session controllable, behaviors
      for replication purposes.
      
      This will allow replication systems like Slony-I and, as has been stated
      on pgsql-hackers, other products to control the firing mechanism of
      triggers and rewrite rules without modifying the system catalog directly.
      
      The firing mechanisms are controlled by a new superuser-only GUC
      variable, session_replication_role, together with a change to
      pg_trigger.tgenabled and a new column pg_rewrite.ev_enabled. Both
      columns are a single char data type now (tgenabled was a bool before).
      The possible values in these attributes are:
      
           'O' - Trigger/Rule fires when session_replication_role is "origin"
                 (default) or "local". This is the default behavior.
      
           'D' - Trigger/Rule is disabled and fires never
      
           'A' - Trigger/Rule fires always regardless of the setting of
                 session_replication_role
      
           'R' - Trigger/Rule fires when session_replication_role is "replica"
      
      The GUC variable can only be changed as long as the system does not have
      any cached query plans. This will prevent changing the session role and
      accidentally executing stored procedures or functions that have plans
      cached that expand to the wrong query set due to differences in the rule
      firing semantics.
      
      The SQL syntax for changing a triggers/rules firing semantics is
      
           ALTER TABLE <tabname> <when> TRIGGER|RULE <name>;
      
           <when> ::= ENABLE | ENABLE ALWAYS | ENABLE REPLICA | DISABLE
      
      psql's \d command as well as pg_dump are extended in a backward
      compatible fashion.
      
      Jan
      0fe16500
  34. 04 3月, 2007 1 次提交
  35. 28 2月, 2007 1 次提交
    • T
      Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len). · 234a02b2
      Tom Lane 提交于
      Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with
      VARSIZE and VARDATA, and as a consequence almost no code was using the
      longer names.  Rename the length fields of struct varlena and various
      derived structures to catch anyplace that was accessing them directly;
      and clean up various places so caught.  In itself this patch doesn't
      change any behavior at all, but it is necessary infrastructure if we hope
      to play any games with the representation of varlena headers.
      Greg Stark and Tom Lane
      234a02b2
  36. 25 1月, 2007 1 次提交
  37. 09 1月, 2007 1 次提交
    • T
      Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LAST · 44317582
      Tom Lane 提交于
      per-column options for btree indexes.  The planner's support for this is still
      pretty rudimentary; it does not yet know how to plan mergejoins with
      nondefault ordering options.  The documentation is pretty rudimentary, too.
      I'll work on improving that stuff later.
      
      Note incompatible change from prior behavior: ORDER BY ... USING will now be
      rejected if the operator is not a less-than or greater-than member of some
      btree opclass.  This prevents less-than-sane behavior if an operator that
      doesn't actually define a proper sort ordering is selected.
      44317582
  38. 06 1月, 2007 1 次提交
  39. 01 1月, 2007 1 次提交
    • T
      Found the problem with my operator-family changes: by fetching from · 0b56be83
      Tom Lane 提交于
      pg_opclass during LookupOpclassInfo(), I'd turned pg_opclass_oid_index
      into a critical system index.  However the problem could only manifest
      during a backend's first attempt to load opclass data, and then only
      if it had successfully loaded pg_internal.init and subsequently received
      a relcache flush; which made it impossible to reproduce in sequential
      tests and darn hard even in parallel tests.  Memo to self: when
      exercising cache flush scenarios, must disable LookupOpclassInfo's
      internal cache too.
      0b56be83