1. 25 2月, 2012 5 次提交
    • T
      Avoid repeated creation/freeing of per-subre DFAs during regex search. · 58735947
      Tom Lane 提交于
      In nested sub-regex trees, lower-level nodes created DFAs and then
      destroyed them again before exiting, which is a bit dumb considering that
      the recursive search is likely to call those nodes again later.  Instead
      cache each created DFA until the end of pg_regexec().  This is basically a
      space for time tradeoff, in that it might increase the maximum memory
      usage.  However, in most regex patterns there are not all that many subre
      nodes, so not that many DFAs --- and in any case, the peak usage occurs
      when reaching the bottom recursion level, and except for alternation cases
      that's going to be the same anyway.
      58735947
    • T
      Remove useless "retry memory" logic within regex engine. · 3cbfe485
      Tom Lane 提交于
      Apparently some primordial version of Spencer's engine needed cdissect()
      and child functions to be able to continue matching from a previous
      position when re-called.  That is dead code, though, since trivial
      inspection shows that cdissect can never be entered without having
      previously done zapmem which resets the relevant retry counter.  I have
      also verified experimentally that no case in the Tcl regression tests
      reaches cdissect with a nonzero retry value.  Accordingly, remove that
      logic.  This doesn't really save any noticeable number of cycles in itself,
      but it is one step towards making dissect() and cdissect() equivalent,
      which will allow removing hundreds of lines of near-duplicated code.
      
      Since struct subre's "retry" field is no longer particularly related to
      any kind of retry, rename it to "id".  As of this commit it's only used
      for identifying a subre node in debug printouts, so you might think we
      should get rid of the field entirely; but I have a plan for another use.
      3cbfe485
    • B
      Mention original ctags option name. · 1fbacbf9
      Bruce Momjian 提交于
      1fbacbf9
    • B
      Update src/tools/make_ctags to avoid Exuberant tags option · 7c19f9d1
      Bruce Momjian 提交于
      that has been renamed and undocumented since 2003;  instead, use the
      documented option.  Add comments.
      7c19f9d1
    • P
      3aa42c25
  2. 24 2月, 2012 8 次提交
    • P
      Add some enumeration commas, for consistency · 9cfd800a
      Peter Eisentraut 提交于
      9cfd800a
    • T
      Fix the general case of quantified regex back-references. · 173e29aa
      Tom Lane 提交于
      Cases where a back-reference is part of a larger subexpression that
      is quantified have never worked in Spencer's regex engine, because
      he used a compile-time transformation that neglected the need to
      check the back-reference match in iterations before the last one.
      (That was okay for capturing parens, and we still do it if the
      regex has *only* capturing parens ... but it's not okay for backrefs.)
      
      To make this work properly, we have to add an "iteration" node type
      to the regex engine's vocabulary of sub-regex nodes.  Since this is a
      moderately large change with a fair risk of introducing new bugs of its
      own, apply to HEAD only, even though it's a fix for a longstanding bug.
      173e29aa
    • A
      Correctly handle NULLs in JSON output. · 0c9e5d5e
      Andrew Dunstan 提交于
      Error reported by David Wheeler.
      0c9e5d5e
    • T
      Last-minute release note updates. · b2ce6070
      Tom Lane 提交于
      Security: CVE-2012-0866, CVE-2012-0867, CVE-2012-0868
      b2ce6070
    • T
      Convert newlines to spaces in names written in pg_dump comments. · 89e0bac8
      Tom Lane 提交于
      pg_dump was incautious about sanitizing object names that are emitted
      within SQL comments in its output script.  A name containing a newline
      would at least render the script syntactically incorrect.  Maliciously
      crafted object names could present a SQL injection risk when the script
      is reloaded.
      
      Reported by Heikki Linnakangas, patch by Robert Haas
      
      Security: CVE-2012-0868
      89e0bac8
    • T
      Remove arbitrary limitation on length of common name in SSL certificates. · 077711c2
      Tom Lane 提交于
      Both libpq and the backend would truncate a common name extracted from a
      certificate at 32 bytes.  Replace that fixed-size buffer with dynamically
      allocated string so that there is no hard limit.  While at it, remove the
      code for extracting peer_dn, which we weren't using for anything; and
      don't bother to store peer_cn longer than we need it in libpq.
      
      This limit was not so terribly unreasonable when the code was written,
      because we weren't using the result for anything critical, just logging it.
      But now that there are options for checking the common name against the
      server host name (in libpq) or using it as the user's name (in the server),
      this could result in undesirable failures.  In the worst case it even seems
      possible to spoof a server name or user name, if the correct name is
      exactly 32 bytes and the attacker can persuade a trusted CA to issue a
      certificate in which that string is a prefix of the certificate's common
      name.  (To exploit this for a server name, he'd also have to send the
      connection astray via phony DNS data or some such.)  The case that this is
      a realistic security threat is a bit thin, but nonetheless we'll treat it
      as one.
      
      Back-patch to 8.4.  Older releases contain the faulty code, but it's not
      a security problem because the common name wasn't used for anything
      interesting.
      
      Reported and patched by Heikki Linnakangas
      
      Security: CVE-2012-0867
      077711c2
    • T
      Require execute permission on the trigger function for CREATE TRIGGER. · 891e6e7b
      Tom Lane 提交于
      This check was overlooked when we added function execute permissions to the
      system years ago.  For an ordinary trigger function it's not a big deal,
      since trigger functions execute with the permissions of the table owner,
      so they couldn't do anything the user issuing the CREATE TRIGGER couldn't
      have done anyway.  However, if a trigger function is SECURITY DEFINER,
      that is not the case.  The lack of checking would allow another user to
      install it on his own table and then invoke it with, essentially, forged
      input data; which the trigger function is unlikely to realize, so it might
      do something undesirable, for instance insert false entries in an audit log
      table.
      
      Reported by Dinesh Kumar, patch by Robert Haas
      
      Security: CVE-2012-0866
      891e6e7b
    • T
      Allow MinGW builds to use standardly-named OpenSSL libraries. · 74e29162
      Tom Lane 提交于
      In the Fedora variant of MinGW, the openssl libraries have their normal
      names, not libeay32 and libssleay32.  Adjust configure probes to allow
      that, per bug #6486.
      
      Tomasz Ostrowski
      74e29162
  3. 23 2月, 2012 9 次提交
  4. 22 2月, 2012 4 次提交
    • T
      Don't clear btpo_cycleid during _bt_vacuum_one_page. · 593a9631
      Tom Lane 提交于
      When "vacuuming" a single btree page by removing LP_DEAD tuples, we are not
      actually within a vacuum operation, but rather in an ordinary insertion
      process that could well be running concurrently with a vacuum.  So clearing
      the cycleid is incorrect, and could cause the concurrent vacuum to miss
      removing tuples that it needs to remove.  This is a longstanding bug
      introduced by commit e6284649 of
      2006-07-25.  I believe it explains Maxim Boguk's recent report of index
      corruption, and probably some other previously unexplained reports.
      
      In 9.0 and up this is a one-line fix; before that we need to introduce a
      flag to tell _bt_delitems what to do.
      593a9631
    • T
      Cosmetic cleanup for commit a760893d. · 9789c99d
      Tom Lane 提交于
      Mostly, fixing overlooked comments.
      9789c99d
    • M
      Avoid double close of file handle in syslogger on win32 · c2a2f751
      Magnus Hagander 提交于
      This causes an exception when running under a debugger or in particular
      when running on a debug version of Windows.
      
      Patch from MauMau
      c2a2f751
    • A
      Fix typo, noticed by Will Crawford. · 6b044cb8
      Andrew Dunstan 提交于
      6b044cb8
  5. 21 2月, 2012 3 次提交
  6. 20 2月, 2012 4 次提交
    • T
      Fix regex back-references that are directly quantified with *. · 5223f96d
      Tom Lane 提交于
      The syntax "\n*", that is a backref with a * quantifier directly applied
      to it, has never worked correctly in Spencer's library.  This has been an
      open bug in the Tcl bug tracker since 2005:
      https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894
      
      The core of the problem is in parseqatom(), which first changes "\n*" to
      "\n+|" and then applies repeat() to the NFA representing the backref atom.
      repeat() thinks that any arc leading into its "rp" argument is part of the
      sub-NFA to be repeated.  Unfortunately, since parseqatom() already created
      the arc that was intended to represent the empty bypass around "\n+", this
      arc gets moved too, so that it now leads into the state loop created by
      repeat().  Thus, what was supposed to be an "empty" bypass gets turned into
      something that represents zero or more repetitions of the NFA representing
      the backref atom.  In the original example, in place of
      	^([bc])\1*$
      we now have something that acts like
      	^([bc])(\1+|[bc]*)$
      At runtime, the branch involving the actual backref fails, as it's supposed
      to, but then the other branch succeeds anyway.
      
      We could no doubt fix this by some rearrangement of the operations in
      parseqatom(), but that code is plenty ugly already, and what's more the
      whole business of converting "x*" to "x+|" probably needs to go away to fix
      another problem I'll mention in a moment.  Instead, this patch suppresses
      the *-conversion when the target is a simple backref atom, leaving the case
      of m == 0 to be handled at runtime.  This makes the patch in regcomp.c a
      one-liner, at the cost of having to tweak cbrdissect() a little.  In the
      event I went a bit further than that and rewrote cbrdissect() to check all
      the string-length-related conditions before it starts comparing characters.
      It seems a bit stupid to possibly iterate through many copies of an
      n-character backreference, only to fail at the end because the target
      string's length isn't a multiple of n --- we could have found that out
      before starting.  The existing coding could only be a win if integer
      division is hugely expensive compared to character comparison, but I don't
      know of any modern machine where that might be true.
      
      This does not fix all the problems with quantified back-references.  In
      particular, the code is still broken for back-references that appear within
      a larger expression that is quantified (so that direct insertion of the
      quantification limits into the BACKREF node doesn't apply).  I think fixing
      that will take some major surgery on the NFA code, specifically introducing
      an explicit iteration node type instead of trying to transform iteration
      into concatenation of modified regexps.
      
      Back-patch to all supported branches.  In HEAD, also add a regression test
      case for this.  (It may seem a bit silly to create a regression test file
      for just one test case; but I'm expecting that we will soon import a whole
      bunch of regex regression tests from Tcl, so might as well create the
      infrastructure now.)
      5223f96d
    • T
      Add caching of ctype.h/wctype.h results in regc_locale.c. · e00f68e4
      Tom Lane 提交于
      While this doesn't save a huge amount of runtime, it still seems worth
      doing, especially since I realized that the data copying I did in my first
      draft was quite unnecessary.  In this version, once we have the results
      cached, getting them back for re-use is really very cheap.
      
      Also, remove the hard-wired limitation to not consider wctype.h results for
      character codes above 255.  It turns out that we can't push the limit as
      far up as I'd originally hoped, because the regex colormap code is not
      efficient enough to cope very well with character classes containing many
      thousand letters, which a Unicode locale is entirely capable of producing.
      Still, we can push it up to U+7FF (which I chose as the limit of 2-byte
      UTF8 characters), which will at least make Eastern Europeans happy pending
      a better solution.  Thus, this commit resolves the specific complaint in
      bug #6457, but not the more general issue that letters of non-western
      alphabets are mostly not recognized as matching [[:alpha:]].
      e00f68e4
    • T
      Create the beginnings of internals documentation for the regex code. · 27af9143
      Tom Lane 提交于
      Create src/backend/regex/README to hold an implementation overview of
      the regex package, and fill it in with some preliminary notes about
      the code's DFA/NFA processing and colormap management.  Much more to
      do there of course.
      
      Also, improve some code comments around the colormap and cvec code.
      No functional changes except to add one missing assert.
      27af9143
    • A
      Improve pretty printing of viewdefs. · 2f582f76
      Andrew Dunstan 提交于
      Some line feeds are added to target lists and from lists to make
      them more readable. By default they wrap at 80 columns if possible,
      but the wrap column is also selectable - if 0 it wraps after every
      item.
      
      Andrew Dunstan, reviewed by Hitoshi Harada.
      2f582f76
  7. 19 2月, 2012 3 次提交
  8. 18 2月, 2012 3 次提交
  9. 17 2月, 2012 1 次提交
    • T
      Fix longstanding error in contrib/intarray's int[] & int[] operator. · 06d9afa6
      Tom Lane 提交于
      The array intersection code would give wrong results if the first entry of
      the correct output array would be "1".  (I think only this value could be
      at risk, since the previous word would always be a lower-bound entry with
      that fixed value.)
      
      Problem spotted by Julien Rouhaud, initial patch by Guillaume Lelarge,
      cosmetic improvements by me.
      06d9afa6