1. 08 5月, 2012 2 次提交
    • R
    • R
      fix ugly bugs in TRE regex parser · d7a90b35
      Rich Felker 提交于
      1. * in BRE is not special at the beginning of the regex or a
      subexpression. this broke ncurses' build scripts.
      
      2. \\( in BRE is a literal \ followed by a literal (, not a literal \
      followed by a subexpression opener.
      
      3. the ^ in \\(^ in BRE is a literal ^ only at the beginning of the
      entire BRE. POSIX allows treating it as an anchor at the beginning of
      a subexpression, but TRE's code for checking if it was at the
      beginning of a subexpression was wrong, and fixing it for the sake of
      supporting a non-portable usage was too much trouble when just
      removing this non-portable behavior was much easier.
      
      this patch also moved lots of the ugly logic for empty atom checking
      out of the default/literal case and into new cases for the relevant
      characters. this should make parsing faster and make the code smaller.
      if nothing else it's a lot more readable/logical.
      
      at some point i'd like to revisit and overhaul lots of this code...
      d7a90b35
  2. 14 4月, 2012 1 次提交
    • R
      remove invalid code from TRE · 386b34a0
      Rich Felker 提交于
      TRE wants to treat + and ? after a +, ?, or * as special; ? means
      ungreedy and + is reserved for future use. however, this is
      non-conformant. although redundant, these redundant characters have
      well-defined (no-op) meaning for POSIX ERE, and are actually _literal_
      characters (which TRE is wrongly ignoring) in POSIX BRE mode.
      
      the simplest fix is to simply remove the unneeded nonstandard
      functionality. as a plus, this shaves off a small amount of bloat.
      386b34a0
  3. 21 3月, 2012 1 次提交
    • R
      upgrade to latest upstream TRE regex code (0.8.0) · ad47d45e
      Rich Felker 提交于
      the main practical results of this change are
      1. the regex code is no longer subject to LGPL; it's now 2-clause BSD
      2. most (all?) popular nonstandard regex extensions are supported
      
      I hesitate to call this a "sync" since both the old and new code are
      heavily modified. in one sense, the old code was "more severely"
      modified, in that it was actively hostile to non-strictly-conforming
      expressions. on the other hand, the new code has eliminated the
      useless translation of the entire regex string to wchar_t prior to
      compiling, and now only converts multibyte character literals as
      needed.
      
      in the future i may use this modified TRE as a basis for writing the
      long-planned new regex engine that will avoid multibyte-to-wide
      character conversion entirely by compiling multibyte bracket
      expressions specific to UTF-8.
      ad47d45e
  4. 17 6月, 2011 1 次提交
  5. 12 2月, 2011 1 次提交