提交 · 628cbb50ba80c83917b07a7609ddec12cda172d0 · Greenplum / Gpdb

11 7月, 2012 1 次提交

Re-implement extraction of fixed prefixes from regular expressions. · 628cbb50

由 Tom Lane 提交于 7月 10, 2012

To generate btree-indexable conditions from regex WHERE conditions (such as
WHERE indexed_col ~ '^foo'), we need to be able to identify any fixed
prefix that a regex might have; that is, find any string that must be a
prefix of all strings satisfying the regex. We used to do that with
entirely ad-hoc code that looked at the source text of the regex. It
didn't know very much about regex syntax, which mostly meant that it would
fail to identify some optimizable cases; but Viktor Rosenfeld reported that
it would produce actively wrong answers for quantified parenthesized
subexpressions, such as '^(foo)?bar'. Rather than trying to extend the
ad-hoc code to cover this, let's get rid of it altogether in favor of
identifying prefixes by examining the compiled form of a regex.

To do this, I've added a new entry point "pg_regprefix" to the regex library;
hopefully it is defined in a sufficiently general fashion that it can remain
in the library when/if that code gets split out as a standalone project.

Since this bug has been there for a very long time, this fix needs to get
back-patched. However it depends on some other recent commits (particularly
the addition of wchar-to-database-encoding conversion), so I'll commit this
separately and then go to work on back-porting the necessary fixes.

628cbb50

10 7月, 2012 1 次提交

Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns. · 00dac600

由 Tom Lane 提交于 7月 09, 2012

Previously, pattern_fixed_prefix() was defined to return whatever fixed
prefix it could extract from the pattern, plus the "rest" of the pattern.
That definition was sensible for LIKE patterns, but not so much for
regexes, where reconstituting a valid pattern minus the prefix could be
quite tricky (certainly the existing code wasn't doing that correctly).
Since the only thing that callers ever did with the "rest" of the pattern
was to pass it to like_selectivity() or regex_selectivity(), let's cut out
the middle-man and just have pattern_fixed_prefix's subroutines do this
directly. Then pattern_fixed_prefix can return a simple selectivity
number, and the question of how to cope with partial patterns is removed
from its API specification.

While at it, adjust the API spec so that callers who don't actually care
about the pattern's selectivity (which is a lot of them) can pass NULL for
the selectivity pointer to skip doing the work of computing a selectivity
estimate.

This patch is only an API refactoring that doesn't actually change any
processing, other than allowing a little bit of useless work to be skipped.
However, it's necessary infrastructure for my upcoming fix to regex prefix
extraction, because after that change there won't be any simple way to
identify the "rest" of the regex, not even to the low level of fidelity
needed by regex_selectivity. We can cope with that if regex_fixed_prefix
and regex_selectivity communicate directly, but not if we have to work
within the old API. Hence, back-patch to all active branches.

00dac600

09 7月, 2012 1 次提交

Fix planner to pass correct collation to operator selectivity estimators. · e7ef6d7e

由 Tom Lane 提交于 7月 08, 2012

We can do this without creating an API break for estimation functions
by passing the collation using the existing fmgr functionality for
passing an input collation as a hidden parameter.

The need for this was foreseen at the outset, but we didn't get around to
making it happen in 9.1 because of the decision to sort all pg_statistic
histograms according to the database's default collation. That meant that
selectivity estimators generally need to use the default collation too,
even if they're estimating for an operator that will do something
different. The reason it's suddenly become more interesting is that
regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE
property), and we no longer want to use the wrong collation when examining
regexps during planning. It's not that the selectivity estimate is likely
to change much from this; rather that we are thinking of caching compiled
regexps during planner estimation, and we won't get the intended benefit
if we cache them with a different collation than the executor will use.

Back-patch to 9.1, both because the regexp change is likely to get
back-patched and because we might as well get this right in all
collation-supporting branches, in case any third-party code wants to
rely on getting the collation. The patch turns out to be minuscule
now that I've done it ...

e7ef6d7e

08 7月, 2012 1 次提交

Simplify and document regex library's compact-NFA representation. · c6aae304

由 Tom Lane 提交于 7月 07, 2012

The previous coding abused the first element of a cNFA state's arcs list
to hold a per-state flag bit, which was confusing, undocumented, and not
even particularly efficient. Get rid of that in favor of a separate
"stflags" vector. Since there's only one bit in use, I chose to allocate a
char per state; we could possibly replace this with a bitmap at some point,
but that would make accesses a little slower. It's already about 8X
smaller than before, so let's not get overly tense.

Also document the representation better than it was before, which is to say
not at all.

This patch is a byproduct of investigations towards extracting a "fixed
prefix" string from the compact-NFA representation of regex patterns.
Might need to back-patch it if we decide to back-patch that fix, but for
now it's just code cleanup so I'll just put it in HEAD.

c6aae304

07 7月, 2012 4 次提交
- A
  Convert libpq regress script to Perl · a184e4db
  由 Alvaro Herrera 提交于 7月 06, 2012
```
This should ease its use on the Windows build environment.
```
  a184e4db
- A
  Update libpq test expected output · adb9b7d5
  由 Alvaro Herrera 提交于 7月 06, 2012
```
Commit 2b443063 changed wording for some of the error messages, but
neglected updating the regress output to match.
```
  adb9b7d5
- B
  Run updated copyright.pl on HEAD and 9.2 trees, updating the psql · 3c9b4064
  由 Bruce Momjian 提交于 7月 06, 2012
```
\copyright output to 2012.

Backpatch to 9.2.
```
  3c9b4064
- B
  Have copyright.pl skip updating something that is just the current year, · d17c0135
  由 Bruce Momjian 提交于 7月 06, 2012
```
to avoid producing dups, e.g. 2012-2012

Backpatch to 9.2.
```
  d17c0135
06 7月, 2012 10 次提交

B
Modify copyright.pl so all lines are processed, not just the first · 95203e08
由 Bruce Momjian 提交于 7月 06, 2012
```
match, so files that contain embedded copyrights are updated, e.g.
pgsql/help.c.

Backpatch to 9.2.
```
95203e08
B
Fix copyright.pl to properly skip the .git directory by adding a · 5198ae89
由 Bruce Momjian 提交于 7月 06, 2012
```
basename() qualification.
```
5198ae89
B
Fix spacing in copyright.pl after being run with missing regex slash · b9eb808b
由 Bruce Momjian 提交于 7月 06, 2012
```
(now added).

Backpatch to 9.2.
```
b9eb808b
B

Update pg_upgrade comments for recent configpath fix. · c742d1db
由 Bruce Momjian 提交于 7月 06, 2012

c742d1db
R
Fix failure of new wchar->mb functions to advance from pointer. · f6a05fd9
由 Robert Haas 提交于 7月 05, 2012
```
Bug spotted by Tom Lane.
```
f6a05fd9
B
Fix PGDATAOLD and PGDATANEW to properly set pgconfig location, per · 2eeb5eb2
由 Bruce Momjian 提交于 7月 05, 2012
```
report from Tom.

Backpatch to 9.2.
```
2eeb5eb2

Don't try to trim "../" in join_path_components(). · 85254199

由 Tom Lane 提交于 7月 05, 2012

join_path_components() tried to remove leading ".." components from its
tail argument, but it was not nearly bright enough to do so correctly
unless the head argument was (a) absolute and (b) canonicalized.
Rather than try to fix that logic, let's just get rid of it: there is no
correctness reason to remove "..", and cosmetic concerns can be taken
care of by a subsequent canonicalize_path() call. Per bug #6715 from
Greg Davidson.

Back-patch to all supported branches. It appears that pre-9.2, this
function is only used with absolute paths as head arguments, which is why
we'd not noticed the breakage before. However, third-party code might be
expecting this function to work in more general cases, so it seems wise
to back-patch.

In HEAD and 9.2, also make some minor cosmetic improvements to callers.

85254199

Revert part of the previous patch that avoided using PLy_elog(). · de479e2e

由 Heikki Linnakangas 提交于 7月 05, 2012

That caused the plpython_unicode regression test to fail on SQL_ASCII
encoding, as evidenced by the buildfarm. The reason is that with the patch,
you don't get the detail in the error message that you got before. That
detail is actually very informative, so rather than just adjust the expected
output, let's revert that part of the patch for now to make the buildfarm
green again, and figure out some other way to avoid the recursion of
PLy_elog() that doesn't lose the detail.

de479e2e

Fix mapping of PostgreSQL encodings to Python encodings. · b66de4c6

由 Heikki Linnakangas 提交于 7月 05, 2012

Windows encodings, "win1252" and so forth, are named differently in Python,
like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails
for some reason, use a plain ereport(), not a PLy_elog(), to report that
error. That avoids recursion and crash, if PLy_elog() tries to call
PLyUnicode_Bytes() again.

This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that
plpython didn't even try these conversions.

Jan Urbański, with minor comment improvements by me.

b66de4c6

Remove support for using wait3() in place of waitpid(). · fc548b22

由 Tom Lane 提交于 7月 05, 2012

All Unix-oid platforms that we currently support should have waitpid(),
since it's in V2 of the Single Unix Spec. Our git history shows that
the wait3 code was added to support NextStep, which we officially dropped
support for as of 9.2. So get rid of the configure test, and simplify the
macro spaghetti in reaper(). Per suggestion from Fujii Masao.

fc548b22

05 7月, 2012 12 次提交
- A
  pg_upgrade: abstract out copying of files from old cluster to new · 666d494d
  由 Alvaro Herrera 提交于 7月 05, 2012
```
Currently only pg_clog is copied, but some other directories could need
the same treatment as well, so create a subroutine to do it.

Extracted from my (somewhat larger) FOR KEY SHARE patch.
```
  666d494d
- M
  Fix function argument tab completion for schema-qualified or quoted function names · 3644a639
  由 Magnus Hagander 提交于 7月 05, 2012
```
Dean Rasheed, reviewed by Josh Kupershmidt
```
  3644a639
- B
  Fix missing regex slash that caused perltidy to get confused on · 539d3875
  由 Bruce Momjian 提交于 7月 04, 2012
```
copyright.pl.

Backpatch to 9.2.
```
  539d3875
- B
  Run newly-configured perltidy script on Perl files. · 042d9ffc
  由 Bruce Momjian 提交于 7月 04, 2012
```
Run on HEAD and 9.2.
```
  042d9ffc
- R
  Reduce messages about implicit indexes and sequences to DEBUG1. · d7c73484
  由 Robert Haas 提交于 7月 04, 2012
```
Per recent discussion on pgsql-hackers, these messages are too
chatty for most users.
```
  d7c73484
- B
  Have pg_dump in binary-upgrade mode properly drop user-created · 3e00d332
  由 Bruce Momjian 提交于 7月 04, 2012
```
extensions that might exist in the new empty cluster databases, like
plpgsql.

Backpatch to 9.2.
```
  3e00d332
- R
  Fix sample INSTR function to return 0 if third arg is 0. · 0fc32c00
  由 Robert Haas 提交于 7月 04, 2012
```
Albe Laurenz, per a report by Greg Smith that our sample function
doesn't quite match Oracle's behavior.
```
  0fc32c00
- R
  Add wchar -> mb conversion routines. · 72dd6291
  由 Robert Haas 提交于 7月 04, 2012
```
This is infrastructure for Alexander Korotkov's work on indexing regular
expression searches.

Alexander Korotkov, with a bit of further hackery on the MULE conversion
by me
```
  72dd6291
- R
  More doc cleanups for recent shared memory changes. · 248b5fce
  由 Robert Haas 提交于 7月 04, 2012
```
Josh Kupershmidt
```
  248b5fce
- R
  
  Documentation cleanups for recent shared memory changes. · 390bfc64
  由 Robert Haas 提交于 7月 04, 2012
  
  390bfc64
- R
  Increase the maximum initdb-configured value for shared_buffers to 128MB. · f3584282
  由 Robert Haas 提交于 7月 04, 2012
```
The old value of 32MB has been around for a very long time, and in the
meantime typical system memories have become vastly larger.  Also, now
that we no longer depend on being able to fit the entirety of our
shared memory segment into the system's limit on System V shared
memory, there's a much better chance of the higher limit actually
proving productive.

Per recent discussion on pgsql-hackers.
```
  f3584282
- R
  Make oid2name, pgbench, and vacuumlo set fallback_application_name. · 17676c78
  由 Robert Haas 提交于 7月 04, 2012
```
Amit Kapila, reviewed by Shigeru Hanada and Peter Eisentraut,
with some modifications by me.
```
  17676c78
04 7月, 2012 10 次提交

M

Remove duplicate, unnecessary, variable declaration · 10e0dd8f
由 Magnus Hagander 提交于 7月 04, 2012

10e0dd8f

Set the write location in the pg_receivexlog status messages · dbc6fcf3

由 Magnus Hagander 提交于 7月 04, 2012

This makes it possible for the master to track how much data has
actually been written my pg_receivexlog - and not just how much
has been sent towards it.

dbc6fcf3

Always treat a standby returning an an invalid flush location as async · 0c4b4686

由 Magnus Hagander 提交于 7月 04, 2012

This ensures that a standby such as pg_receivexlog will not be selected
as sync standby - which would cause the master to block waiting for
a location that could never happen.

Fujii Masao

0c4b4686

Remove reference to default wal_buffers being 8 · 817d870c

由 Magnus Hagander 提交于 7月 04, 2012

This hasn't been true since 9.1, when the default was changed to -1.
Remove the reference completely, keeping the discussion of the parameter
and it's shared memory effects on the config page.

817d870c

M
Remove references to pgfoundry as recommended hosting platform · 51fc4068
由 Magnus Hagander 提交于 7月 04, 2012
```
pgfoundry is deprectaed and no longer accepting new projects,
so we really shouldn't be directing people there.
```
51fc4068

Remove references to PostgreSQL bundled on Solaris · d80785e6

由 Magnus Hagander 提交于 7月 04, 2012

Also remove special references to downloads off pgfoundry since they are
not correct - downloads are done through the main website.

d80785e6

Improve documentation about MULE encoding. · 09022de1

由 Tom Lane 提交于 7月 04, 2012

This commit improves the comments in pg_wchar.h and creates #define symbols
for some formerly hard-coded values.  No substantive code changes.

Tatsuo Ishii and Tom Lane

09022de1

A

Forgot an #include in the previous patch :-( · 47a2adc8
由 Alvaro Herrera 提交于 7月 03, 2012

47a2adc8

Have REASSIGN OWNED work on extensions, too · 0c7b9dc7

由 Alvaro Herrera 提交于 7月 03, 2012

Per bug #6593, REASSIGN OWNED fails when the affected role has created
an extension.  Even though the user related to the extension is not
nominally the owner, its OID appears on pg_shdepend and thus causes
problems when the user is to be dropped.

This commit adds code to change the "ownership" of the extension itself,
not of the contained objects.  This is fine because it's currently only
called from REASSIGN OWNED, which would also modify the ownership of the
contained objects.  However, this is not sufficient for a working ALTER
OWNER implementation extension.

Back-patch to 9.1, where extensions were introduced.

Bug #6593 reported by Emiliano Leporati.

0c7b9dc7

B

Have copyright tool mention that certain files should be updated in back branches. · b33385b8
由 Bruce Momjian 提交于 7月 03, 2012

b33385b8