- 10 Jul, 2012 5 commits
-
-
Tom Lane authored
Also some cosmetic improvements for wchar-to-mblen patch.
-
Alvaro Herrera authored
When in SQL_ASCII encoding, strings passed around are not necessarily UTF8-safe. We had already fixed this in some places, but it looks like we missed some. I had to backpatch Peter Eisentraut's a8b92b60 to 9.1 in order for this patch to cherry-pick more cleanly. Patch from Alex Hunsaker, tweaked by Kyotaro HORIGUCHI and myself. Some desultory cleanup and comment addition by me, during patch review. Per bug report from Christoph Berg in 20120209102116.GA14429@msgid.df7cb.de
-
Alvaro Herrera authored
-
Tom Lane authored
To generate btree-indexable conditions from regex WHERE conditions (such as WHERE indexed_col ~ '^foo'), we need to be able to identify any fixed prefix that a regex might have; that is, find any string that must be a prefix of all strings satisfying the regex. We used to do that with entirely ad-hoc code that looked at the source text of the regex. It didn't know very much about regex syntax, which mostly meant that it would fail to identify some optimizable cases; but Viktor Rosenfeld reported that it would produce actively wrong answers for quantified parenthesized subexpressions, such as '^(foo)?bar'. Rather than trying to extend the ad-hoc code to cover this, let's get rid of it altogether in favor of identifying prefixes by examining the compiled form of a regex. To do this, I've added a new entry point "pg_regprefix" to the regex library; hopefully it is defined in a sufficiently general fashion that it can remain in the library when/if that code gets split out as a standalone project. Since this bug has been there for a very long time, this fix needs to get back-patched. However it depends on some other recent commits (particularly the addition of wchar-to-database-encoding conversion), so I'll commit this separately and then go to work on back-porting the necessary fixes.
-
Tom Lane authored
Previously, pattern_fixed_prefix() was defined to return whatever fixed prefix it could extract from the pattern, plus the "rest" of the pattern. That definition was sensible for LIKE patterns, but not so much for regexes, where reconstituting a valid pattern minus the prefix could be quite tricky (certainly the existing code wasn't doing that correctly). Since the only thing that callers ever did with the "rest" of the pattern was to pass it to like_selectivity() or regex_selectivity(), let's cut out the middle-man and just have pattern_fixed_prefix's subroutines do this directly. Then pattern_fixed_prefix can return a simple selectivity number, and the question of how to cope with partial patterns is removed from its API specification. While at it, adjust the API spec so that callers who don't actually care about the pattern's selectivity (which is a lot of them) can pass NULL for the selectivity pointer to skip doing the work of computing a selectivity estimate. This patch is only an API refactoring that doesn't actually change any processing, other than allowing a little bit of useless work to be skipped. However, it's necessary infrastructure for my upcoming fix to regex prefix extraction, because after that change there won't be any simple way to identify the "rest" of the regex, not even to the low level of fidelity needed by regex_selectivity. We can cope with that if regex_fixed_prefix and regex_selectivity communicate directly, but not if we have to work within the old API. Hence, back-patch to all active branches.
-
- 09 Jul, 2012 1 commit
-
-
Tom Lane authored
We can do this without creating an API break for estimation functions by passing the collation using the existing fmgr functionality for passing an input collation as a hidden parameter. The need for this was foreseen at the outset, but we didn't get around to making it happen in 9.1 because of the decision to sort all pg_statistic histograms according to the database's default collation. That meant that selectivity estimators generally need to use the default collation too, even if they're estimating for an operator that will do something different. The reason it's suddenly become more interesting is that regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE property), and we no longer want to use the wrong collation when examining regexps during planning. It's not that the selectivity estimate is likely to change much from this; rather that we are thinking of caching compiled regexps during planner estimation, and we won't get the intended benefit if we cache them with a different collation than the executor will use. Back-patch to 9.1, both because the regexp change is likely to get back-patched and because we might as well get this right in all collation-supporting branches, in case any third-party code wants to rely on getting the collation. The patch turns out to be minuscule now that I've done it ...
-
- 07 Jul, 2012 1 commit
-
-
Tom Lane authored
The previous coding abused the first element of a cNFA state's arcs list to hold a per-state flag bit, which was confusing, undocumented, and not even particularly efficient. Get rid of that in favor of a separate "stflags" vector. Since there's only one bit in use, I chose to allocate a char per state; we could possibly replace this with a bitmap at some point, but that would make accesses a little slower. It's already about 8X smaller than before, so let's not get overly tense. Also document the representation better than it was before, which is to say not at all. This patch is a byproduct of investigations towards extracting a "fixed prefix" string from the compact-NFA representation of regex patterns. Might need to back-patch it if we decide to back-patch that fix, but for now it's just code cleanup so I'll just put it in HEAD.
-
- 06 Jul, 2012 10 commits
-
-
Alvaro Herrera authored
This should ease its use on the Windows build environment.
-
Alvaro Herrera authored
Commit 2b443063 changed wording for some of the error messages, but neglected updating the regress output to match.
-
Bruce Momjian authored
\copyright output to 2012. Backpatch to 9.2.
-
Bruce Momjian authored
to avoid producing dups, e.g. 2012-2012 Backpatch to 9.2.
-
Bruce Momjian authored
match, so files that contain embedded copyrights are updated, e.g. pgsql/help.c. Backpatch to 9.2.
-
Bruce Momjian authored
basename() qualification.
-
Bruce Momjian authored
(now added). Backpatch to 9.2.
-
Bruce Momjian authored
-
Robert Haas authored
Bug spotted by Tom Lane.
-
Bruce Momjian authored
report from Tom. Backpatch to 9.2.
-
- 05 Jul, 2012 9 commits
-
-
Tom Lane authored
join_path_components() tried to remove leading ".." components from its tail argument, but it was not nearly bright enough to do so correctly unless the head argument was (a) absolute and (b) canonicalized. Rather than try to fix that logic, let's just get rid of it: there is no correctness reason to remove "..", and cosmetic concerns can be taken care of by a subsequent canonicalize_path() call. Per bug #6715 from Greg Davidson. Back-patch to all supported branches. It appears that pre-9.2, this function is only used with absolute paths as head arguments, which is why we'd not noticed the breakage before. However, third-party code might be expecting this function to work in more general cases, so it seems wise to back-patch. In HEAD and 9.2, also make some minor cosmetic improvements to callers.
-
Heikki Linnakangas authored
That caused the plpython_unicode regression test to fail on SQL_ASCII encoding, as evidenced by the buildfarm. The reason is that with the patch, you don't get the detail in the error message that you got before. That detail is actually very informative, so rather than just adjust the expected output, let's revert that part of the patch for now to make the buildfarm green again, and figure out some other way to avoid the recursion of PLy_elog() that doesn't lose the detail.
-
Heikki Linnakangas authored
Windows encodings, "win1252" and so forth, are named differently in Python, like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails for some reason, use a plain ereport(), not a PLy_elog(), to report that error. That avoids recursion and crash, if PLy_elog() tries to call PLyUnicode_Bytes() again. This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that plpython didn't even try these conversions. Jan Urbański, with minor comment improvements by me.
-
Tom Lane authored
All Unix-oid platforms that we currently support should have waitpid(), since it's in V2 of the Single Unix Spec. Our git history shows that the wait3 code was added to support NextStep, which we officially dropped support for as of 9.2. So get rid of the configure test, and simplify the macro spaghetti in reaper(). Per suggestion from Fujii Masao.
-
Alvaro Herrera authored
Currently only pg_clog is copied, but some other directories could need the same treatment as well, so create a subroutine to do it. Extracted from my (somewhat larger) FOR KEY SHARE patch.
-
Magnus Hagander authored
Dean Rasheed, reviewed by Josh Kupershmidt
-
Bruce Momjian authored
copyright.pl. Backpatch to 9.2.
-
Bruce Momjian authored
Run on HEAD and 9.2.
-
Robert Haas authored
Per recent discussion on pgsql-hackers, these messages are too chatty for most users.
-
- 04 Jul, 2012 14 commits
-
-
Bruce Momjian authored
extensions that might exist in the new empty cluster databases, like plpgsql. Backpatch to 9.2.
-
Robert Haas authored
Albe Laurenz, per a report by Greg Smith that our sample function doesn't quite match Oracle's behavior.
-
Robert Haas authored
This is infrastructure for Alexander Korotkov's work on indexing regular expression searches. Alexander Korotkov, with a bit of further hackery on the MULE conversion by me
-
Robert Haas authored
Josh Kupershmidt
-
Robert Haas authored
-
Robert Haas authored
The old value of 32MB has been around for a very long time, and in the meantime typical system memories have become vastly larger. Also, now that we no longer depend on being able to fit the entirety of our shared memory segment into the system's limit on System V shared memory, there's a much better chance of the higher limit actually proving productive. Per recent discussion on pgsql-hackers.
-
Robert Haas authored
Amit Kapila, reviewed by Shigeru Hanada and Peter Eisentraut, with some modifications by me.
-
Magnus Hagander authored
-
Magnus Hagander authored
This makes it possible for the master to track how much data has actually been written my pg_receivexlog - and not just how much has been sent towards it.
-
Magnus Hagander authored
This ensures that a standby such as pg_receivexlog will not be selected as sync standby - which would cause the master to block waiting for a location that could never happen. Fujii Masao
-
Magnus Hagander authored
This hasn't been true since 9.1, when the default was changed to -1. Remove the reference completely, keeping the discussion of the parameter and it's shared memory effects on the config page.
-
Magnus Hagander authored
pgfoundry is deprectaed and no longer accepting new projects, so we really shouldn't be directing people there.
-
Magnus Hagander authored
Also remove special references to downloads off pgfoundry since they are not correct - downloads are done through the main website.
-
Tom Lane authored
This commit improves the comments in pg_wchar.h and creates #define symbols for some formerly hard-coded values. No substantive code changes. Tatsuo Ishii and Tom Lane
-