- 09 Dec, 2010 2 commits
-
-
Simon Riggs authored
Hot Standby conflicts only with tuples that were visible at some point. So ignore tuples from aborted transactions or for tuples updated/deleted during the inserting transaction when generating the conflict transaction ids. Following detailed analysis and test case by Noah Misch. Original report covered btree delete records, correctly observed by Heikki Linnakangas that this applies to other cases also. Fix covers all sources of cleanup records via common code.
-
Tom Lane authored
Recent versions of the Linux system header files cause xlogdefs.h to believe that open_datasync should be the default sync method, whereas formerly fdatasync was the default on Linux. open_datasync is a bad choice, first because it doesn't actually outperform fdatasync (in fact the reverse), and second because we try to use O_DIRECT with it, causing failures on certain filesystems (e.g., ext4 with data=journal option). This part of the patch is largely per a proposal from Marti Raudsepp. More extensive changes are likely to follow in HEAD, but this is as much change as we want to back-patch. Also clean up confusing code and incorrect documentation surrounding the fsync_writethrough option. Those changes shouldn't result in any actual behavioral change, but I chose to back-patch them anyway to keep the branches looking similar in this area. In 9.0 and HEAD, also do some copy-editing on the WAL Reliability documentation section. Back-patch to all supported branches, since any of them might get used on modern Linux versions.
-
- 08 Dec, 2010 1 commit
-
-
Simon Riggs authored
First, avoid scanning the whole ProcArray once we know there are at least commit_siblings active; second, skip the check altogether if commit_siblings = 0. Greg Smith
-
- 07 Dec, 2010 2 commits
-
-
Heikki Linnakangas authored
an old transaction running in the master, and a lot of transactions have started and finished since, and a WAL-record is written in the gap between the creating the running-xacts snapshot and WAL-logging it, recovery will fail with "too many KnownAssignedXids" error. This bug was reported by Joachim Wieland on Nov 19th. In the same scenario, when fewer transactions have started so that all the xids fit in KnownAssignedXids despite the first bug, a more serious bug arises. We incorrectly initialize the clog code with the oldest still running transaction, and when we see the WAL record belonging to a transaction with an XID larger than one that committed already before the checkpoint we're recovering from, we zero the clog page containing the already committed transaction, leading to data loss. In hindsight, trying to track xids in the known-assigned-xids array before seeing the running-xacts record was too complicated. To fix that, hold XidGenLock while the running-xacts snapshot is taken and WAL-logged. That ensures that no transaction can begin or end in that gap, so that in recvoery we know that the snapshot contains all transactions running at that point in WAL.
-
Tom Lane authored
There are some code paths, such as SPI_execute(), where we invoke copyObject() on raw parse trees before doing parse analysis on them. Since the bison grammar is capable of building heavily nested parsetrees while itself using only minimal stack depth, this means that copyObject() can be the front-line function that hits stack overflow before anything else does. Accordingly, it had better have a check_stack_depth() call. I did a bit of performance testing and found that this slows down copyObject() by only a few percent, so the hit ought to be negligible in the context of complete processing of a query. Per off-list report from Toshihide Katayama. Back-patch to all supported branches.
-
- 06 Dec, 2010 3 commits
-
-
Andrew Dunstan authored
This doesn't involve any user-visible change in behavior, but will be useful when the COPY routines are exposed to allow their use by Foreign Data Wrapper routines, which will be able to use these routines to read irregular CSV files, for example.
-
Heikki Linnakangas authored
-
Peter Eisentraut authored
-
- 05 Dec, 2010 1 commit
-
-
Tom Lane authored
Avoid eating quite so much memory for large inheritance trees, by reclaiming the space used by temporary copies of the original parsetree and range table, as well as the workspace needed during planning. The cost is needing to copy the finished plan trees out of the child memory context. Although this looks like it ought to slow things down, my testing shows it actually is faster, apparently because fewer interactions with malloc() are needed and/or we can do the work within a more readily cacheable amount of memory. That result might be platform-dependent, but I'll take it. Per a gripe from John Papandriopoulos, in which it was pointed out that the memory consumption actually grew as O(N^2) for sufficiently many child tables, since we were creating N copies of the N-element range table.
-
- 04 Dec, 2010 7 commits
-
-
Tom Lane authored
1. Complain, rather than silently doing nothing, if an "invalid" tuple is found on a leaf page. Per off-list discussion with Heikki. 2. Fix oversight in code that removes a GISTSearchItem from the search queue: we have to reset lastHeap if this was the last heap item in the parent GISTSearchTreeItem. Otherwise subsequent additions will do the wrong thing. This was probably masked in early testing because in typical cases the parent item would now be completely empty and would be deleted on next call. You'd need a queued non-leaf page at exactly the same distance as a heap tuple to expose the bug.
-
Peter Eisentraut authored
run_schedule() and run_single_test() were using different output widths, which would show up in bigcheck/bigtest, for example.
-
Tom Lane authored
-
Tom Lane authored
Teodor Sigaev, with some revision by Tom
-
Tom Lane authored
-
Tom Lane authored
On reflection it's a bad idea for the KNNGIST patch to have removed that. We don't want it silently returning incorrect answers.
-
Tom Lane authored
This commit represents a rather heavily editorialized version of Teodor's builtin_knngist_itself-0.8.2 and builtin_knngist_proc-0.8.1 patches. I redid the opclass API to add a separate Distance method instead of turning the Consistent method into an illogical mess, fixed some bit-rot in the rbtree interfaces, and generally worked over the code style and comments. There's still no non-code documentation to speak of, but I'll work on that separately. Some contrib-module changes are also yet to come (right now, point <-> point is the only KNN-ified operator). Teodor Sigaev and Tom Lane
-
- 03 Dec, 2010 6 commits
-
-
Robert Haas authored
-
Robert Haas authored
Noted by Itagaki Takahiro.
-
Robert Haas authored
This eliminates some crufty, special-purpose code and, as a non-trivial side benefit, allows recovery.conf parameters to be unquoted. Dimitri Fontaine, with review and cleanup by Alvaro Herrera, Itagaki Takahiro, and me.
-
Heikki Linnakangas authored
the "END OF FORMAT CALLBACKS" comment, because they are format callbacks too.
-
Itagaki Takahiro authored
We can directly verify the unterminated input with pg_verify_mbstr_len.
-
Tom Lane authored
This is a heavily revised version of builtin_knngist_core-0.9. The ordering operators are no longer mixed in with actual quals, which would have confused not only humans but significant parts of the planner. Instead, ordering operators are carried separately throughout planning and execution. Since the API for ambeginscan and amrescan functions had to be changed anyway, this commit takes the opportunity to rationalize that a bit. RelationGetIndexScan no longer forces a premature index_rescan call; instead, callers of index_beginscan must call index_rescan too. Aside from making the AM-side initialization logic a bit less peculiar, this has the advantage that we do not make a useless extra am_rescan call when there are runtime key values. AMs formerly could not assume that the key values passed to amrescan were actually valid; now they can. Teodor Sigaev and Tom Lane
-
- 02 Dec, 2010 5 commits
-
-
Alvaro Herrera authored
Keep only the typedef in the header file.
-
Alvaro Herrera authored
-
Alvaro Herrera authored
-
Alvaro Herrera authored
-
Heikki Linnakangas authored
to make it easier to reuse that code. There is no user-visible changes. This is in preparation for the patch to add a new archive format, a directory, to perform a custom-like dump but with each table being dumped to a separate file (that in turn is a prerequisite for parallel pg_dump). This also makes it easier to add new compression methods in the future, and makes the pg_backup_custom.c code easier to read, when the compression-related code is factored out. Joachim Wieland, with heavy editorialization by me.
-
- 01 Dec, 2010 1 commit
-
-
Tom Lane authored
There were corner cases in which the planner would attempt to inline such a function, which would result in a failure at runtime due to loss of information about exactly what the result record type is. Fix by disabling inlining when the function's recorded result type is RECORD. There might be some sub-cases where inlining could still be allowed, but this is a simple and backpatchable fix, so leave refinements for another day. Per bug #5777 from Nate Carson. Back-patch to all supported branches. 8.1 happens to avoid a core-dump here, but it still does the wrong thing.
-
- 29 Nov, 2010 4 commits
-
-
Tom Lane authored
Formerly we looked up the operators associated with each index (caching them in relcache) and then the planner looked up the btree opfamily containing such operators in order to build the btree-centric pathkey representation that describes the index's sort order. This is quite pointless for btree indexes: we might as well just use the index's opfamily information directly. That saves syscache lookup cycles during planning, and furthermore allows us to eliminate the relcache's caching of operators altogether, which may help in reducing backend startup time. I added code to plancat.c to perform the same type of double lookup on-the-fly if it's ever faced with a non-btree amcanorder index AM. If such a thing actually becomes interesting for production, we should replace that logic with some more-direct method for identifying the corresponding btree opfamily; but it's not worth spending effort on now. There is considerably more to do pursuant to my recent proposal to get rid of sort-operator-based representations of sort orderings, but this patch grabs some of the low-hanging fruit. I'll look at the remainder of that work after the current commitfest.
-
Heikki Linnakangas authored
Christoph Berg.
-
Robert Haas authored
Fujii Masao
-
Simon Riggs authored
removing an infrequently occurring race condition in Hot Standby. An xid must be assigned before a lock appears in shared memory, rather than immediately after, else GetRunningTransactionLocks() may see InvalidTransactionId, causing assertion failures during lock processing on standby. Bug report and diagnosis by Fujii Masao, fix by me.
-
- 27 Nov, 2010 7 commits
-
-
Tom Lane authored
Per gripe from Andreas Scherbaum.
-
Bruce Momjian authored
something that can be documented.
-
Robert Haas authored
KaiGai Kohei, with a few changes by me.
-
Tom Lane authored
-
Tom Lane authored
The pg_fe_sendauth code might fail if it can't handle the authentication request message type --- if so, ping should still say the server is up.
-
Tom Lane authored
Basically, we want to distinguish all cases where the connection was not made from those where it was. A convenient proxy for this is to see if we got a message with a SQLSTATE code back from the postmaster. This presumes that the postmaster will always send us a SQLSTATE in a failure message, which is true for 7.4 and later postmasters in every case except fork failure. (We could possibly complicate the postmaster code to do something about that, but it seems not worth the trouble, especially since pg_ctl's response for that case should be to keep waiting anyway.) If we did get a SQLSTATE from the postmaster, there are basically only two cases, as per last week's discussion: ERRCODE_CANNOT_CONNECT_NOW and everything else. Any other error code implies that the postmaster is in principle willing to accept connections, it just didn't like or couldn't handle this particular request. We want to make a special case for ERRCODE_CANNOT_CONNECT_NOW so that "pg_ctl start -w" knows it should keep waiting. In passing, pick names for the enum constants that are a tad less likely to present collision hazards in future.
-
Tom Lane authored
Newly added code was supposing that "struct sockaddr_in" applies to IPv6.
-
- 26 Nov, 2010 1 commit
-
-
Tom Lane authored
1. Don't #include postgres.h in a frontend build. 2. Don't assume that the backend's symbol PGSQL_AF_INET6 has anything to do with the constant that will be used by system library functions (because, in point of fact, it usually doesn't). Fortunately, PGSQL_AF_INET is equal to AF_INET, so we can just cater for both sets of values in one case construct without fear of conflict.
-