- 29 Aug, 2019 1 commit
- 
- 
Etsuro Fujita authored
 
- 
- 28 Aug, 2019 5 commits
- 
- 
Tom Lane authored
- 
Heikki Linnakangas authoredThe comment did not match what the code actually did for integers with the 43rd bit set. You get an integer like that, if you have a posting list with two adjacent TIDs that are more than 2^31 blocks apart. According to the comment, we would store that in 6 bytes, with no continuation bit on the 6th byte, but in reality, the code encodes it using 7 bytes, with a continuation bit on the 6th byte as normal. The decoding routine also handled these 7-byte integers correctly, except for an overflow check that assumed that one integer needs at most 6 bytes. Fix the overflow check, and fix the comment to match what the code actually does. Also fix the comment that claimed that there are 17 unused bits in the 64-bit representation of an item pointer. In reality, there are 64-32-11=21. Fitting any item pointer into max 6 bytes was an important property when this was written, because in the old pre-9.4 format, item pointers were stored as plain arrays, with 6 bytes for every item pointer. The maximum of 6 bytes per integer in the new format guaranteed that we could convert any page from the old format to the new format after upgrade, so that the new format was never larger than the old format. But we hardly need to worry about that anymore, and running into that problem during upgrade, where an item pointer is expanded from 6 to 7 bytes such that the data doesn't fit on a page anymore, is implausible in practice anyway. Backpatch to all supported versions. This also includes a little test module to test these large distances between item pointers, without requiring a 16 TB table. It is not backpatched, I'm including it more for the benefit of future development of new posting list formats. Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi Reviewed-by: Masahiko Sawada, Alexander Korotkov 
- 
Thomas Munro authoredRelationAllowsEarlyPruning() performed a catalog scan, but is used in two contexts where that was a bad idea: 1. In heap_page_prune_opt(), which runs very frequently in some large scans. This caused major performance problems in a field report that was easy to reproduce. 2. In TestForOldSnapshot(), which runs while we hold a buffer content lock. It's not clear if this was guaranteed to be free of buffer deadlock risk. The check was introduced in commit 2cc41acd and defended against a real problem: 9.6's hash indexes have no page LSN and so we can't allow early pruning (ie the snapshot-too-old feature). We can remove the check from all later releases though: hash indexes are now logged, and there is no way to create UNLOGGED indexes on regular logged tables. If a future release allows such a combination, it might need to put a similar check in place, but it'll need some more thought. Back-patch to 10. Author: Thomas Munro Reviewed-by: Tom Lane, who spotted the second problem Discussion: https://postgr.es/m/CA%2BhUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD%2Bg%40mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1%2BWy%2BN4eE5zPm765h68LrkWc3Biu_8rzzi%2BOYX4j%2BiHRw%40mail.gmail.com
- 
Michael Paquier authoredcheck_float4_val() checks after underflow and overflow of values converted from float8 to float4, but there has never been any regression tests for that. This brings the coverage of float.h to 100%. Author: Movead Li Discussion: https://postgr.es/m/20190822174636998766188@highgo.ca 
- 
Michael Paquier authoredIn this case, the transfer uses a libpq connection, which is subject to the timeout parameters set at system level, and this can make the rewind operation suddenly canceled which is not good for automation. One workaround to such issues would be to use PGOPTIONS to enforce the wanted timeout parameters, but that's annoying, and for example pg_dump, which can run potentially long-running queries disables all types of timeouts. lock_timeout and statement_timeout are the ones which can cause problems now. Note that pg_rewind does not use transactions, so disabling idle_in_transaction_session_timeout is optional, but it feels safer to do so for the future. This is back-patched down to 9.5. idle_in_transaction_session_timeout is only present since 9.6. Author: Alexander Kukushkin Discussion: https://postgr.es/m/CAFh8B=krcVXksxiwVQh1SoY+ziJ-JC=6FcuoBL3yce_40Es5_g@mail.gmail.com Backpatch-through: 9.5 
 
- 
- 27 Aug, 2019 8 commits
- 
- 
Tom Lane authoredGive it an explanatory para like the other default roles have. Don't imply that it can send any signal whatever. In passing, reorder the table entries and explanatory paras for the default roles into some semblance of consistency. Ian Barwick, tweaked a bit by me. Discussion: https://postgr.es/m/89907e32-76f3-7282-a89c-ea19c722fe5d@2ndquadrant.com 
- 
Tom Lane authoredTurns out that returning "unrecognized signal" is confusing. Make it explicit that the platform lacks any support for signal names. (At least of the machines in the buildfarm, only HPUX lacks it.) Back-patch to v12 where we invented this function. Discussion: https://postgr.es/m/3067.1566870481@sss.pgh.pa.us 
- 
Peter Geoghegan authoredCommit efada2b8, which made the nbtree page deletion algorithm more robust, removed the concept of a half-dead internal page. Remove a comment about half dead parent pages that was overlooked. 
- 
Tom Lane authoredSection 4.2.7 says that unless otherwise specified, built-in aggregates ignore rows in which any input is null. This is not true of the JSON aggregates, but it wasn't documented. Fix that. Of the other entries in table 9.55, some were explicit about ignoring nulls, and some weren't; for consistency and self-contained-ness, make them all say it explicitly. Per bug #15884 from Tim Möhlmann. Back-patch to all supported branches. Discussion: https://postgr.es/m/15884-c32d848f787fcae3@postgresql.org 
- 
Tom Lane authoredDaniel Gustafsson Discussion: https://postgr.es/m/F2FB03F2-B112-4E51-842E-12C50DCA2F4A@yesql.se 
- 
Tom Lane authoredAn empty file name or subdirectory name leads join_path_components() to just produce the parent directory name, which leads to weird failures or recursive inclusions. Let's throw a specific error for that. It takes only slightly more code to detect all-blank names, so do so. Also, detect direct recursion, ie a file calling itself. As coded this will also detect recursion via "include_dir '.'", which is perhaps more likely than explicitly including the file itself. Detecting indirect recursion would require API changes for guc-file.l functions, which seems not worth it since extensions might call them. The nesting depth limit will catch such cases eventually, just not with such an on-point error message. In passing, adjust the example usages in postgresql.conf.sample to perhaps eliminate the problem at the source: there's no reason for the examples to suggest that an empty value is valid. Per a trouble report from Brent Bates. Back-patch to 9.5; the issue is old, but the code in 9.4 is enough different that the patch doesn't apply easily, and it doesn't seem worth the trouble to fix there. Ian Barwick and Tom Lane Discussion: https://postgr.es/m/8c8bcbca-3bd9-dc6e-8986-04a5abdef142@2ndquadrant.com 
- 
Michael Paquier authoredFD_SETSIZE needs to be declared before winsock2.h, or it is possible to run into buffer overflow issues when using --jobs. This is similar to pgbench's solution done in a23c6415. This has been introduced by 71d84efb, and older versions have been using the default value of FD_SETSIZE, defined at 64. Per buildfarm member jacana, but this impacts all Windows animals running the TAP tests. I have reproduced the failure locally to check the patch. Author: Michael Paquier Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/20190826054000.GE7005@paquier.xyz Backpatch-through: 9.5 
 
- 26 Aug, 2019 7 commits
- 
- 
Tom Lane authoredIf a test case tried to set an invalid value of synchronous_standby_names, the test script didn't detect that, which seems like a bad idea. Noticed while testing a proposed patch that broke some of these test cases. 
- 
Tom Lane authoredA report from Alvaro Herrera shows that if we're in PM_STARTUP state, and we spawn a dead_end child to reject some incoming connection request, and that child dies with an unexpected exit code, the postmaster does not respond well. We correctly send SIGQUIT to the startup process, but then: * if the startup process exits with nonzero exit code, as expected, we thought that that indicated a crash and aborted startup. * if the startup process exits with zero exit code, which is possible due to the inherent race condition, we'd advance to PM_RUN state which is fine --- but the code forgot that AbortStartTime would be nonzero in this situation. We'd either die on the Asserts saying that it was zero, or perhaps misbehave later on. (A quick look suggests that the only misbehavior might be busy-waiting due to DetermineSleepTime doing the wrong thing.) To fix the first point, adjust the state-machine logic to recognize that a nonzero exit code is expected after sending SIGQUIT, and have it transition to a state where we can restart the startup process. To fix the second point, change the Asserts to clear the variable rather than just claiming it should be clear already. Perhaps we could improve this further by not treating a crash of a dead_end child as a reason for panic'ing the database. However, since those child processes are connected to shared memory, that seems a bit risky. There are few good reasons for a dead_end child to report failure anyway (the cause of this in Alvaro's report is quite unclear). On balance, therefore, a minimal fix seems best. This is an oversight in commit 45811be9. While that was back-patched, I'm hesitant to back-patch this change. The lack of reasons for a dead_end child to fail suggests that the case should be very rare in the field, which squares with the lack of reports; so it seems like this might not be worth the risk of introducing new issues. In any case we can let it bake awhile in HEAD before considering a back-patch. Discussion: https://postgr.es/m/20190615160950.GA31378@alvherre.pgsql 
- 
Tom Lane authoredIncompletely quoting an API spec does nobody any good. Noted by Paul Jungwirth. Looks like the discrepancy was my fault originally :-( Discussion: https://postgr.es/m/CA+renyU_J8TU_d3Kr0PkuOgFbpypextendu7a+_d5NOfVdvDeA@mail.gmail.com 
- 
Peter Eisentraut authoredIn cc8d4151, the arguments of warn_or_exit_horribly() were changed but this was not updated. 
- 
Andrew Dunstan authoredPreviously 'uname -r' on Msys2 reported a kernele release starting with 2. The latest version starts with 3. In commit 1638623f we specifically looked for one starting with 2. This is now changed to look for any digit between 2 and 9. backpatch to release 10. 
- 
Andrew Dunstan authoredOn msys2, 'uname -s' reports a string starting MSYS instead on MINGW as happens on msys1. Treat these both the same way. This reverts 608a7101 in favor of a more general solution. Backpatch to all live branches. 
- 
Michael Paquier authoredWhen trying to use a high number of jobs, vacuumdb (and more recently reindexdb) has only checked for a maximum number of jobs used, causing confusing failures when running out of file descriptors when the jobs open connections to Postgres. This commit changes the error handling so as we do not check anymore for a maximum number of allowed jobs when parsing the option value with FD_SETSIZE, but check instead if a file descriptor is within the supported range when opening the connections for the jobs so as this is detected at the earliest time possible. Also, improve the error message to give a hint about the number of jobs recommended, using a wording given by the reviewers of the patch. Reported-by: Andres Freund Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de Backpatch-through: 9.5 
 
- 
- 25 Aug, 2019 3 commits
- 
- 
Tom Lane authoredPOSIX permits getopt() to advance optind beyond argc when the last argv entry is an option that requires an argument and hasn't got one. It seems that no major platforms actually do that, but musl does, so that something like "psql -f" would crash with that libc. Add a check that optind is in range before trying to look at the possibly-bogus option. Report and fix by Quentin Rameau. Back-patch to all supported branches. Discussion: https://postgr.es/m/20190825100617.GA6087@fifth.space 
- 
Tom Lane authoredWe were setting extra_float_digits = 0 to avoid platform-dependent output in this test, but that's still able to expose platform-specific roundoff behavior in some new test cases added by commit a3d28448, as reported by Peter Eisentraut. Reduce it to -1 to hide that. (Over in geometry.sql, we're using -3, which is an ancient decision dating to 337f73b1. I wonder whether that's overkill now. But there's probably little value in trying to change it.) Back-patch to v12 where a3d28448 came in; there's no evidence that we have any platform-dependent issues here before that. Discussion: https://postgr.es/m/15551268-e224-aa46-084a-124b64095ee3@2ndquadrant.com 
- 
Thomas Munro authoredBleeding-edge LLVM has stopped supplying replacements for various C++14 library features, for people on older C++ versions. Since we're not ready to require C++14 yet, just use plain old new instead of make_unique. As revealed by buildfarm animal seawasp. Back-patch to 11. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CA%2BhUKGJWG7unNqmkxg7nC5o3o-0p2XP6co4r%3D9epqYMm8UY4Mw%40mail.gmail.com 
 
- 
- 24 Aug, 2019 4 commits
- 
- 
Michael Paquier authored989d23b0 has caused its tests to be broken as the module defines unused steps, turning the buildfarm red. 
- 
Peter Geoghegan authoredThe Postgres approach to coupling locks during an ascent of the tree is slightly different to the approach taken by Lehman and Yao. Add a new paragraph to the "Differences to the Lehman & Yao algorithm" section of the nbtree README that explains the similarities and differences. 
- 
Michael Paquier authoredThis is useful for developers to find out if an isolation spec is over-engineered or if it needs more work by warning at the end of a test run if a step is not used, generating a failure with extra diffs. While on it, clean up all the specs which include steps not used in any permutations to simplify them. Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz 
- 
Michael Paquier authoredThe original purpose of the dry-run mode is to be able to print all the possible permutations from a spec file, but it has become less useful since isolation tests has improved regarding deadlock detection as one step not wanted by the author could block indefinitely now (originally the step blocked would have been detected rather quickly). Per discussion, let's remove it. Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz 
 
- 
- 23 Aug, 2019 1 commit
- 
- 
Michael Paquier authoredThis adds a section for heap-related functions. These were previously mixed with functions having a more general purpose, leading to confusion. While on it, add a query example for fsm_page_contents. Backpatch down to 10, where b5e3942f introduced the subsections for function types in pageinspect documentation. Author: Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoDyM7E1+cK3-aWejxKTGC-wVVP2B+RnJhN6inXyeRmqzw@mail.gmail.com Backpatch-through: 10 
 
- 
- 22 Aug, 2019 3 commits
- 
- 
Peter Eisentraut authoredT612 has been fully supported since the major window function enhancements in PostgreSQL 11, but it wasn't updated at the time. 
- 
Peter Eisentraut authoredThere were some minor differences that didn't seem necessary. Discussion: https://www.postgresql.org/message-id/flat/86b67eef-bb26-c97d-3e35-64f1fbd4f9fe%402ndquadrant.com 
- 
Michael Paquier authoredThe "Express" flavor of Visual Studio exists up to 2017, and the documentation referred to "Express" for Visual Studio 2019. Author: Takuma Hoshiai Discussion: https://postgr.es/m/20190820120231.f905542e685140258ca73d82@sraoss.co.jp Backpatch-through: 9.4 
 
- 
- 21 Aug, 2019 5 commits
- 
- 
Peter Geoghegan authoredAdjust the struct comment that describes how page splits use their descent stack to cascade up the tree from the leaf level. In passing, fix up some unrelated nbtree comments that had typos or were obsolete. 
- 
Peter Eisentraut authoredcrypt() hasn't been needed since crypt detection was removed from PostgreSQL, so these configure checks are not necessary. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/21f88934-f00c-27f6-a9d8-7ea06d317781%402ndquadrant.com 
- 
Alvaro Herrera authoredIn early development patches, "replication origins" were called "identifiers"; almost everything was renamed, but these references to the old terminology went unnoticed. Reported-by: Craig Ringer 
- 
Tom Lane authoredUsing pg_pltemplate as test data was probably not very forward-looking, considering we've had many discussions around removing that catalog altogether. Use a nearby temp table instead, to make these two test scripts more self-contained. This is a better test case anyway, since it exercises the scenario where the entries in the anyarray column actually vary in type intra-query. 
- 
Peter Eisentraut authoredAuthor: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/E393EC88-377F-4C59-A67A-69F2A38D17C7@yesql.se 
 
- 
- 20 Aug, 2019 3 commits
- 
- 
Peter Eisentraut authoredCorrect the comment for read_any_attr(). Give a clearer error message when parsing at the end of the string, when the client-final-message does not contain a "p" attribute (for some reason). Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/2fb8a15b-de35-682d-a77b-edcc9c52fa12%402ndquadrant.com 
- 
Alvaro Herrera authoredAuthor: Alexander Lakhin Discussion: https://postgr.es/m/20190819072244.GE18166@paquier.xyz 
- 
Michael Paquier authoredFD_SETSIZE is included in sys/select.h per POSIX, and this header inclusion has been moved to scripts_parallel.c as of 5f384037 without moving the variable, causing a compilation failure on recent versions of OpenBSD (6.6 was the version used in the report). In order to take care of the failure, move FD_SETSIZE directly to scripts_parallel.c with a wrapper controlling the maximum number of parallel slots supported, based on a suggestion by Andres Freund. While on it, reduce the maximum number to be less than FD_SETSIZE, leaving some room for stdin, stdout and such as they consume some file descriptors. The buildfarm did not complain about that, as it happens to only be an issue on recent versions of OpenBSD and there is no coverage in this area. 51c3e9fa fixed a similar set of issues. Bug: #15964 Reported-by: Sean Farrell Discussion: https://postgr.es/m/15964-c1753bdfed722e04@postgresql.org 
 
-