- 07 Apr, 2018 16 commits
-
-
Alvaro Herrera authored
Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
-
Alvaro Herrera authored
This works very much like the existing bms_last_member function, only it traverses through the Bitmapset in the opposite direction from the most significant bit down to the least significant bit. A special prevbit value of -1 may be used to have the function determine the most significant bit. This is useful for starting a loop. When there are no members less than prevbit, the function returns -2 to indicate there are no more members. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f-K=3d5MDASNYFJpUpc20xcBnAwNC1-AOeunhn0OtkWbQ@mail.gmail.com
-
Andres Freund authored
When an update moves a row between partitions (supported since 2f178441), our normal logic for following update chains in READ COMMITTED mode doesn't work anymore. Cross partition updates are modeled as an delete from the old and insert into the new partition. No ctid chain exists across partitions, and there's no convenient space to introduce that link. Not throwing an error in a partitioned context when one would have been thrown without partitioning is obviously problematic. This commit introduces infrastructure to detect when a tuple has been moved, not just plainly deleted. That allows to throw an error when encountering a deletion that's actually a move, while attempting to following a ctid chain. The row deleted as part of a cross partition update is marked by pointing it's t_ctid to an invalid block, instead of self as a normal update would. That was deemed to be the least invasive and most future proof way to represent the knowledge, given how few infomask bits are there to be recycled (there's also some locking issues with using infomask bits). External code following ctid chains should be updated to check for moved tuples. The most likely consequence of not doing so is a missed error. Author: Amul Sul, editorialized by me Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com
-
Teodor Sigaev authored
This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
-
Teodor Sigaev authored
Missed in 1c1791e0 commit
-
Teodor Sigaev authored
Jsonb has a complex nature so there isn't best-for-everything way to convert it to tsvector for full text search. Current to_tsvector(json(b)) suggests to convert only string values, but it's possible to index keys, numerics and even booleans value. To solve that json(b)_to_tsvector has a second required argument contained a list of desired types of json fields. Second argument is a jsonb scalar or array right now with possibility to add new options in a future. Bump catalog version Author: Dmitry Dolgov with some editorization by me Reviewed by: Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com
-
Peter Eisentraut authored
We need to wait for the initial sync of all subscriptions. On some (faster?) machines, this didn't make a difference, but the (slower?) buildfarm machines are upset.
-
Andres Freund authored
They've been broken for days, and prevent other tests from being run. The plan is to revert their addition later. Discussion: https://postgr.es/m/20180407162252.wfo5aorjrjw2n5ws@alap3.anarazel.de
-
Peter Eisentraut authored
Update the built-in logical replication system to make use of the previously added logical decoding for TRUNCATE support. Add the required truncate callback to pgoutput and a new logical replication protocol message. Publications get a new attribute to determine whether to replicate truncate actions. When updating a publication via pg_dump from an older version, this is not set, thus preserving the previous behavior. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
-
Peter Eisentraut authored
Add a new WAL record type for TRUNCATE, which is only used when wal_level >= logical. (For physical replication, TRUNCATE is already replicated via SMGR records.) Add new callback for logical decoding output plugins to receive TRUNCATE actions. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
-
Teodor Sigaev authored
Hash index searches acquire predicate locks on the primary page of a bucket. It acquires a lock on both the old and new buckets for scans that happen concurrently with page splits. During a bucket split, a predicate lock is copied from the primary page of an old bucket to the primary page of a new bucket. Author: Shubham Barai, Amit Kapila Reviewed by: Amit Kapila, Alexander Korotkov, Thomas Munro Discussion: https://www.postgresql.org/message-id/flat/CALxAEPvNsM2GTiXdRgaaZ1Pjd1bs+sxfFsf7Ytr+iq+5JJoYXA@mail.gmail.com
-
Alvaro Herrera authored
Author: Amit Langote Reviewed-by: Álvaro Herrera, David Rowley Discussion: https://postgr.es/m/CA+HiwqGzq4D6z=8R0AP+XhbTFCQ-4Ct+t2ekqjE9Fpm84_JUGg@mail.gmail.com
-
Andres Freund authored
The failure appears to solely be caused by the changed partition pruning logic. Author: Andres Freund Discussion: https://postgr.es/m/20180406210330.wmqw42wqgiicktli@alap3.anarazel.de
-
Andres Freund authored
The tests added as part of 9fdb675f yield differing results depending on endianess, causing buildfarm failures. As the differences are expected, split the hash partitioning tests into a different file and maintain alternative output. The separate file is so the amount of duplicated output is reduced. David produced the alternative output without a machine to test on, so it's possible this'll require a buildfarm cycle or two to get right. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f-6f4c2Qhuipe-GY7BKmFd0FMBobRnLS7hVCoAmTszsBg@mail.gmail.com
-
Andres Freund authored
The atomics fallback implementation for pg_atomic_flag was broken, returning the inverted value from pg_atomic_test_set_flag(). This was unnoticed because a) atomic flags were unused until recently b) the test code wasn't run when the fallback implementation was in use (because it didn't allow to test for some edge cases). Fix the bug, and improve the fallback so it has the same behaviour as the non-fallback implementation in the problematic edge cases. That breaks ABI compatibility in the back branches when fallbacks are in use, but given they were broken until now... Author: Andres Freund Reported-by: Daniel Gustafsson Discussion: https://postgr.es/m/FB948276-7B32-4B77-83E6-D00167F8EEB4@yesql.se https://postgr.es/m/20180406233854.uni2h3mbnveczl32@alap3.anarazel.de Backpatch: 9.5-, where the atomics abstraction was introduced.
-
- 06 Apr, 2018 18 commits
-
-
Robert Haas authored
Report and proposed fix by David Rowley, put in patch form by Peter Geoghegan. Discussion: http://postgr.es/m/CAKJS1f91kq1wfYR8rnRRfKtxyhU2woEA+=whd640UxMyU+O0EQ@mail.gmail.com
-
Robert Haas authored
Also enable this for postgres_fdw. Etsuro Fujita, based on an earlier patch by Amit Langote. The larger patch series of which this is a part has been reviewed by Amit Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost, and me. Minor documentation changes to the final version by me. Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp
-
Tom Lane authored
This has no functional impact whatsoever. However, it causes these unnecessary quote marks to disappear from the generated postgres.bki file, making it easier to verify that the upcoming bootstrap data conversion patch doesn't change the generated file.
-
Alvaro Herrera authored
Noted by Vik Fearing and Robert Haas
-
Magnus Hagander authored
These tests accummulated almost a gigabyte of data during the test which was then removed at the end. Instead, remove output that's no longer needed between the individual tests, to keep the total disk usage down lower. Author: Michael Banck
-
Magnus Hagander authored
Author: Michael Banck
-
Alvaro Herrera authored
Add a new module backend/partitioning/partprune.c, implementing a more sophisticated algorithm for partition pruning. The new module uses each partition's "boundinfo" for pruning instead of constraint exclusion, based on an idea proposed by Robert Haas of a "pruning program": a list of steps generated from the query quals which are run iteratively to obtain a list of partitions that must be scanned in order to satisfy those quals. At present, this targets planner-time partition pruning, but there exist further patches to apply partition pruning at execution time as well. This commit also moves some definitions from include/catalog/partition.h to a new file include/partitioning/partbounds.h, in an attempt to rationalize partitioning related code. Authors: Amit Langote, David Rowley, Dilip Kumar Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen. Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
-
Stephen Frost authored
This provides a newer version of adminpack which works with the newly added default roles to support GRANT'ing to non-superusers access to read and write files, along with related functions (unlinking files, getting file length, renaming/removing files, scanning the log file directory) which are supported through adminpack. Note that new versions of the functions are required because an environment might have an updated version of the library but still have the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to PUBLIC for the functions). This patch also removes the long-deprecated alternative names for functions that adminpack used to include and which are now included in the backend, in adminpack v1.1. Applications using the deprecated names should be updated to use the backend functions instead. Existing installations which continue to use adminpack v1.0 should continue to function until/unless adminpack is upgraded. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
-
Stephen Frost authored
This patch adds new default roles named 'pg_read_server_files', 'pg_write_server_files', 'pg_execute_server_program' which allow an administrator to GRANT to a non-superuser role the ability to access server-side files or run programs through PostgreSQL (as the user the database is running as). Having one of these roles allows a non-superuser to use server-side COPY to read, write, or with a program, and to use file_fdw (if installed by a superuser and GRANT'd USAGE on it) to read from files or run a program. The existing misc file functions are also changed to allow a user with the 'pg_read_server_files' default role to read any files on the filesystem, matching the privileges given to that role through COPY and file_fdw from above. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
-
Stephen Frost authored
This removes the explicit superuser checks in the various file-access functions in the backend, specifically pg_ls_dir(), pg_read_file(), pg_read_binary_file(), and pg_stat_file(). Instead, EXECUTE is REVOKE'd from public for these, meaning that only a superuser is able to run them by default, but access to them can be GRANT'd to other roles. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
-
Peter Eisentraut authored
Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us
-
Peter Eisentraut authored
MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us
-
Robert Haas authored
The previous coding inadvertently checked the constraints for the partitioned table rather than the target partition, which could lead to data in a partition that fails to satisfy some constraint on that partition. This problem seems to date back to when table partitioning was introduced; prior to that, there was only one target table for a COPY, so the problem didn't occur, and the code just didn't get updated. Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp
-
Robert Haas authored
Etsuro Fujita. The larger patch series of which this is a part has been reviewed by Amit Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost, and me. Discussion: http://postgr.es/m/5A95487E.9050808@lab.ntt.co.jp
-
Peter Eisentraut authored
We don't actually need the insert-or-update logic, so it's clearer to have separate functions for the inserting and updating. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
-
Peter Eisentraut authored
In case the subscription is removed before the worker is fully started, give a specific error message instead of the generic "cache lookup" error. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
-
Peter Eisentraut authored
-
Simon Riggs authored
Separation of parser data structures from executor, as requested by Tom Lane. Further improvements possible. While there, implement error for multiple VALUES clauses via parser to allow line number of error, as requested by Andres Freund. Author: Pavan Deolasee Discussion: https://www.postgresql.org/message-id/CABOikdPpqjectFchg0FyTOpsGXyPoqwgC==OLKWuxgBOsrDDZw@mail.gmail.com
-
- 05 Apr, 2018 6 commits
-
-
Magnus Hagander authored
S_ISLNK doesn't exist on Win32, instead we should use pgwin32_is_junction().
-
Magnus Hagander authored
This makes it possible to turn checksums on in a live cluster, without the previous need for dump/reload or logical replication (and to turn it off). Enabling checkusm starts a background process in the form of a launcher/worker combination that goes through the entire database and recalculates checksums on each and every page. Only when all pages have been checksummed are they fully enabled in the cluster. Any failure of the process will revert to checksums off and the process has to be started. This adds a new WAL record that indicates the state of checksums, so the process works across replicated clusters. Authors: Magnus Hagander and Daniel Gustafsson Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin
-
Bruce Momjian authored
Discussion: https://postgr.es/m/CAF4Au4xYem_W3KOuxcKct7=G4j8Z3uO9j3DUKTFJqUsfp_9pQg@mail.gmail.com Author: Oleg Bartunov Backpatch-through: 9.3
-
Simon Riggs authored
Reported-by: Andrew Gierth
-
Peter Eisentraut authored
A normal SQL command run inside PL/pgSQL acquires a snapshot, but SET TRANSACTION does not work anymore if a snapshot is set. So we have to handle this separately. Reviewed-by: Alexander Korotkov <a.korotkov@postgrespro.ru> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
-
Simon Riggs authored
Use of a C++ keyword as a function name caused problems Reported-by: Álvaro Herrera
-