1. 07 Apr, 2016 1 commit
    • Stephen Frost's avatar
      Add new catalog called pg_init_privs · 6c268df1
      Stephen Frost authored
      This new catalog holds the privileges which the system was
      initialized with at initdb time, along with any permissions set
      by extensions at CREATE EXTENSION time.  This allows pg_dump
      (and any other similar use-cases) to detect when the privileges
      set on initdb-created or extension-created objects have been
      changed from what they were set to at initdb/extension-creation
      time and handle those changes appropriately.
      
      Reviews by Alexander Korotkov, Jose Luis Tallon
      6c268df1
  2. 06 Apr, 2016 8 commits
    • Teodor Sigaev's avatar
      Add jsonb_insert · 0b62fd03
      Teodor Sigaev authored
      It inserts a new value into an jsonb array at arbitrary position or
      a new key to jsonb object.
      
      Author: Dmitry Dolgov
      Reviewers: Petr Jelinek, Vitaly Burovoy, Andrew Dunstan
      0b62fd03
    • Peter Eisentraut's avatar
      pg_dump: Add table qualifications to some tags · 3b3fcc4e
      Peter Eisentraut authored
      Some object types have names that are only unique for one table.  But
      for those we generally didn't put the table name into the dump TOC tag.
      So it was impossible to identify these objects if the same name was used
      for multiple tables.  This affects policies, column defaults,
      constraints, triggers, and rules.
      
      Fix by adding the table name to the TOC tag, so that it now reads
      "$schema $table $object".
      Reviewed-by: default avatarMichael Paquier <michael.paquier@gmail.com>
      3b3fcc4e
    • Tom Lane's avatar
      Run pgindent on a batch of (mostly-planner-related) source files. · de94e2af
      Tom Lane authored
      Getting annoyed at the amount of unrelated chatter I get from pgindent'ing
      Rowley's unique-joins patch.  Re-indent all the files it touches.
      de94e2af
    • Simon Riggs's avatar
      d25379eb
    • Fujii Masao's avatar
      Use proper format specifier %X/%X for LSN, again. · ead9963c
      Fujii Masao authored
      Commit cee31f5f fixed this problem, but commit 989be081 accidentally
      reverted the fix.
      
      Thomas Munro
      ead9963c
    • Simon Riggs's avatar
      Revert bf08f229 · cac0e366
      Simon Riggs authored
      Remove recent changes to logging XLOG_RUNNING_XACTS by request.
      cac0e366
    • Simon Riggs's avatar
      Generic Messages for Logical Decoding · 3fe3511d
      Simon Riggs authored
      API and mechanism to allow generic messages to be inserted into WAL that are
      intended to be read by logical decoding plugins. This commit adds an optional
      new callback to the logical decoding API.
      
      Messages are either text or bytea. Messages can be transactional, or not, and
      are identified by a prefix to allow multiple concurrent decoding plugins.
      
      (Not to be confused with Generic WAL records, which are intended to allow crash
      recovery of extensible objects.)
      
      Author: Petr Jelinek and Andres Freund
      Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs
      Discussion: 5685F999.6010202@2ndquadrant.com
      3fe3511d
    • Fujii Masao's avatar
      Support multiple synchronous standby servers. · 989be081
      Fujii Masao authored
      Previously synchronous replication offered only the ability to confirm
      that all changes made by a transaction had been transferred to at most
      one synchronous standby server.
      
      This commit extends synchronous replication so that it supports multiple
      synchronous standby servers. It enables users to consider one or more
      standby servers as synchronous, and increase the level of transaction
      durability by ensuring that transaction commits wait for replies from
      all of those synchronous standbys.
      
      Multiple synchronous standby servers are configured in
      synchronous_standby_names which is extended to support new syntax of
      'num_sync ( standby_name [ , ... ] )', where num_sync specifies
      the number of synchronous standbys that transaction commits need to
      wait for replies from and standby_name is the name of a standby
      server.
      
      The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
      is also still supported. It's the same as new syntax with num_sync=1.
      
      This commit doesn't include "quorum commit" feature which was discussed
      in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
      synchronous_standby_names determines the priority of each standby for
      being chosen as a synchronous standby. The standbys whose names appear
      earlier in the list are given higher priority and will be considered as
      synchronous. Other standby servers appearing later in this list
      represent potential synchronous standbys.
      
      The regression test for multiple synchronous standbys is not included
      in this commit. It should come later.
      
      Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
      Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
      Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
      Rajeev Rastogi
      
      Many thanks to the various individuals who were involved in
      discussing and developing this feature.
      989be081
  3. 05 Apr, 2016 9 commits
    • Alvaro Herrera's avatar
      Fix broken ALTER INDEX documentation · 2143f5e1
      Alvaro Herrera authored
      Commit b8a91d9d put the description of the new IF EXISTS clause in the
      wrong place -- move it where it belongs.
      
      Backpatch to 9.2.
      2143f5e1
    • Alvaro Herrera's avatar
      Support ALTER THING .. DEPENDS ON EXTENSION · f2fcad27
      Alvaro Herrera authored
      This introduces a new dependency type which marks an object as depending
      on an extension, such that if the extension is dropped, the object
      automatically goes away; and also, if the database is dumped, the object
      is included in the dump output.  Currently the grammar supports this for
      indexes, triggers, materialized views and functions only, although the
      utility code is generic so adding support for more object types is a
      matter of touching the parser rules only.
      
      Author: Abhijit Menon-Sen
      Reviewed-by: Alexander Korotkov, Álvaro Herrera
      Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org
      f2fcad27
    • Robert Haas's avatar
      Fix parallel-safety code for parallel aggregation. · 41ea0c23
      Robert Haas authored
      has_parallel_hazard() was ignoring the proparallel markings for
      aggregates, which is no good.  Fix that.  There was no way to mark
      an aggregate as actually being parallel-safe, either, so add a
      PARALLEL option to CREATE AGGREGATE.
      
      Patch by me, reviewed by David Rowley.
      41ea0c23
    • Robert Haas's avatar
      Align all shared memory allocations to cache line boundaries. · 09adc9a8
      Robert Haas authored
      Experimentation shows this only costs about 6kB, which seems well
      worth it given the major performance effects that can be caused
      by insufficient alignment, especially on larger systems.
      
      Discussion: 14166.1458924422@sss.pgh.pa.us
      09adc9a8
    • Tom Lane's avatar
      Fix PL/Python for recursion and interleaved set-returning functions. · 1d2fe56e
      Tom Lane authored
      PL/Python failed if a PL/Python function was invoked recursively via SPI,
      since arguments are passed to the function in its global dictionary
      (a horrible decision that's far too ancient to undo) and it would delete
      those dictionary entries on function exit, leaving the outer recursion
      level(s) without any arguments.  Not deleting them would be little better,
      since the outer levels would then see the innermost level's arguments.
      
      Since PL/Python uses ValuePerCall mode for evaluating set-returning
      functions, it's possible for multiple executions of the same SRF to be
      interleaved within a query.  PL/Python failed in such a case, because
      it stored only one iterator per function, directly in the function's
      PLyProcedure struct.  Moreover, one interleaved instance of the SRF
      would see argument values that should belong to another.
      
      Hence, invent code for saving and restoring the argument entries.  To fix
      the recursion case, we only need to save at recursive entry and restore
      at recursive exit, so the overhead in non-recursive cases is negligible.
      To fix the SRF case, we have to save when suspending a SRF and restore
      when resuming it, which is potentially not negligible; but fortunately
      this is mostly a matter of manipulating Python object refcounts and
      should not involve much physical data copying.
      
      Also, store the Python iterator and saved argument values in a structure
      associated with the SRF call site rather than the function itself.  This
      requires adding a memory context deletion callback to ensure that the SRF
      state is cleaned up if the calling query exits before running the SRF to
      completion.  Without that we'd leak a refcount to the iterator object in
      such a case, resulting in session-lifespan memory leakage.  (In the
      pre-existing code, there was no memory leak because there was only one
      iterator pointer, but what would happen is that the previous iterator
      would be resumed by the next query attempting to use the SRF.  Hardly the
      semantics we want.)
      
      We can buy back some of whatever overhead we've added by getting rid of
      PLy_function_delete_args(), which seems a useless activity: there is no
      need to delete argument entries from the global dictionary on exit,
      since the next time anyone would see the global dict is on the next
      fresh call of the PL/Python function, at which time we'd overwrite those
      entries with new arg values anyway.
      
      Also clean up some really ugly coding in the SRF implementation, including
      such gems as returning directly out of a PG_TRY block.  (The only reason
      that failed to crash hard was that all existing call sites immediately
      exited their own PG_TRY blocks, popping the dangling longjmp pointer before
      there was any chance of it being used.)
      
      In principle this is a bug fix; but it seems a bit too invasive relative to
      its value for a back-patch, and besides the fix depends on memory context
      callbacks so it could not go back further than 9.5 anyway.
      
      Alexey Grishchenko and Tom Lane
      1d2fe56e
    • Robert Haas's avatar
      Add parallel query support functions for assorted aggregates. · 11c8669c
      Robert Haas authored
      This lets us use parallel aggregate for a variety of useful cases
      that didn't work before, like sum(int8), sum(numeric), several
      versions of avg(), and various other functions.
      
      Add some regression tests, as well, testing the general sanity of
      these and future catalog entries.
      
      David Rowley, reviewed by Tomas Vondra, with a few further changes
      by me.
      11c8669c
    • Magnus Hagander's avatar
      Implement backup API functions for non-exclusive backups · 71176854
      Magnus Hagander authored
      Previously non-exclusive backups had to be done using the replication protocol
      and pg_basebackup. With this commit it's now possible to make them using
      pg_start_backup/pg_stop_backup as well, as long as the backup program can
      maintain a persistent connection to the database.
      
      Doing this, backup_label and tablespace_map are returned as results from
      pg_stop_backup() instead of being written to the data directory. This makes
      the server safe from a crash during an ongoing backup, which can be a problem
      with exclusive backups.
      
      The old syntax of the functions remain and work exactly as before, but since the
      new syntax is safer this should eventually be deprecated and removed.
      
      Only reference documentation is included. The main section on backup still needs
      to be rewritten to cover this, but since that is already scheduled for a separate
      large rewrite, it's not included in this patch.
      
      Reviewed by David Steele and Amit Kapila
      71176854
    • Magnus Hagander's avatar
      Fix typo · 9457b591
      Magnus Hagander authored
      Etsuro Fujita
      9457b591
    • Peter Eisentraut's avatar
      Fix error message from wal_level value renaming · 4dcd4da9
      Peter Eisentraut authored
      found by Ian Barwick
      4dcd4da9
  4. 04 Apr, 2016 12 commits
    • Tom Lane's avatar
      Disallow newlines in parameter values to be set in ALTER SYSTEM. · 99f3b561
      Tom Lane authored
      As noted by Julian Schauder in bug #14063, the configuration-file parser
      doesn't support embedded newlines in string literals.  While there might
      someday be a good reason to remove that restriction, there doesn't seem
      to be one right now.  However, ALTER SYSTEM SET could accept strings
      containing newlines, since many of the variable-specific value-checking
      routines would just see a newline as whitespace.  This led to writing a
      postgresql.auto.conf file that was broken and had to be removed manually.
      
      Pending a reason to work harder, just throw an error if someone tries this.
      
      In passing, fix several places in the ALTER SYSTEM logic that failed to
      provide an errcode() for an ereport(), and thus would falsely log the
      failure as an internal XX000 error.
      
      Back-patch to 9.4 where ALTER SYSTEM was introduced.
      99f3b561
    • Alvaro Herrera's avatar
      Display WAL pointer in rm_redo error callback · 890614d2
      Alvaro Herrera authored
      This makes it easier to identify the source of a recovery problem
      in case of a bug or data corruption.
      890614d2
    • Tom Lane's avatar
      Add a few comments about ANALYZE's strategy for collecting MCVs. · 3c69b33f
      Tom Lane authored
      Alex Shulgin complained that the underlying strategy wasn't all that
      apparent, particularly not the fact that we intentionally have two
      code paths depending on whether we think the column has a limited set
      of possible values or not.  Try to make it clearer.
      3c69b33f
    • Tom Lane's avatar
      Partially revert commit 3d3bf62f. · 391159e0
      Tom Lane authored
      On reflection, the pre-existing logic in ANALYZE is specifically meant to
      compare the frequency of a candidate MCV against the estimated frequency of
      a random distinct value across the whole table.  The change to compare it
      against the average frequency of values actually seen in the sample doesn't
      seem very principled, and if anything it would make us less likely not more
      likely to consider a value an MCV.  So revert that, but keep the aspect of
      considering only nonnull values, which definitely is correct.
      
      In passing, rename the local variables in these stanzas to
      "ndistinct_table", to avoid confusion with the "ndistinct" that appears at
      an outer scope in compute_scalar_stats.
      391159e0
    • Alvaro Herrera's avatar
      Silence compiler warning · c9ff752a
      Alvaro Herrera authored
      Reported by Peter Eisentraut to occur on 32bit systems
      c9ff752a
    • Tom Lane's avatar
      Add a \gexec command to psql for evaluation of computed queries. · 2bbe9112
      Tom Lane authored
      \gexec executes the just-entered query, like \g, but instead of printing
      the results it takes each field as a SQL command to send to the server.
      Computing a series of queries to be executed is a fairly common thing,
      but up to now you always had to resort to kluges like writing the queries
      to a file and then inputting the file.  Now it can be done with no
      intermediate step.
      
      The implementation is fairly straightforward except for its interaction
      with FETCH_COUNT.  ExecQueryUsingCursor isn't capable of being called
      recursively, and even if it were, its need to create a transaction
      block interferes unpleasantly with the desired behavior of \gexec after
      a failure of a generated query (i.e., that it can continue).  Therefore,
      disable use of ExecQueryUsingCursor when doing the master \gexec query.
      We can still apply it to individual generated queries, however, and there
      might be some value in doing so.
      
      While testing this feature's interaction with single-step mode, I (tgl) was
      led to conclude that SendQuery needs to recognize SIGINT (cancel_pressed)
      as a negative response to the single-step prompt.  Perhaps that's a
      back-patchable bug fix, but for now I just included it here.
      
      Corey Huinker, reviewed by Jim Nasby, Daniel Vérité, and myself
      2bbe9112
    • Tom Lane's avatar
      Introduce a LOG_SERVER_ONLY ereport level, which is never sent to client. · 66229ac0
      Tom Lane authored
      This elevel is useful for logging audit messages and similar information
      that should not be passed to the client.  It's equivalent to LOG in terms
      of decisions about logging priority in the postmaster log, but messages
      with this elevel will never be sent to the client.
      
      In the current implementation, it's just an alias for the longstanding
      COMMERROR elevel (or more accurately, we've made COMMERROR an alias for
      this).  At some point it might be interesting to allow a LOG_ONLY flag to
      be attached to any elevel, but that would be considerably more complicated,
      and it's not clear there's enough use-cases to justify the extra work.
      For now, let's just take the easy 90% solution.
      
      David Steele, reviewed by Fabien Coelho, Petr Jelínek, and myself
      66229ac0
    • Tom Lane's avatar
      Fix latent portability issue in pgwin32_dispatch_queued_signals(). · 58666ed2
      Tom Lane authored
      The first iteration of the signal-checking loop would compute sigmask(0)
      which expands to 1<<(-1) which is undefined behavior according to the
      C standard.  The lack of field reports of trouble suggest that it
      evaluates to 0 on all existing Windows compilers, but that's hardly
      something to rely on.  Since signal 0 isn't a queueable signal anyway,
      we can just make the loop iterate from 1 instead, and save a few cycles
      as well as avoiding the undefined behavior.
      
      In passing, avoid evaluating the volatile expression UNBLOCKED_SIGNAL_QUEUE
      twice in a row; there's no reason to waste cycles like that.
      
      Noted by Aleksander Alekseev, though this isn't his proposed fix.
      Back-patch to all supported branches.
      58666ed2
    • Teodor Sigaev's avatar
      Fix typo · eb7308d2
      Teodor Sigaev authored
      Michael Paquier
      eb7308d2
    • Teodor Sigaev's avatar
      fix typo · 9b27aebe
      Teodor Sigaev authored
      Andreas Ulbrich
      9b27aebe
    • Dean Rasheed's avatar
      Improve estimate of distinct values in estimate_num_groups(). · 84f9a35e
      Dean Rasheed authored
      When adjusting the estimate for the number of distinct values from a
      rel in a grouped query to take into account the selectivity of the
      rel's restrictions, use a formula that is less likely to produce
      under-estimates.
      
      The old formula simply multiplied the number of distinct values in the
      rel by the restriction selectivity, which would be correct if the
      restrictions were fully correlated with the grouping expressions, but
      can produce significant under-estimates in cases where they are not
      well correlated.
      
      The new formula is based on the random selection probability, and so
      assumes that the restrictions are not correlated with the grouping
      expressions. This is guaranteed to produce larger estimates, and of
      course risks over-estimating in cases where the restrictions are
      correlated, but that has less severe consequences than
      under-estimating, which might lead to a HashAgg that consumes an
      excessive amount of memory.
      
      This could possibly be improved upon in the future by identifying
      correlated restrictions and using a hybrid of the old and new
      formulae.
      
      Author: Tomas Vondra, with some hacking be me
      Reviewed-by: Mark Dilger, Alexander Korotkov, Dean Rasheed and Tom Lane
      Discussion: http://www.postgresql.org/message-id/flat/56CD0381.5060502@2ndquadrant.com
      84f9a35e
    • Simon Riggs's avatar
      Avoid archiving XLOG_RUNNING_XACTS on idle server · bf08f229
      Simon Riggs authored
      If archive_timeout > 0 we should avoid logging XLOG_RUNNING_XACTS if idle.
      
      Bug 13685 reported by Laurence Rowe, investigated in detail by Michael Paquier,
      though this is not his proposed fix.
      20151016203031.3019.72930@wrigleys.postgresql.org
      
      Simple non-invasive patch to allow later backpatch to 9.4 and 9.5
      bf08f229
  5. 03 Apr, 2016 6 commits
    • Tom Lane's avatar
      Clean up dubious code in contrib/seg. · a75a418d
      Tom Lane authored
      The restore() function assumed that the result of sprintf() with %e format
      would necessarily contain an 'e', which is false: what if the supplied
      number is an infinity or NaN?  If that did happen, we'd get a
      null-pointer-dereference core dump.  The case appears impossible currently,
      because seg_in() does not accept such values, and there are no seg-creating
      functions that would create one.  But it seems unwise to rely on it never
      happening in future.
      
      Quite aside from that, the code was pretty ugly: it relied on modifying a
      static format string when it could use a "*" precision argument, and it
      used strtok() entirely gratuitously, and it stripped off trailing spaces
      by hand instead of just not asking for them to begin with.
      
      Coverity noticed the potential null pointer dereference (though I wonder
      why it didn't complain years ago, since this code is ancient).
      
      Since this is just code cleanup and forestalling a hypothetical future
      bug, there seems no need for back-patching.
      a75a418d
    • Tom Lane's avatar
      Fix contrib/bloom to not fail under CLOBBER_CACHE_ALWAYS. · 8f75fd1f
      Tom Lane authored
      The code was supposing that rd_amcache wouldn't disappear from under it
      during a scan; which is wrong.  Copy the data out of the relcache rather
      than trying to reference it there.
      8f75fd1f
    • Tom Lane's avatar
      Clean up some stuff in new contrib/bloom module. · a9284849
      Tom Lane authored
      Coverity complained about implicit sign-extension in the
      BloomPageGetFreeSpace macro, probably because sizeOfBloomTuple isn't wide
      enough for size calculations.  No overflow is really possible as long as
      maxoff and sizeOfBloomTuple are small enough to represent a realistic
      situation, but it seems like a good idea to declare sizeOfBloomTuple as
      Size not int32.
      
      Add missing check on BloomPageAddItem() result, again from Coverity.
      
      Avoid core dump due to not allocating so->sign array when
      scan->numberOfKeys is zero.  Also thanks to Coverity.
      
      Use FLEXIBLE_ARRAY_MEMBER rather than declaring an array as size 1
      when it isn't necessarily.
      
      Very minor beautification of related code.
      
      Unfortunately, none of the Coverity-detected mistakes look like they
      could account for the remaining buildfarm unhappiness with this
      module.  It's barely possible that the FLEXIBLE_ARRAY_MEMBER mistake
      does account for that, if it's enabling bogus compiler optimizations;
      but I'm not terribly optimistic.  We probably still have bugs to
      find here.
      a9284849
    • Simon Riggs's avatar
      Avoid pin scan for replay of XLOG_BTREE_VACUUM in all cases · 3e4b7d87
      Simon Riggs authored
      Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require
      complex interlocking that matched the requirements on the master. This required
      an O(N) operation that became a significant problem with large indexes, causing
      replication delays of seconds or in some cases minutes while the
      XLOG_BTREE_VACUUM was replayed.
      
      This commit skips the pin scan that was previously required, by observing in
      detail when and how it is safe to do so, with full documentation. The pin
      scan is skipped only in replay; the VACUUM code path on master is not
      touched here and WAL is identical.
      
      The current commit applies in all cases, effectively replacing commit
      687f2cd7.
      3e4b7d87
    • Tom Lane's avatar
      Add psql \errverbose command to see last server error at full verbosity. · 3cc38ca7
      Tom Lane authored
      Often, upon getting an unexpected error in psql, one's first wish is that
      the verbosity setting had been higher; for example, to be able to see the
      schema-name field or the server code location info.  Up to now the only way
      has been to adjust the VERBOSITY variable and repeat the failing query.
      That's a pain, and it doesn't work if the error isn't reproducible.
      
      This commit adds a psql feature that redisplays the most recent server
      error at full verbosity, without needing to make any variable changes or
      re-execute the failed command.  We just need to hang onto the latest error
      PGresult in case the user executes \errverbose, and then apply libpq's
      new PQresultVerboseErrorMessage() function to it.  This will consume
      some trivial amount of psql memory, but otherwise the cost when the
      feature isn't used should be negligible.
      
      Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
      3cc38ca7
    • Tom Lane's avatar
      Add libpq support for recreating an error message with different verbosity. · e3161b23
      Tom Lane authored
      Often, upon getting an unexpected error in psql, one's first wish is that
      the verbosity setting had been higher; for example, to be able to see the
      schema-name field or the server code location info.  Up to now the only way
      has been to adjust the VERBOSITY variable and repeat the failing query.
      That's a pain, and it doesn't work if the error isn't reproducible.
      
      This commit adds support in libpq for regenerating the error message for
      an existing error PGresult at any desired verbosity level.  This is almost
      just a matter of refactoring the existing code into a subroutine, but there
      is one bit of possibly-needed information that was not getting put into
      PGresults: the text of the last query sent to the server.  We must add that
      string to the contents of an error PGresult.  But we only need to save it
      if it might be used, which with the existing error-formatting code only
      happens if there is a PG_DIAG_STATEMENT_POSITION error field, which is
      probably pretty rare for errors in production situations.  So really the
      overhead when the feature isn't used should be negligible.
      
      Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
      e3161b23
  6. 02 Apr, 2016 4 commits