1. 01 May, 2017 8 commits
    • Tom Lane's avatar
      Improve function header comment for create_singleton_array(). · 54affb41
      Tom Lane authored
      Mentioning the caller is neither future-proof nor an adequate substitute
      for giving an API specification.  Per gripe from Neha Khatri, though
      I changed the patch around some.
      
      Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
      54affb41
    • Tom Lane's avatar
      Reduce semijoins with unique inner relations to plain inner joins. · 92a43e48
      Tom Lane authored
      If the inner relation can be proven unique, that is it can have no more
      than one matching row for any row of the outer query, then we might as
      well implement the semijoin as a plain inner join, allowing substantially
      more freedom to the planner.  This is a form of outer join strength
      reduction, but it can't be implemented in reduce_outer_joins() because
      we don't have enough info about the individual relations at that stage.
      Instead do it much like remove_useless_joins(): once we've built base
      relations, we can make another pass over the SpecialJoinInfo list and
      get rid of any entries representing reducible semijoins.
      
      This is essentially a followon to the inner-unique patch (commit 9c7f5229)
      and makes use of the proof machinery that that patch created.  We need only
      minor refactoring of innerrel_is_unique's API to support this usage.
      
      Per performance complaint from Teodor Sigaev.
      
      Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
      92a43e48
    • Tom Lane's avatar
      Fix mis-optimization of semijoins with more than one LHS relation. · 2057a58d
      Tom Lane authored
      The inner-unique patch (commit 9c7f5229) supposed that if we're
      considering a JOIN_UNIQUE_INNER join path, we can always set inner_unique
      for the join, because the inner path produced by create_unique_path should
      be unique relative to the outer relation.  However, that's true only if
      we're considering joining to the whole outer relation --- otherwise we may
      be applying only some of the join quals, and so the inner path might be
      non-unique from the perspective of this join.  Adjust the test to only
      believe that we can set inner_unique if we have the whole semijoin LHS on
      the outer side.
      
      There is more that can be done in this area, but this commit is only
      intended to provide the minimal fix needed to get correct plans.
      
      Per report from Teodor Sigaev.  Thanks to David Rowley for preliminary
      investigation.
      
      Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
      2057a58d
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2017b. · 74a20d0a
      Tom Lane authored
      DST law changes in Chile, Haiti, and Mongolia.  Historical corrections for
      Ecuador, Kazakhstan, Liberia, and Spain.
      
      The IANA crew continue their campaign to replace invented time zone
      abbrevations with numeric GMT offsets.  This update changes numerous zones
      in South America, the Pacific and Indian oceans, and some Asian and Middle
      Eastern zones.  I kept these abbreviations in the tznames/ data files,
      however, so that we will still accept them for input.  (We may want to
      start trimming those files someday, but I think we should wait for the
      upstream dust to settle before deciding what to do.)
      
      In passing, add MESZ (Mitteleuropaeische Sommerzeit) to the tznames lists;
      since we accept MEZ (Mitteleuropaeische Zeit) it seems rather strange not
      to take the other one.  And fix some incorrect, or at least obsolete,
      comments that certain abbreviations are not traceable to the IANA data.
      74a20d0a
    • Robert Haas's avatar
      libpq: Fix inadvertent change in .pgpass lookup behavior. · bdac9836
      Robert Haas authored
      Commit 274bb2b3 caused password file
      lookups to use the hostaddr in preference to the host, but that was
      not intended and the documented behavior is the opposite.
      
      Report and patch by Kyotaro Horiguchi.
      
      Discussion: http://postgr.es/m/20170428.165432.60857995.horiguchi.kyotaro@lab.ntt.co.jp
      bdac9836
    • Andrew Dunstan's avatar
      Allow vcregress.pl to run an arbitrary TAP test set · fed6df48
      Andrew Dunstan authored
      Currently only provision for running the bin checks in a single step is
      provided for. Now these tests can be run individually, as well as tests
      in other locations (e.g. src.test/recover).
      
      Also provide for suppressing unnecessary temp installs by setting the
      NO_TEMP_INSTALL environment variable just as the Makefiles do.
      
      Backpatch to 9.4.
      fed6df48
    • Peter Eisentraut's avatar
      Fix logical replication launcher wake up and reset · 9414e41e
      Peter Eisentraut authored
      After the logical replication launcher was told to wake up at
      commit (for example, by a CREATE SUBSCRIPTION command), the flag to wake
      up was not reset, so it would be woken up at every following commit as
      well.  So fix that by resetting the flag.
      
      Also, we don't need to wake up anything if the transaction was rolled
      back.  Just reset the flag in that case.
      
      Author: Masahiko Sawada <sawada.mshk@gmail.com>
      Reported-by: default avatarFujii Masao <masao.fujii@gmail.com>
      9414e41e
    • Robert Haas's avatar
      Fire per-statement triggers on partitioned tables. · e180c8aa
      Robert Haas authored
      Even though no actual tuples are ever inserted into a partitioned
      table (the actual tuples are in the partitions, not the partitioned
      table itself), we still need to have a ResultRelInfo for the
      partitioned table, or per-statement triggers won't get fired.
      
      Amit Langote, per a report from Rajkumar Raghuwanshi.  Reviewed by me.
      
      Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
      e180c8aa
  2. 30 Apr, 2017 3 commits
    • Tom Lane's avatar
      Sync our copy of the timezone library with IANA release tzcode2017b. · e18b2c48
      Tom Lane authored
      zic no longer mishandles some transitions in January 2038 when it
      attempts to work around Qt bug 53071.  This fixes a bug affecting
      Pacific/Tongatapu that was introduced in zic 2016e.  localtime.c
      now contains a workaround, useful when loading a file generated by
      a buggy zic.
      
      There are assorted cosmetic changes as well, notably relocation
      of a bunch of #defines.
      e18b2c48
    • Tom Lane's avatar
      Fix possible null pointer dereference or invalid warning message. · 12d11432
      Tom Lane authored
      Thinko in commit de438971: this warning message references the wrong
      "LogicalRepWorker *" variable.  This would often result in a core dump,
      but if it didn't, the message would show the wrong subscription OID.
      
      In passing, adjust the message text to format a subscription OID
      similarly to how that's done elsewhere in the function; and fix
      grammatical issues in some nearby messages.
      
      Per Coverity testing.
      12d11432
    • Tom Lane's avatar
      Micro-optimize some slower queries in the opr_sanity regression test. · c2384421
      Tom Lane authored
      Convert the binary_coercible() and physically_coercible() functions from
      SQL to plpgsql.  It's not that plpgsql is inherently better at doing
      queries; if you simply convert the previous single SQL query into one
      RETURN expression, it's no faster.  The problem with the existing code
      is that it fools the plancache into deciding that it's worth re-planning
      the query every time, since constant-folding with a concrete value for $2
      allows elimination of at least one sub-SELECT.  In reality that's using the
      planner to do the equivalent of a few runtime boolean tests, causing the
      function to run much slower than it should.  Splitting the AND/OR logic
      into separate plpgsql statements allows each if-expression to acquire a
      static plan.
      
      Also, get rid of some uses of obj_description() in favor of explicitly
      joining to pg_description, allowing the joins to be optimized better.
      (Someday we might improve the SQL-function-inlining logic enough that
      this happens automatically, but today is not that day.)
      
      Together, these changes reduce the runtime of the opr_sanity regression
      test by about a factor of two on one of my slower machines.  They don't
      seem to help as much on a fast machine, but this should at least benefit
      the buildfarm.
      c2384421
  3. 28 Apr, 2017 9 commits
  4. 27 Apr, 2017 12 commits
    • Andres Freund's avatar
      Don't build full initial logical decoding snapshot if NOEXPORT_SNAPSHOT. · ab9c4338
      Andres Freund authored
      Earlier commits (56e19d93 and 2bef06d5) make it cheaper to
      create a logical slot if not exporting the initial snapshot.  If
      NOEXPORT_SNAPSHOT is specified, we can skip the overhead, not just
      when creating a slot via sql (which can't export snapshots).  As
      NOEXPORT_SNAPSHOT has only recently been introduced, this shouldn't be
      backpatched.
      ab9c4338
    • Andres Freund's avatar
      Don't use on-disk snapshots for exported logical decoding snapshot. · 56e19d93
      Andres Freund authored
      Logical decoding stores historical snapshots on disk, so that logical
      decoding can restart without having to reconstruct a snapshot from
      scratch (for which the resources are not guaranteed to be present
      anymore).  These serialized snapshots were also used when creating a
      new slot via the walsender interface, which can export a "full"
      snapshot (i.e. one that can read all tables, not just catalog ones).
      
      The problem is that the serialized snapshots are only useful for
      catalogs and not for normal user tables.  Thus the use of such a
      serialized snapshot could result in an inconsistent snapshot being
      exported, which could lead to queries returning wrong data.  This
      would only happen if logical slots are created while another logical
      slot already exists.
      
      Author: Petr Jelinek
      Reviewed-By: Andres Freund
      Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
      Backport: 9.4, where logical decoding was introduced.
      56e19d93
    • Tom Lane's avatar
      Avoid slow shutdown of pg_basebackup. · 7834d20b
      Tom Lane authored
      pg_basebackup's child process did not pay any attention to the pipe
      from its parent while waiting for input from the source server.
      If no server data was arriving, it would only wake up and check the
      pipe every standby_message_timeout or so.  This creates a problem
      since the parent process might determine and send the desired stop
      position only after the server has reached end-of-WAL and stopped
      sending data.  In the src/test/recovery regression tests, the timing
      is repeatably such that it takes nearly 10 seconds for the child
      process to realize that it should shut down.  It's not clear how
      often that would happen in real-world cases, but it sure seems like
      a bug --- and if the user turns off standby_message_timeout or sets
      it very large, the delay could be a lot worse.
      
      To fix, expand the StreamCtl API to allow the pipe input FD to be
      passed down to the low-level wait routine, and watch both sockets
      when sleeping.
      
      (Note: AFAICS this issue doesn't affect the Windows port, since
      it doesn't rely on a pipe to transfer the stop position to the
      child thread.)
      
      Discussion: https://postgr.es/m/6456.1493263884@sss.pgh.pa.us
      7834d20b
    • Fujii Masao's avatar
      Fix bug so logical rep launcher saves correctly time of last startup of worker. · 9f11fcec
      Fujii Masao authored
      Previously the logical replication launcher stored the last timestamp
      when it started the worker, in the local variable "last_start_time",
      in order to check whether wal_retrive_retry_interval elapsed since
      the last startup of worker. If it has elapsed, the launcher sees
      pg_subscription and starts new worker if necessary. This is for
      limitting the startup of worker to once a wal_retrieve_retry_interval.
      
      The bug was that the variable "last_start_time" was defined and
      always initialized with 0 at the beginning of the launcher's main loop.
      So even if it's set to the last timestamp in later phase of the loop,
      it's always reset to 0. Therefore the launcher could not check
      correctly whether wal_retrieve_retry_interval elapsed since
      the last startup.
      
      This patch moves the variable "last_start_time" outside the main loop
      so that it will not be reset.
      
      Reviewed-by: Petr Jelinek
      Discussion: http://postgr.es/m/CAHGQGwGJrPO++XM4mFENAwpy1eGXKsGdguYv43GUgLgU-x8nTQ@mail.gmail.com
      9f11fcec
    • Tom Lane's avatar
      Cope with glibc too old to have epoll_create1(). · 82ebbeb0
      Tom Lane authored
      Commit fa31b6f4 supposed that we didn't have to worry about that
      anymore, but it seems that RHEL5 is like that, and that's still
      a supported platform.  Put back the prior coding under an #ifdef,
      adding an explicit fcntl() to retain the desired CLOEXEC property.
      
      Discussion: https://postgr.es/m/12307.1493325329@sss.pgh.pa.us
      82ebbeb0
    • Andres Freund's avatar
      Preserve required !catalog tuples while computing initial decoding snapshot. · 2bef06d5
      Andres Freund authored
      The logical decoding machinery already preserved all the required
      catalog tuples, which is sufficient in the course of normal logical
      decoding, but did not guarantee that non-catalog tuples were preserved
      during computation of the initial snapshot when creating a slot over
      the replication protocol.
      
      This could cause a corrupted initial snapshot being exported.  The
      time window for issues is usually not terribly large, but on a busy
      server it's perfectly possible to it hit it.  Ongoing decoding is not
      affected by this bug.
      
      To avoid increased overhead for the SQL API, only retain additional
      tuples when a logical slot is being created over the replication
      protocol.  To do so this commit changes the signature of
      CreateInitDecodingContext(), but it seems unlikely that it's being
      used in an extension, so that's probably ok.
      
      In a drive-by fix, fix handling of
      ReplicationSlotsComputeRequiredXmin's already_locked argument, which
      should only apply to ProcArrayLock, not ReplicationSlotControlLock.
      
      Reported-By: Erik Rijkers
      Analyzed-By: Petr Jelinek
      Author: Petr Jelinek, heavily editorialized by Andres Freund
      Reviewed-By: Andres Freund
      Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com
      Backport: 9.4, where logical decoding was introduced.
      2bef06d5
    • Tom Lane's avatar
      Make latch.c more paranoid about child-process cases. · fa31b6f4
      Tom Lane authored
      Although the postmaster doesn't currently create a self-pipe or any
      latches, there's discussion of it doing so in future.  It's also
      conceivable that a shared_preload_libraries extension would try to
      create such a thing in the postmaster process today.  In that case
      the self-pipe FDs would be inherited by forked child processes.
      latch.c was entirely unprepared for such a case and could suffer an
      assertion failure, or worse try to use the inherited pipe if somebody
      called WaitLatch without having called InitializeLatchSupport in that
      process.  Make it keep track of whether InitializeLatchSupport has been
      called in the *current* process, and do the right thing if state has
      been inherited from a parent.
      
      Apply FD_CLOEXEC to file descriptors created in latch.c (the self-pipe,
      as well as epoll event sets).  This ensures that child processes spawned
      in backends, the archiver, etc cannot accidentally or intentionally mess
      with these FDs.  It also ensures that we end up with the right state
      for the self-pipe in EXEC_BACKEND processes, which otherwise wouldn't
      know to close the postmaster's self-pipe FDs.
      
      Back-patch to 9.6, mainly to keep latch.c looking similar in all branches
      it exists in.
      
      Discussion: https://postgr.es/m/8322.1493240739@sss.pgh.pa.us
      fa31b6f4
    • Bruce Momjian's avatar
      doc: PG10 release note typo fix · a311d2a0
      Bruce Momjian authored
      Reported-by: daniel.westermann
      a311d2a0
    • Bruce Momjian's avatar
      doc PG10rel: adjust hash index commits and add parallel subquery · f8ab08ad
      Bruce Momjian authored
      Reported-by: Amit Kapila
      f8ab08ad
    • Simon Riggs's avatar
      Rework handling of subtransactions in 2PC recovery · 49e92815
      Simon Riggs authored
      The bug fixed by 0874d4f3
      caused us to question and rework the handling of
      subtransactions in 2PC during and at end of recovery.
      Patch adds checks and tests to ensure no further bugs.
      
      This effectively removes the temporary measure put in place
      by 546c13e1.
      
      Author: Simon Riggs
      Reviewed-by: Tom Lane, Michael Paquier
      Discussion: http://postgr.es/m/CANP8+j+vvXmruL_i2buvdhMeVv5TQu0Hm2+C5N+kdVwHJuor8w@mail.gmail.com
      49e92815
    • Simon Riggs's avatar
      Additional tests for subtransactions in recovery · 0352c15e
      Simon Riggs authored
      Tests for normal and prepared transactions
      
      Author: Nikhil Sontakke, placed in new test file by me
      0352c15e
    • Peter Eisentraut's avatar
      Fix typo in comment · 6c9bd27a
      Peter Eisentraut authored
      Author: Masahiko Sawada <sawada.mshk@gmail.com>
      6c9bd27a
  5. 26 Apr, 2017 8 commits
    • Tom Lane's avatar
      Allow multiple bgworkers to be launched per postmaster iteration. · aa1351f1
      Tom Lane authored
      Previously, maybe_start_bgworker() would launch at most one bgworker
      process per call, on the grounds that the postmaster might otherwise
      neglect its other duties for too long.  However, that seems overly
      conservative, especially since bad effects only become obvious when
      many hundreds of bgworkers need to be launched at once.  On the other
      side of the coin is that the existing logic could result in substantial
      delay of bgworker launches, because ServerLoop isn't guaranteed to
      iterate immediately after a signal arrives.  (My attempt to fix that
      by using pselect(2) encountered too many portability question marks,
      and in any case could not help on platforms without pselect().)
      One could also question the wisdom of using an O(N^2) processing
      method if the system is intended to support so many bgworkers.
      
      As a compromise, allow that function to launch up to 100 bgworkers
      per call (and in consequence, rename it to maybe_start_bgworkers).
      This will allow any normal parallel-query request for workers
      to be satisfied immediately during sigusr1_handler, avoiding the
      question of whether ServerLoop will be able to launch more promptly.
      
      There is talk of rewriting the postmaster to use a WaitEventSet to
      avoid the signal-response-delay problem, but I'd argue that this change
      should be kept even after that happens (if it ever does).
      
      Backpatch to 9.6 where parallel query was added.  The issue exists
      before that, but previous uses of bgworkers typically aren't as
      sensitive to how quickly they get launched.
      
      Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
      aa1351f1
    • Bruce Momjian's avatar
      fda4fec5
    • Stephen Frost's avatar
      pg_get_partkeydef: return NULL for non-partitions · 0c76c246
      Stephen Frost authored
      Our general rule for pg_get_X(oid) functions is to simply return NULL
      when passed an invalid or inappropriate OID.  Teach pg_get_partkeydef to
      do this also, making it easier for users to use this function when
      querying against tables with both partitions and non-partitions (such as
      pg_class).
      
      As a concrete example, this makes pg_dump's life a little easier.
      
      Author: Amit Langote
      0c76c246
    • Tom Lane's avatar
      Silence compiler warning induced by commit de438971. · 49da0067
      Tom Lane authored
      Smarter compilers can see that "slot" can't be used uninitialized,
      but some popular ones cannot.  Noted by Jeff Janes.
      49da0067
    • Peter Eisentraut's avatar
      doc: ALTER SUBSCRIPTION documentation fixes · e315346d
      Peter Eisentraut authored
      WITH is optional for REFRESH PUBLICATION.  Also, remove a spurious
      bracket and fix a punctuation.
      
      Author: Euler Taveira <euler@timbira.com.br>
      e315346d
    • Peter Eisentraut's avatar
      Fix query that gets remote relation info · 61ecc90b
      Peter Eisentraut authored
      Publisher relation can be incorrectly chosen, if there are more than
      one relation in different schemas with the same name.
      
      Author: Euler Taveira <euler@timbira.com.br>
      61ecc90b
    • Peter Eisentraut's avatar
      Spelling fixes in code comments · e495c168
      Peter Eisentraut authored
      Author: Euler Taveira <euler@timbira.com.br>
      e495c168
    • Fujii Masao's avatar
      Fix typo in comment. · 1f8b0601
      Fujii Masao authored
      Author: Masahiko Sawada
      1f8b0601