1. 20 Dec, 2016 3 commits
  2. 19 Dec, 2016 6 commits
    • Robert Haas's avatar
      Provide a DSA area for all parallel queries. · e13029a5
      Robert Haas authored
      This will allow future parallel query code to dynamically allocate
      storage shared by all participants.
      
      Thomas Munro, with assorted changes by me.
      e13029a5
    • Tom Lane's avatar
      Fix handling of phrase operator removal while removing tsquery stopwords. · 26044384
      Tom Lane authored
      The distance of a removed phrase operator should propagate up to a
      parent phrase operator if there is one, but this only worked correctly
      in left-deep trees.  Throwing in a few parentheses confused it completely,
      as indeed was illustrated by bizarre results in existing regression test
      cases.
      
      To fix, track unaccounted-for distances that should propagate to the left
      and to the right of the current node, rather than trying to make it work
      with only one returned distance.
      
      Also make some adjustments to behave as well as we can for cases of
      intermixed phrase and regular (AND/OR) operators.  I don't think it's
      possible to be 100% correct for that without a rethinking of the tsquery
      representation; for example, maybe we should just not drop stopword nodes
      at all underneath phrase operators.  But this is better than it was,
      and changing tsquery representation wouldn't be safely back-patchable.
      
      While at it, I simplified the API of the clean_fakeval_intree function
      a bit by getting rid of the "char *result" output parameter; that wasn't
      doing anything that wasn't redundant with whether the result node is
      NULL or not, and testing for NULL seems a lot clearer/safer.
      
      This is part of a larger project to fix various infelicities in the
      phrase-search implementation, but this part seems comittable on its own.
      
      Back-patch to 9.6 where phrase operators were introduced.
      
      Discussion: https://postgr.es/m/28215.1481999808@sss.pgh.pa.us
      Discussion: https://postgr.es/m/26706.1482087250@sss.pgh.pa.us
      26044384
    • Robert Haas's avatar
      Fix locking problem in _hash_squeezebucket() / _hash_freeovflpage(). · dd728826
      Robert Haas authored
      A bucket squeeze operation needs to lock each page of the bucket
      before releasing the prior page, but the previous coding fumbled the
      locking when freeing an overflow page during a bucket squeeze
      operation.  Commit 6d46f478
      introduced this bug.
      
      Amit Kapila, with help from Kuntal Ghosh and Dilip Kumar, after
      an initial trouble report by Jeff Janes.  Reviewed by me.  I also
      fixed a problem with a comment.
      dd728826
    • Robert Haas's avatar
      Remove unused file. · 668dbbec
      Robert Haas authored
      This was added in 10540974, but has
      never been used for anything as far as I can tell.  There seems to
      be no reason to keep it.
      668dbbec
    • Fujii Masao's avatar
      Support quorum-based synchronous replication. · 3901fd70
      Fujii Masao authored
      This feature is also known as "quorum commit" especially in discussion
      on pgsql-hackers.
      
      This commit adds the following new syntaxes into synchronous_standby_names
      GUC. By using FIRST and ANY keywords, users can specify the method to
      choose synchronous standbys from the listed servers.
      
        FIRST num_sync (standby_name [, ...])
        ANY num_sync (standby_name [, ...])
      
      The keyword FIRST specifies a priority-based synchronous replication
      which was available also in 9.6 or before. This method makes transaction
      commits wait until their WAL records are replicated to num_sync
      synchronous standbys chosen based on their priorities.
      
      The keyword ANY specifies a quorum-based synchronous replication
      and makes transaction commits wait until their WAL records are
      replicated to *at least* num_sync listed standbys. In this method,
      the values of sync_state.pg_stat_replication for the listed standbys
      are reported as "quorum". The priority is still assigned to each standby,
      but not used in this method.
      
      The existing syntaxes having neither FIRST nor ANY keyword are still
      supported. They are the same as new syntax with FIRST keyword, i.e.,
      a priorirty-based synchronous replication.
      
      Author: Masahiko Sawada
      Reviewed-By: Michael Paquier, Amit Kapila and me
      Discussion: <CAD21AoAACi9NeC_ecm+Vahm+MMA6nYh=Kqs3KB3np+MBOS_gZg@mail.gmail.com>
      
      Many thanks to the various individuals who were involved in
      discussing and developing this feature.
      3901fd70
    • Magnus Hagander's avatar
      Fix base backup rate limiting in presence of slow i/o · 10238fad
      Magnus Hagander authored
      When source i/o on disk was too slow compared to the rate limiting
      specified, the system could end up with a negative value for sleep that
      it never got out of, which caused rate limiting to effectively be
      turned off.
      
      Discussion: https://postgr.es/m/CABUevEy_-e0YvL4ayoX8bH_Ja9w%2BBHoP6jUgdxZuG2nEj3uAfQ%40mail.gmail.com
      
      Analysis by me, patch by Antonin Houska
      10238fad
  3. 18 Dec, 2016 2 commits
    • Noah Misch's avatar
      MSVC: Position MSBFLAGS after flags it might override. · cc07e06b
      Noah Misch authored
      Christian Ullrich
      cc07e06b
    • Tom Lane's avatar
      In contrib/uuid-ossp, #include headers needed for ntohl() and ntohs(). · 4a0a34b5
      Tom Lane authored
      Oversight in commit b8cc8f94.  I just noticed this causes compiler
      warnings on FreeBSD, and it really ought to cause warnings elsewhere too:
      all references I can find say that <arpa/inet.h> is required for these.
      We have a lot of code elsewhere that thinks that both <netinet/in.h>
      and <arpa/inet.h> should be included for these functions, so do it that
      way here too, even though <arpa/inet.h> ought to be sufficient according
      to the references I consulted.
      
      Back-patch to 9.4 where the previous commit landed.
      4a0a34b5
  4. 17 Dec, 2016 3 commits
    • Tom Lane's avatar
      Fix FK-based join selectivity estimation for semi/antijoins. · 7fa93eec
      Tom Lane authored
      This case wasn't thought through sufficiently in commit 100340e2.
      It's true that the FK proves that every outer row has a match in the
      inner table, but we forgot that some of the inner rows might be filtered
      away by WHERE conditions located within the semijoin's RHS.
      
      If the RHS is just one table, we can reasonably take the semijoin
      selectivity as equal to the fraction of the referenced table's rows
      that are expected to survive its restriction clauses.
      
      If the RHS is a join, it's not clear how much of the referenced table
      might get through the join, so fall back to the same rule we were
      already using for other outer-join cases: use the minimum of the
      regular per-clause selectivity estimates.  This gives the same result
      as if we hadn't considered the FK at all when there's a single FK
      column, but it should still help for multi-column FKs, which is the
      case that 100340e2 is really meant to help with.
      
      Back-patch to 9.6 where the previous commit came in.
      
      Discussion: https://postgr.es/m/16149.1481835103@sss.pgh.pa.us
      7fa93eec
    • Peter Eisentraut's avatar
      doc: Remove some trailing whitespace · b645a05f
      Peter Eisentraut authored
      Per discussion, we will not at this time remove trailing whitespace in
      psql output displays where it is part of the actual psql output.
      
      From: Vladimir Rusinov <vrusinov@google.com>
      b645a05f
    • Magnus Hagander's avatar
      Fix typos in comments · 01776a07
      Magnus Hagander authored
      Michael Paquier
      01776a07
  5. 16 Dec, 2016 9 commits
    • Robert Haas's avatar
      Fix outdated comment in lwlock.c · 591ccb66
      Robert Haas authored
      Commit 3761fe3c should have made
      this change, but didn't.
      
      Reported by Álvaro Herrera.
      591ccb66
    • Fujii Masao's avatar
      Ensure that num_sync is greater than zero in synchronous_standby_names. · 93eb619c
      Fujii Masao authored
      Previously num_sync could be set to zero and this setting caused
      an assertion failure. This means that multiple synchronous standbys
      code should assume that num_sync is greater than zero.
      Also setting num_sync to zero is nonsense because it's basically
      the configuration for synchronous replication. If users want not to
      make transaction commits wait for any standbys,
      synchronous_standby_names should be emptied to disable synchronous
      replication instead of setting num_sync to zero.
      
      This patch forbids users from setting num_sync to zero in
      synchronous_standby_names. If zero is specified, an error will
      happen during processing the parameter settings.
      
      Back-patch to 9.6 where multiple synchronous standbys feature was added.
      
      Patch by me. Reviewed by Tom Lane.
      Discussion: <CAHGQGwHWB3izc6cXuFLh5kOcAbFXaRhhgwd-X5PeN9TEjxqXwg@mail.gmail.com>
      93eb619c
    • Tom Lane's avatar
      Improve documentation around TS_execute(). · 23c75b55
      Tom Lane authored
      I got frustrated by the lack of commentary in this area, so here is some
      reverse-engineered documentation, along with minor stylistic cleanup.
      No code changes more significant than removal of unused variables.
      
      Back-patch to 9.6, not because that's useful in itself, but because
      we have some bugs to fix in phrase search and this would cause merge
      failures if it's only in HEAD.
      23c75b55
    • Robert Haas's avatar
      Simplify LWLock tranche machinery by removing array_base/array_stride. · 3761fe3c
      Robert Haas authored
      array_base and array_stride were added so that we could identify the
      offset of an LWLock within a tranche, but this facility is only very
      marginally used apart from the main tranche.  So, give every lock in
      the main tranche its own tranche ID and get rid of array_base,
      array_stride, and all that's attached.  For debugging facilities
      (Trace_lwlocks and LWLOCK_STATS) print the pointer address of the
      LWLock using %p instead of the offset.  This is arguably more useful,
      and certainly a lot cheaper.  Drop the offset-within-tranche from
      the information reported to dtrace and from one can't-happen message
      inside lwlock.c.
      
      The main user-visible impact of this change is that pg_stat_activity
      will now report all waits for LWLocks as "LWLock" rather than
      reporting some as "LWLockTranche" and others as "LWLockNamed".
      
      The main motivation for this change is that the need to specify an
      array_base and an array_stride is awkward for parallel query.  There
      is only a very limited supply of tranche IDs so we can't just keep
      allocating new ones, and if we try to use the same tranche IDs every
      time then we run into trouble when multiple parallel contexts are
      use simultaneously.  So if we didn't get rid of this mechanism we'd
      have to make it even more complicated.  By simplifying it in this
      way, we instead reduce the size of the generated code for lwlock.c
      by about 5%.
      
      Discussion: http://postgr.es/m/CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com
      3761fe3c
    • Fujii Masao's avatar
      Add missing documentation for effective_io_concurrency tablespace option. · 4e344c2c
      Fujii Masao authored
      The description of effective_io_concurrency option was missing in ALTER
      TABLESPACE docs though it's included in CREATE TABLESPACE one.
      
      Back-patch to 9.6 where effective_io_concurrency tablespace option was added.
      
      Michael Paquier, reported by Marc-Olaf Jaschke
      4e344c2c
    • Robert Haas's avatar
      Unbreak Finalize HashAggregate over Partial HashAggregate. · b81b5a96
      Robert Haas authored
      Commit 5dfc1981 introduced the use
      of a new type of hash table with linear reprobing for hash aggregates.
      Such a hash table behaves very poorly if keys are inserted in hash
      order, which does in fact happen in the case where a query use a
      Finalize HashAggregate node fed (via Gather) by a Partial
      HashAggregate node.  In fact, queries with this type of plan tend
      to run effectively forever.
      
      Fix that by seeding the hash value differently in each worker
      (and in the leader, if it participates).
      
      Andres Freund and Robert Haas
      b81b5a96
    • Robert Haas's avatar
      Fix more hash index bugs around marking buffers dirty. · 6a4fe112
      Robert Haas authored
      In _hash_freeovflpage(), if we're freeing the overflow page that
      immediate follows the page to which tuples are being moved (the
      confusingly-named "write buffer"), don't forget to mark that
      page dirty after updating its hasho_nextblkno.
      
      In _hash_squeezebucket(), it's not necessary to mark the primary
      bucket page dirty if there are no overflow pages, because there's
      nothing to squeeze in that case.
      
      Amit Kapila, with help from Kuntal Ghosh and Dilip Kumar, after
      an initial trouble report by Jeff Janes.
      6a4fe112
    • Robert Haas's avatar
      Remove _hash_wrtbuf() in favor of calling MarkBufferDirty(). · 25216c98
      Robert Haas authored
      The whole concept of _hash_wrtbuf() is that we need to know at the
      time we're releasing the buffer lock (and pin) whether we dirtied the
      buffer, but this is easy to get wrong.  This patch actually fixes one
      non-obvious bug of that form: hashbucketcleanup forgot to signal
      _hash_squeezebucket, which gets the primary bucket page already
      locked, as to whether it had already dirtied the page.  Calling
      MarkBufferDirty() at the places where we dirty the buffer is more
      intuitive and lets us simplify the code in various places as well.
      
      On top of all that, the ultimate goal here is to make hash indexes
      WAL-logged, and as the comments to _hash_wrtbuf() note, it should
      go away when that happens.  Making it go away a little earlier than
      that seems like a good preparatory step.
      
      Report by Jeff Janes.  Diagnosis by Amit Kapila, Kuntal Ghosh,
      and Dilip Kumar.  Patch by me, after studying an alternative patch
      submitted by Amit Kapila.
      
      Discussion: http://postgr.es/m/CAA4eK1Kf6tOY0oVz_SEdngiNFkeXrA3xUSDPPORQvsWVPdKqnA@mail.gmail.com
      25216c98
    • Heikki Linnakangas's avatar
      Fix off-by-one in memory allocation for quote_literal_cstr(). · 4f5182e1
      Heikki Linnakangas authored
      The calculation didn't take into account the NULL terminator. That lead
      to overwriting the palloc'd buffer by one byte, if the input consists
      entirely of backslashes. For example "format('%L', E'\\')".
      
      Fixes bug #14468. Backpatch to all supported versions.
      
      Report: https://www.postgresql.org/message-id/20161216105001.13334.42819%40wrigleys.postgresql.org
      4f5182e1
  6. 15 Dec, 2016 3 commits
  7. 13 Dec, 2016 7 commits
    • Tom Lane's avatar
      Improve handling of array elements as getdiag_targets and cursor_variables. · 55caaaeb
      Tom Lane authored
      There's no good reason why plpgsql's GET DIAGNOSTICS statement can't
      support an array element as target variable, since the execution code
      already uses the generic exec_assign_value() function to assign to it.
      Hence, refactor the grammar to allow that, by making getdiag_target
      depend on the assign_var production.
      
      Ideally we'd also let a cursor_variable expand to an element of a
      refcursor[] array, but that's substantially harder since those statements
      also have to handle bound-cursor-variable cases.  For now, just make sure
      the reported error is sensible, ie "cursor variable must be a simple
      variable" not "variable must be of type cursor or refcursor".  The latter
      was quite confusing from the user's viewpoint, since what he wrote
      satisfies the claimed restriction.
      
      Per bug #14463 from Zhou Digoal.  Given the lack of previous complaints,
      I see no need for a back-patch.
      
      Discussion: https://postgr.es/m/20161213152548.14897.81245@wrigleys.postgresql.org
      55caaaeb
    • Tom Lane's avatar
      Prevent planagg.c from failing on queries containing CTEs. · 1f542a2e
      Tom Lane authored
      The existing tests in preprocess_minmax_aggregates() usually prevent it
      from trying to do anything with queries containing CTEs, but there's an
      exception: a CTE could be present as a member of an appendrel, if we
      flattened a UNION ALL that contains CTE references.  If it did try to
      generate an optimized path for a query using a CTE, it failed with
      "could not find plan for CTE", as reported by Torsten Förtsch.
      
      The proximate cause is an unwise decision in commit 3fc6e2d7 to clear
      subroot->cte_plan_ids in build_minmax_path().  That left the subroot's
      cte_plan_ids list out of step with its parse->cteList.
      
      Removing the "subroot->cte_plan_ids = NIL;" assignment is enough to let
      the case work again, but really it's pretty silly to be expending any
      cycles at all in this module when there are CTEs: we always treat their
      outputs as unordered so there's no way for the optimization to win.
      Hence, also add an early-exit test so we don't waste time like that.
      
      Back-patch to 9.6 where the misbehavior was introduced.
      
      Report: https://postgr.es/m/CAKkG4_=gjY5QiHtqSZyWMwDuTd_CftKoTaCqxjJ7uUz1-Gw=qw@mail.gmail.com
      1f542a2e
    • Robert Haas's avatar
      Fix bug in hashbulkdelete. · 501c7b94
      Robert Haas authored
      Commit 6d46f478 failed to account for
      the possibility that hashbulkdelete() might encounter a bucket that
      has been split since it began scanning the bucket array.  Repair.
      
      Extracted from a larger pathc by Amit Kapila; I rewrote the comment.
      501c7b94
    • Robert Haas's avatar
      Fix bugs in RelationGetPartitionDispatchInfo. · a2566508
      Robert Haas authored
      The previous coding was not quite right for cases involving multiple
      levels of partitioning.
      
      Amit Langote
      a2566508
    • Robert Haas's avatar
      Clean up code, comments, and formatting for table partitioning. · 4b9a98e1
      Robert Haas authored
      Amit Langote, plus pgindent-ing by me.  Inspired in part by review
      comments from Tomas Vondra.
      4b9a98e1
    • Robert Haas's avatar
      Update typedefs.list · acddbe22
      Robert Haas authored
      So developers can more easily run pgindent locally
      acddbe22
    • Robert Haas's avatar
      doc: Improve documentation related to table partitioning feature. · a1a4459c
      Robert Haas authored
      Commit f0e44751 implemented table
      partitioning, but failed to mention the "no row movement"
      restriction in the documentation.  Fix that and a few other issues.
      
      Amit Langote, with some additional wordsmithing by me.
      a1a4459c
  8. 12 Dec, 2016 7 commits
    • Robert Haas's avatar
      Remove should_free arguments to tuplesort routines. · 3856cf96
      Robert Haas authored
      Since commit e94568ec, the answer is
      always "false", and we do not need to complicate the API by arranging
      to return a constant value.
      
      Peter Geoghegan
      
      Discussion: http://postgr.es/m/CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com
      3856cf96
    • Tom Lane's avatar
      Catversion bump for temporary replication slots. · 9b3d02c2
      Tom Lane authored
      Missed in commit a924c327.
      Per Fujii Masao.
      9b3d02c2
    • Tom Lane's avatar
      Fix race condition in test_decoding "slot" test. · 23f722ba
      Tom Lane authored
      This test, just added in commit a924c327, sometimes fails because
      the old backend hasn't finished dropping the temporary replication slot
      when the new backend looks.  Borrow the previously-invented methodology
      for waiting for the old process to disappear from pg_stat_activity.
      
      Petr Jelinek
      
      Discussion: https://postgr.es/m/62935e6f-4f1b-c433-e0fa-7f936a38b3e5@2ndquadrant.com
      23f722ba
    • Robert Haas's avatar
      doc: Fix purported type of pg_am.amhandler to match reality. · b4630e01
      Robert Haas authored
      Joel Jacobson
      b4630e01
    • Tom Lane's avatar
      Make the different Unix-y semaphore implementations ABI-compatible. · be7b2848
      Tom Lane authored
      Previously, the "sem" field of PGPROC varied in size depending on which
      kernel semaphore API we were using.  That was okay as long as there was
      only one likely choice per platform, but in the wake of commit ecb0d20a,
      that assumption seems rather shaky.  It doesn't seem out of the question
      anymore that an extension compiled against one API choice might be loaded
      into a postmaster built with another choice.  Moreover, this prevents any
      possibility of selecting the semaphore API at postmaster startup, which
      might be something we want to do in future.
      
      Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix
      semaphore APIs, and turn the pointed-to data into an opaque struct whose
      contents are only known within the responsible modules.
      
      For the SysV and unnamed-POSIX APIs, the pointed-to data has to be
      allocated elsewhere in shared memory, which takes a little bit of
      rejiggering of the InitShmemAllocation code sequence.  (I invented a
      ShmemAllocUnlocked() function to make that a little cleaner than it used
      to be.  That function is not meant for any uses other than the ones it
      has now, but it beats having InitShmemAllocation() know explicitly about
      allocation of space for semaphores and spinlocks.)  This change means an
      extra indirection to access the semaphore data, but since we only touch
      that when blocking or awakening a process, there shouldn't be any
      meaningful performance penalty.  Moreover, at least for the unnamed-POSIX
      case on Linux, the sem_t type is quite a bit wider than a pointer, so this
      reduces sizeof(PGPROC) which seems like a good thing.
      
      For the named-POSIX API, there's effectively no change: the PGPROC.sem
      field was and still is a pointer to something returned by sem_open() in
      the postmaster's memory space.  Document and check the pre-existing
      limitation that this case can't work in EXEC_BACKEND mode.
      
      It did not seem worth unifying the Windows semaphore ABI with the Unix
      cases, since there's no likelihood of needing ABI compatibility much less
      runtime switching across those cases.  However, we can simplify the Windows
      code a bit if we define PGSemaphore as being directly a HANDLE, rather than
      pointer to HANDLE, so let's do that while we're here.  (This also ends up
      being no change in what's physically stored in PGPROC.sem.  We're just
      moving the HANDLE fetch from callees to callers.)
      
      It would take a bunch of additional code shuffling to get to the point of
      actually choosing a semaphore API at postmaster start, but the effects
      of that would now be localized in the port/XXX_sema.c files, so it seems
      like fit material for a separate patch.  The need for it is unproven as
      yet, anyhow, whereas the ABI risk to extensions seems real enough.
      
      Discussion: https://postgr.es/m/4029.1481413370@sss.pgh.pa.us
      be7b2848
    • Robert Haas's avatar
      psql: Fix incorrect version check for table partitining. · 06e18487
      Robert Haas authored
      Table partitioning was added in 10, not 9.6.
      
      Fabrízio de Royes Mello, per report from Jeff Janes
      06e18487
    • Tom Lane's avatar
      Fix creative, but unportable, spelling of "ptr != NULL". · 563d575f
      Tom Lane authored
      Or at least I suppose that's what was really meant here.  But even
      aside from the not-per-project-style use of "0" to mean "NULL",
      I doubt it's safe to assume that all valid pointers are > NULL.
      Per buildfarm member pademelon.
      563d575f