1. 16 Dec, 2016 5 commits
    • Fujii Masao's avatar
      Add missing documentation for effective_io_concurrency tablespace option. · 4e344c2c
      Fujii Masao authored
      The description of effective_io_concurrency option was missing in ALTER
      TABLESPACE docs though it's included in CREATE TABLESPACE one.
      
      Back-patch to 9.6 where effective_io_concurrency tablespace option was added.
      
      Michael Paquier, reported by Marc-Olaf Jaschke
      4e344c2c
    • Robert Haas's avatar
      Unbreak Finalize HashAggregate over Partial HashAggregate. · b81b5a96
      Robert Haas authored
      Commit 5dfc1981 introduced the use
      of a new type of hash table with linear reprobing for hash aggregates.
      Such a hash table behaves very poorly if keys are inserted in hash
      order, which does in fact happen in the case where a query use a
      Finalize HashAggregate node fed (via Gather) by a Partial
      HashAggregate node.  In fact, queries with this type of plan tend
      to run effectively forever.
      
      Fix that by seeding the hash value differently in each worker
      (and in the leader, if it participates).
      
      Andres Freund and Robert Haas
      b81b5a96
    • Robert Haas's avatar
      Fix more hash index bugs around marking buffers dirty. · 6a4fe112
      Robert Haas authored
      In _hash_freeovflpage(), if we're freeing the overflow page that
      immediate follows the page to which tuples are being moved (the
      confusingly-named "write buffer"), don't forget to mark that
      page dirty after updating its hasho_nextblkno.
      
      In _hash_squeezebucket(), it's not necessary to mark the primary
      bucket page dirty if there are no overflow pages, because there's
      nothing to squeeze in that case.
      
      Amit Kapila, with help from Kuntal Ghosh and Dilip Kumar, after
      an initial trouble report by Jeff Janes.
      6a4fe112
    • Robert Haas's avatar
      Remove _hash_wrtbuf() in favor of calling MarkBufferDirty(). · 25216c98
      Robert Haas authored
      The whole concept of _hash_wrtbuf() is that we need to know at the
      time we're releasing the buffer lock (and pin) whether we dirtied the
      buffer, but this is easy to get wrong.  This patch actually fixes one
      non-obvious bug of that form: hashbucketcleanup forgot to signal
      _hash_squeezebucket, which gets the primary bucket page already
      locked, as to whether it had already dirtied the page.  Calling
      MarkBufferDirty() at the places where we dirty the buffer is more
      intuitive and lets us simplify the code in various places as well.
      
      On top of all that, the ultimate goal here is to make hash indexes
      WAL-logged, and as the comments to _hash_wrtbuf() note, it should
      go away when that happens.  Making it go away a little earlier than
      that seems like a good preparatory step.
      
      Report by Jeff Janes.  Diagnosis by Amit Kapila, Kuntal Ghosh,
      and Dilip Kumar.  Patch by me, after studying an alternative patch
      submitted by Amit Kapila.
      
      Discussion: http://postgr.es/m/CAA4eK1Kf6tOY0oVz_SEdngiNFkeXrA3xUSDPPORQvsWVPdKqnA@mail.gmail.com
      25216c98
    • Heikki Linnakangas's avatar
      Fix off-by-one in memory allocation for quote_literal_cstr(). · 4f5182e1
      Heikki Linnakangas authored
      The calculation didn't take into account the NULL terminator. That lead
      to overwriting the palloc'd buffer by one byte, if the input consists
      entirely of backslashes. For example "format('%L', E'\\')".
      
      Fixes bug #14468. Backpatch to all supported versions.
      
      Report: https://www.postgresql.org/message-id/20161216105001.13334.42819%40wrigleys.postgresql.org
      4f5182e1
  2. 15 Dec, 2016 3 commits
  3. 13 Dec, 2016 7 commits
    • Tom Lane's avatar
      Improve handling of array elements as getdiag_targets and cursor_variables. · 55caaaeb
      Tom Lane authored
      There's no good reason why plpgsql's GET DIAGNOSTICS statement can't
      support an array element as target variable, since the execution code
      already uses the generic exec_assign_value() function to assign to it.
      Hence, refactor the grammar to allow that, by making getdiag_target
      depend on the assign_var production.
      
      Ideally we'd also let a cursor_variable expand to an element of a
      refcursor[] array, but that's substantially harder since those statements
      also have to handle bound-cursor-variable cases.  For now, just make sure
      the reported error is sensible, ie "cursor variable must be a simple
      variable" not "variable must be of type cursor or refcursor".  The latter
      was quite confusing from the user's viewpoint, since what he wrote
      satisfies the claimed restriction.
      
      Per bug #14463 from Zhou Digoal.  Given the lack of previous complaints,
      I see no need for a back-patch.
      
      Discussion: https://postgr.es/m/20161213152548.14897.81245@wrigleys.postgresql.org
      55caaaeb
    • Tom Lane's avatar
      Prevent planagg.c from failing on queries containing CTEs. · 1f542a2e
      Tom Lane authored
      The existing tests in preprocess_minmax_aggregates() usually prevent it
      from trying to do anything with queries containing CTEs, but there's an
      exception: a CTE could be present as a member of an appendrel, if we
      flattened a UNION ALL that contains CTE references.  If it did try to
      generate an optimized path for a query using a CTE, it failed with
      "could not find plan for CTE", as reported by Torsten Förtsch.
      
      The proximate cause is an unwise decision in commit 3fc6e2d7 to clear
      subroot->cte_plan_ids in build_minmax_path().  That left the subroot's
      cte_plan_ids list out of step with its parse->cteList.
      
      Removing the "subroot->cte_plan_ids = NIL;" assignment is enough to let
      the case work again, but really it's pretty silly to be expending any
      cycles at all in this module when there are CTEs: we always treat their
      outputs as unordered so there's no way for the optimization to win.
      Hence, also add an early-exit test so we don't waste time like that.
      
      Back-patch to 9.6 where the misbehavior was introduced.
      
      Report: https://postgr.es/m/CAKkG4_=gjY5QiHtqSZyWMwDuTd_CftKoTaCqxjJ7uUz1-Gw=qw@mail.gmail.com
      1f542a2e
    • Robert Haas's avatar
      Fix bug in hashbulkdelete. · 501c7b94
      Robert Haas authored
      Commit 6d46f478 failed to account for
      the possibility that hashbulkdelete() might encounter a bucket that
      has been split since it began scanning the bucket array.  Repair.
      
      Extracted from a larger pathc by Amit Kapila; I rewrote the comment.
      501c7b94
    • Robert Haas's avatar
      Fix bugs in RelationGetPartitionDispatchInfo. · a2566508
      Robert Haas authored
      The previous coding was not quite right for cases involving multiple
      levels of partitioning.
      
      Amit Langote
      a2566508
    • Robert Haas's avatar
      Clean up code, comments, and formatting for table partitioning. · 4b9a98e1
      Robert Haas authored
      Amit Langote, plus pgindent-ing by me.  Inspired in part by review
      comments from Tomas Vondra.
      4b9a98e1
    • Robert Haas's avatar
      Update typedefs.list · acddbe22
      Robert Haas authored
      So developers can more easily run pgindent locally
      acddbe22
    • Robert Haas's avatar
      doc: Improve documentation related to table partitioning feature. · a1a4459c
      Robert Haas authored
      Commit f0e44751 implemented table
      partitioning, but failed to mention the "no row movement"
      restriction in the documentation.  Fix that and a few other issues.
      
      Amit Langote, with some additional wordsmithing by me.
      a1a4459c
  4. 12 Dec, 2016 13 commits
    • Robert Haas's avatar
      Remove should_free arguments to tuplesort routines. · 3856cf96
      Robert Haas authored
      Since commit e94568ec, the answer is
      always "false", and we do not need to complicate the API by arranging
      to return a constant value.
      
      Peter Geoghegan
      
      Discussion: http://postgr.es/m/CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com
      3856cf96
    • Tom Lane's avatar
      Catversion bump for temporary replication slots. · 9b3d02c2
      Tom Lane authored
      Missed in commit a924c327.
      Per Fujii Masao.
      9b3d02c2
    • Tom Lane's avatar
      Fix race condition in test_decoding "slot" test. · 23f722ba
      Tom Lane authored
      This test, just added in commit a924c327, sometimes fails because
      the old backend hasn't finished dropping the temporary replication slot
      when the new backend looks.  Borrow the previously-invented methodology
      for waiting for the old process to disappear from pg_stat_activity.
      
      Petr Jelinek
      
      Discussion: https://postgr.es/m/62935e6f-4f1b-c433-e0fa-7f936a38b3e5@2ndquadrant.com
      23f722ba
    • Robert Haas's avatar
      doc: Fix purported type of pg_am.amhandler to match reality. · b4630e01
      Robert Haas authored
      Joel Jacobson
      b4630e01
    • Tom Lane's avatar
      Make the different Unix-y semaphore implementations ABI-compatible. · be7b2848
      Tom Lane authored
      Previously, the "sem" field of PGPROC varied in size depending on which
      kernel semaphore API we were using.  That was okay as long as there was
      only one likely choice per platform, but in the wake of commit ecb0d20a,
      that assumption seems rather shaky.  It doesn't seem out of the question
      anymore that an extension compiled against one API choice might be loaded
      into a postmaster built with another choice.  Moreover, this prevents any
      possibility of selecting the semaphore API at postmaster startup, which
      might be something we want to do in future.
      
      Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix
      semaphore APIs, and turn the pointed-to data into an opaque struct whose
      contents are only known within the responsible modules.
      
      For the SysV and unnamed-POSIX APIs, the pointed-to data has to be
      allocated elsewhere in shared memory, which takes a little bit of
      rejiggering of the InitShmemAllocation code sequence.  (I invented a
      ShmemAllocUnlocked() function to make that a little cleaner than it used
      to be.  That function is not meant for any uses other than the ones it
      has now, but it beats having InitShmemAllocation() know explicitly about
      allocation of space for semaphores and spinlocks.)  This change means an
      extra indirection to access the semaphore data, but since we only touch
      that when blocking or awakening a process, there shouldn't be any
      meaningful performance penalty.  Moreover, at least for the unnamed-POSIX
      case on Linux, the sem_t type is quite a bit wider than a pointer, so this
      reduces sizeof(PGPROC) which seems like a good thing.
      
      For the named-POSIX API, there's effectively no change: the PGPROC.sem
      field was and still is a pointer to something returned by sem_open() in
      the postmaster's memory space.  Document and check the pre-existing
      limitation that this case can't work in EXEC_BACKEND mode.
      
      It did not seem worth unifying the Windows semaphore ABI with the Unix
      cases, since there's no likelihood of needing ABI compatibility much less
      runtime switching across those cases.  However, we can simplify the Windows
      code a bit if we define PGSemaphore as being directly a HANDLE, rather than
      pointer to HANDLE, so let's do that while we're here.  (This also ends up
      being no change in what's physically stored in PGPROC.sem.  We're just
      moving the HANDLE fetch from callees to callers.)
      
      It would take a bunch of additional code shuffling to get to the point of
      actually choosing a semaphore API at postmaster start, but the effects
      of that would now be localized in the port/XXX_sema.c files, so it seems
      like fit material for a separate patch.  The need for it is unproven as
      yet, anyhow, whereas the ABI risk to extensions seems real enough.
      
      Discussion: https://postgr.es/m/4029.1481413370@sss.pgh.pa.us
      be7b2848
    • Robert Haas's avatar
      psql: Fix incorrect version check for table partitining. · 06e18487
      Robert Haas authored
      Table partitioning was added in 10, not 9.6.
      
      Fabrízio de Royes Mello, per report from Jeff Janes
      06e18487
    • Tom Lane's avatar
      Fix creative, but unportable, spelling of "ptr != NULL". · 563d575f
      Tom Lane authored
      Or at least I suppose that's what was really meant here.  But even
      aside from the not-per-project-style use of "0" to mean "NULL",
      I doubt it's safe to assume that all valid pointers are > NULL.
      Per buildfarm member pademelon.
      563d575f
    • Peter Eisentraut's avatar
      Add support for temporary replication slots · a924c327
      Peter Eisentraut authored
      This allows creating temporary replication slots that are removed
      automatically at the end of the session or on error.
      
      From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
      a924c327
    • Heikki Linnakangas's avatar
      Refactor the code for verifying user's password. · e7f051b8
      Heikki Linnakangas authored
      Split md5_crypt_verify() into three functions:
      * get_role_password() to fetch user's password from pg_authid, and check
        its expiration.
      * md5_crypt_verify() to check an MD5 authentication challenge
      * plain_crypt_verify() to check a plaintext password.
      
      get_role_password() will be needed as a separate function by the upcoming
      SCRAM authentication patch set. Most of the remaining functionality in
      md5_crypt_verify() was different for MD5 and plaintext authentication, so
      split that for readability.
      
      While we're at it, simplify the *_crypt_verify functions by using
      stack-allocated buffers to hold the temporary MD5 hashes, instead of
      pallocing.
      
      Reviewed by Michael Paquier.
      
      Discussion: https://www.postgresql.org/message-id/3029e460-d47c-710e-507e-d8ba759d7cbb@iki.fi
      e7f051b8
    • Heikki Linnakangas's avatar
      Further cleanup from the strong-random patch. · 58445c5c
      Heikki Linnakangas authored
      Also use the new facility for generating RADIUS authenticator requests,
      and salt in chkpass extension.
      
      Reword the error messages to be nicer. Fix bogus error code used in the
      message in BackendStartup.
      58445c5c
    • Heikki Linnakangas's avatar
      Fix pgcrypto compilation with OpenSSL 1.1.0. · 9bbbf029
      Heikki Linnakangas authored
      Was broken by the switch to using OpenSSL's EVP interface for ciphers, in
      commit 5ff4a67f.
      
      Reported by Andres Freund. Fix by Michael Paquier with some kibitzing by me.
      
      Discussion: https://www.postgresql.org/message-id/20161201014826.ic72tfkahmevpwz7@alap3.anarazel.de
      9bbbf029
    • Heikki Linnakangas's avatar
      Fix two thinkos related to strong random keys. · 41493bac
      Heikki Linnakangas authored
      pg_backend_random() is used for MD5 salt generation, but it can fail, and
      no checks were done on its status code.
      
      Fix memory leak, if generating a random number for a cancel key failed.
      
      Both issues were spotted by Coverity. Fix by Michael Paquier.
      41493bac
    • Heikki Linnakangas's avatar
      ad365b2f
  5. 11 Dec, 2016 2 commits
    • Tom Lane's avatar
      Use "%option prefix" to set API names in ecpg's lexer. · 92fb6498
      Tom Lane authored
      Clean up some technical debt left behind by commit 72b1e3a2: instead of
      quickly hacking the name of base_yylex() with a #define, set it properly
      with "%option prefix".  This causes the names of pgc.l's other exported
      symbols to change as well, so run around and modify the outside references
      to them as needed.  Similarly, make pgc.l's external references to
      base_yylval use that variable's true name instead of a macro.
      
      The reason for doing this now is that the quick-hack solution will fail
      with future versions of flex, as reported by Дилян Палаузов.
      Hence, back-patch into 9.6 where the previous commit appeared, since
      it's likely people will build 9.6 with newer flex versions during
      its lifetime.
      
      Discussion: https://postgr.es/m/d845c1af-e18d-6651-178f-9f08cdf37e10@aegee.org
      92fb6498
    • Tom Lane's avatar
      Prevent crash when ts_rewrite() replaces a non-top-level subtree with null. · 0eaaaf00
      Tom Lane authored
      When ts_rewrite()'s replacement argument is an empty tsquery, it's supposed
      to simplify any operator nodes whose operand(s) become NULL; but it failed
      to do that reliably, because dropvoidsubtree() only examined the top level
      of the result tree.  Rather than make a second recursive pass, let's just
      give the responsibility to dofindsubquery() to simplify while it's doing
      the main replacement pass.  Per report from Andreas Seltenreich.
      
      Artur Zakirov, with some cosmetic changes by me.  Back-patch to all
      supported branches.
      
      Discussion: https://postgr.es/m/8737i01dew.fsf@credativ.de
      0eaaaf00
  6. 09 Dec, 2016 2 commits
    • Tom Lane's avatar
      Be more careful about Python refcounts while creating exception objects. · 9cda81f0
      Tom Lane authored
      PLy_generate_spi_exceptions neglected to do Py_INCREF on the new exception
      objects, evidently supposing that PyModule_AddObject would do that --- but
      it doesn't.  This left us in a situation where a Python garbage collection
      cycle could result in deletion of exception object(s), causing server
      crashes or wrong answers if the exception objects are used later in the
      session.
      
      In addition, PLy_generate_spi_exceptions didn't bother to test for
      a null result from PyErr_NewException, which at best is inconsistent
      with the code in PLy_add_exceptions.  And PLy_add_exceptions, while it
      did do Py_INCREF on the exceptions it makes, waited to do that till
      after some PyModule_AddObject calls, creating a similar risk for
      failure if garbage collection happened within those calls.
      
      To fix, refactor to have just one piece of code that creates an
      exception object and adds it to the spiexceptions module, bumping the
      refcount first.
      
      Also, let's add an additional refcount to represent the pointer we're
      going to store in a C global variable or hash table.  This should only
      matter if the user does something weird like delete the spiexceptions
      Python module, but lack of paranoia has caused us enough problems in
      PL/Python already.
      
      The fact that PyModule_AddObject doesn't do a Py_INCREF of its own
      explains the need for the Py_INCREF added in commit 4c966d92, so we
      can improve the comment about that; also, this means we really want
      to do that before not after the PyModule_AddObject call.
      
      The missing Py_INCREF in PLy_generate_spi_exceptions was reported and
      diagnosed by Rafa de la Torre; the other fixes by me.  Back-patch
      to all supported branches.
      
      Discussion: https://postgr.es/m/CA+Fz15kR1OXZv43mDrJb3XY+1MuQYWhx5kx3ea6BRKQp6ezGkg@mail.gmail.com
      9cda81f0
    • Alvaro Herrera's avatar
      Fix crasher bug in array_position(s) · a73491e5
      Alvaro Herrera authored
      array_position and its cousin array_positions were caching the element
      type equality function's FmgrInfo without being careful enough to put it
      in a long-lived context.  This is obviously broken but it didn't matter
      in most cases; only when using arrays of records (involving record_eq)
      it becomes a problem.  The fix is to ensure that the type's equality
      function's FmgrInfo is cached in the array_position's flinfo->fn_mcxt
      rather than the current memory context.
      
      Apart from record types, the only other case that seems complex enough
      to possibly cause the same problem are range types.  I didn't find a way
      to reproduce the problem with those, so I only include the test case
      submitted with the bug report as regression test.
      
      Bug report and patch: Junseok Yang
      Discussion: https://postgr.es/m/CAE+byMupUURYiZ6bKYgMZb9pgV1CYAijJGqWj-90W=nS7uEOeA@mail.gmail.com
      Backpatch to 9.5, where array_position appeared.
      a73491e5
  7. 08 Dec, 2016 8 commits
    • Heikki Linnakangas's avatar
      Fix thinko in safeguard for negative availMem. · 64bc26f9
      Heikki Linnakangas authored
      Also, use pass read_buffer_size * numInputTapes rather than just availMem
      to USEMEM, to be neat.
      
      Peter Geoghegan.
      64bc26f9
    • Robert Haas's avatar
      Fix bogus comment. · 01ae881e
      Robert Haas authored
      Commit 4212cb73 rendered a comment
      in execMain.c incorrect.  Per complaint from Tom Lane, repair.
      
      Patch from Amit Kapila, per wording suggested by Tom Lane and me.
      01ae881e
    • Robert Haas's avatar
      Silence compiler warning. · ab4575dc
      Robert Haas authored
      Per report from Stephen Frost.
      ab4575dc
    • Robert Haas's avatar
      Log the creation of an init fork unconditionally. · fa0f466d
      Robert Haas authored
      Previously, it was thought that this only needed to be done for the
      benefit of possible standbys, so wal_level = minimal skipped it.
      But that's not safe, because during crash recovery we might replay
      XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record which recursively
      removes the directory that contains the new init fork.  So log it
      always.
      
      The user-visible effect of this bug is that if you create a database
      or tablespace, then create an unlogged table, then crash without
      checkpointing, then restart, accessing the table will fail, because
      the it won't have been properly reset.  This commit fixes that.
      
      Michael Paquier, per a report from Konstantin Knizhnik.  Wording of
      the comments per a suggestion from me.
      fa0f466d
    • Tom Lane's avatar
      Fix reporting of column typmods for multi-row VALUES constructs. · 0b78106c
      Tom Lane authored
      expandRTE() and get_rte_attribute_type() reported the exprType() and
      exprTypmod() values of the expressions in the first row of the VALUES as
      being the column type/typmod returned by the VALUES RTE.  That's fine for
      the data type, since we coerce all expressions in a column to have the same
      common type.  But we don't coerce them to have a common typmod, so it was
      possible for rows after the first one to return values that violate the
      claimed column typmod.  This leads to the incorrect result seen in bug
      #14448 from Hassan Mahmood, as well as some other corner-case misbehaviors.
      
      The desired behavior is the same as we use in other type-unification
      cases: report the common typmod if there is one, but otherwise return -1
      indicating no particular constraint.  It's cheap for transformValuesClause
      to determine the common typmod while transforming a multi-row VALUES, but
      it'd be less cheap for expandRTE() and get_rte_attribute_type() to
      re-determine that info every time they're asked --- possibly a lot less
      cheap, if the VALUES has many rows.  Therefore, the best fix is to record
      the common typmods explicitly in a list in the VALUES RTE, as we were
      already doing for column collations.  This looks quite a bit like what
      we're doing for CTE RTEs, so we can save a little bit of space and code by
      unifying the representation for those two RTE types.  They both now share
      coltypes/coltypmods/colcollations fields.  (At some point it might seem
      desirable to populate those fields for all RTE types; but right now it
      looks like constructing them for other RTE types would add more code and
      cycles than it would save.)
      
      The RTE change requires a catversion bump, so this fix is only usable
      in HEAD.  If we fix this at all in the back branches, the patch will
      need to look quite different.
      
      Report: https://postgr.es/m/20161205143037.4377.60754@wrigleys.postgresql.org
      Discussion: https://postgr.es/m/27429.1480968538@sss.pgh.pa.us
      0b78106c
    • Heikki Linnakangas's avatar
      Fix quoting and a compiler warning in dumping partitions. · 2560d244
      Heikki Linnakangas authored
      Partition name needs to be quoted in the ATTACH PARTITION command
      constructed in binary-upgrade mode.
      
      Silence compiler warning about set but unused variable, without
      --enable-cassert.
      2560d244
    • Heikki Linnakangas's avatar
      Clean up password authentication code a bit. · fe7bdf0b
      Heikki Linnakangas authored
      Commit fe0a0b59, which moved code to do MD5 authentication to a separate
      CheckMD5Auth() function, left behind a comment that really belongs inside
      the function, too. Also move the check for db_user_namespace inside the
      function, seems clearer that way.
      
      Now that the md5 salt is passed as argument to md5_crypt_verify, it's a bit
      silly that it peeks into the Port struct to see if MD5 authentication was
      used. Seems more straightforward to treat it as an MD5 authentication, if
      the md5 salt argument is given. And after that, md5_crypt_verify only used
      the Port argument to look at port->user_name, but that is redundant,
      because it is also passed as a separate 'role' argument. So remove the Port
      argument altogether.
      fe7bdf0b
    • Heikki Linnakangas's avatar
      Fix accounting of memory needed for merge heap. · f7d54f4f
      Heikki Linnakangas authored
      We allegedly allocated all remaining memory for the read buffers of the
      sort tapes, but we allocated the merge heap only after that. That means
      that the allocation of the merge heap was guaranteed to go over the memory
      limit. Fix by allocating the merge heap first. This makes little difference
      in practice, because the merge heap is tiny, but let's tidy.
      
      While we're at it, add a safeguard for the case that we are already over
      the limit when allocating the read buffers. That shouldn't happen, but
      better safe than sorry.
      
      The memory accounting error was reported off-list by Peter Geoghegan.
      f7d54f4f