1. 13 Jun, 2017 1 commit
    • Tom Lane's avatar
      Assert that we don't invent relfilenodes or type OIDs in binary upgrade. · 7332c3cb
      Tom Lane authored
      During pg_upgrade's restore run, all relfilenode choices should be
      overridden by commands in the dump script.  If we ever find ourselves
      choosing a relfilenode in the ordinary way, someone blew it.  Likewise for
      pg_type OIDs.  Since pg_upgrade might well succeed anyway, if there happens
      not to be a conflict during the regression test run, we need assertions
      here to keep us on the straight and narrow.
      
      We might someday be able to remove the assertion in GetNewRelFileNode,
      if pg_upgrade is rewritten to remove its assumption that old and new
      relfilenodes always match.  But it's hard to see how to get rid of the
      pg_type OID constraint, since those OIDs are embedded in user tables
      in some cases.
      
      Back-patch as far as 9.5, because of the risk of back-patches breaking
      something here even if it works in HEAD.  I'd prefer to go back further,
      but 9.4 fails both assertions due to get_rel_infos()'s use of a temporary
      table.  We can't use the later-branch solution of a CTE for compatibility
      reasons (cf commit 5d16332e), and it doesn't seem worth inventing some
      other way to do the query.  (I did check, by dint of changing the Asserts
      to elog(WARNING), that there are no other cases of unwanted OID assignments
      during 9.4's regression test run.)
      
      Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us
      7332c3cb
  2. 12 Jun, 2017 9 commits
  3. 11 Jun, 2017 2 commits
    • Tom Lane's avatar
      Handle unqualified SEQUENCE NAME options properly in parse_utilcmd.c. · 51893985
      Tom Lane authored
      generateSerialExtraStmts() was sloppy about handling the case where
      SEQUENCE NAME is given with a not-schema-qualified name.  It was generating
      a CreateSeqStmt with an unqualified sequence name, and an AlterSeqStmt
      whose "owned_by" DefElem contained a T_String Value with a null string
      pointer in the schema-name position.  The generated nextval() argument was
      also underqualified.  This accidentally failed to fail at runtime, but only
      so long as the current default creation namespace at runtime is the right
      namespace.  That's bogus; the parse-time transformation is supposed to be
      inserting the right schema name in all cases, so as to avoid any possible
      skew in that selection.  I'm not sure this could fail in pg_dump's usage,
      but it's still wrong; we have had real bugs in this area before adopting
      the policy that parse_utilcmd.c should generate only fully-qualified
      auxiliary commands.  A slightly lesser problem, which is what led me to
      notice this in the first place, is that pprint() dumped core on the
      AlterSeqStmt because of the bogus T_String.
      
      Noted while poking into the open problem with ALTER SEQUENCE breaking
      pg_upgrade.
      51893985
    • Joe Conway's avatar
      Apply RLS policies to partitioned tables. · 4f7a95be
      Joe Conway authored
      The new partitioned table capability added a new relkind, namely
      RELKIND_PARTITIONED_TABLE. Update fireRIRrules() to apply RLS
      policies on RELKIND_PARTITIONED_TABLE as it does RELKIND_RELATION.
      
      In addition, add RLS regression test coverage for partitioned tables.
      
      Issue raised by Fakhroutdinov Evgenievich and patch by Mike Palmiotto.
      Regression test editorializing by me.
      
      Discussion: https://postgr.es/m/flat/20170601065959.1486.69906@wrigleys.postgresql.org
      4f7a95be
  4. 10 Jun, 2017 2 commits
  5. 09 Jun, 2017 10 commits
  6. 08 Jun, 2017 9 commits
    • Andres Freund's avatar
      Use standard interrupt handling in logical replication launcher. · 2c48f5db
      Andres Freund authored
      Previously the exit handling was only able to exit from within the
      main loop, and not from within the backend code it calls.  Fix that by
      using the standard die() SIGTERM handler, and adding the necessary
      CHECK_FOR_INTERRUPTS() call.
      
      This requires adding yet another process-type-specific branch to
      ProcessInterrupts(), which hints that we probably should generalize
      that handling.  But that's work for another day.
      
      Author: Petr Jelinek
      Reviewed-By: Andres Freund
      Discussion: https://postgr.es/m/fe072153-babd-3b5d-8052-73527a6eb657@2ndquadrant.com
      2c48f5db
    • Andres Freund's avatar
      Again report a useful error message when walreceiver's connection closes. · 5fd56b9f
      Andres Freund authored
      Since 7c4f5240 (merged in v10), a shutdown master is reported as
        FATAL:  unexpected result after CommandComplete: server closed the connection unexpectedly
      by walsender. It used to be
        LOG:  replication terminated by primary server
        FATAL:  could not send end-of-streaming message to primary: no COPY in progress
      while the old message clearly is not perfect, it's definitely better
      than what's reported now.
      
      The change comes from the attempt to handle finished COPYs without
      erroring out, needed for the new logical replication, which wasn't
      needed before.
      
      There's probably better ways to handle this, but for now just
      explicitly check for a closed connection.
      
      Author: Petr Jelinek
      Reviewed-By: Andres Freund
      Discussion: https://postgr.es/m/f7c7dd08-855c-e4ed-41f4-d064a6c0665a@2ndquadrant.com
      Backpatch: -
      5fd56b9f
    • Peter Eisentraut's avatar
      Update key words table for version 10 · 5c4109f2
      Peter Eisentraut authored
      5c4109f2
    • Andrew Dunstan's avatar
      Mark to_tsvector(regconfig,json[b]) functions immutable · f7e6853e
      Andrew Dunstan authored
      This make them consistent with the text function and means they can be
      used in functional indexes.
      
      Catalog version bumped.
      
      Per gripe from Josh Berkus.
      f7e6853e
    • Tom Lane's avatar
      Fix bit-rot in pg_upgrade's test.sh, and improve documentation. · 5bab1985
      Tom Lane authored
      Doing a cross-version upgrade test with test.sh evidently hasn't been
      tested since circa 9.2, because the script lacked case branches for
      old-version servers newer than 9.1.  Future-proof that a bit, and
      clean up breakage induced by our recent drop of V0 function call
      protocol (namely that oldstyle_length() isn't in the regression
      suite anymore).
      
      (This isn't enough to make the test work perfectly cleanly across
      versions, but at least it finishes and provides dump files that
      you can diff manually.  One issue I didn't touch is that we might
      want to execute the "reindex_hash.sql" file in the new DB before
      dumping it, so that the hash indexes don't vanish from the dump.)
      
      Improve the TESTING doc file: put the tl;dr version at the top not
      the bottom, and bring its explanation of how to run a cross-version
      test up to speed, since the installcheck target isn't there and won't
      be resurrected.  Improve the comment in the Makefile about why not.
      
      In passing, teach .gitignore and "make clean" about a couple more
      junk output files.
      
      Discussion: https://postgr.es/m/14058.1496892482@sss.pgh.pa.us
      5bab1985
    • Heikki Linnakangas's avatar
      Improve authentication error messages. · e3df8f8b
      Heikki Linnakangas authored
      Most of the improvements were in the new SCRAM code:
      
      * In SCRAM protocol violation messages, use errdetail to provide the
        details.
      
      * If pg_backend_random() fails, throw an ERROR rather than just LOG. We
        shouldn't continue authentication if we can't generate a random nonce.
      
      * Use ereport() rather than elog() for the "invalid SCRAM verifier"
        messages. They shouldn't happen, if everything works, but it's not
        inconceivable that someone would have invalid scram verifiers in
        pg_authid, e.g. if a broken client application was used to generate the
        verifier.
      
      But this change applied to old code:
      
      * Use ERROR rather than COMMERROR for protocol violation errors. There's
        no reason to not tell the client what they did wrong. The client might be
        confused already, so that it cannot read and display the error correctly,
        but let's at least try. In the "invalid password packet size" case, we
        used to actually continue with authentication anyway, but that is now a
        hard error.
      
      Patch by Michael Paquier and me. Thanks to Daniel Varrazzo for spotting
      the typo in one of the messages that spurred the discussion and these
      larger changes.
      
      Discussion: https://www.postgresql.org/message-id/CA%2Bmi_8aZYLhuyQi1Jo0hO19opNZ2OEATEOM5fKApH7P6zTOZGg%40mail.gmail.com
      e3df8f8b
    • Peter Eisentraut's avatar
      7ff9812f
    • Robert Haas's avatar
      Add statistics subdirectory to Makefile. · 0eac8e7f
      Robert Haas authored
      Commit 7b504eb2 overlooked this.
      
      Report and patch by Kyotaro Horiguchi
      
      Discussion: http://postgr.es/m/20170608.145852.54673832.horiguchi.kyotaro@lab.ntt.co.jp
      0eac8e7f
    • Joe Conway's avatar
      Fix contrib/sepgsql regr tests for tup-routing constraint check change. · 06c0afe5
      Joe Conway authored
      Commit 15ce775f changed tuple-routing constraint checking logic.
      This affects the expected output for contrib/sepgsql, because
      there's no longer LOG entries reporting allowance of int4eq()
      execution. Per buildfarm.
      06c0afe5
  7. 07 Jun, 2017 7 commits
    • Tom Lane's avatar
      Docs: improve CREATE TABLE ref page's discussion of partition bounds. · 0198c277
      Tom Lane authored
      Clarify in the syntax synopsis that partition bound values must be
      exactly numeric literals or string literals; previously it
      said "bound_literal" which was defined nowhere.
      
      Replace confusing --- and, I think, incorrect in detail --- definition
      of how range bounds work with a reference to row-wise comparison plus
      a concrete example (which I stole from Robert Haas).
      
      Minor copy-editing in the same area.
      
      Discussion: https://postgr.es/m/30475.1496005465@sss.pgh.pa.us
      Discussion: https://postgr.es/m/28106.1496041449@sss.pgh.pa.us
      0198c277
    • Robert Haas's avatar
      postgres_fdw: Allow cancellation of transaction control commands. · ae9bfc5d
      Robert Haas authored
      Commit f039eaac, later back-patched
      with commit 1b812afb, allowed many of
      the queries issued by postgres_fdw to fetch remote data to respond to
      cancel interrupts in a timely fashion.  However, it didn't do anything
      about the transaction control commands, which remained
      noninterruptible.
      
      Improve the situation by changing do_sql_command() to retrieve query
      results using pgfdw_get_result(), which uses the asynchronous
      interface to libpq so that it can check for interrupts every time
      libpq returns control.  Since this might result in a situation
      where we can no longer be sure that the remote transaction state
      matches the local transaction state, add a facility to force all
      levels of the local transaction to abort if we've lost track of
      the remote state; without this, an apparently-successful commit of
      the local transaction might fail to commit changes made on the
      remote side.  Also, add a 60-second timeout for queries issue during
      transaction abort; if that expires, give up and mark the state of
      the connection as unknown.  Drop all such connections when we exit
      the local transaction.  Together, these changes mean that if we're
      aborting the local toplevel transaction anyway, we can just drop the
      remote connection in lieu of waiting (possibly for a very long time)
      for it to complete an abort.
      
      This still leaves quite a bit of room for improvement.  PQcancel()
      has no asynchronous interface, so if we get stuck sending the cancel
      request we'll still hang.  Also, PQsetnonblocking() is not used, which
      means we could block uninterruptibly when sending a query.  There
      might be some other optimizations possible as well.  Nonetheless,
      this allows us to escape a wait for an unresponsive remote server
      quickly in many more cases than previously.
      
      Report by Suraj Kharage.  Patch by me and Rafia Sabih.  Review
      and testing by Amit Kapila and Tushar Ahuja.
      
      Discussion: http://postgr.es/m/CAF1DzPU8Kx+fMXEbFoP289xtm3bz3t+ZfxhmKavr98Bh-C0TqQ@mail.gmail.com
      ae9bfc5d
    • Peter Eisentraut's avatar
      Fix updating of pg_subscription_rel from workers · 644ea35f
      Peter Eisentraut authored
      A logical replication worker should not insert new rows into
      pg_subscription_rel, only update existing rows, so that there are no
      races if a concurrent refresh removes rows.  Adjust the API to be able
      to choose that behavior.
      
      Author: Masahiko Sawada <sawada.mshk@gmail.com>
      Reported-by: default avatartushar <tushar.ahuja@enterprisedb.com>
      644ea35f
    • Robert Haas's avatar
      Prevent BEFORE triggers from violating partitioning constraints. · 15ce775f
      Robert Haas authored
      Since tuple-routing implicitly checks the partitioning constraints
      at least for the levels of the partitioning hierarchy it traverses,
      there's normally no need to revalidate the partitioning constraint
      after performing tuple routing.  However, if there's a BEFORE trigger
      on the target partition, it could modify the tuple, causing the
      partitioning constraint to be violated.  Catch that case.
      
      Also, instead of checking the root table's partition constraint after
      tuple-routing, check it beforehand.  Otherwise, the rules for when
      the partitioning constraint gets checked get too complicated, because
      you sometimes have to check part of the constraint but not all of it.
      This effectively reverts commit 39162b20
      in favor of a different approach altogether.
      
      Report by me.  Initial debugging by Jeevan Ladhe.  Patch by Amit
      Langote, reviewed by me.
      
      Discussion: http://postgr.es/m/CA+Tgmoa9DTgeVOqopieV8d1QRpddmP65aCdxyjdYDoEO5pS5KA@mail.gmail.com
      15ce775f
    • Heikki Linnakangas's avatar
      Clear auth context correctly when re-connecting after failed auth attempt. · e6c33d59
      Heikki Linnakangas authored
      If authentication over an SSL connection fails, with sslmode=prefer,
      libpq will reconnect without SSL and retry. However, we did not clear
      the variables related to GSS, SSPI, and SASL authentication state, when
      reconnecting. Because of that, the second authentication attempt would
      always fail with a "duplicate GSS/SASL authentication request" error.
      pg_SSPI_startup did not check for duplicate authentication requests like
      the corresponding GSS and SASL functions, so with SSPI, you would leak
      some memory instead.
      
      Another way this could manifest itself, on version 10, is if you list
      multiple hostnames in the "host" parameter. If the first server requests
      Kerberos or SCRAM authentication, but it fails, the attempts to connect to
      the other servers will also fail with "duplicate authentication request"
      errors.
      
      To fix, move the clearing of authentication state from closePGconn to
      pgDropConnection, so that it is cleared also when re-connecting.
      
      Patch by Michael Paquier, with some kibitzing by me.
      
      Backpatch down to 9.3. 9.2 has the same bug, but the code around closing
      the connection is somewhat different, so that this patch doesn't apply.
      To fix this in 9.2, I think we would need to back-port commit 210eb9b7
      first, and then apply this patch. However, given that we only bumped into
      this in our own testing, we haven't heard any reports from users about
      this, and that 9.2 will be end-of-lifed in a couple of months anyway, it
      doesn't seem worth the risk and trouble.
      
      Discussion: https://www.postgresql.org/message-id/CAB7nPqRuOUm0MyJaUy9L3eXYJU3AKCZ-0-03=-aDTZJGV4GyWw@mail.gmail.com
      e6c33d59
    • Heikki Linnakangas's avatar
      Fix double-free bug in GSS authentication. · 3344582e
      Heikki Linnakangas authored
      The logic to free the buffer after the gss_init_sec_context() call was
      always a bit wonky. Because gss_init_sec_context() sets the GSS context
      variable, conn->gctx, we would in fact always attempt to free the buffer.
      That only works, because previously conn->ginbuf.value was initialized to
      NULL, and free(NULL) is a no-op. Commit 61bf96ca refactored things so
      that the GSS input token buffer is allocated locally in pg_GSS_continue,
      and not held in the PGconn object. After that, the now-local ginbuf.value
      variable isn't initialized when it's not used, so we pass a bogus pointer
      to free().
      
      To fix, only try to free the input buffer if we allocated it. That was the
      intention, certainly after the refactoring, and probably even before that.
      But because there's no live bug before the refactoring, I refrained from
      backpatching this.
      
      The bug was also independently reported by Graham Dutton, as bug #14690.
      Patch reviewed by Michael Paquier.
      
      Discussion: https://www.postgresql.org/message-id/6288d80e-a0bf-d4d3-4e12-7b79c77f1771%40iki.fi
      Discussion: https://www.postgresql.org/message-id/20170605130954.1438.90535%40wrigleys.postgresql.org
      3344582e
    • Peter Eisentraut's avatar
      Consistently use subscription name as application name · d4bfc06e
      Peter Eisentraut authored
      The logical replication apply worker uses the subscription name as
      application name, except for table sync.  This was incorrectly set to
      use the replication slot name, which might be different, in one case.
      Also add a comment why the other case is different.
      d4bfc06e