1. 09 Jun, 2015 3 commits
  2. 08 Jun, 2015 6 commits
    • Alvaro Herrera's avatar
      Fix typos · 94232c90
      Alvaro Herrera authored
      tablesapce -> tablespace
      there -> their
      
      These were introduced in 72d422a5, so no need to backpatch.
      94232c90
    • Fujii Masao's avatar
      Refactor WAL segment copying code. · 7abc6859
      Fujii Masao authored
      * Remove unused argument "dstfname" and related code from XLogFileCopy().
      
      * Previously XLogFileCopy() returned a pstrdup'd string so that
      InstallXLogFileSegment() used it later. Since the pstrdup'd string was never
      free'd, there could be a risk of memory leak. It was almost harmless because
      the startup process exited just after calling XLogFileCopy(), it existed.
      This commit changes XLogFileCopy() so that it directly calls
      InstallXLogFileSegment() and doesn't call pstrdup() at all. Which fixes that
      memory leak problem.
      
      * Extend InstallXLogFileSegment() so that the caller can specify the log level.
      Which allows us to emit an error when InstallXLogFileSegment() fails a disk
      file access like link() and rename(). Previously it was always logged with
      LOG level and additionally needed to be logged with ERROR when we wanted
      to treat it as an error.
      
      Michael Paquier
      7abc6859
    • Andres Freund's avatar
      Allow HotStandbyActiveInReplay() to be called in single user mode. · d1b95821
      Andres Freund authored
      HotStandbyActiveInReplay, introduced in 061b079f, only allowed WAL
      replay to happen in the startup process, missing the single user case.
      
      This buglet is fairly harmless as it only causes problems when single
      user mode in an assertion enabled build is used to replay a btree vacuum
      record.
      
      Backpatch to 9.2. 061b079f was backpatched further, but the assertion
      was not.
      d1b95821
    • Andrew Dunstan's avatar
      Clarify documentation of jsonb - text · 94d6727d
      Andrew Dunstan authored
      Peter Geoghegan
      94d6727d
    • Andrew Dunstan's avatar
      Desupport jsonb subscript deletion on objects · b81c7b40
      Andrew Dunstan authored
      Supporting deletion of JSON pairs within jsonb objects using an
      array-style integer subscript allowed for surprising outcomes.  This was
      mostly due to the implementation-defined ordering of pairs within
      objects for jsonb.
      
      It also seems desirable to make jsonb integer subscript deletion
      consistent with the 9.4 era general purpose integer subscripting
      operator for jsonb (although that operator returns NULL when an object
      is encountered, while we prefer here to throw an error).
      
      Peter Geoghegan, following discussion on -hackers.
      b81c7b40
    • Peter Eisentraut's avatar
      doc: Fix broken links in FOP build · d23a3a60
      Peter Eisentraut authored
      FOP doesn't handle links to table rows, so put the link to a cell
      instead.
      d23a3a60
  3. 07 Jun, 2015 1 commit
    • Tom Lane's avatar
      Use a safer method for determining whether relcache init file is stale. · f3b5565d
      Tom Lane authored
      When we invalidate the relcache entry for a system catalog or index, we
      must also delete the relcache "init file" if the init file contains a copy
      of that rel's entry.  The old way of doing this relied on a specially
      maintained list of the OIDs of relations present in the init file: we made
      the list either when reading the file in, or when writing the file out.
      The problem is that when writing the file out, we included only rels
      present in our local relcache, which might have already suffered some
      deletions due to relcache inval events.  In such cases we correctly decided
      not to overwrite the real init file with incomplete data --- but we still
      used the incomplete initFileRelationIds list for the rest of the current
      session.  This could result in wrong decisions about whether the session's
      own actions require deletion of the init file, potentially allowing an init
      file created by some other concurrent session to be left around even though
      it's been made stale.
      
      Since we don't support changing the schema of a system catalog at runtime,
      the only likely scenario in which this would cause a problem in the field
      involves a "vacuum full" on a catalog concurrently with other activity, and
      even then it's far from easy to provoke.  Remarkably, this has been broken
      since 2002 (in commit 78634044), but we had
      never seen a reproducible test case until recently.  If it did happen in
      the field, the symptoms would probably involve unexpected "cache lookup
      failed" errors to begin with, then "could not open file" failures after the
      next checkpoint, as all accesses to the affected catalog stopped working.
      Recovery would require manually removing the stale "pg_internal.init" file.
      
      To fix, get rid of the initFileRelationIds list, and instead consult
      syscache.c's list of relations used in catalog caches to decide whether a
      relation is included in the init file.  This should be a tad more efficient
      anyway, since we're replacing linear search of a list with ~100 entries
      with a binary search.  It's a bit ugly that the init file contents are now
      so directly tied to the catalog caches, but in practice that won't make
      much difference.
      
      Back-patch to all supported branches.
      f3b5565d
  4. 05 Jun, 2015 3 commits
    • Tom Lane's avatar
      Get rid of a //-style comment. · 1497369e
      Tom Lane authored
      Not sure how "//XXX" got into a committed patch in the first place,
      as it's both content-free and against project style.  pgindent made a
      bit of a hash of it, too.
      
      Going forward, we should have at least one buildfarm member using
      "gcc -ansi" to catch such things, at least till such time as we
      decide the project target language isn't C90 any more.  I've turned
      this option on on dromedary.
      1497369e
    • Tom Lane's avatar
      Fix incorrect order of database-locking operations in InitPostgres(). · ac23b711
      Tom Lane authored
      We should set MyProc->databaseId after acquiring the per-database lock,
      not beforehand.  The old way risked deadlock against processes trying to
      copy or delete the target database, since they would first acquire the lock
      and then wait for processes with matching databaseId to exit; that left a
      window wherein an incoming process could set its databaseId and then block
      on the lock, while the other process had the lock and waited in vain for
      the incoming process to exit.
      
      CountOtherDBBackends() would time out and fail after 5 seconds, so this
      just resulted in an unexpected failure not a permanent lockup, but it's
      still annoying when it happens.  A real-world example of a use-case is that
      short-duration connections to a template database should not cause CREATE
      DATABASE to fail.
      
      Doing it in the other order should be fine since the contract has always
      been that processes searching the ProcArray for a database ID must hold the
      relevant per-database lock while searching.  Thus, this actually removes
      the former race condition that required an assumption that storing to
      MyProc->databaseId is atomic.
      
      It's been like this for a long time, so back-patch to all active branches.
      ac23b711
    • Robert Haas's avatar
      Cope with possible failure of the oldest MultiXact to exist. · 068cfadf
      Robert Haas authored
      Recent commits, mainly b69bf30b and
      53bb309d, introduced mechanisms to
      protect against wraparound of the MultiXact member space: the number
      of multixacts that can exist at one time is limited to 2^32, but the
      total number of members in those multixacts is also limited to 2^32,
      and older code did not take care to enforce the second limit,
      potentially allowing old data to be overwritten while it was still
      needed.
      
      Unfortunately, these new mechanisms failed to account for the fact
      that the code paths in which they run might be executed during
      recovery or while the cluster was in an inconsistent state.  Also,
      they failed to account for the fact that users who used pg_upgrade
      to upgrade a PostgreSQL version between 9.3.0 and 9.3.4 might have
      might oldestMultiXid = 1 in the control file despite the true value
      being larger.
      
      To fix these problems, first, avoid unnecessarily examining the
      mmembers of MultiXacts when the cluster is not known to be consistent.
      TruncateMultiXact has done this for a long time, and this patch does
      not fix that.  But the new calls used to prevent member wraparound
      are not needed until we reach normal running, so avoid calling them
      earlier.  (SetMultiXactIdLimit is actually called before InRecovery
      is set, so we can't rely on that; we invent our own multixact-specific
      flag instead.)
      
      Second, make failure to look up the members of a MultiXact a non-fatal
      error.  Instead, if we're unable to determine the member offset at
      which wraparound would occur, postpone arming the member wraparound
      defenses until we are able to do so.  If we're unable to determine the
      member offset that should force autovacuum, force it continuously
      until we are able to do so.  If we're unable to deterine the member
      offset at which we should truncate the members SLRU, log a message and
      skip truncation.
      
      An important consequence of these changes is that anyone who does have
      a bogus oldestMultiXid = 1 value in pg_control will experience
      immediate emergency autovacuuming when upgrading to a release that
      contains this fix.  The release notes should highlight this fact.  If
      a user has no pg_multixact/offsets/0000 file, but has oldestMultiXid = 1
      in the control file, they may wish to vacuum any tables with
      relminmxid = 1 prior to upgrading in order to avoid an immediate
      emergency autovacuum after the upgrade.  This must be done with a
      PostgreSQL version 9.3.5 or newer and with vacuum_multixact_freeze_min_age
      and vacuum_multixact_freeze_table_age set to 0.
      
      This patch also adds an additional log message at each database server
      startup, indicating either that protections against member wraparound
      have been engaged, or that they have not.  In the latter case, once
      autovacuum has advanced oldestMultiXid to a sane value, the message
      indicating that the guards have been engaged will appear at the next
      checkpoint.  A few additional messages have also been added at the DEBUG1
      level so that the correct operation of this code can be properly audited.
      
      Along the way, this patch fixes another, related bug in TruncateMultiXact
      that has existed since PostgreSQL 9.3.0: when no MultiXacts exist at
      all, the truncation code looks up NextMultiXactId, which doesn't exist
      yet.  This can lead to TruncateMultiXact removing every file in
      pg_multixact/offsets instead of keeping one around, as it should.
      This in turn will cause the database server to refuse to start
      afterwards.
      
      Patch by me.  Review by Álvaro Herrera, Andres Freund, Noah Misch, and
      Thomas Munro.
      068cfadf
  5. 04 Jun, 2015 11 commits
    • Robert Haas's avatar
      99cfd5e1
    • Robert Haas's avatar
      docs: Fix list of object types pg_table_is_visible() can handle. · 1c645da8
      Robert Haas authored
      Materialized views and foreign tables were missing from the list,
      probably because they are newer than the other object types that were
      mentioned.
      
      Etsuro Fujita
      1c645da8
    • Tom Lane's avatar
      Second try at stabilizing query plans in rowsecurity regression test. · 1d278425
      Tom Lane authored
      This reverts commit 5cdf25e1,
      which was almost immediately proven insufficient by the buildfarm.
      
      On second thought, the tables involved are not large enough that
      autovacuum or autoanalyze would notice them; what seems far more
      likely to be the culprit is the database-wide "vacuum analyze"
      in the concurrent gist test.  That thing has given us one headache
      too many, so get rid of it in favor of targeted vacuuming of that
      test's own tables only.
      1d278425
    • Tom Lane's avatar
      Fix brin regression test so it actually tests cidr. · 1676e438
      Tom Lane authored
      The problem noted in my previous commit was simpler than I thought:
      we weren't getting an index plan because the column wasn't indexed.
      1676e438
    • Tom Lane's avatar
      Tighten the per-operator testing done in brin regression test. · 79454c69
      Tom Lane authored
      Verify that the number of matches is exactly what it should be, not just
      that it not be zero.  This should help us detect any environment-dependent
      issues.
      
      Also, verify that we're getting the expected type of scan plan (either
      bitmap or seqscan as appropriate).  Right now, this is failing on the
      cidrcol test cases, as shown in the output file.  I'll look into that
      in a bit, but it seems good to commit this as-is temporarily to verify
      that it behaves as expected on the buildfarm.
      79454c69
    • Tom Lane's avatar
      Fix brin "char" test to actually test what it meant to test. · 78e72794
      Tom Lane authored
      Casting to char, without quotes, does not give the same results as casting
      to "char".  That meant we were not testing the brin "char" paths at all,
      since we ended up with a text operator not a "char" operator.
      78e72794
    • Tom Lane's avatar
      Stabilize results of brin regression test. · bac99475
      Tom Lane authored
      This test used seqscans on tenk1, with LIMIT, to build test data.
      That works most of the time, but if the synchronized-seqscan logic
      kicks in, we get varying test data.  This seems likely to explain
      the erratic test failures on buildfarm member chipmunk, which uses
      smaller-than-default shared_buffers.  To fix, add ORDER BY clauses to
      force the ordering to be what it was implicitly being assumed to be.
      
      Peter Geoghegan had noticed this with respect to one of the trouble
      spots, though not the ones actually causing the chipmunk issue.
      bac99475
    • Tom Lane's avatar
      Stabilize query plans in rowsecurity regression test. · 5cdf25e1
      Tom Lane authored
      Some recent buildfarm failures can be explained by supposing that
      autovacuum or autoanalyze fired on the tables created by this test,
      resulting in plan changes.  Do a proactive VACUUM ANALYZE on the
      test's principal tables to try to forestall such changes.
      5cdf25e1
    • Fujii Masao's avatar
      Remove -i/--ignore-version option from pg_dump, pg_dumpall and pg_restore. · 232cd63b
      Fujii Masao authored
      The commit c22ed3d5 turned
      the -i/--ignore-version options into no-ops and marked as deprecated.
      Considering we shipped that in 8.4, it's time to remove all trace of
      those switches, per discussion. We'd still have to wait a couple releases
      before it'd be safe to use -i for something else, but it'd be a start.
      232cd63b
    • Fujii Masao's avatar
      Fix some issues in pg_class.relminmxid and pg_database.datminmxid documentation. · 38d500ac
      Fujii Masao authored
      - Correct the name of directory which those catalog columns allow to be shrunk.
      - Correct the name of symbol which is used as the value of pg_class.relminmxid
        when the relation is not a table.
      - Fix "ID ID" typo.
      
      Backpatch to 9.3 where those cataog columns were introduced.
      38d500ac
    • Peter Eisentraut's avatar
      doc: Fix PDF build with FOP · afae1f78
      Peter Eisentraut authored
      Because of a bug in the DocBook XSL FO style sheet, an xref to a
      varlistentry whose term includes an indexterm fails to build.  One such
      instance was introduced in commit
      5086dfce.  Fix by adding the upstream
      bug fix to our customization layer.
      afae1f78
  6. 03 Jun, 2015 3 commits
    • Tom Lane's avatar
      Fix some questionable edge-case behaviors in add_path() and friends. · 3b0f7760
      Tom Lane authored
      add_path_precheck was doing exact comparisons of path costs, but it really
      needs to do them fuzzily to be sure it won't reject paths that could
      survive add_path's comparisons.  (This can only matter if the initial cost
      estimate is very close to the final one, but that turns out to often be
      true.)
      
      Also, it should ignore startup cost for this purpose if and only if
      compare_path_costs_fuzzily would do so.  The previous coding always ignored
      startup cost for parameterized paths, which is wrong as of commit
      3f59be83; it could result in improper early rejection of paths that
      we care about for SEMI/ANTI joins.  It also always considered startup cost
      for unparameterized paths, which is just as wrong though the only effect is
      to waste planner cycles on paths that can't survive.  Instead, it should
      consider startup cost only when directed to by the consider_startup/
      consider_param_startup relation flags.
      
      Likewise, compare_path_costs_fuzzily should have symmetrical behavior
      for parameterized and unparameterized paths.  In this case, the best
      answer seems to be that after establishing that total costs are fuzzily
      equal, we should compare startup costs whether or not the consider_xxx
      flags are on.  That is what it's always done for unparameterized paths,
      so let's make the behavior for parameterized  paths match.
      
      These issues were noted while developing the SEMI/ANTI join costing fix
      of commit 3f59be83, but we chose not to back-patch these fixes,
      because they can cause changes in the planner's choices among
      nearly-same-cost plans.  (There is in fact one minor change in plan choice
      within the core regression tests.)  Destabilizing plan choices in back
      branches without very clear improvements is frowned on, so we'll just fix
      this in HEAD.
      3b0f7760
    • Tom Lane's avatar
      Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans. · 3f59be83
      Tom Lane authored
      When the inner side of a nestloop SEMI or ANTI join is an indexscan that
      uses all the join clauses as indexquals, it can be presumed that both
      matched and unmatched outer rows will be processed very quickly: for
      matched rows, we'll stop after fetching one row from the indexscan, while
      for unmatched rows we'll have an indexscan that finds no matching index
      entries, which should also be quick.  The planner already knew about this,
      but it was nonetheless charging for at least one full run of the inner
      indexscan, as a consequence of concerns about the behavior of materialized
      inner scans --- but those concerns don't apply in the fast case.  If the
      inner side has low cardinality (many matching rows) this could make an
      indexscan plan look far more expensive than it actually is.  To fix,
      rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we
      don't add the inner scan cost until we've inspected the indexquals, and
      then we can add either the full-run cost or just the first tuple's cost as
      appropriate.
      
      Experimentation with this fix uncovered another problem: add_path and
      friends were coded to disregard cheap startup cost when considering
      parameterized paths.  That's usually okay (and desirable, because it thins
      the path herd faster); but in this fast case for SEMI/ANTI joins, it could
      result in throwing away the desired plain indexscan path in favor of a
      bitmap scan path before we ever get to the join costing logic.  In the
      many-matching-rows cases of interest here, a bitmap scan will do a lot more
      work than required, so this is a problem.  To fix, add a per-relation flag
      consider_param_startup that works like the existing consider_startup flag,
      but applies to parameterized paths, and set it for relations that are the
      inside of a SEMI or ANTI join.
      
      To make this patch reasonably safe to back-patch, care has been taken to
      avoid changing the planner's behavior except in the very narrow case of
      SEMI/ANTI joins with inner indexscans.  There are places in
      compare_path_costs_fuzzily and add_path_precheck that are not terribly
      consistent with the new approach, but changing them will affect planner
      decisions at the margins in other cases, so we'll leave that for a
      HEAD-only fix.
      
      Back-patch to 9.3; before that, the consider_startup flag didn't exist,
      meaning that the second aspect of the patch would be too invasive.
      
      Per a complaint from Peter Holzer and analysis by Tomas Vondra.
      3f59be83
    • Fujii Masao's avatar
      Minor improvement to txid_current() documentation. · 37013621
      Fujii Masao authored
      Michael Paquier, reviewed by Christoph Berg and Naoya Anzai
      37013621
  7. 01 Jun, 2015 5 commits
  8. 31 May, 2015 1 commit
    • Peter Eisentraut's avatar
      Make Python tests more portable · 75f9d176
      Peter Eisentraut authored
      Newer Python versions randomize the hash seed for dictionaries,
      resulting in a random output order, which messes up the regression test
      diffs.
      
      Instead, use Python assert to compare the dictionaries with their
      expected value.
      75f9d176
  9. 29 May, 2015 6 commits
    • Bruce Momjian's avatar
      ac6f2295
    • Tom Lane's avatar
      initdb -S should now have an explicit check that $PGDATA is valid. · 1943c000
      Tom Lane authored
      The fsync code from the backend essentially assumes that somebody's already
      validated PGDATA, at least to the extent of it being a readable directory.
      That's safe enough for initdb's normal code path too, but "initdb -S"
      doesn't have any other processing at all that touches the target directory.
      To have reasonable error-case behavior, add a pg_check_dir call.
      Per gripe from Peter E.
      1943c000
    • Tom Lane's avatar
      Remove special cases for ETXTBSY from new fsync'ing logic. · 57e1138b
      Tom Lane authored
      The argument that this is a sufficiently-expected case to be silently
      ignored seems pretty thin.  Andres had brought it up back when we were
      still considering that most fsync failures should be hard errors, and it
      probably would be legit not to fail hard for ETXTBSY --- but the same is
      true for EROFS and other cases, which is why we gave up on hard failures.
      ETXTBSY is surely not a normal case, so logging the failure seems fine
      from here.
      57e1138b
    • Tom Lane's avatar
      Check that all aliases of a built-in function have same leakproof property. · 1c8c656b
      Tom Lane authored
      opr_sanity.sql has a test checking that relevant properties of built-in
      functions match when the same C function is referenced by multiple pg_proc
      entries.  The test neglected to check proleakproof, though, and when
      I added that condition it exposed that xideqint4 hadn't been updated to
      match xideq.  So fix that as well, and in consequence bump catversion.
      
      This isn't very critical, so no need to worry about fixing back branches.
      1c8c656b
    • Tom Lane's avatar
      Adjust initdb to also not consider fsync'ing failures fatal. · c07d8c96
      Tom Lane authored
      Make initdb's version of this logic look as much like the backend's
      as possible.  This is much less critical than in the backend since not
      so many people use "initdb -S", but we want the same corner-case error
      handling in both cases.
      
      Back-patch to 9.3 where initdb -S option was introduced.  Before that,
      initdb only had to deal with freshly-created data directories, wherein
      no failures should be expected.
      
      Abhijit Menon-Sen
      c07d8c96
    • Tom Lane's avatar
      Revert exporting of internal GUC variable "data_directory". · da33a389
      Tom Lane authored
      This undoes a poorly-thought-out choice in commit 970a1868, namely
      to export guc.c's internal variable data_directory.  The authoritative
      variable so far as C code is concerned is DataDir; there is no reason for
      anything except specific bits of GUC code to look at the GUC variable.
      
      After yesterday's commits fixing the fsync-on-restart patch, the only
      remaining misuse of data_directory was in AlterSystemSetConfigFile(),
      which would be much better off just using a relative path anyhow: it's
      less code and it doesn't break if the DBA moves the data directory of a
      running system, which is a case we've taken some pains over in the past.
      
      This is mostly cosmetic, so no need for a back-patch (and I'd be hesitant
      to remove a global variable in stable branches anyway).
      da33a389
  10. 28 May, 2015 1 commit
    • Tom Lane's avatar
      Fix fsync-at-startup code to not treat errors as fatal. · d8179b00
      Tom Lane authored
      Commit 2ce439f3 introduced a rather serious
      regression, namely that if its scan of the data directory came across any
      un-fsync-able files, it would fail and thereby prevent database startup.
      Worse yet, symlinks to such files also caused the problem, which meant that
      crash restart was guaranteed to fail on certain common installations such
      as older Debian.
      
      After discussion, we agreed that (1) failure to start is worse than any
      consequence of not fsync'ing is likely to be, therefore treat all errors
      in this code as nonfatal; (2) we should not chase symlinks other than
      those that are expected to exist, namely pg_xlog/ and tablespace links
      under pg_tblspc/.  The latter restriction avoids possibly fsync'ing a
      much larger part of the filesystem than intended, if the user has left
      random symlinks hanging about in the data directory.
      
      This commit takes care of that and also does some code beautification,
      mainly moving the relevant code into fd.c, which seems a much better place
      for it than xlog.c, and making sure that the conditional compilation for
      the pre_sync_fname pass has something to do with whether pg_flush_data
      works.
      
      I also relocated the call site in xlog.c down a few lines; it seems a
      bit silly to be doing this before ValidateXLOGDirectoryStructure().
      
      The similar logic in initdb.c ought to be made to match this, but that
      change is noncritical and will be dealt with separately.
      
      Back-patch to all active branches, like the prior commit.
      
      Abhijit Menon-Sen and Tom Lane
      d8179b00