1. 11 Jan, 2014 7 commits
    • Tom Lane's avatar
      Fix possible crashes due to using elog/ereport too early in startup. · 910bac59
      Tom Lane authored
      Per reports from Andres Freund and Luke Campbell, a server failure during
      set_pglocale_pgservice results in a segfault rather than a useful error
      message, because the infrastructure needed to use ereport hasn't been
      initialized; specifically, MemoryContextInit hasn't been called.
      One known cause of this is starting the server in a directory it
      doesn't have permission to read.
      
      We could try to prevent set_pglocale_pgservice from using anything that
      depends on palloc or elog, but that would be messy, and the odds of future
      breakage seem high.  Moreover there are other things being called in main.c
      that look likely to use palloc or elog too --- perhaps those things
      shouldn't be there, but they are there today.  The best solution seems to
      be to move the call of MemoryContextInit to very early in the backend's
      real main() function.  I've verified that an elog or ereport occurring
      immediately after that is now capable of sending something useful to
      stderr.
      
      I also added code to elog.c to print something intelligible rather than
      just crashing if MemoryContextInit hasn't created the ErrorContext.
      This could happen if MemoryContextInit itself fails (due to malloc
      failure), and provides some future-proofing against someone trying to
      sneak in new code even earlier in server startup.
      
      Back-patch to all supported branches.  Since we've only heard reports of
      this type of failure recently, it may be that some recent change has made
      it more likely to see a crash of this kind; but it sure looks like it's
      broken all the way back.
      910bac59
    • Bruce Momjian's avatar
      Revert fd2ace80 · d84c584e
      Bruce Momjian authored
      Seems we want to document '=' plpgsql assignment instead.
      d84c584e
    • Tom Lane's avatar
      Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD. · 62865262
      Tom Lane authored
      The standard typanalyze functions skip over values whose detoasted size
      exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during
      ANALYZE.  However, we (I think I, actually :-() failed to consider the
      possibility that *every* non-null value in a column is too wide.  While
      compute_minimal_stats() seems to behave reasonably anyway in such a case,
      compute_scalar_stats() just fell through and generated no pg_statistic
      entry at all.  That's unnecessarily pessimistic: we can still produce
      valid stanullfrac and stawidth values in such cases, since we do include
      too-wide values in the average-width calculation.  Furthermore, since the
      general assumption in this code is that too-wide values are probably all
      distinct from each other, it seems reasonable to set stadistinct to -1
      ("all distinct").
      
      Per complaint from Kadri Raudsepp.  This has been like this since roughly
      neolithic times, so back-patch to all supported branches.
      62865262
    • Bruce Momjian's avatar
      docs: remove undocumented assign syntax in plpgsql examples · fd2ace80
      Bruce Momjian authored
      Pavel Stehule
      fd2ace80
    • Tom Lane's avatar
      Add another regression test cross-checking operator and function comments. · 28233ffa
      Tom Lane authored
      Add a query that lists all the functions that are operator implementation
      functions and have a SQL comment that doesn't just say "implementation of
      XYZ operator".  (Note that the preceding test checks that such functions'
      comments exactly match the corresponding operators' comments.)
      
      While it's not forbidden to add more functions to this list, that should
      only be done when we're encouraging users to use either the function or
      operator syntax for the functionality, which is a fairly rare situation.
      28233ffa
    • Andrew Dunstan's avatar
      Remove DESCR entries for json operator functions. · 11829ff8
      Andrew Dunstan authored
      Per -hackers discussion.
      11829ff8
    • Bruce Momjian's avatar
  2. 10 Jan, 2014 2 commits
    • Bruce Momjian's avatar
      Move username lookup functions from /port to /common · 111022ea
      Bruce Momjian authored
      Per suggestion from Peter E and Alvaro
      111022ea
    • Alvaro Herrera's avatar
      Accept pg_upgraded tuples during multixact freezing · 423e1211
      Alvaro Herrera authored
      The new MultiXact freezing routines introduced by commit 8e9a16ab8f7
      neglected to consider tuples that came from a pg_upgrade'd database; a
      vacuum run that tried to freeze such tuples would die with an error such
      as
      ERROR: MultiXactId 11415437 does no longer exist -- apparent wraparound
      
      To fix, ensure that GetMultiXactIdMembers is allowed to return empty
      multis when the infomask bits are right, as is done in other callsites.
      
      Per trouble report from F-Secure.
      
      In passing, fix a copy&paste bug reported by Andrey Karpov from VIVA64
      from their PVS-Studio static checked, that instead of setting relminmxid
      to Invalid, we were setting relfrozenxid twice.  Not an important
      mistake because that code branch is about relations for which we don't
      use the frozenxid/minmxid values at all in the first place, but seems to
      warrants a fix nonetheless.
      423e1211
  3. 09 Jan, 2014 7 commits
    • Tom Lane's avatar
      Remove unnecessary local variables to work around an icc optimization bug. · faab7a95
      Tom Lane authored
      Buildfarm member dunlin has been crashing since commit 8b49a604, but other
      machines seem fine with that code.  It turns out that removing the local
      variables in ordered_set_startup() that are copies of fields in "qstate"
      dodges the problem.  This might cost a few cycles on register-rich
      machines, but it's probably a wash on others, and in any case this code
      isn't performance-critical.  Thanks to Jeremy Drake for off-list
      investigation.
      faab7a95
    • Michael Meskes's avatar
    • Michael Meskes's avatar
      Fix descriptor output in ECPG. · d685e242
      Michael Meskes authored
      While working on most platforms the old way sometimes created alignment
      problems. This should fix it. Also the regresion tests were updated to test for
      the reported case.
      
      Report and fix by MauMau <maumau307@gmail.com>
      d685e242
    • Heikki Linnakangas's avatar
      Refactor checking whether we've reached the recovery target. · c945af80
      Heikki Linnakangas authored
      Makes the replay loop slightly more readable, by separating the concerns of
      whether to stop and whether to delay, and how to extract the timestamp from
      a record.
      
      This has the user-visible change that the timestamp of the last applied
      record is now updated after actually applying it. Before, it was updated
      just before applying it. That meant that pg_last_xact_replay_timestamp()
      could return the timestamp of a commit record that is in process of being
      replayed, but not yet applied. Normally the difference is small, but if
      min_recovery_apply_delay is set, there could be a significant delay between
      reading a record and applying it.
      
      Another behavioral change is that if you recover to a restore point, we stop
      after the restore point record, not before it. It makes no difference as far
      as running queries on the server is concerned, as applying a restore point
      record changes nothing, but if examine the timeline history you will see
      that the new timeline branched off just after the restore point record, not
      before it. One practical consequence is that if you do PITR to the new
      timeline, and set recovery target to the same named restore point again, it
      will find and stop recovery at the same restore point. Conceptually, I think
      it makes more sense to consider the restore point as part of the new
      timeline's history than not.
      
      In principle, setting the last-replayed timestamp before actually applying
      the record was a bug all along, but it doesn't seem worth the risk to
      backpatch, since min_recovery_apply_delay was only added in 9.4.
      c945af80
    • Peter Eisentraut's avatar
      pgcrypto: Make header files stand alone · 10a3b165
      Peter Eisentraut authored
      pgp.h used to require including mbuf.h and px.h first.  Include those in
      pgp.h, so that it can be used without prerequisites.  Remove mbuf.h
      inclusions in .c files where mbuf.h features are not used
      directly.  (px.h was always used.)
      10a3b165
    • Tom Lane's avatar
      We don't need to include pg_sema.h in s_lock.h anymore. · 220b3433
      Tom Lane authored
      Minor improvement to commit daa7527a:
      s_lock.h no longer has any need to mention PGSemaphoreData, so we can
      rip out the #include that supplies that.  In a non-HAVE_SPINLOCKS
      build, this doesn't really buy much since we still need the #include
      in spin.h --- but everywhere else, this reduces #include footprint by
      some trifle, and helps keep the different locking facilities separate.
      220b3433
    • Tom Lane's avatar
      Fix "cannot accept a set" error when only some arms of a CASE return a set. · 080b7db7
      Tom Lane authored
      In commit c1352052, I implemented an
      optimization that assumed that a function's argument expressions would
      either always return a set (ie multiple rows), or always not.  This is
      wrong however: we allow CASE expressions in which some arms return a set
      of some type and others just return a scalar of that type.  There may be
      other examples as well.  To fix, replace the run-time test of whether an
      argument returned a set with a static precheck (expression_returns_set).
      This adds a little bit of query startup overhead, but it seems barely
      measurable.
      
      Per bug #8228 from David Johnston.  This has been broken since 8.0,
      so patch all supported branches.
      080b7db7
  4. 08 Jan, 2014 7 commits
    • Robert Haas's avatar
      Reduce the number of semaphores used under --disable-spinlocks. · daa7527a
      Robert Haas authored
      Instead of allocating a semaphore from the operating system for every
      spinlock, allocate a fixed number of semaphores (by default, 1024)
      from the operating system and multiplex all the spinlocks that get
      created onto them.  This could self-deadlock if a process attempted
      to acquire more than one spinlock at a time, but since processes
      aren't supposed to execute anything other than short stretches of
      straight-line code while holding a spinlock, that shouldn't happen.
      
      One motivation for this change is that, with the introduction of
      dynamic shared memory, it may be desirable to create spinlocks that
      last for less than the lifetime of the server.  Without this change,
      attempting to use such facilities under --disable-spinlocks would
      quickly exhaust any supply of available semaphores.  Quite apart
      from that, it's desirable to contain the quantity of semaphores
      needed to run the server simply on convenience grounds, since using
      too many may make it harder to get PostgreSQL running on a new
      platform, which is mostly the point of --disable-spinlocks in the
      first place.
      
      Patch by me; review by Tom Lane.
      daa7527a
    • Heikki Linnakangas's avatar
      Fix pause_at_recovery_target + recovery_target_inclusive combination. · 3739e5ab
      Heikki Linnakangas authored
      If pause_at_recovery_target is set, recovery pauses *before* applying the
      target record, even if recovery_target_inclusive is set. If you then
      continue with pg_xlog_replay_resume(), it will apply the target record
      before ending recovery. In other words, if you log in while it's paused
      and verify that the database looks OK, ending recovery changes its state
      again, possibly destroying data that you were tring to salvage with PITR.
      
      Backpatch to 9.1, this has been broken since pause_at_recovery_target was
      added.
      3739e5ab
    • Heikki Linnakangas's avatar
      If multiple recovery_targets are specified, use the latest one. · 815d71de
      Heikki Linnakangas authored
      The docs say that only one of recovery_target_xid, recovery_target_time, or
      recovery_target_name can be specified. But the code actually did something
      different, so that a name overrode time, and xid overrode both time and name.
      Now the target specified last takes effect, whether it's an xid, time or
      name.
      
      With this patch, we still accept multiple recovery_target settings, even
      though docs say that only one can be specified. It's a general property of
      the recovery.conf file parser that you if you specify the same option twice,
      the last one takes effect, like with postgresql.conf.
      815d71de
    • Tom Lane's avatar
      Avoid extra AggCheckCallContext() checks in ordered-set aggregates. · 847e46ab
      Tom Lane authored
      In the transition functions, we don't really need to recheck this after the
      first call.  I had been feeling paranoid about possibly getting a non-null
      argument value in some other context; but it's probably game over anyway
      if we have a non-null "internal" value that's not what we are expecting.
      
      In the final functions, the general convention in pre-existing final
      functions seems to be that an Assert() is good enough, so do it like that
      here too.
      
      This seems to save a few tenths of a percent of overall query runtime,
      which isn't much, but still it's just overhead if there's not a plausible
      case where the checks would fire.
      847e46ab
    • Tom Lane's avatar
      Save a few cycles in advance_transition_function(). · e6336b8b
      Tom Lane authored
      Keep a pre-initialized FunctionCallInfoData in AggStatePerAggData, and
      re-use that at each row instead of doing InitFunctionCallInfoData each
      time.  This saves only half a dozen assignments and maybe some stack
      manipulation, and yet that seems to be good for a percent or two of the
      overall query run time for simple aggregates such as count(*).  The cost
      is that the FunctionCallInfoData (which is about a kilobyte, on 64-bit
      machines) stays allocated for the duration of the query instead of being
      short-lived stack data.  But we're already paying an equivalent space cost
      for each regular FuncExpr or OpExpr node, so I don't feel bad about paying
      it for aggregate functions.  The code seems a little cleaner this way too,
      since the number of things passed to advance_transition_function decreases.
      e6336b8b
    • Heikki Linnakangas's avatar
      Fix bug in determining when recovery has reached consistency. · d59ff6c1
      Heikki Linnakangas authored
      When starting WAL replay from an online checkpoint, the last replayed WAL
      record variable was initialized using the checkpoint record's location, even
      though the records between the REDO location and the checkpoint record had
      not been replayed yet. That was noted as "slightly confusing" but harmless
      in the comment, but in some cases, it fooled CheckRecoveryConsistency to
      incorrectly conclude that we had already reached a consistent state
      immediately at the beginning of WAL replay. That caused the system to accept
      read-only connections in hot standby mode too early, and also PANICs with
      message "WAL contains references to invalid pages".
      
      Fix by initializing the variables to the REDO location instead.
      
      In 9.2 and above, change CheckRecoveryConsistency() to use
      lastReplayedEndRecPtr variable when checking if backup end location has
      been reached. It was inconsistently using EndRecPtr for that check, but
      lastReplayedEndRecPtr when checking min recovery point. It made no
      difference before this patch, because in all the places where
      CheckRecoveryConsistency was called the two variables were the same, but
      it was always an accident waiting to happen, and would have been wrong
      after this patch anyway.
      
      Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0,
      where hot standby was introduced.
      d59ff6c1
    • Peter Eisentraut's avatar
      pg_upgrade: Fix fatal error handling · ca607b15
      Peter Eisentraut authored
      Restore exiting when pg_log(PG_FATAL) is called directly instead of
      calling pg_fatal().  Fault introduced in
      264aa14a.
      ca607b15
  5. 07 Jan, 2014 6 commits
    • Bruce Momjian's avatar
      Update copyright for 2014 · 7e04792a
      Bruce Momjian authored
      Update all files in head, and files COPYRIGHT and legal.sgml in all back
      branches.
      7e04792a
    • Tom Lane's avatar
      Fix LATERAL references to target table of UPDATE/DELETE. · 0c051c90
      Tom Lane authored
      I failed to think much about UPDATE/DELETE when implementing LATERAL :-(.
      The implemented behavior ended up being that subqueries in the FROM or
      USING clause (respectively) could access the update/delete target table as
      though it were a lateral reference; which seems fine if they said LATERAL,
      but certainly ought to draw an error if they didn't.  Fix it so you get a
      suitable error when you omit LATERAL.  Per report from Emre Hasegeli.
      0c051c90
    • Heikki Linnakangas's avatar
      Silence compiler warning on MSVC. · f68220df
      Heikki Linnakangas authored
      MSVC doesn't know that elog(ERROR) doesn't return, and gives a warning about
      missing return. Silence that.
      
      Amit Kapila
      f68220df
    • Magnus Hagander's avatar
      Move permissions check from do_pg_start_backup to pg_start_backup · 9544cc0d
      Magnus Hagander authored
      And the same for do_pg_stop_backup. The code in do_pg_* is not allowed
      to access the catalogs. For manual base backups, the permissions
      check can be handled in the calling function, and for streaming
      base backups only users with the required permissions can get past
      the authentication step in the first place.
      
      Reported by Antonin Houska, diagnosed by Andres Freund
      9544cc0d
    • Magnus Hagander's avatar
      Avoid including tablespaces inside PGDATA twice in base backups · b168c5ef
      Magnus Hagander authored
      If a tablespace was crated inside PGDATA it was backed up both as part
      of the PGDATA backup and as the backup of the tablespace. Avoid this
      by skipping any directory inside PGDATA that contains one of the active
      tablespaces.
      
      Dimitri Fontaine and Magnus Hagander
      b168c5ef
    • Peter Eisentraut's avatar
      Add more use of psprintf() · edc43458
      Peter Eisentraut authored
      edc43458
  6. 06 Jan, 2014 1 commit
  7. 05 Jan, 2014 1 commit
    • Tom Lane's avatar
      Cache catalog lookup data across groups in ordered-set aggregates. · 8b49a604
      Tom Lane authored
      The initial commit of ordered-set aggregates just did all the setup work
      afresh each time the aggregate function is started up.  But in a GROUP BY
      query, the catalog lookups need not be repeated for each group, since the
      column datatypes and sort information won't change.  When there are many
      small groups, this makes for a useful, though not huge, performance
      improvement.  Per suggestion from Andrew Gierth.
      
      Profiling of these cases suggests that it might be profitable to avoid
      duplicate lookups within tuplesort startup as well; but changing the
      tuplesort APIs would have much broader impact, so I left that for
      another day.
      8b49a604
  8. 04 Jan, 2014 3 commits
    • Tom Lane's avatar
      Fix translatability markings in psql, and add defenses against future bugs. · 92459e7a
      Tom Lane authored
      Several previous commits have added columns to various \d queries without
      updating their translate_columns[] arrays, leading to potentially incorrect
      translations in NLS-enabled builds.  Offenders include commit 89368676
      (added prosecdef to \df+), c9ac00e6 (added description to \dc+) and
      3b17efdf (added description to \dC+).  Fix those cases back to 9.3 or
      9.2 as appropriate.
      
      Since this is evidently more easily missed than one would like, in HEAD
      also add an Assert that the supplied array is long enough.  This requires
      an API change for printQuery(), so it seems inappropriate for back
      branches, but presumably all future changes will be tested in HEAD anyway.
      
      In HEAD and 9.3, also clean up a whole lot of sloppiness in the emitted
      SQL for \dy (event triggers): lack of translatability due to failing to
      pass words-to-be-translated through gettext_noop(), inadequate schema
      qualification, and sloppy formatting resulting in unnecessarily ugly
      -E output.
      
      Peter Eisentraut and Tom Lane, per bug #8702 from Sergey Burladyan
      92459e7a
    • Tom Lane's avatar
      Fix header comment for bitncmp(). · 5858cf8a
      Tom Lane authored
      The result is an int less than, equal to, or greater than zero, in the
      style of memcmp (and, in fact, exactly the output of memcmp in some cases).
      This comment previously said -1, 1, or 0, which was an overspecification,
      as noted by Emre Hasegeli.  All of the existing callers appear to be fine
      with the actual behavior, so just fix the comment.
      
      In passing, improve infelicitous formatting of some call sites.
      5858cf8a
    • Tom Lane's avatar
      Fix typo in comment. · 99299756
      Tom Lane authored
      classifyClauses was renamed to classifyConditions somewhere along the
      line, but this comment didn't get the memo.
      
      Ian Barwick
      99299756
  9. 03 Jan, 2014 3 commits
    • Alvaro Herrera's avatar
      Restore some comments lost during 15732b34 · 1a3e82a7
      Alvaro Herrera authored
      Michael Paquier
      1a3e82a7
    • Tom Lane's avatar
      Ooops, should use double not single quotes in StaticAssertStmt(). · a3b4aeec
      Tom Lane authored
      That's what I get for testing this on an older compiler.
      a3b4aeec
    • Tom Lane's avatar
      Fix calculation of maximum statistics-message size. · a7ef273e
      Tom Lane authored
      The PGSTAT_NUM_TABENTRIES macro should have been updated when new fields
      were added to struct PgStat_MsgTabstat in commit 64482890, but it wasn't.
      Fix that.
      
      Also, add a static assertion that we didn't overrun the intended size limit
      on stats messages.  This will not necessarily catch every mistake in
      computing the maximum array size for stats messages, but it will catch ones
      that have practical consequences.  (The assertion in fact doesn't complain
      about the aforementioned error in PGSTAT_NUM_TABENTRIES, because that was
      not big enough to cause the array length to increase.)
      
      No back-patch, as there's no actual bug in existing releases; this is just
      in the nature of future-proofing.
      
      Mark Dilger and Tom Lane
      a7ef273e
  10. 02 Jan, 2014 3 commits
    • Alvaro Herrera's avatar
      Handle 5-char filenames in SlruScanDirectory · 638cf09e
      Alvaro Herrera authored
      Original users of slru.c were all producing 4-digit filenames, so that
      was all that that code was prepared to handle.  Changes to multixact.c
      in the course of commit 0ac5ad51 made pg_multixact/members create
      5-digit filenames once a certain threshold was reached, which
      SlruScanDirectory wasn't prepared to deal with; in particular,
      5-digit-name files were not removed during truncation.  Change that
      routine to make it aware of those files, and have it process them just
      like any others.
      
      Right now, some pg_multixact/members directories will contain a mixture
      of 4-char and 5-char filenames.  A future commit is expected fix things
      so that each slru.c user declares the correct maximum width for the
      files it produces, to avoid such unsightly mixtures.
      
      Noticed while investigating bug #8673 reported by Serge Negodyuck.
      638cf09e
    • Alvaro Herrera's avatar
      Wrap multixact/members correctly during extension · a50d9762
      Alvaro Herrera authored
      In the 9.2 code for extending multixact/members, the logic was very
      simple because the number of entries in a members page was a proper
      divisor of 2^32, and thus at 2^32 wraparound the logic for page switch
      was identical than at any other page boundary.  In commit 0ac5ad51 I
      failed to realize this and introduced code that was not able to go over
      the 2^32 boundary.  Fix that by ensuring that when we reach the last
      page of the last segment we correctly zero the initial page of the
      initial segment, using correct uint32-wraparound-safe arithmetic.
      
      Noticed while investigating bug #8673 reported by Serge Negodyuck, as
      diagnosed by Andres Freund.
      a50d9762
    • Alvaro Herrera's avatar
      Handle wraparound during truncation in multixact/members · 722acf51
      Alvaro Herrera authored
      In pg_multixact/members, relying on modulo-2^32 arithmetic for
      wraparound handling doesn't work all that well.  Because we don't
      explicitely track wraparound of the allocation counter for members, it
      is possible that the "live" area exceeds 2^31 entries; trying to remove
      SLRU segments that are "old" according to the original logic might lead
      to removal of segments still in use.  To fix, have the truncation
      routine use a tailored SlruScanDirectory callback that keeps track of
      the live area in actual use; that way, when the live range exceeds 2^31
      entries, the oldest segments still live will not get removed untimely.
      
      This new SlruScanDir callback needs to take care not to remove segments
      that are "in the future": if new SLRU segments appear while the
      truncation is ongoing, make sure we don't remove them.  This requires
      examination of shared memory state to recheck for false positives, but
      testing suggests that this doesn't cause a problem.  The original coding
      didn't suffer from this pitfall because segments created when truncation
      is running are never considered to be removable.
      
      Per Andres Freund's investigation of bug #8673 reported by Serge
      Negodyuck.
      722acf51