1. 20 Sep, 2010 1 commit
  2. 11 Sep, 2010 1 commit
    • Joe Conway's avatar
      SERIALIZABLE transactions are actually implemented beneath the covers with · 5eb15c99
      Joe Conway authored
      transaction snapshots, i.e. a snapshot registered at the beginning of
      a transaction. Change variable naming and comments to reflect this reality
      in preparation for a future, truly serializable mode, e.g.
      Serializable Snapshot Isolation (SSI).
      
      For the moment transaction snapshots are still used to implement
      SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
      Grittner and Dan Ports with review and some minor wording changes by me.
      5eb15c99
  3. 05 Aug, 2010 1 commit
    • Robert Haas's avatar
      Standardize get_whatever_oid functions for object types with · 2a6ef344
      Robert Haas authored
      unqualified names.
      
      - Add a missing_ok parameter to get_tablespace_oid.
      - Avoid duplicating get_tablespace_od guts in objectNamesToOids.
      - Add a missing_ok parameter to get_database_oid.
      - Replace get_roleid and get_role_checked with get_role_oid.
      - Add get_namespace_oid, get_language_oid, get_am_oid.
      - Refactor existing code to use new interfaces.
      
      Thanks to KaiGai Kohei for the review.
      2a6ef344
  4. 25 Jul, 2010 1 commit
  5. 22 Jul, 2010 1 commit
    • Robert Haas's avatar
      Centralize DML permissions-checking logic. · b8c6c71d
      Robert Haas authored
      Remove bespoke code in DoCopy and RI_Initial_Check, which now instead
      fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry.
      This is intended to make it feasible for an enhanced security provider
      to actually make use of ExecutorCheckPerms_hook, but also has the
      advantage that RI_Initial_Check can allow use of the fast-path when
      column-level but not table-level permissions are present.
      
      KaiGai Kohei.  Reviewed (in an earlier version) by Stephen Frost, and by me.
      Some further changes to the comments by me.
      b8c6c71d
  6. 12 Jul, 2010 1 commit
    • Tom Lane's avatar
      Make NestLoop plan nodes pass outer-relation variables into their inner · 53e75768
      Tom Lane authored
      relation using the general PARAM_EXEC executor parameter mechanism, rather
      than the ad-hoc kluge of passing the outer tuple down through ExecReScan.
      The previous method was hard to understand and could never be extended to
      handle parameters coming from multiple join levels.  This patch doesn't
      change the set of possible plans nor have any significant performance effect,
      but it's necessary infrastructure for future generalization of the concept
      of an inner indexscan plan.
      
      ExecReScan's second parameter is now unused, so it's removed.
      53e75768
  7. 09 Jul, 2010 1 commit
  8. 28 Apr, 2010 1 commit
    • Heikki Linnakangas's avatar
      Introduce wal_level GUC to explicitly control if information needed for · 9b8a7332
      Heikki Linnakangas authored
      archival or hot standby should be WAL-logged, instead of deducing that from
      other options like archive_mode. This replaces recovery_connections GUC in
      the primary, where it now has no effect, but it's still used in the standby
      to enable/disable hot standby.
      
      Remove the WAL-logging of "unlogged operations", like creating an index
      without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
      the wal_mode setting and the settings that affect how much shared memory a
      hot standby server needs to track master transactions (max_connections,
      max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
      change, at server restart, write a WAL record noting the new settings and
      update pg_control. This allows us to notice the change in those settings in
      the standby at the right moment, they used to be included in checkpoint
      records, but that meant that a changed value was not reflected in the
      standby until the first checkpoint after the change.
      
      Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
      the sequence it used to follow, before hot standby and subsequent patches
      changed it to 0x9003.
      9b8a7332
  9. 26 Feb, 2010 1 commit
  10. 20 Feb, 2010 1 commit
    • Tom Lane's avatar
      Clean up handling of XactReadOnly and RecoveryInProgress checks. · 05d8a561
      Tom Lane authored
      Add some checks that seem logically necessary, in particular let's make
      real sure that HS slave sessions cannot create temp tables.  (If they did
      they would think that temp tables belonging to the master's session with
      the same BackendId were theirs.  We *must* not allow myTempNamespace to
      become set in a slave session.)
      
      Change setval() and nextval() so that they are only allowed on temp sequences
      in a read-only transaction.  This seems consistent with what we allow for
      table modifications in read-only transactions.  Since an HS slave can't have a
      temp sequence, this also provides a nicer cure for the setval PANIC reported
      by Erik Rijkers.
      
      Make the error messages more uniform, and have them mention the specific
      command being complained of.  This seems worth the trifling amount of extra
      code, since people are likely to see such messages a lot more than before.
      05d8a561
  11. 09 Feb, 2010 1 commit
    • Tom Lane's avatar
      Fix up rickety handling of relation-truncation interlocks. · cbe9d6be
      Tom Lane authored
      Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr
      relation entries, so that they will get reset to InvalidBlockNumber whenever
      an smgr-level flush happens.  Because we now send smgr invalidation messages
      immediately (not at end of transaction) when a relation truncation occurs,
      this ensures that other backends will reset their values before they next
      access the relation.  We no longer need the unreliable assumption that a
      VACUUM that's doing a truncation will hold its AccessExclusive lock until
      commit --- in fact, we can intentionally release that lock as soon as we've
      completed the truncation.  This patch therefore reverts (most of) Alvaro's
      patch of 2009-11-10, as well as my marginal hacking on it yesterday.  We can
      also get rid of assorted no-longer-needed relcache flushes, which are far more
      expensive than an smgr flush because they kill a lot more state.
      
      In passing this patch fixes smgr_redo's failure to perform visibility-map
      truncation, and cleans up some rather dubious assumptions in freespace.c and
      visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of
      date.
      cbe9d6be
  12. 07 Feb, 2010 1 commit
    • Tom Lane's avatar
      Create a "relation mapping" infrastructure to support changing the relfilenodes · b9b8831a
      Tom Lane authored
      of shared or nailed system catalogs.  This has two key benefits:
      
      * The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.
      
      * We no longer have to use an unsafe reindex-in-place approach for reindexing
        shared catalogs.
      
      CLUSTER on nailed catalogs now works too, although I left it disabled on
      shared catalogs because the resulting pg_index.indisclustered update would
      only be visible in one database.
      
      Since reindexing shared system catalogs is now fully transactional and
      crash-safe, the former special cases in REINDEX behavior have been removed;
      shared catalogs are treated the same as non-shared.
      
      This commit does not do anything about the recently-discussed problem of
      deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
      concurrent queries; will address that in a separate patch.  As a stopgap,
      parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
      such failures during the regression tests.
      b9b8831a
  13. 03 Feb, 2010 1 commit
  14. 28 Jan, 2010 1 commit
  15. 15 Jan, 2010 1 commit
    • Heikki Linnakangas's avatar
      Introduce Streaming Replication. · 40f908bd
      Heikki Linnakangas authored
      This includes two new kinds of postmaster processes, walsenders and
      walreceiver. Walreceiver is responsible for connecting to the primary server
      and streaming WAL to disk, while walsender runs in the primary server and
      streams WAL from disk to the client.
      
      Documentation still needs work, but the basics are there. We will probably
      pull the replication section to a new chapter later on, as well as the
      sections describing file-based replication. But let's do that as a separate
      patch, so that it's easier to see what has been added/changed. This patch
      also adds a new section to the chapter about FE/BE protocol, documenting the
      protocol used by walsender/walreceivxer.
      
      Bump catalog version because of two new functions,
      pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
      monitoring the progress of replication.
      
      Fujii Masao, with additional hacking by me
      40f908bd
  16. 08 Jan, 2010 1 commit
    • Tom Lane's avatar
      Fix oversight in EvalPlanQualFetch: after failing to lock a tuple because · 217dc525
      Tom Lane authored
      someone else has just updated it, we have to set priorXmax to that tuple's
      xmax (ie, the XID of the other xact that updated it) before looping back to
      examine the next tuple.  Obviously, the next tuple in the update chain should
      have that XID as its xmin, not the same xmin as the preceding tuple that we
      had been trying to lock.  The mismatch would cause the EvalPlanQual logic to
      decide that the tuple chain ended in a deletion, when actually there was a
      live tuple that should have been found.
      
      I inserted this error when recently adding logic to EvalPlanQual to make it
      lock tuples before returning them (as opposed to the old method in which the
      lock would occur much later, causing a great deal of work to be wasted if we
      only then discover someone else updated it).  Sigh.  Per today's report from
      Takahiro Itagaki of inconsistent results during pgbench runs.
      217dc525
  17. 06 Jan, 2010 1 commit
    • Bruce Momjian's avatar
      Preserve relfilenodes: · f98fbc78
      Bruce Momjian authored
      Add support to pg_dump --binary-upgrade to preserve all relfilenodes,
      for use by pg_migrator.
      f98fbc78
  18. 02 Jan, 2010 1 commit
  19. 15 Dec, 2009 1 commit
  20. 11 Dec, 2009 1 commit
    • Tom Lane's avatar
      Ensure that the result tuple of an EvalPlanQual cycle gets materialized · d8e511fa
      Tom Lane authored
      before we zap the input tuple.  Otherwise, pass-by-reference columns of
      the result slot are likely to contain just references to the input
      tuple, leading to big trouble if the pfree'd space is reused.  Per
      trouble report from Jaime Casanova.  This is a new bug in the recent
      rewrite of EvalPlanQual, so nothing to back-patch.
      d8e511fa
  21. 09 Dec, 2009 1 commit
    • Tom Lane's avatar
      Prevent indirect security attacks via changing session-local state within · 62aba765
      Tom Lane authored
      an allegedly immutable index function.  It was previously recognized that
      we had to prevent such a function from executing SET/RESET ROLE/SESSION
      AUTHORIZATION, or it could trivially obtain the privileges of the session
      user.  However, since there is in general no privilege checking for changes
      of session-local state, it is also possible for such a function to change
      settings in a way that might subvert later operations in the same session.
      Examples include changing search_path to cause an unexpected function to
      be called, or replacing an existing prepared statement with another one
      that will execute a function of the attacker's choosing.
      
      The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
      these threats, which are the same places previously deemed to need protection
      against the SET ROLE issue.  GUC changes are still allowed, since there are
      many useful cases for that, but we prevent security problems by forcing a
      rollback of any GUC change after completing the operation.  Other cases are
      handled by throwing an error if any change is attempted; these include temp
      table creation, closing a cursor, and creating or deleting a prepared
      statement.  (In 7.4, the infrastructure to roll back GUC changes doesn't
      exist, so we settle for rejecting changes of "search_path" in these contexts.)
      
      Original report and patch by Gurjeet Singh, additional analysis by
      Tom Lane.
      
      Security: CVE-2009-4136
      62aba765
  22. 20 Nov, 2009 1 commit
    • Tom Lane's avatar
      Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be · 7fc0f062
      Tom Lane authored
      checked to determine whether the trigger should be fired.
      
      For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
      triggers it can provide a noticeable performance improvement, since queuing of
      a deferred trigger event and re-fetching of the row(s) at end of statement can
      be short-circuited if the trigger does not need to be fired.
      
      Takahiro Itagaki, reviewed by KaiGai Kohei.
      7fc0f062
  23. 26 Oct, 2009 1 commit
    • Tom Lane's avatar
      Re-implement EvalPlanQual processing to improve its performance and eliminate · 9f2ee8f2
      Tom Lane authored
      a lot of strange behaviors that occurred in join cases.  We now identify the
      "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
      UPDATE/SHARE queries.  If an EvalPlanQual recheck is necessary, we jam the
      appropriate row into each scan node in the rechecking plan, forcing it to emit
      only that one row.  The former behavior could rescan the whole of each joined
      relation for each recheck, which was terrible for performance, and what's much
      worse could result in duplicated output tuples.
      
      Also, the original implementation of EvalPlanQual could not re-use the recheck
      execution tree --- it had to go through a full executor init and shutdown for
      every row to be tested.  To avoid this overhead, I've associated a special
      runtime Param with each LockRows or ModifyTable plan node, and arranged to
      make every scan node below such a node depend on that Param.  Thus, by
      signaling a change in that Param, the EPQ machinery can just rescan the
      already-built test plan.
      
      This patch also adds a prohibition on set-returning functions in the
      targetlist of SELECT FOR UPDATE/SHARE.  This is needed to avoid the
      duplicate-output-tuple problem.  It seems fairly reasonable since the
      other restrictions on SELECT FOR UPDATE are meant to ensure that there
      is a unique correspondence between source tuples and result tuples,
      which an output SRF destroys as much as anything else does.
      9f2ee8f2
  24. 12 Oct, 2009 1 commit
    • Tom Lane's avatar
      Move the handling of SELECT FOR UPDATE locking and rechecking out of · 0adaf4cb
      Tom Lane authored
      execMain.c and into a new plan node type LockRows.  Like the recent change
      to put table updating into a ModifyTable plan node, this increases planning
      flexibility by allowing the operations to occur below the top level of the
      plan tree.  It's necessary in any case to restore the previous behavior of
      having FOR UPDATE locking occur before ModifyTable does.
      
      This partially refactors EvalPlanQual to allow multiple rows-under-test
      to be inserted into the EPQ machinery before starting an EPQ test query.
      That isn't sufficient to fix EPQ's general bogosity in the face of plans
      that return multiple rows per test row, though.  Since this patch is
      mostly about getting some plan node infrastructure in place and not about
      fixing ten-year-old bugs, I will leave EPQ improvements for another day.
      
      Another behavioral change that we could now think about is doing FOR UPDATE
      before LIMIT, but that too seems like it should be treated as a followon
      patch.
      0adaf4cb
  25. 10 Oct, 2009 1 commit
    • Tom Lane's avatar
      Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. · 8a5849b7
      Tom Lane authored
      They are now handled by a new plan node type called ModifyTable, which is
      placed at the top of the plan tree.  In itself this change doesn't do much,
      except perhaps make the handling of RETURNING lists and inherited UPDATEs a
      tad less klugy.  But it is necessary preparation for the intended extension of
      allowing RETURNING queries inside WITH.
      
      Marko Tiikkaja
      8a5849b7
  26. 08 Oct, 2009 1 commit
    • Tom Lane's avatar
      Remove very ancient tuple-counting infrastructure (IncrRetrieved() and · c970292a
      Tom Lane authored
      friends).  This code has all been ifdef'd out for many years, and doesn't
      seem to have any prospect of becoming any more useful in the future.
      EXPLAIN ANALYZE is what people use in practice, and I think if we did want
      process-wide counters we'd be more likely to put in dtrace events for that
      than try to resurrect this code.  Get rid of it so as to have one less detail
      to worry about while refactoring execMain.c.
      c970292a
  27. 05 Oct, 2009 1 commit
    • Tom Lane's avatar
      Create an ALTER DEFAULT PRIVILEGES command, which allows users to adjust · 249724cb
      Tom Lane authored
      the privileges that will be applied to subsequently-created objects.
      
      Such adjustments are always per owning role, and can be restricted to objects
      created in particular schemas too.  A notable benefit is that users can
      override the traditional default privilege settings, eg, the PUBLIC EXECUTE
      privilege traditionally granted by default for functions.
      
      Petr Jelinek
      249724cb
  28. 27 Sep, 2009 1 commit
    • Tom Lane's avatar
      Replace the array-style TupleTable data structure with a simple List of · f92e8a4b
      Tom Lane authored
      TupleTableSlot nodes.  This eliminates the need to count in advance
      how many Slots will be needed, which seems more than worth the small
      increase in the amount of palloc traffic during executor startup.
      
      The ExecCountSlots infrastructure is now all dead code, but I'll remove it
      in a separate commit for clarity.
      
      Per a comment from Robert Haas.
      f92e8a4b
  29. 26 Sep, 2009 1 commit
    • Tom Lane's avatar
      Extend the BKI infrastructure to allow system catalogs to be given · 49856352
      Tom Lane authored
      hand-assigned rowtype OIDs, even when they are not "bootstrapped" catalogs
      that have handmade type rows in pg_type.h.  Give pg_database such an OID.
      Restore the availability of C macros for the rowtype OIDs of the bootstrapped
      catalogs.  (These macros are now in the individual catalogs' .h files,
      though, not in pg_type.h.)
      
      This commit doesn't do anything especially useful by itself, but it's
      necessary infrastructure for reverting some ill-considered changes in
      relcache.c.
      49856352
  30. 29 Jul, 2009 1 commit
    • Tom Lane's avatar
      Support deferrable uniqueness constraints. · 25d9bf2e
      Tom Lane authored
      The current implementation fires an AFTER ROW trigger for each tuple that
      looks like it might be non-unique according to the index contents at the
      time of insertion.  This works well as long as there aren't many conflicts,
      but won't scale to massive unique-key reassignments.  Improving that case
      is a TODO item.
      
      Dean Rasheed
      25d9bf2e
  31. 11 Jun, 2009 2 commits
  32. 07 May, 2009 1 commit
    • Tom Lane's avatar
      Add an option to AlterTableCreateToastTable() to allow its caller to force · 1e06ed1a
      Tom Lane authored
      a toast table to be built, even if the sum-of-column-widths calculation
      indicates one isn't needed.  This is needed by pg_migrator because if the
      old table has a toast table, we have to migrate over the toast table since
      it might contain some live data, even though subsequent column drops could
      mean that no recently-added rows could require toasting.
      1e06ed1a
  33. 08 Feb, 2009 1 commit
    • Tom Lane's avatar
      Ensure that INSERT ... SELECT into a table with OIDs never copies row OIDs · 3d02cae3
      Tom Lane authored
      from the source table.  This could never happen anyway before 8.4 because
      the executor invariably applied a "junk filter" to rows due to be inserted;
      but now that we skip doing that when it's not necessary, the case can occur.
      Problem noted 2008-11-27 by KaiGai Kohei, though I misunderstood what he
      was on about at the time (the opacity of the patch he proposed didn't help).
      3d02cae3
  34. 02 Feb, 2009 1 commit
  35. 22 Jan, 2009 1 commit
  36. 01 Jan, 2009 1 commit
  37. 30 Nov, 2008 1 commit
    • Tom Lane's avatar
      Clean up the API for DestReceiver objects by eliminating the assumption · c1f30733
      Tom Lane authored
      that a Portal is a useful and sufficient additional argument for
      CreateDestReceiver --- it just isn't, in most cases.  Instead formalize
      the approach of passing any needed parameters to the receiver separately.
      
      One unexpected benefit of this change is that we can declare typedef Portal
      in a less surprising location.
      
      This patch is just code rearrangement and doesn't change any functionality.
      I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.
      c1f30733
  38. 19 Nov, 2008 1 commit
    • Tom Lane's avatar
      Some infrastructure changes for the upcoming auto-explain contrib module: · cd35e9d7
      Tom Lane authored
      * Refactor explain.c slightly to export a convenient-to-use subroutine
      for printing EXPLAIN results.
      
      * Provide hooks for plugins to get control at ExecutorStart and ExecutorEnd
      as well as ExecutorRun.
      
      * Add some minimal support for tracking the total runtime of ExecutorRun.
      This code won't actually do anything unless a plugin prods it to.
      
      * Change the API of the DefineCustomXXXVariable functions to allow nonzero
      "flags" to be specified for a custom GUC variable.  While at it, also make
      the "bootstrap" default value for custom GUCs be explicitly specified as a
      parameter to these functions.  This is to eliminate confusion over where the
      default comes from, as has been expressed in the past by some users of the
      custom-variable facility.
      
      * Refactor GUC code a bit to ensure that a custom variable gets initialized to
      something valid (like its default value) even if the placeholder value was
      invalid.
      cd35e9d7
  39. 16 Nov, 2008 1 commit
    • Tom Lane's avatar
      Modify UPDATE/DELETE WHERE CURRENT OF to use the FOR UPDATE infrastructure to · 18004101
      Tom Lane authored
      locate the target row, if the cursor was declared with FOR UPDATE or FOR
      SHARE.  This approach is more flexible and reliable than digging through the
      plan tree; for instance it can cope with join cursors.  But we still provide
      the old code for use with non-FOR-UPDATE cursors.  Per gripe from Robert Haas.
      18004101