Commits · b9b8831ad60f6e4bd580fe6dbe9749359298a3c4 · Abuhujair Javed / Postgres FD Implementation

07 Feb, 2010 1 commit

Create a "relation mapping" infrastructure to support changing the relfilenodes · b9b8831a

Tom Lane authored 15 years ago

of shared or nailed system catalogs.  This has two key benefits:

* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.

* We no longer have to use an unsafe reindex-in-place approach for reindexing
  shared catalogs.

CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.

Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.

This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch.  As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.

b9b8831a

03 Feb, 2010 1 commit

Move the responsibility of writing a "unlogged WAL operation" record from · 9de778b2

Heikki Linnakangas authored 15 years ago

heap_sync() to the callers, because heap_sync() is sometimes called even
if the operation itself is WAL-logged. This eliminates the bogus unlogged
records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.

9de778b2

28 Jan, 2010 1 commit
- Type table feature · e7b3349a
  Peter Eisentraut authored 15 years ago
```
This adds the CREATE TABLE name OF type command, per SQL standard.
```
  e7b3349a
15 Jan, 2010 1 commit

Introduce Streaming Replication. · 40f908bd

Heikki Linnakangas authored 15 years ago

This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me

40f908bd

08 Jan, 2010 1 commit

Fix oversight in EvalPlanQualFetch: after failing to lock a tuple because · 217dc525

Tom Lane authored 15 years ago

someone else has just updated it, we have to set priorXmax to that tuple's
xmax (ie, the XID of the other xact that updated it) before looping back to
examine the next tuple. Obviously, the next tuple in the update chain should
have that XID as its xmin, not the same xmin as the preceding tuple that we
had been trying to lock. The mismatch would cause the EvalPlanQual logic to
decide that the tuple chain ended in a deletion, when actually there was a
live tuple that should have been found.

I inserted this error when recently adding logic to EvalPlanQual to make it
lock tuples before returning them (as opposed to the old method in which the
lock would occur much later, causing a great deal of work to be wasted if we
only then discover someone else updated it). Sigh. Per today's report from
Takahiro Itagaki of inconsistent results during pgbench runs.

217dc525

06 Jan, 2010 1 commit

Preserve relfilenodes: · f98fbc78

Bruce Momjian authored 15 years ago

Add support to pg_dump --binary-upgrade to preserve all relfilenodes,
for use by pg_migrator.

f98fbc78

02 Jan, 2010 1 commit
- Update copyright for the year 2010. · 02398008
  Bruce Momjian authored 15 years ago
  
  02398008
15 Dec, 2009 1 commit

Add an EXPLAIN (BUFFERS) option to show buffer-usage statistics. · cddca5ec

Robert Haas authored 15 years ago

This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.

Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.

cddca5ec

11 Dec, 2009 1 commit

Ensure that the result tuple of an EvalPlanQual cycle gets materialized · d8e511fa

Tom Lane authored 15 years ago

before we zap the input tuple. Otherwise, pass-by-reference columns of
the result slot are likely to contain just references to the input
tuple, leading to big trouble if the pfree'd space is reused. Per
trouble report from Jaime Casanova. This is a new bug in the recent
rewrite of EvalPlanQual, so nothing to back-patch.

d8e511fa

09 Dec, 2009 1 commit

Prevent indirect security attacks via changing session-local state within · 62aba765

Tom Lane authored 15 years ago

an allegedly immutable index function. It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user. However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.

The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue. GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation. Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement. (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)

Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.

Security: CVE-2009-4136

62aba765

20 Nov, 2009 1 commit

Add a WHEN clause to CREATE TRIGGER, allowing a boolean expression to be · 7fc0f062

Tom Lane authored 15 years ago

checked to determine whether the trigger should be fired.

For BEFORE triggers this is mostly a matter of spec compliance; but for AFTER
triggers it can provide a noticeable performance improvement, since queuing of
a deferred trigger event and re-fetching of the row(s) at end of statement can
be short-circuited if the trigger does not need to be fired.

Takahiro Itagaki, reviewed by KaiGai Kohei.

7fc0f062

26 Oct, 2009 1 commit

Re-implement EvalPlanQual processing to improve its performance and eliminate · 9f2ee8f2

Tom Lane authored 15 years ago

a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.

Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.

This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.

9f2ee8f2

12 Oct, 2009 1 commit

Move the handling of SELECT FOR UPDATE locking and rechecking out of · 0adaf4cb

Tom Lane authored 15 years ago

execMain.c and into a new plan node type LockRows. Like the recent change
to put table updating into a ModifyTable plan node, this increases planning
flexibility by allowing the operations to occur below the top level of the
plan tree. It's necessary in any case to restore the previous behavior of
having FOR UPDATE locking occur before ModifyTable does.

This partially refactors EvalPlanQual to allow multiple rows-under-test
to be inserted into the EPQ machinery before starting an EPQ test query.
That isn't sufficient to fix EPQ's general bogosity in the face of plans
that return multiple rows per test row, though. Since this patch is
mostly about getting some plan node infrastructure in place and not about
fixing ten-year-old bugs, I will leave EPQ improvements for another day.

Another behavioral change that we could now think about is doing FOR UPDATE
before LIMIT, but that too seems like it should be treated as a followon
patch.

0adaf4cb

10 Oct, 2009 1 commit

Split the processing of INSERT/UPDATE/DELETE operations out of execMain.c. · 8a5849b7

Tom Lane authored 15 years ago

They are now handled by a new plan node type called ModifyTable, which is
placed at the top of the plan tree.  In itself this change doesn't do much,
except perhaps make the handling of RETURNING lists and inherited UPDATEs a
tad less klugy.  But it is necessary preparation for the intended extension of
allowing RETURNING queries inside WITH.

Marko Tiikkaja

8a5849b7

08 Oct, 2009 1 commit

Remove very ancient tuple-counting infrastructure (IncrRetrieved() and · c970292a

Tom Lane authored 15 years ago

friends). This code has all been ifdef'd out for many years, and doesn't
seem to have any prospect of becoming any more useful in the future.
EXPLAIN ANALYZE is what people use in practice, and I think if we did want
process-wide counters we'd be more likely to put in dtrace events for that
than try to resurrect this code. Get rid of it so as to have one less detail
to worry about while refactoring execMain.c.

c970292a

05 Oct, 2009 1 commit

Create an ALTER DEFAULT PRIVILEGES command, which allows users to adjust · 249724cb

Tom Lane authored 15 years ago

the privileges that will be applied to subsequently-created objects.

Such adjustments are always per owning role, and can be restricted to objects
created in particular schemas too.  A notable benefit is that users can
override the traditional default privilege settings, eg, the PUBLIC EXECUTE
privilege traditionally granted by default for functions.

Petr Jelinek

249724cb

27 Sep, 2009 1 commit

Replace the array-style TupleTable data structure with a simple List of · f92e8a4b

Tom Lane authored 15 years ago

TupleTableSlot nodes.  This eliminates the need to count in advance
how many Slots will be needed, which seems more than worth the small
increase in the amount of palloc traffic during executor startup.

The ExecCountSlots infrastructure is now all dead code, but I'll remove it
in a separate commit for clarity.

Per a comment from Robert Haas.

f92e8a4b

26 Sep, 2009 1 commit

Extend the BKI infrastructure to allow system catalogs to be given · 49856352

Tom Lane authored 15 years ago

hand-assigned rowtype OIDs, even when they are not "bootstrapped" catalogs
that have handmade type rows in pg_type.h.  Give pg_database such an OID.
Restore the availability of C macros for the rowtype OIDs of the bootstrapped
catalogs.  (These macros are now in the individual catalogs' .h files,
though, not in pg_type.h.)

This commit doesn't do anything especially useful by itself, but it's
necessary infrastructure for reverting some ill-considered changes in
relcache.c.

49856352

29 Jul, 2009 1 commit

Support deferrable uniqueness constraints. · 25d9bf2e

Tom Lane authored 15 years ago

The current implementation fires an AFTER ROW trigger for each tuple that
looks like it might be non-unique according to the index contents at the
time of insertion.  This works well as long as there aren't many conflicts,
but won't scale to massive unique-key reassignments.  Improving that case
is a TODO item.

Dean Rasheed

25d9bf2e

11 Jun, 2009 2 commits
- Revisit AlterTableCreateToastTable's API once again, hoping to make it what · 44aa60fa
  Tom Lane authored 15 years ago
```
pg_migrator actually needs and not just a partial solution.  We have to be
able to specify the OID that the new toast table should be created with.
```
  44aa60fa
- 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list · d7471402
  Bruce Momjian authored 15 years ago
```
provided by Andrew.
```
  d7471402
07 May, 2009 1 commit

Add an option to AlterTableCreateToastTable() to allow its caller to force · 1e06ed1a

Tom Lane authored 15 years ago

a toast table to be built, even if the sum-of-column-widths calculation
indicates one isn't needed. This is needed by pg_migrator because if the
old table has a toast table, we have to migrate over the toast table since
it might contain some live data, even though subsequent column drops could
mean that no recently-added rows could require toasting.

1e06ed1a

08 Feb, 2009 1 commit

Ensure that INSERT ... SELECT into a table with OIDs never copies row OIDs · 3d02cae3

Tom Lane authored 16 years ago

from the source table. This could never happen anyway before 8.4 because
the executor invariably applied a "junk filter" to rows due to be inserted;
but now that we skip doing that when it's not necessary, the case can occur.
Problem noted 2008-11-27 by KaiGai Kohei, though I misunderstood what he
was on about at the time (the opacity of the patch he proposed didn't help).

3d02cae3

02 Feb, 2009 1 commit

Allow reloption names to have qualifiers, initially supporting a TOAST · 3a5b7737

Alvaro Herrera authored 16 years ago

qualifier, and add support for this in pg_dump.

This allows TOAST tables to have user-defined fillfactor, and will also
enable us to move the autovacuum parameters to reloptions without taking
away the possibility of setting values for TOAST tables.

3a5b7737

22 Jan, 2009 1 commit
- Support column-level privileges, as required by SQL standard. · 3cb5d658
  Tom Lane authored 16 years ago
```
Stephen Frost, with help from KaiGai Kohei and others
```
  3cb5d658
01 Jan, 2009 1 commit
- Update copyright for 2009. · 511db38a
  Bruce Momjian authored 16 years ago
  
  511db38a
30 Nov, 2008 1 commit

Clean up the API for DestReceiver objects by eliminating the assumption · c1f30733

Tom Lane authored 16 years ago

that a Portal is a useful and sufficient additional argument for
CreateDestReceiver --- it just isn't, in most cases. Instead formalize
the approach of passing any needed parameters to the receiver separately.

One unexpected benefit of this change is that we can declare typedef Portal
in a less surprising location.

This patch is just code rearrangement and doesn't change any functionality.
I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.

c1f30733

19 Nov, 2008 1 commit

Some infrastructure changes for the upcoming auto-explain contrib module: · cd35e9d7

Tom Lane authored 16 years ago

* Refactor explain.c slightly to export a convenient-to-use subroutine
for printing EXPLAIN results.

* Provide hooks for plugins to get control at ExecutorStart and ExecutorEnd
as well as ExecutorRun.

* Add some minimal support for tracking the total runtime of ExecutorRun.
This code won't actually do anything unless a plugin prods it to.

* Change the API of the DefineCustomXXXVariable functions to allow nonzero
"flags" to be specified for a custom GUC variable.  While at it, also make
the "bootstrap" default value for custom GUCs be explicitly specified as a
parameter to these functions.  This is to eliminate confusion over where the
default comes from, as has been expressed in the past by some users of the
custom-variable facility.

* Refactor GUC code a bit to ensure that a custom variable gets initialized to
something valid (like its default value) even if the placeholder value was
invalid.

cd35e9d7

16 Nov, 2008 1 commit

Modify UPDATE/DELETE WHERE CURRENT OF to use the FOR UPDATE infrastructure to · 18004101

Tom Lane authored 16 years ago

locate the target row, if the cursor was declared with FOR UPDATE or FOR
SHARE. This approach is more flexible and reliable than digging through the
plan tree; for instance it can cope with join cursors. But we still provide
the old code for use with non-FOR-UPDATE cursors. Per gripe from Robert Haas.

18004101

15 Nov, 2008 1 commit

Make SELECT FOR UPDATE/SHARE work on inheritance trees, by having the plan · 0656ed3d

Tom Lane authored 16 years ago

return the tableoid as well as the ctid for any FOR UPDATE targets that
have child tables. All child tables are listed in the ExecRowMark list,
but the executor just skips the ones that didn't produce the current row.

Curiously, this longstanding restriction doesn't seem to have been documented
anywhere; so no doc changes.

0656ed3d

06 Nov, 2008 1 commit

Improve bulk-insert performance by keeping the current target buffer pinned · 85e2cedf

Tom Lane authored 16 years ago

(but not locked, as that would risk deadlocks).  Also, make it work in a small
ring of buffers to avoid having bulk inserts trash the whole buffer arena.

Robert Haas, after an idea of Simon Riggs'.

85e2cedf

31 Oct, 2008 1 commit

Simplify ExecutorRun's API and save some trivial number of cycles by having · df5a9961

Tom Lane authored 16 years ago

it just return void instead of sometimes returning a TupleTableSlot. SQL
functions don't need that anymore, and noplace else does either. Eliminating
the return value also means one less hassle for the ExecutorRun hook functions
that will be supported beginning in 8.4.

df5a9961

25 Aug, 2008 1 commit

Move exprType(), exprTypmod(), expression_tree_walker(), and related routines · e5536e77

Tom Lane authored 16 years ago

into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend.  There's probably more that should be done along this line,
but this is a start anyway.

e5536e77

08 Aug, 2008 1 commit

Install checks in executor startup to ensure that the tuples produced by an · 30fd8ec7

Tom Lane authored 16 years ago

INSERT or UPDATE will match the target table's current rowtype. In pre-8.3
releases inconsistency can arise with stale cached plans, as reported by
Merlin Moncure. (We patched the equivalent hazard on the SELECT side in Feb
2007; I'm not sure why we thought there was no risk on the insertion side.)
In 8.3 and HEAD this problem should be impossible due to plan cache
invalidation management, but it seems prudent to make the check anyway.

Back-patch as far as 8.0. 7.x versions lack ALTER COLUMN TYPE, so there
seems no way to abuse a stale plan comparably.

30fd8ec7

26 Jul, 2008 1 commit

As noted by Andrew Gierth, there's really no need any more to force a junk · a77eaa6a

Tom Lane authored 16 years ago

filter to be used when INSERT or SELECT INTO has a plan that returns raw
disk tuples. The virtual-tuple-slot optimizations that were put in place
awhile ago mean that ExecInsert has to do ExecMaterializeSlot, and that
already copies the tuple if it's raw (and does so more efficiently than
a junk filter, too). So get rid of that logic. This in turn means that
we can throw away ExecMayReturnRawTuples, which wasn't used for any other
purpose, and was always a kluge anyway.

In passing, move a couple of SELECT-INTO-specific fields out of EState
and into the private state of the SELECT INTO DestReceiver, as was foreseen
in an old comment there. Also make intorel_receive use ExecMaterializeSlot
not ExecCopySlotTuple, for consistency with ExecInsert and to possibly save
a tuple copy step in some cases.

a77eaa6a

18 Jul, 2008 1 commit
- Provide a function hook to let plug-ins get control around ExecutorRun. · 6cc88f0a
  Tom Lane authored 16 years ago
```
ITAGAKI Takahiro
```
  6cc88f0a
12 May, 2008 2 commits

Improve snapshot manager by keeping explicit track of snapshots. · 5da9da71

Alvaro Herrera authored 16 years ago

There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment. This is all done automatically now.

As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.

5da9da71

Restructure some header files a bit, in particular heapam.h, by removing some · f8c4d7db

Alvaro Herrera authored 16 years ago

unnecessary #include lines in it.  Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.

For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.

While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.

f8c4d7db

09 May, 2008 1 commit

Change the rules for inherited CHECK constraints to be essentially the same · cd902b33

Tom Lane authored 16 years ago

as those for inherited columns; that is, it's no longer allowed for a child
table to not have a check constraint matching one that exists on a parent.
This satisfies the principle of least surprise (rows selected from the parent
will always appear to meet its check constraints) and eliminates some
longstanding bogosity in pg_dump, which formerly had to guess about whether
check constraints were really inherited or not.

The implementation involves adding conislocal and coninhcount columns to
pg_constraint (paralleling attislocal and attinhcount in pg_attribute)
and refactoring various ALTER TABLE actions to be more like those for
columns.

Alex Hunsaker, Nikhil Sontakke, Tom Lane

cd902b33

21 Apr, 2008 1 commit

Fix a couple of places in execMain that erroneously assumed that SELECT FOR · f593f623

Tom Lane authored 16 years ago

UPDATE/SHARE couldn't occur as a subquery in a query with a non-SELECT
top-level operation. Symptoms included outright failure (as in report from
Mark Mielke) and silently neglecting to take the requested row locks.

Back-patch to 8.3, because the visible failure in the INSERT ... SELECT case
is a regression from 8.2. I'm a bit hesitant to back-patch further given the
lack of field complaints.

f593f623