Commits · 9f2e211386931f7aee48ffbc2fcaef1632d8329f · Abuhujair Javed / Postgres FD Implementation

20 Sep, 2010 1 commit
- Remove cvs keywords from all files. · 9f2e2113
  Magnus Hagander authored 14 years ago
  
  9f2e2113
11 Sep, 2010 1 commit

SERIALIZABLE transactions are actually implemented beneath the covers with · 5eb15c99

Joe Conway authored 14 years ago

transaction snapshots, i.e. a snapshot registered at the beginning of
a transaction. Change variable naming and comments to reflect this reality
in preparation for a future, truly serializable mode, e.g.
Serializable Snapshot Isolation (SSI).

For the moment transaction snapshots are still used to implement
SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
Grittner and Dan Ports with review and some minor wording changes by me.

5eb15c99

26 Aug, 2010 1 commit

Fix ExecMakeTableFunctionResult to verify that all rows returned by a SRF · db2d9c60

Tom Lane authored 14 years ago

returning "record" actually do have the same rowtype. This is needed because
the parser can't realistically enforce that they will all have the same typmod,
as seen in a recent example from David Wheeler.

Back-patch to 8.0, which is as far back as we have the notion of RECORD
subtypes being distinguished by typmod. Wheeler's example depends on
8.4-and-up features, but I suspect there may be ways to provoke similar
failures before 8.4.

db2d9c60

18 Aug, 2010 1 commit

Reset the per-output-tuple exprcontext each time through the main loop in · 3573c834

Tom Lane authored 14 years ago

ExecModifyTable(). This avoids memory leakage when trigger functions leave
junk behind in that context (as they more or less must). Problem and solution
identified by Dean Rasheed.

I'm a bit concerned about the longevity of this solution --- once a plan can
have multiple ModifyTable nodes, we are very possibly going to have to do
something different. But it should hold up for 9.0.

3573c834

05 Aug, 2010 1 commit

Standardize get_whatever_oid functions for object types with · 2a6ef344

Robert Haas authored 14 years ago

unqualified names.

- Add a missing_ok parameter to get_tablespace_oid.
- Avoid duplicating get_tablespace_od guts in objectNamesToOids.
- Add a missing_ok parameter to get_database_oid.
- Replace get_roleid and get_role_checked with get_role_oid.
- Add get_namespace_oid, get_language_oid, get_am_oid.
- Refactor existing code to use new interfaces.

Thanks to KaiGai Kohei for the review.

2a6ef344

28 Jul, 2010 2 commits

Fix oversight in new EvalPlanQual logic: the second loop over the ExecRowMark · 77c75076

Tom Lane authored 14 years ago

list in ExecLockRows() forgot to allow for the possibility that some of the
rowmarks are for child tables that aren't relevant to the current row.
Per report from Kenichiro Tanaka.

77c75076

Fix potential failure when hashing the output of a subplan that produces · 133924e1

Tom Lane authored 14 years ago

a pass-by-reference datatype with a nontrivial projection step.
We were using the same memory context for the projection operation as for
the temporary context used by the hashtable routines in execGrouping.c.
However, the hashtable routines feel free to reset their temp context at
any time, which'd lead to destroying input data that was still needed.
Report and diagnosis by Tao Ma.

Back-patch to 8.1, where the problem was introduced by the changes that
allowed us to work with "virtual" tuples instead of materializing intermediate
tuple values everywhere. The earlier code looks quite similar, but it doesn't
suffer the problem because the data gets copied into another context as a
result of having to materialize ExecProject's output tuple.

133924e1

25 Jul, 2010 1 commit
- CREATE TABLE IF NOT EXISTS. · a3b012b5
  Robert Haas authored 14 years ago
```
Reviewed by Bernd Helmle.
```
  a3b012b5
22 Jul, 2010 1 commit

Centralize DML permissions-checking logic. · b8c6c71d

Robert Haas authored 14 years ago

Remove bespoke code in DoCopy and RI_Initial_Check, which now instead
fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry.
This is intended to make it feasible for an enhanced security provider
to actually make use of ExecutorCheckPerms_hook, but also has the
advantage that RI_Initial_Check can allow use of the fast-path when
column-level but not table-level permissions are present.

KaiGai Kohei. Reviewed (in an earlier version) by Stephen Frost, and by me.
Some further changes to the comments by me.

b8c6c71d

16 Jul, 2010 1 commit

Remove a sanity check in the exclusion-constraint code that prevented users · e11cfa87

Tom Lane authored 14 years ago

from defining non-self-conflicting constraints.

Jeff Davis

Note: I (tgl) objected to removing this check in 9.0 on the grounds that it
was an important sanity check in new, poorly tested code.  However, it should
be all right to remove it for 9.1, since we'll get field testing from the
9.0 branch.

e11cfa87

12 Jul, 2010 1 commit

Make NestLoop plan nodes pass outer-relation variables into their inner · 53e75768

Tom Lane authored 14 years ago

relation using the general PARAM_EXEC executor parameter mechanism, rather
than the ad-hoc kluge of passing the outer tuple down through ExecReScan.
The previous method was hard to understand and could never be extended to
handle parameters coming from multiple join levels. This patch doesn't
change the set of possible plans nor have any significant performance effect,
but it's necessary infrastructure for future generalization of the concept
of an inner indexscan plan.

ExecReScan's second parameter is now unused, so it's removed.

53e75768

09 Jul, 2010 1 commit

Add a hook in ExecCheckRTPerms(). · f4122a8d

Robert Haas authored 14 years ago

This hook allows a loadable module to gain control when table permissions
are checked.  It is expected to be used by an eventual SE-PostgreSQL
implementation, but there are other possible applications as well.  A
sample contrib module can be found in the archives at:

http://archives.postgresql.org/pgsql-hackers/2010-05/msg01095.php

Robert Haas and Stephen Frost

f4122a8d

06 Jul, 2010 1 commit
- pgindent run for 9.0, second run · 239d769e
  Bruce Momjian authored 14 years ago
  
  239d769e
29 May, 2010 1 commit
- Add C comment that we will have to remove an exclusion constraint check · 71902205
  Bruce Momjian authored 14 years ago
```
if we ever implement '<>' index opclasses.

Jeff Davis
```
  71902205
28 May, 2010 1 commit

Rejigger mergejoin logic so that a tuple with a null in the first merge column · f39d57b8

Tom Lane authored 14 years ago

is treated like end-of-input, if nulls sort last in that column and we are not
doing outer-join filling for that input. In such a case, the tuple cannot
join to anything from the other input (because we assume mergejoinable
operators are strict), and neither can any tuple following it in the sort
order. If we're not interested in doing outer-join filling we can just
pretend the tuple and its successors aren't there at all. This can save a
great deal of time in situations where there are many nulls in the join
column, as in a recent example from Scott Marlowe. Also, since the planner
tends to not count nulls in its mergejoin scan selectivity estimates, this
is an important fix to make the runtime behavior more like the estimate.

I regard this as an omission in the patch I wrote years ago to teach mergejoin
that tuples containing nulls aren't joinable, so I'm back-patching it. But
only to 8.3 --- in older versions, we didn't have a solid notion of whether
nulls sort high or low, so attempting to apply this optimization could break
things.

f39d57b8

28 Apr, 2010 1 commit

Introduce wal_level GUC to explicitly control if information needed for · 9b8a7332

Heikki Linnakangas authored 14 years ago

archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.

Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.

Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.

9b8a7332

21 Mar, 2010 1 commit
- Message tuning · c248d171
  Peter Eisentraut authored 15 years ago
  
  c248d171
19 Mar, 2010 1 commit

Modify error context callback functions to not assume that they can fetch · a836abe9

Tom Lane authored 15 years ago

catalog entries via SearchSysCache and related operations. Although, at the
time that these callbacks are called by elog.c, we have not officially aborted
the current transaction, it still seems rather risky to initiate any new
catalog fetches. In all these cases the needed information is readily
available in the caller and so it's just a matter of a bit of extra notation
to pass it to the callback.

Per crash report from Dennis Koegel. I've concluded that the real fix for
his problem is to clear the error context stack at entry to proc_exit, but
it still seems like a good idea to make the callbacks a bit less fragile
for other cases.

Backpatch to 8.4. We could go further back, but the patch doesn't apply
cleanly. In the absence of proof that this fixes something and isn't just
paranoia, I'm not going to expend the effort.

a836abe9

26 Feb, 2010 1 commit
- pgindent run for 9.0 · 65e806cb
  Bruce Momjian authored 15 years ago
  
  65e806cb
20 Feb, 2010 1 commit

Clean up handling of XactReadOnly and RecoveryInProgress checks. · 05d8a561

Tom Lane authored 15 years ago

Add some checks that seem logically necessary, in particular let's make
real sure that HS slave sessions cannot create temp tables. (If they did
they would think that temp tables belonging to the master's session with
the same BackendId were theirs. We *must* not allow myTempNamespace to
become set in a slave session.)

Change setval() and nextval() so that they are only allowed on temp sequences
in a read-only transaction. This seems consistent with what we allow for
table modifications in read-only transactions. Since an HS slave can't have a
temp sequence, this also provides a nicer cure for the setval PANIC reported
by Erik Rijkers.

Make the error messages more uniform, and have them mention the specific
command being complained of. This seems worth the trifling amount of extra
code, since people are likely to see such messages a lot more than before.

05d8a561

18 Feb, 2010 1 commit

Fix ExecEvalArrayRef to pass down the old value of the array element or slice · 11d5ba97

Tom Lane authored 15 years ago

being assigned to, in case the expression to be assigned is a FieldStore that
would need to modify that value. The need for this was foreseen some time
ago, but not implemented then because we did not have arrays of composites.
Now we do, but the point evidently got overlooked in that patch. Net result
is that updating a field of an array element doesn't work right, as
illustrated if you try the new regression test on an unpatched backend.
Noted while experimenting with EXPLAIN VERBOSE, which has also got some issues
in this area.

Backpatch to 8.3, where arrays of composites were introduced.

11d5ba97

14 Feb, 2010 1 commit

Wrap calls to SearchSysCache and related functions using macros. · e26c539e

Robert Haas authored 15 years ago

The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4).  This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.

Design and review by Tom Lane.

e26c539e

12 Feb, 2010 1 commit

Extend the set of frame options supported for window functions. · ec4be2ee

Tom Lane authored 15 years ago

This patch allows the frame to start from CURRENT ROW (in either RANGE or
ROWS mode), and it also adds support for ROWS n PRECEDING and ROWS n FOLLOWING
start and end points.  (RANGE value PRECEDING/FOLLOWING isn't there yet ---
the grammar works, but that's all.)

Hitoshi Harada, reviewed by Pavel Stehule

ec4be2ee

09 Feb, 2010 1 commit

Fix up rickety handling of relation-truncation interlocks. · cbe9d6be

Tom Lane authored 15 years ago

Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr
relation entries, so that they will get reset to InvalidBlockNumber whenever
an smgr-level flush happens. Because we now send smgr invalidation messages
immediately (not at end of transaction) when a relation truncation occurs,
this ensures that other backends will reset their values before they next
access the relation. We no longer need the unreliable assumption that a
VACUUM that's doing a truncation will hold its AccessExclusive lock until
commit --- in fact, we can intentionally release that lock as soon as we've
completed the truncation. This patch therefore reverts (most of) Alvaro's
patch of 2009-11-10, as well as my marginal hacking on it yesterday. We can
also get rid of assorted no-longer-needed relcache flushes, which are far more
expensive than an smgr flush because they kill a lot more state.

In passing this patch fixes smgr_redo's failure to perform visibility-map
truncation, and cleans up some rather dubious assumptions in freespace.c and
visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of
date.

cbe9d6be

08 Feb, 2010 2 commits

Create an official API function for C functions to use to check if they are · d5768dce

Tom Lane authored 15 years ago

being called as aggregates, and to get the aggregate transition state memory
context if needed. Use it instead of poking directly into AggState and
WindowAggState in places that shouldn't know so much.

We should have done this in 8.4, probably, but better late than never.

Revised version of a patch by Hitoshi Harada.

d5768dce

Remove old-style VACUUM FULL (which was known for a little while as · 0a469c87

Tom Lane authored 15 years ago

VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
Per discussion, the use case for this method of vacuuming is no longer large
enough to justify maintaining it; not to mention that we don't wish to invest
the work that would be needed to make it play nicely with Hot Standby.

Aside from the code directly related to old-style VACUUM FULL, this commit
removes support for certain WAL record types that could only be generated
within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
nontransactional generation of cache invalidation sinval messages (the last
being the sticking point for Hot Standby).

We still have to retain all code that copes with finding HEAP_MOVED_OFF and
HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long
as we want to support in-place update from pre-9.0 databases.

0a469c87

07 Feb, 2010 1 commit

Create a "relation mapping" infrastructure to support changing the relfilenodes · b9b8831a

Tom Lane authored 15 years ago

of shared or nailed system catalogs.  This has two key benefits:

* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.

* We no longer have to use an unsafe reindex-in-place approach for reindexing
  shared catalogs.

CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.

Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.

This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch.  As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.

b9b8831a

03 Feb, 2010 1 commit

Move the responsibility of writing a "unlogged WAL operation" record from · 9de778b2

Heikki Linnakangas authored 15 years ago

heap_sync() to the callers, because heap_sync() is sometimes called even
if the operation itself is WAL-logged. This eliminates the bogus unlogged
records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.

9de778b2

01 Feb, 2010 1 commit

Augment EXPLAIN output with more details on Hash nodes. · 42a8ab0a

Robert Haas authored 15 years ago

We show the number of buckets, the number of batches (and also the original
number if it has changed), and the peak space used by the hash table. Minor
executor changes to track peak space used.

42a8ab0a

31 Jan, 2010 1 commit

Fix memory leak created by deferrable-index-constraints patches. · 034fffbf

Tom Lane authored 15 years ago

We need to free the OID list returned by ExecInsertIndexTuples to avoid
a query-lifespan memory leak.  When many rows require rechecking, this
can be a significant leak --- it's even more than the space used for the
queued trigger events.

Dean Rasheed

034fffbf

28 Jan, 2010 1 commit
- Type table feature · e7b3349a
  Peter Eisentraut authored 15 years ago
```
This adds the CREATE TABLE name OF type command, per SQL standard.
```
  e7b3349a
15 Jan, 2010 1 commit

Introduce Streaming Replication. · 40f908bd

Heikki Linnakangas authored 15 years ago

This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me

40f908bd

11 Jan, 2010 1 commit

Improve ExecEvalVar's handling of whole-row variables in cases where the · 292176a1

Tom Lane authored 15 years ago

rowtype contains dropped columns. Sometimes the input tuple will be formed
from a select targetlist in which dropped columns are filled with a NULL
of an arbitrary type (the planner typically uses INT4, since it can't tell
what type the dropped column really was). So we need to relax the rowtype
compatibility check to not insist on physical compatibility if the actual
column value is NULL.

In principle we might need to do this for functions returning composite
types, too (see tupledesc_match()). In practice there doesn't seem to be
a bug there, probably because the function will be using the same cached
rowtype descriptor as the caller. Fixing that code path would require
significant rearrangement, so I left it alone for now.

Per complaint from Filip Rembialkowski.

292176a1

09 Jan, 2010 1 commit

Make ExecEvalFieldSelect throw a more intelligible error if it's asked to · 85113bcf

Tom Lane authored 15 years ago

extract a system column, and remove a couple of lines that are useless
in light of the fact that we aren't ever going to support this case. There
isn't much point in trying to make this work because a tuple Datum does
not carry many of the system columns. Per experimentation with a case
reported by Dean Rasheed; we'll have to fix his problem somewhere else.

85113bcf

08 Jan, 2010 1 commit

Fix oversight in EvalPlanQualFetch: after failing to lock a tuple because · 217dc525

Tom Lane authored 15 years ago

someone else has just updated it, we have to set priorXmax to that tuple's
xmax (ie, the XID of the other xact that updated it) before looping back to
examine the next tuple. Obviously, the next tuple in the update chain should
have that XID as its xmin, not the same xmin as the preceding tuple that we
had been trying to lock. The mismatch would cause the EvalPlanQual logic to
decide that the tuple chain ended in a deletion, when actually there was a
live tuple that should have been found.

I inserted this error when recently adding logic to EvalPlanQual to make it
lock tuples before returning them (as opposed to the old method in which the
lock would occur much later, causing a great deal of work to be wasted if we
only then discover someone else updated it). Sigh. Per today's report from
Takahiro Itagaki of inconsistent results during pgbench runs.

217dc525

06 Jan, 2010 1 commit

Preserve relfilenodes: · f98fbc78

Bruce Momjian authored 15 years ago

Add support to pg_dump --binary-upgrade to preserve all relfilenodes,
for use by pg_migrator.

f98fbc78

05 Jan, 2010 1 commit

Add support for doing FULL JOIN ON FALSE. While this is really a rather · 90f4c2d9

Tom Lane authored 15 years ago

peculiar variant of UNION ALL, and so wouldn't likely get written directly
as-is, it's possible for it to arise as a result of simplification of
less-obviously-silly queries. In particular, now that we can do flattening
of subqueries that have constant outputs and are underneath an outer join,
it's possible for the case to result from simplification of queries of the
type exhibited in bug #5263. Back-patch to 8.4 to avoid a functionality
regression for this type of query.

90f4c2d9

04 Jan, 2010 1 commit

When estimating the selectivity of an inequality "column > constant" or · 40608e7f

Tom Lane authored 15 years ago

"column < constant", and the comparison value is in the first or last
histogram bin or outside the histogram entirely, try to fetch the actual
column min or max value using an index scan (if there is an index on the
column). If successful, replace the lower or upper histogram bound with
that value before carrying on with the estimate. This limits the
estimation error caused by moving min/max values when the comparison
value is close to the min or max. Per a complaint from Josh Berkus.

It is tempting to consider using this mechanism for mergejoinscansel as well,
but that would inject index fetches into main-line join estimation not just
endpoint cases. I'm refraining from that until we can get a better handle
on the costs of doing this type of lookup.

40608e7f

02 Jan, 2010 2 commits

check_exclusion_constraint didn't actually work correctly for index · 2b59274c

Tom Lane authored 15 years ago

expressions: FormIndexDatum requires the estate's scantuple to already point
at the tuple the values are supposedly being extracted from.  Adjust test
case so that this type of confusion will be exposed.
Per report from hubert depesz lubaczewski.

2b59274c

Update copyright for the year 2010. · 02398008
Bruce Momjian authored 15 years ago

02398008