Commit bb446b68 authored by Tom Lane's avatar Tom Lane

Support synchronization of snapshots through an export/import procedure.

A transaction can export a snapshot with pg_export_snapshot(), and then
others can import it with SET TRANSACTION SNAPSHOT.  The data does not
leave the server so there are not security issues.  A snapshot can only
be imported while the exporting transaction is still running, and there
are some other restrictions.

I'm not totally convinced that we've covered all the bases for SSI (true
serializable) mode, but it works fine for lesser isolation modes.

Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified
by Tom Lane
parent b436c72f
...@@ -13802,6 +13802,14 @@ SELECT typlen FROM pg_type WHERE oid = pg_typeof(33); ...@@ -13802,6 +13802,14 @@ SELECT typlen FROM pg_type WHERE oid = pg_typeof(33);
<sect1 id="functions-admin"> <sect1 id="functions-admin">
<title>System Administration Functions</title> <title>System Administration Functions</title>
<para>
The functions described in this section are used to control and
monitor a <productname>PostgreSQL</> installation.
</para>
<sect2 id="functions-admin-set">
<title>Configuration Settings Functions</title>
<para> <para>
<xref linkend="functions-admin-set-table"> shows the functions <xref linkend="functions-admin-set-table"> shows the functions
available to query and alter run-time configuration parameters. available to query and alter run-time configuration parameters.
...@@ -13889,6 +13897,11 @@ SELECT set_config('log_statement_stats', 'off', false); ...@@ -13889,6 +13897,11 @@ SELECT set_config('log_statement_stats', 'off', false);
</programlisting> </programlisting>
</para> </para>
</sect2>
<sect2 id="functions-admin-signal">
<title>Server Signalling Functions</title>
<indexterm> <indexterm>
<primary>pg_cancel_backend</primary> <primary>pg_cancel_backend</primary>
</indexterm> </indexterm>
...@@ -13985,6 +13998,11 @@ SELECT set_config('log_statement_stats', 'off', false); ...@@ -13985,6 +13998,11 @@ SELECT set_config('log_statement_stats', 'off', false);
subprocess. subprocess.
</para> </para>
</sect2>
<sect2 id="functions-admin-backup">
<title>Backup Control Functions</title>
<indexterm> <indexterm>
<primary>backup</primary> <primary>backup</primary>
</indexterm> </indexterm>
...@@ -14181,6 +14199,11 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); ...@@ -14181,6 +14199,11 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<xref linkend="continuous-archiving">. <xref linkend="continuous-archiving">.
</para> </para>
</sect2>
<sect2 id="functions-recovery-control">
<title>Recovery Control Functions</title>
<indexterm> <indexterm>
<primary>pg_is_in_recovery</primary> <primary>pg_is_in_recovery</primary>
</indexterm> </indexterm>
...@@ -14198,7 +14221,7 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); ...@@ -14198,7 +14221,7 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
The functions shown in <xref The functions shown in <xref
linkend="functions-recovery-info-table"> provide information linkend="functions-recovery-info-table"> provide information
about the current status of the standby. about the current status of the standby.
These functions may be executed during both recovery and in normal running. These functions may be executed both during recovery and in normal running.
</para> </para>
<table id="functions-recovery-info-table"> <table id="functions-recovery-info-table">
...@@ -14333,6 +14356,87 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); ...@@ -14333,6 +14356,87 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
the pause, the rate of WAL generation and available disk space. the pause, the rate of WAL generation and available disk space.
</para> </para>
</sect2>
<sect2 id="functions-snapshot-synchronization">
<title>Snapshot Synchronization Functions</title>
<indexterm>
<primary>pg_export_snapshot</primary>
</indexterm>
<para>
<productname>PostgreSQL</> allows database sessions to synchronize their
snapshots. A <firstterm>snapshot</> determines which data is visible to the
transaction that is using the snapshot. Synchronized snapshots are
necessary when two or more sessions need to see identical content in the
database. If two sessions just start their transactions independently,
there is always a possibility that some third transaction commits
between the executions of the two <command>START TRANSACTION</> commands,
so that one session sees the effects of that transaction and the other
does not.
</para>
<para>
To solve this problem, <productname>PostgreSQL</> allows a transaction to
<firstterm>export</> the snapshot it is using. As long as the exporting
transaction remains open, other transactions can <firstterm>import</> its
snapshot, and thereby be guaranteed that they see exactly the same view
of the database that the first transaction sees. But note that any
database changes made by any one of these transactions remain invisible
to the other transactions, as is usual for changes made by uncommitted
transactions. So the transactions are synchronized with respect to
pre-existing data, but act normally for changes they make themselves.
</para>
<para>
Snapshots are exported with the <function>pg_export_snapshot</> function,
shown in <xref linkend="functions-snapshot-synchronization-table">, and
imported with the <xref linkend="sql-set-transaction"> command.
</para>
<table id="functions-snapshot-synchronization-table">
<title>Snapshot Synchronization Functions</title>
<tgroup cols="3">
<thead>
<row><entry>Name</entry> <entry>Return Type</entry> <entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<literal><function>pg_export_snapshot()</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>Save the current snapshot and return its identifier</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
The function <function>pg_export_snapshot</> saves the current snapshot
and returns a <type>text</> string identifying the snapshot. This string
must be passed (outside the database) to clients that want to import the
snapshot. The snapshot is available for import only until the end of the
transaction that exported it. A transaction can export more than one
snapshot, if needed. Note that doing so is only useful in <literal>READ
COMMITTED</> transactions, since in <literal>REPEATABLE READ</> and
higher isolation levels, transactions use the same snapshot throughout
their lifetime. Once a transaction has exported any snapshots, it cannot
be prepared with <xref linkend="sql-prepare-transaction">.
</para>
<para>
See <xref linkend="sql-set-transaction"> for details of how to use an
exported snapshot.
</para>
</sect2>
<sect2 id="functions-admin-dbobject">
<title>Database Object Management Functions</title>
<para> <para>
The functions shown in <xref linkend="functions-admin-dbsize"> calculate The functions shown in <xref linkend="functions-admin-dbsize"> calculate
the disk space usage of database objects. the disk space usage of database objects.
...@@ -14591,9 +14695,14 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); ...@@ -14591,9 +14695,14 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
the relation. the relation.
</para> </para>
</sect2>
<sect2 id="functions-admin-genfile">
<title>Generic File Access Functions</title>
<para> <para>
The functions shown in <xref The functions shown in <xref
linkend="functions-admin-genfile"> provide native access to linkend="functions-admin-genfile-table"> provide native access to
files on the machine hosting the server. Only files within the files on the machine hosting the server. Only files within the
database cluster directory and the <varname>log_directory</> can be database cluster directory and the <varname>log_directory</> can be
accessed. Use a relative path for files in the cluster directory, accessed. Use a relative path for files in the cluster directory,
...@@ -14601,7 +14710,7 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); ...@@ -14601,7 +14710,7 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
for log files. Use of these functions is restricted to superusers. for log files. Use of these functions is restricted to superusers.
</para> </para>
<table id="functions-admin-genfile"> <table id="functions-admin-genfile-table">
<title>Generic File Access Functions</title> <title>Generic File Access Functions</title>
<tgroup cols="3"> <tgroup cols="3">
<thead> <thead>
...@@ -14694,13 +14803,18 @@ SELECT (pg_stat_file('filename')).modification; ...@@ -14694,13 +14803,18 @@ SELECT (pg_stat_file('filename')).modification;
</programlisting> </programlisting>
</para> </para>
</sect2>
<sect2 id="functions-advisory-locks">
<title>Advisory Lock Functions</title>
<para> <para>
The functions shown in <xref linkend="functions-advisory-locks"> manage The functions shown in <xref linkend="functions-advisory-locks-table">
advisory locks. For details about proper use of these functions, see manage advisory locks. For details about proper use of these functions,
<xref linkend="advisory-locks">. see <xref linkend="advisory-locks">.
</para> </para>
<table id="functions-advisory-locks"> <table id="functions-advisory-locks-table">
<title>Advisory Lock Functions</title> <title>Advisory Lock Functions</title>
<tgroup cols="3"> <tgroup cols="3">
<thead> <thead>
...@@ -14972,6 +15086,8 @@ SELECT (pg_stat_file('filename')).modification; ...@@ -14972,6 +15086,8 @@ SELECT (pg_stat_file('filename')).modification;
at session end, even if the client disconnects ungracefully.) at session end, even if the client disconnects ungracefully.)
</para> </para>
</sect2>
</sect1> </sect1>
<sect1 id="functions-trigger"> <sect1 id="functions-trigger">
......
...@@ -33,6 +33,7 @@ ...@@ -33,6 +33,7 @@
<refsynopsisdiv> <refsynopsisdiv>
<synopsis> <synopsis>
SET TRANSACTION <replaceable class="parameter">transaction_mode</replaceable> [, ...] SET TRANSACTION <replaceable class="parameter">transaction_mode</replaceable> [, ...]
SET TRANSACTION SNAPSHOT <replaceable class="parameter">snapshot_id</replaceable>
SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transaction_mode</replaceable> [, ...] SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transaction_mode</replaceable> [, ...]
<phrase>where <replaceable class="parameter">transaction_mode</replaceable> is one of:</phrase> <phrase>where <replaceable class="parameter">transaction_mode</replaceable> is one of:</phrase>
...@@ -60,6 +61,8 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa ...@@ -60,6 +61,8 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa
The available transaction characteristics are the transaction The available transaction characteristics are the transaction
isolation level, the transaction access mode (read/write or isolation level, the transaction access mode (read/write or
read-only), and the deferrable mode. read-only), and the deferrable mode.
In addition, a snapshot can be selected, though only for the current
transaction, not as a session default.
</para> </para>
<para> <para>
...@@ -98,7 +101,7 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa ...@@ -98,7 +101,7 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa
serializable transactions would create a situation which could not serializable transactions would create a situation which could not
have occurred for any serial (one-at-a-time) execution of those have occurred for any serial (one-at-a-time) execution of those
transactions, one of them will be rolled back with a transactions, one of them will be rolled back with a
<literal>serialization_failure</literal> <literal>SQLSTATE</literal>. <literal>serialization_failure</literal> error.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -139,13 +142,41 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa ...@@ -139,13 +142,41 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa
<para> <para>
The <literal>DEFERRABLE</literal> transaction property has no effect The <literal>DEFERRABLE</literal> transaction property has no effect
unless the transaction is also <literal>SERIALIZABLE</literal> and unless the transaction is also <literal>SERIALIZABLE</literal> and
<literal>READ ONLY</literal>. When all of these properties are set on a <literal>READ ONLY</literal>. When all three of these properties are
selected for a
transaction, the transaction may block when first acquiring its snapshot, transaction, the transaction may block when first acquiring its snapshot,
after which it is able to run without the normal overhead of a after which it is able to run without the normal overhead of a
<literal>SERIALIZABLE</literal> transaction and without any risk of <literal>SERIALIZABLE</literal> transaction and without any risk of
contributing to or being canceled by a serialization failure. This mode contributing to or being canceled by a serialization failure. This mode
is well suited for long-running reports or backups. is well suited for long-running reports or backups.
</para> </para>
<para>
The <literal>SET TRANSACTION SNAPSHOT</literal> command allows a new
transaction to run with the same <firstterm>snapshot</> as an existing
transaction. The pre-existing transaction must have exported its snapshot
with the <literal>pg_export_snapshot</literal> function (see <xref
linkend="functions-snapshot-synchronization">). That function returns a
snapshot identifier, which must be given to <literal>SET TRANSACTION
SNAPSHOT</literal> to specify which snapshot is to be imported. The
identifier must be written as a string literal in this command, for example
<literal>'000003A1-1'</>.
<literal>SET TRANSACTION SNAPSHOT</literal> can only be executed at the
start of a transaction, before the first query or
data-modification statement (<command>SELECT</command>,
<command>INSERT</command>, <command>DELETE</command>,
<command>UPDATE</command>, <command>FETCH</command>, or
<command>COPY</command>) of the transaction. Furthermore, the transaction
must already be set to <literal>SERIALIZABLE</literal> or
<literal>REPEATABLE READ</literal> isolation level (otherwise, the snapshot
would be discarded immediately, since <literal>READ COMMITTED</> mode takes
a new snapshot for each command). If the importing transaction uses
<literal>SERIALIZABLE</literal> isolation level, then the transaction that
exported the snapshot must also use that isolation level. Also, a
non-read-only serializable transaction cannot import a snapshot from a
read-only transaction.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
...@@ -163,6 +194,8 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa ...@@ -163,6 +194,8 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa
by instead specifying the desired <replaceable by instead specifying the desired <replaceable
class="parameter">transaction_modes</replaceable> in class="parameter">transaction_modes</replaceable> in
<command>BEGIN</command> or <command>START TRANSACTION</command>. <command>BEGIN</command> or <command>START TRANSACTION</command>.
But that option is not available for <command>SET TRANSACTION
SNAPSHOT</command>.
</para> </para>
<para> <para>
...@@ -178,11 +211,45 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa ...@@ -178,11 +211,45 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa
</para> </para>
</refsect1> </refsect1>
<refsect1>
<title>Examples</title>
<para>
To begin a new transaction with the same snapshot as an already
existing transaction, first export the snapshot from the existing
transaction. That will return the snapshot identifier, for example:
<programlisting>
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT pg_export_snapshot();
pg_export_snapshot
--------------------
000003A1-1
(1 row)
</programlisting>
Then give the snapshot identifier in a <command>SET TRANSACTION
SNAPSHOT</command> command at the beginning of the newly opened
transaction:
<programlisting>
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION SNAPSHOT '000003A1-1';
</programlisting>
</para>
</refsect1>
<refsect1 id="R1-SQL-SET-TRANSACTION-3"> <refsect1 id="R1-SQL-SET-TRANSACTION-3">
<title>Compatibility</title> <title>Compatibility</title>
<para> <para>
Both commands are defined in the <acronym>SQL</acronym> standard. These commands are defined in the <acronym>SQL</acronym> standard,
except for the <literal>DEFERRABLE</literal> transaction mode
and the <command>SET TRANSACTION SNAPSHOT</> form, which are
<productname>PostgreSQL</productname> extensions.
</para>
<para>
<literal>SERIALIZABLE</literal> is the default transaction <literal>SERIALIZABLE</literal> is the default transaction
isolation level in the standard. In isolation level in the standard. In
<productname>PostgreSQL</productname> the default is ordinarily <productname>PostgreSQL</productname> the default is ordinarily
...@@ -197,12 +264,6 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa ...@@ -197,12 +264,6 @@ SET SESSION CHARACTERISTICS AS TRANSACTION <replaceable class="parameter">transa
not implemented in the <productname>PostgreSQL</productname> server. not implemented in the <productname>PostgreSQL</productname> server.
</para> </para>
<para>
The <literal>DEFERRABLE</literal>
<replaceable class="parameter">transaction_mode</replaceable>
is a <productname>PostgreSQL</productname> language extension.
</para>
<para> <para>
The SQL standard requires commas between successive <replaceable The SQL standard requires commas between successive <replaceable
class="parameter">transaction_modes</replaceable>, but for historical class="parameter">transaction_modes</replaceable>, but for historical
......
...@@ -87,6 +87,11 @@ Item ...@@ -87,6 +87,11 @@ Item
<entry>Subdirectory containing information about committed serializable transactions</entry> <entry>Subdirectory containing information about committed serializable transactions</entry>
</row> </row>
<row>
<entry><filename>pg_snapshots</></entry>
<entry>Subdirectory containing exported snapshots</entry>
</row>
<row> <row>
<entry><filename>pg_stat_tmp</></entry> <entry><filename>pg_stat_tmp</></entry>
<entry>Subdirectory containing temporary files for the statistics <entry>Subdirectory containing temporary files for the statistics
......
...@@ -2067,6 +2067,16 @@ PrepareTransaction(void) ...@@ -2067,6 +2067,16 @@ PrepareTransaction(void)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot PREPARE a transaction that has operated on temporary tables"))); errmsg("cannot PREPARE a transaction that has operated on temporary tables")));
/*
* Likewise, don't allow PREPARE after pg_export_snapshot. This could be
* supported if we added cleanup logic to twophase.c, but for now it
* doesn't seem worth the trouble.
*/
if (XactHasExportedSnapshots())
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot PREPARE a transaction that has exported snapshots")));
/* Prevent cancel/die interrupt while cleaning up */ /* Prevent cancel/die interrupt while cleaning up */
HOLD_INTERRUPTS(); HOLD_INTERRUPTS();
......
...@@ -58,6 +58,7 @@ ...@@ -58,6 +58,7 @@
#include "utils/guc.h" #include "utils/guc.h"
#include "utils/ps_status.h" #include "utils/ps_status.h"
#include "utils/relmapper.h" #include "utils/relmapper.h"
#include "utils/snapmgr.h"
#include "utils/timestamp.h" #include "utils/timestamp.h"
#include "pg_trace.h" #include "pg_trace.h"
...@@ -6381,6 +6382,12 @@ StartupXLOG(void) ...@@ -6381,6 +6382,12 @@ StartupXLOG(void)
*/ */
ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP); ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP);
/*
* Likewise, delete any saved transaction snapshot files that got
* left behind by crashed backends.
*/
DeleteAllExportedSnapshotFiles();
/* /*
* Initialize for Hot Standby, if enabled. We won't let backends in * Initialize for Hot Standby, if enabled. We won't let backends in
* yet, not until we've reached the min recovery point specified in * yet, not until we've reached the min recovery point specified in
......
...@@ -553,8 +553,8 @@ static void processCASbits(int cas_bits, int location, const char *constrType, ...@@ -553,8 +553,8 @@ static void processCASbits(int cas_bits, int location, const char *constrType,
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SHARE SERIALIZABLE SERVER SESSION SESSION_USER SET SETOF SHARE
SHOW SIMILAR SIMPLE SMALLINT SOME STABLE STANDALONE_P START STATEMENT SHOW SIMILAR SIMPLE SMALLINT SNAPSHOT SOME STABLE STANDALONE_P START
STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING STATEMENT STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING
SYMMETRIC SYSID SYSTEM_P SYMMETRIC SYSID SYSTEM_P
TABLE TABLES TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN TIME TIMESTAMP TABLE TABLES TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN TIME TIMESTAMP
...@@ -1352,6 +1352,15 @@ set_rest: /* Generic SET syntaxes: */ ...@@ -1352,6 +1352,15 @@ set_rest: /* Generic SET syntaxes: */
n->args = list_make1(makeStringConst($3 == XMLOPTION_DOCUMENT ? "DOCUMENT" : "CONTENT", @3)); n->args = list_make1(makeStringConst($3 == XMLOPTION_DOCUMENT ? "DOCUMENT" : "CONTENT", @3));
$$ = n; $$ = n;
} }
/* Special syntaxes invented by PostgreSQL: */
| TRANSACTION SNAPSHOT Sconst
{
VariableSetStmt *n = makeNode(VariableSetStmt);
n->kind = VAR_SET_MULTI;
n->name = "TRANSACTION SNAPSHOT";
n->args = list_make1(makeStringConst($3, @3));
$$ = n;
}
; ;
var_name: ColId { $$ = $1; } var_name: ColId { $$ = $1; }
......
...@@ -1122,6 +1122,28 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum) ...@@ -1122,6 +1122,28 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
return result; return result;
} }
/*
* GetMaxSnapshotXidCount -- get max size for snapshot XID array
*
* We have to export this for use by snapmgr.c.
*/
int
GetMaxSnapshotXidCount(void)
{
return procArray->maxProcs;
}
/*
* GetMaxSnapshotSubxidCount -- get max size for snapshot sub-XID array
*
* We have to export this for use by snapmgr.c.
*/
int
GetMaxSnapshotSubxidCount(void)
{
return TOTAL_MAX_CACHED_SUBXIDS;
}
/* /*
* GetSnapshotData -- returns information about running transactions. * GetSnapshotData -- returns information about running transactions.
* *
...@@ -1187,14 +1209,14 @@ GetSnapshotData(Snapshot snapshot) ...@@ -1187,14 +1209,14 @@ GetSnapshotData(Snapshot snapshot)
* we are in recovery, see later comments. * we are in recovery, see later comments.
*/ */
snapshot->xip = (TransactionId *) snapshot->xip = (TransactionId *)
malloc(arrayP->maxProcs * sizeof(TransactionId)); malloc(GetMaxSnapshotXidCount() * sizeof(TransactionId));
if (snapshot->xip == NULL) if (snapshot->xip == NULL)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY), (errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of memory"))); errmsg("out of memory")));
Assert(snapshot->subxip == NULL); Assert(snapshot->subxip == NULL);
snapshot->subxip = (TransactionId *) snapshot->subxip = (TransactionId *)
malloc(TOTAL_MAX_CACHED_SUBXIDS * sizeof(TransactionId)); malloc(GetMaxSnapshotSubxidCount() * sizeof(TransactionId));
if (snapshot->subxip == NULL) if (snapshot->subxip == NULL)
ereport(ERROR, ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY), (errcode(ERRCODE_OUT_OF_MEMORY),
...@@ -1376,6 +1398,77 @@ GetSnapshotData(Snapshot snapshot) ...@@ -1376,6 +1398,77 @@ GetSnapshotData(Snapshot snapshot)
return snapshot; return snapshot;
} }
/*
* ProcArrayInstallImportedXmin -- install imported xmin into MyProc->xmin
*
* This is called when installing a snapshot imported from another
* transaction. To ensure that OldestXmin doesn't go backwards, we must
* check that the source transaction is still running, and we'd better do
* that atomically with installing the new xmin.
*
* Returns TRUE if successful, FALSE if source xact is no longer running.
*/
bool
ProcArrayInstallImportedXmin(TransactionId xmin, TransactionId sourcexid)
{
bool result = false;
ProcArrayStruct *arrayP = procArray;
int index;
Assert(TransactionIdIsNormal(xmin));
if (!TransactionIdIsNormal(sourcexid))
return false;
/* Get lock so source xact can't end while we're doing this */
LWLockAcquire(ProcArrayLock, LW_SHARED);
for (index = 0; index < arrayP->numProcs; index++)
{
volatile PGPROC *proc = arrayP->procs[index];
TransactionId xid;
/* Ignore procs running LAZY VACUUM */
if (proc->vacuumFlags & PROC_IN_VACUUM)
continue;
xid = proc->xid; /* fetch just once */
if (xid != sourcexid)
continue;
/*
* We check the transaction's database ID for paranoia's sake: if
* it's in another DB then its xmin does not cover us. Caller should
* have detected this already, so we just treat any funny cases as
* "transaction not found".
*/
if (proc->databaseId != MyDatabaseId)
continue;
/*
* Likewise, let's just make real sure its xmin does cover us.
*/
xid = proc->xmin; /* fetch just once */
if (!TransactionIdIsNormal(xid) ||
!TransactionIdPrecedesOrEquals(xid, xmin))
continue;
/*
* We're good. Install the new xmin. As in GetSnapshotData, set
* TransactionXmin too. (Note that because snapmgr.c called
* GetSnapshotData first, we'll be overwriting a valid xmin here,
* so we don't check that.)
*/
MyProc->xmin = TransactionXmin = xmin;
result = true;
break;
}
LWLockRelease(ProcArrayLock);
return result;
}
/* /*
* GetRunningTransactionData -- returns information about running transactions. * GetRunningTransactionData -- returns information about running transactions.
* *
......
...@@ -147,6 +147,8 @@ ...@@ -147,6 +147,8 @@
* *
* predicate lock maintenance * predicate lock maintenance
* GetSerializableTransactionSnapshot(Snapshot snapshot) * GetSerializableTransactionSnapshot(Snapshot snapshot)
* SetSerializableTransactionSnapshot(Snapshot snapshot,
* TransactionId sourcexid)
* RegisterPredicateLockingXid(void) * RegisterPredicateLockingXid(void)
* PredicateLockRelation(Relation relation, Snapshot snapshot) * PredicateLockRelation(Relation relation, Snapshot snapshot)
* PredicateLockPage(Relation relation, BlockNumber blkno, * PredicateLockPage(Relation relation, BlockNumber blkno,
...@@ -417,7 +419,8 @@ static void OldSerXidSetActiveSerXmin(TransactionId xid); ...@@ -417,7 +419,8 @@ static void OldSerXidSetActiveSerXmin(TransactionId xid);
static uint32 predicatelock_hash(const void *key, Size keysize); static uint32 predicatelock_hash(const void *key, Size keysize);
static void SummarizeOldestCommittedSxact(void); static void SummarizeOldestCommittedSxact(void);
static Snapshot GetSafeSnapshot(Snapshot snapshot); static Snapshot GetSafeSnapshot(Snapshot snapshot);
static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot); static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot,
TransactionId sourcexid);
static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag); static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag);
static bool GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag, static bool GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag,
PREDICATELOCKTARGETTAG *parent); PREDICATELOCKTARGETTAG *parent);
...@@ -1505,7 +1508,8 @@ GetSafeSnapshot(Snapshot origSnapshot) ...@@ -1505,7 +1508,8 @@ GetSafeSnapshot(Snapshot origSnapshot)
* our caller passed to us. The pointer returned is actually the same * our caller passed to us. The pointer returned is actually the same
* one passed to it, but we avoid assuming that here. * one passed to it, but we avoid assuming that here.
*/ */
snapshot = GetSerializableTransactionSnapshotInt(origSnapshot); snapshot = GetSerializableTransactionSnapshotInt(origSnapshot,
InvalidTransactionId);
if (MySerializableXact == InvalidSerializableXact) if (MySerializableXact == InvalidSerializableXact)
return snapshot; /* no concurrent r/w xacts; it's safe */ return snapshot; /* no concurrent r/w xacts; it's safe */
...@@ -1574,11 +1578,52 @@ GetSerializableTransactionSnapshot(Snapshot snapshot) ...@@ -1574,11 +1578,52 @@ GetSerializableTransactionSnapshot(Snapshot snapshot)
if (XactReadOnly && XactDeferrable) if (XactReadOnly && XactDeferrable)
return GetSafeSnapshot(snapshot); return GetSafeSnapshot(snapshot);
return GetSerializableTransactionSnapshotInt(snapshot); return GetSerializableTransactionSnapshotInt(snapshot,
InvalidTransactionId);
} }
/*
* Import a snapshot to be used for the current transaction.
*
* This is nearly the same as GetSerializableTransactionSnapshot, except that
* we don't take a new snapshot, but rather use the data we're handed.
*
* The caller must have verified that the snapshot came from a serializable
* transaction; and if we're read-write, the source transaction must not be
* read-only.
*/
void
SetSerializableTransactionSnapshot(Snapshot snapshot,
TransactionId sourcexid)
{
Assert(IsolationIsSerializable());
/*
* We do not allow SERIALIZABLE READ ONLY DEFERRABLE transactions to
* import snapshots, since there's no way to wait for a safe snapshot
* when we're using the snap we're told to. (XXX instead of throwing
* an error, we could just ignore the XactDeferrable flag?)
*/
if (XactReadOnly && XactDeferrable)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("a snapshot-importing transaction must not be READ ONLY DEFERRABLE")));
(void) GetSerializableTransactionSnapshotInt(snapshot, sourcexid);
}
/*
* Guts of GetSerializableTransactionSnapshot
*
* If sourcexid is valid, this is actually an import operation and we should
* skip calling GetSnapshotData, because the snapshot contents are already
* loaded up. HOWEVER: to avoid race conditions, we must check that the
* source xact is still running after we acquire SerializableXactHashLock.
* We do that by calling ProcArrayInstallImportedXmin.
*/
static Snapshot static Snapshot
GetSerializableTransactionSnapshotInt(Snapshot snapshot) GetSerializableTransactionSnapshotInt(Snapshot snapshot,
TransactionId sourcexid)
{ {
PGPROC *proc; PGPROC *proc;
VirtualTransactionId vxid; VirtualTransactionId vxid;
...@@ -1598,6 +1643,14 @@ GetSerializableTransactionSnapshotInt(Snapshot snapshot) ...@@ -1598,6 +1643,14 @@ GetSerializableTransactionSnapshotInt(Snapshot snapshot)
/* /*
* First we get the sxact structure, which may involve looping and access * First we get the sxact structure, which may involve looping and access
* to the "finished" list to free a structure for use. * to the "finished" list to free a structure for use.
*
* We must hold SerializableXactHashLock when taking/checking the snapshot
* to avoid race conditions, for much the same reasons that
* GetSnapshotData takes the ProcArrayLock. Since we might have to release
* SerializableXactHashLock to call SummarizeOldestCommittedSxact, this
* means we have to create the sxact first, which is a bit annoying (in
* particular, an elog(ERROR) in procarray.c would cause us to leak the
* sxact). Consider refactoring to avoid this.
*/ */
#ifdef TEST_OLDSERXID #ifdef TEST_OLDSERXID
SummarizeOldestCommittedSxact(); SummarizeOldestCommittedSxact();
...@@ -1615,8 +1668,19 @@ GetSerializableTransactionSnapshotInt(Snapshot snapshot) ...@@ -1615,8 +1668,19 @@ GetSerializableTransactionSnapshotInt(Snapshot snapshot)
} }
} while (!sxact); } while (!sxact);
/* Get the snapshot */ /* Get the snapshot, or check that it's safe to use */
snapshot = GetSnapshotData(snapshot); if (!TransactionIdIsValid(sourcexid))
snapshot = GetSnapshotData(snapshot);
else if (!ProcArrayInstallImportedXmin(snapshot->xmin, sourcexid))
{
ReleasePredXact(sxact);
LWLockRelease(SerializableXactHashLock);
ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("could not import the requested snapshot"),
errdetail("The source transaction %u is not running anymore.",
sourcexid)));
}
/* /*
* If there are no serializable transactions which are not read-only, we * If there are no serializable transactions which are not read-only, we
......
...@@ -72,6 +72,7 @@ ...@@ -72,6 +72,7 @@
#include "utils/plancache.h" #include "utils/plancache.h"
#include "utils/portal.h" #include "utils/portal.h"
#include "utils/ps_status.h" #include "utils/ps_status.h"
#include "utils/snapmgr.h"
#include "utils/tzparser.h" #include "utils/tzparser.h"
#include "utils/xml.h" #include "utils/xml.h"
...@@ -6093,8 +6094,11 @@ ExecSetVariableStmt(VariableSetStmt *stmt) ...@@ -6093,8 +6094,11 @@ ExecSetVariableStmt(VariableSetStmt *stmt)
case VAR_SET_MULTI: case VAR_SET_MULTI:
/* /*
* Special case for special SQL syntax that effectively sets more * Special-case SQL syntaxes. The TRANSACTION and SESSION
* than one variable per statement. * CHARACTERISTICS cases effectively set more than one variable
* per statement. TRANSACTION SNAPSHOT only takes one argument,
* but we put it here anyway since it's a special case and not
* related to any GUC variable.
*/ */
if (strcmp(stmt->name, "TRANSACTION") == 0) if (strcmp(stmt->name, "TRANSACTION") == 0)
{ {
...@@ -6140,6 +6144,18 @@ ExecSetVariableStmt(VariableSetStmt *stmt) ...@@ -6140,6 +6144,18 @@ ExecSetVariableStmt(VariableSetStmt *stmt)
item->defname); item->defname);
} }
} }
else if (strcmp(stmt->name, "TRANSACTION SNAPSHOT") == 0)
{
A_Const *con = (A_Const *) linitial(stmt->args);
if (stmt->is_local)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("SET LOCAL TRANSACTION SNAPSHOT is not implemented")));
Assert(IsA(con, A_Const));
Assert(nodeTag(&con->val) == T_String);
ImportSnapshot(strVal(&con->val));
}
else else
elog(ERROR, "unexpected SET MULTI element: %s", elog(ERROR, "unexpected SET MULTI element: %s",
stmt->name); stmt->name);
......
This diff is collapsed.
...@@ -2555,6 +2555,7 @@ main(int argc, char *argv[]) ...@@ -2555,6 +2555,7 @@ main(int argc, char *argv[])
"pg_clog", "pg_clog",
"pg_notify", "pg_notify",
"pg_serial", "pg_serial",
"pg_snapshots",
"pg_subtrans", "pg_subtrans",
"pg_twophase", "pg_twophase",
"pg_multixact/members", "pg_multixact/members",
......
...@@ -53,6 +53,6 @@ ...@@ -53,6 +53,6 @@
*/ */
/* yyyymmddN */ /* yyyymmddN */
#define CATALOG_VERSION_NO 201110161 #define CATALOG_VERSION_NO 201110221
#endif #endif
...@@ -2870,6 +2870,9 @@ DESCR("xlog filename and byte offset, given an xlog location"); ...@@ -2870,6 +2870,9 @@ DESCR("xlog filename and byte offset, given an xlog location");
DATA(insert OID = 2851 ( pg_xlogfile_name PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ )); DATA(insert OID = 2851 ( pg_xlogfile_name PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ ));
DESCR("xlog filename, given an xlog location"); DESCR("xlog filename, given an xlog location");
DATA(insert OID = 3809 ( pg_export_snapshot PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
DESCR("export a snapshot");
DATA(insert OID = 3810 ( pg_is_in_recovery PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_in_recovery _null_ _null_ _null_ )); DATA(insert OID = 3810 ( pg_is_in_recovery PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_in_recovery _null_ _null_ _null_ ));
DESCR("true if server is in recovery"); DESCR("true if server is in recovery");
......
...@@ -337,6 +337,7 @@ PG_KEYWORD("show", SHOW, UNRESERVED_KEYWORD) ...@@ -337,6 +337,7 @@ PG_KEYWORD("show", SHOW, UNRESERVED_KEYWORD)
PG_KEYWORD("similar", SIMILAR, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("similar", SIMILAR, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("simple", SIMPLE, UNRESERVED_KEYWORD) PG_KEYWORD("simple", SIMPLE, UNRESERVED_KEYWORD)
PG_KEYWORD("smallint", SMALLINT, COL_NAME_KEYWORD) PG_KEYWORD("smallint", SMALLINT, COL_NAME_KEYWORD)
PG_KEYWORD("snapshot", SNAPSHOT, UNRESERVED_KEYWORD)
PG_KEYWORD("some", SOME, RESERVED_KEYWORD) PG_KEYWORD("some", SOME, RESERVED_KEYWORD)
PG_KEYWORD("stable", STABLE, UNRESERVED_KEYWORD) PG_KEYWORD("stable", STABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("standalone", STANDALONE_P, UNRESERVED_KEYWORD) PG_KEYWORD("standalone", STANDALONE_P, UNRESERVED_KEYWORD)
......
...@@ -43,6 +43,8 @@ extern bool PageIsPredicateLocked(Relation relation, BlockNumber blkno); ...@@ -43,6 +43,8 @@ extern bool PageIsPredicateLocked(Relation relation, BlockNumber blkno);
/* predicate lock maintenance */ /* predicate lock maintenance */
extern Snapshot GetSerializableTransactionSnapshot(Snapshot snapshot); extern Snapshot GetSerializableTransactionSnapshot(Snapshot snapshot);
extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
TransactionId sourcexid);
extern void RegisterPredicateLockingXid(TransactionId xid); extern void RegisterPredicateLockingXid(TransactionId xid);
extern void PredicateLockRelation(Relation relation, Snapshot snapshot); extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot); extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
......
...@@ -37,10 +37,16 @@ extern void ExpireTreeKnownAssignedTransactionIds(TransactionId xid, ...@@ -37,10 +37,16 @@ extern void ExpireTreeKnownAssignedTransactionIds(TransactionId xid,
extern void ExpireAllKnownAssignedTransactionIds(void); extern void ExpireAllKnownAssignedTransactionIds(void);
extern void ExpireOldKnownAssignedTransactionIds(TransactionId xid); extern void ExpireOldKnownAssignedTransactionIds(TransactionId xid);
extern RunningTransactions GetRunningTransactionData(void); extern int GetMaxSnapshotXidCount(void);
extern int GetMaxSnapshotSubxidCount(void);
extern Snapshot GetSnapshotData(Snapshot snapshot); extern Snapshot GetSnapshotData(Snapshot snapshot);
extern bool ProcArrayInstallImportedXmin(TransactionId xmin,
TransactionId sourcexid);
extern RunningTransactions GetRunningTransactionData(void);
extern bool TransactionIdIsInProgress(TransactionId xid); extern bool TransactionIdIsInProgress(TransactionId xid);
extern bool TransactionIdIsActive(TransactionId xid); extern bool TransactionIdIsActive(TransactionId xid);
extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum); extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum);
......
...@@ -42,4 +42,9 @@ extern void AtSubCommit_Snapshot(int level); ...@@ -42,4 +42,9 @@ extern void AtSubCommit_Snapshot(int level);
extern void AtSubAbort_Snapshot(int level); extern void AtSubAbort_Snapshot(int level);
extern void AtEOXact_Snapshot(bool isCommit); extern void AtEOXact_Snapshot(bool isCommit);
extern Datum pg_export_snapshot(PG_FUNCTION_ARGS);
extern void ImportSnapshot(const char *idstr);
extern bool XactHasExportedSnapshots(void);
extern void DeleteAllExportedSnapshotFiles(void);
#endif /* SNAPMGR_H */ #endif /* SNAPMGR_H */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment