Commit 40f908bd authored by Heikki Linnakangas's avatar Heikki Linnakangas

Introduce Streaming Replication.

This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me
parent 4cbe4739
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/client-auth.sgml,v 1.125 2009/12/12 21:35:21 mha Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/client-auth.sgml,v 1.126 2010/01/15 09:18:56 heikki Exp $ -->
<chapter id="client-authentication">
<title>Client Authentication</title>
......@@ -181,6 +181,8 @@ hostnossl <replaceable>database</replaceable> <replaceable>user</replaceable>
the requested user must be a member of the role with the same
name as the requested database. (<literal>samegroup</> is an
obsolete but still accepted spelling of <literal>samerole</>.)
The value <literal>replication</> specifies that the record
matches if streaming replication is requested.
Otherwise, this is the name of
a specific <productname>PostgreSQL</productname> database.
Multiple database names can be supplied by separating them with
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.243 2010/01/06 02:41:37 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.244 2010/01/15 09:18:58 heikki Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
......@@ -1746,6 +1746,51 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
</variablelist>
</sect2>
<sect2 id="runtime-config-replication">
<title>Streaming Replication</title>
<para>
These settings control the behavior of the built-in
<firstterm>streaming replication</> feature.
</para>
<variablelist>
<varlistentry id="guc-max-wal-senders" xreflabel="max_wal_senders">
<term><varname>max_wal_senders</varname> (<type>integer</type>)</term>
<indexterm>
<primary><varname>max_wal_senders</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Specifies the maximum number of concurrent connections from standby
servers (i.e., the maximum number of simultaneously running WAL sender
processes). The default is zero. This parameter can only be set at
server start.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-wal-sender-delay" xreflabel="wal_sender_delay">
<term><varname>wal_sender_delay</varname> (<type>integer</type>)</term>
<indexterm>
<primary><varname>wal_sender_delay</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Specifies the delay between activity rounds for the WAL sender.
In each round the WAL sender sends any WAL accumulated since last
round to the standby server. It then sleeps for
<varname>wal_sender_delay</> milliseconds, and repeats. The default
value is 200 milliseconds (<literal>200ms</>).
Note that on many systems, the effective resolution of sleep delays is
10 milliseconds; setting <varname>wal_sender_delay</> to a value that
is not a multiple of 10 might have the same results as setting it to
the next higher multiple of 10. This parameter can only be set in the
<filename>postgresql.conf</> file or on the server command line.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="runtime-config-standby">
<title>Standby Servers</title>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.495 2009/12/19 17:49:50 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.496 2010/01/15 09:18:58 heikki Exp $ -->
<chapter id="functions">
<title>Functions and Operators</title>
......@@ -12984,7 +12984,8 @@ SELECT set_config('log_statement_stats', 'off', false);
<para>
The functions shown in <xref
linkend="functions-admin-backup-table"> assist in making on-line backups.
Use of the first three functions is restricted to superusers.
Use of the first three functions is restricted to superusers. The first
five functions cannot be executed during recovery.
</para>
<table id="functions-admin-backup-table">
......@@ -13135,11 +13136,17 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<indexterm>
<primary>pg_is_in_recovery</primary>
</indexterm>
<indexterm>
<primary>pg_last_xlog_receive_location</primary>
</indexterm>
<indexterm>
<primary>pg_last_xlog_replay_location</primary>
</indexterm>
<para>
The functions shown in <xref
linkend="functions-recovery-info-table"> provide information
about the current status of Hot Standby.
about the current status of the standby.
These functions may be executed during both recovery and in normal running.
</para>
......@@ -13160,6 +13167,33 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<entry>True if recovery is still in progress.
</entry>
</row>
<row>
<entry>
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
<entry>Get last transaction log location received and synced to disk during
streaming recovery. If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
InvalidXLogRecPtr (0/0).
</entry>
</row>
<row>
<entry>
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
<entry>Get last transaction log location replayed during recovery.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
then the return value will be InvalidXLogRecPtr (0/0).
</entry>
</row>
</tbody>
</tgroup>
</table>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.35 2009/04/27 16:27:35 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.36 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="high-availability">
<title>High Availability, Load Balancing, and Replication</title>
......@@ -146,11 +146,16 @@ protocol to make nodes agree on a serializable transactional order.
made the new master database server. This is asynchronous and
can only be done for the entire database server.
</para>
<para>
A PITR warm standby server can be kept more up-to-date using the
streaming replication feature built into <productname>PostgreSQL</> 8.5
onwards.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Master-Slave Replication</term>
<term>Trigger-Based Master-Slave Replication</term>
<listitem>
<para>
......@@ -278,7 +283,7 @@ protocol to make nodes agree on a serializable transactional order.
<entry>Shared Disk Failover</entry>
<entry>File System Replication</entry>
<entry>Warm Standby Using PITR</entry>
<entry>Master-Slave Replication</entry>
<entry>Trigger-Based Master-Slave Replication</entry>
<entry>Statement-Based Replication Middleware</entry>
<entry>Asynchronous Multimaster Replication</entry>
<entry>Synchronous Multimaster Replication</entry>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.72 2009/08/07 20:54:31 alvherre Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.73 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="performance-tips">
<title>Performance Tips</title>
......@@ -836,8 +836,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
needs to be written, because in case of an error, the files
containing the newly loaded data will be removed anyway.
However, this consideration does not apply when
<xref linkend="guc-archive-mode"> is on, as all commands
must write WAL in that case.
<xref linkend="guc-archive-mode"> is on or streaming replication
is allowed (i.e., <xref linkend="guc-max-wal-senders"> is more
than or equal to one), as all commands must write WAL in that case.
</para>
</sect2>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.76 2009/12/02 04:54:10 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.77 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="protocol">
<title>Frontend/Backend Protocol</title>
......@@ -4140,6 +4140,66 @@ not line breaks.
</sect1>
<sect1 id="protocol-replication">
<title>Streaming Replication Protocol</title>
<para>
To initiate streaming replication, the frontend sends the "replication"
parameter in the startup message. This tells the backend to go into
walsender mode, where a small set of replication commands can be issued
instead of SQL statements. Only the simple query protocol can be used in
walsender mode.
The commands accepted in walsender mode are:
<variablelist>
<varlistentry>
<term>IDENTIFY_SYSTEM</term>
<listitem>
<para>
Requests the server to idenfity itself. Server replies with a result
set of a single row, and two fields:
systemid: The unique system identifier identifying the cluster. This
can be used to check that the base backup used to initialize the
slave came from the same cluster.
timeline: Current TimelineID. Also used to check that the slave is
consistent with the master.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>START_REPLICATION XXX/XXX</term>
<listitem>
<para>
Instructs backend to start streaming WAL, starting at point XXX/XXX.
Server can reply with an error e.g if the requested piece of WAL has
already been recycled. On success, server responds with a
CopyOutResponse message, and backend starts to stream WAL as CopyData
messages.
</para>
<para>
The payload in each CopyData message consists of an XLogRecPtr,
indicating the starting point of the WAL in the message, immediately
followed by the WAL data itself.
</para>
<para>
A single WAL record is never split across two CopyData messages. When
a WAL record crosses a WAL page boundary, however, and is therefore
already split using continuation records, it can be split at the page
boundary. In other words, the first main WAL record and its
continuation records can be split across different CopyData messages.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect1>
<sect1 id="protocol-changes">
<title>Summary of Changes since Protocol 2.0</title>
......
......@@ -4,7 +4,7 @@
#
# Copyright (c) 1994, Regents of the University of California
#
# $PostgreSQL: pgsql/src/Makefile,v 1.47 2009/08/26 22:24:42 petere Exp $
# $PostgreSQL: pgsql/src/Makefile,v 1.48 2010/01/15 09:18:59 heikki Exp $
#
#-------------------------------------------------------------------------
......@@ -18,6 +18,7 @@ all install installdirs uninstall distprep:
$(MAKE) -C timezone $@
$(MAKE) -C backend $@
$(MAKE) -C backend/utils/mb/conversion_procs $@
$(MAKE) -C backend/replication/walreceiver $@
$(MAKE) -C backend/snowball $@
$(MAKE) -C include $@
$(MAKE) -C interfaces $@
......
......@@ -5,7 +5,7 @@
# Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
# Portions Copyright (c) 1994, Regents of the University of California
#
# $PostgreSQL: pgsql/src/backend/Makefile,v 1.139 2010/01/05 01:20:35 tgl Exp $
# $PostgreSQL: pgsql/src/backend/Makefile,v 1.140 2010/01/15 09:18:59 heikki Exp $
#
#-------------------------------------------------------------------------
......@@ -15,7 +15,7 @@ top_builddir = ../..
include $(top_builddir)/src/Makefile.global
SUBDIRS = access bootstrap catalog parser commands executor foreign lib libpq \
main nodes optimizer port postmaster regex rewrite \
main nodes optimizer port postmaster regex replication rewrite \
storage tcop tsearch utils $(top_builddir)/src/timezone
include $(srcdir)/common.mk
......
......@@ -59,7 +59,7 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtsort.c,v 1.121 2010/01/02 16:57:35 momjian Exp $
* $PostgreSQL: pgsql/src/backend/access/nbtree/nbtsort.c,v 1.122 2010/01/15 09:19:00 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -210,10 +210,10 @@ _bt_leafbuild(BTSpool *btspool, BTSpool *btspool2)
wstate.index = btspool->index;
/*
* We need to log index creation in WAL iff WAL archiving is enabled AND
* it's not a temp index.
* We need to log index creation in WAL iff WAL archiving/streaming is
* enabled AND it's not a temp index.
*/
wstate.btws_use_wal = XLogArchivingActive() && !wstate.index->rd_istemp;
wstate.btws_use_wal = XLogIsNeeded() && !wstate.index->rd_istemp;
/* reserve the metapage */
wstate.btws_pages_alloced = BTREE_METAPAGE + 1;
......
......@@ -2,13 +2,14 @@
# PostgreSQL recovery config file
# -------------------------------
#
# Edit this file to provide the parameters that PostgreSQL
# needs to perform an archive recovery of a database.
# Edit this file to provide the parameters that PostgreSQL needs to
# perform an archive recovery of a database, or to act as a log-streaming
# replication standby.
#
# If "recovery.conf" is present in the PostgreSQL data directory, it is
# read on postmaster startup. After successful recovery, it is renamed
# to "recovery.done" to ensure that we do not accidentally re-enter
# archive recovery mode.
# archive recovery or standby mode.
#
# This file consists of lines of the form:
#
......@@ -23,7 +24,7 @@
# are example values.
#
#---------------------------------------------------------------------------
# REQUIRED PARAMETERS
# ARCHIVE RECOVERY PARAMETERS
#---------------------------------------------------------------------------
#
# restore_command
......@@ -33,6 +34,9 @@
# which is replaced by the name of the desired log file, and %p,
# which is replaced by the absolute path to copy the log file to.
#
# This parameter is *required* for an archive recovery, but optional
# for replication.
#
# It is important that the command return nonzero exit status on failure.
# The command *will* be asked for log files that are not present in the
# archive; it must return nonzero when so asked.
......@@ -43,10 +47,6 @@
#restore_command = 'cp /mnt/server/archivedir/%f %p'
#
#
#---------------------------------------------------------------------------
# OPTIONAL PARAMETERS
#---------------------------------------------------------------------------
#
# recovery_end_command
#
# specifies an optional shell command to execute at completion of recovery.
......@@ -79,6 +79,28 @@
#
#
#---------------------------------------------------------------------------
# LOG-STREAMING REPLICATION PARAMETERS
#---------------------------------------------------------------------------
#
# When standby_mode is enabled, the PostgreSQL server will work as
# a standby. It tries to connect to the primary according to the
# connection settings primary_conninfo, and receives XLOG records
# continuously.
#
#standby_mode = 'false' # 'true' or 'false'
#
#primary_conninfo = 'host=localhost port=5432'
#
#
# By default, a standby server keeps streaming XLOG records from the
# primary indefinitely. If you want to stop streaming and finish recovery,
# opening up the system in read/write mode, specify path to a trigger file.
# Server will poll the trigger file path periodically and stop streaming
# when it's found.
#
#trigger_file = ''
#
#---------------------------------------------------------------------------
# HOT STANDBY PARAMETERS
#---------------------------------------------------------------------------
#
......
This diff is collapsed.
......@@ -8,7 +8,7 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/bootstrap/bootstrap.c,v 1.255 2010/01/02 16:57:36 momjian Exp $
* $PostgreSQL: pgsql/src/backend/bootstrap/bootstrap.c,v 1.256 2010/01/15 09:19:00 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -32,6 +32,7 @@
#include "nodes/makefuncs.h"
#include "postmaster/bgwriter.h"
#include "postmaster/walwriter.h"
#include "replication/walreceiver.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
#include "storage/proc.h"
......@@ -173,7 +174,7 @@ static IndexList *ILHead = NULL;
* AuxiliaryProcessMain
*
* The main entry point for auxiliary processes, such as the bgwriter,
* walwriter, bootstrapper and the shared memory checker code.
* walwriter, walreceiver, bootstrapper and the shared memory checker code.
*
* This code is here just because of historical reasons.
*/
......@@ -314,6 +315,9 @@ AuxiliaryProcessMain(int argc, char *argv[])
case WalWriterProcess:
statmsg = "wal writer process";
break;
case WalReceiverProcess:
statmsg = "wal receiver process";
break;
default:
statmsg = "??? process";
break;
......@@ -419,6 +423,24 @@ AuxiliaryProcessMain(int argc, char *argv[])
WalWriterMain();
proc_exit(1); /* should never return */
case WalReceiverProcess:
/* don't set signals, walreceiver has its own agenda */
{
PGFunction WalReceiverMain;
/*
* Walreceiver is not linked directly into the server
* binary because we would then need to link the server
* with libpq. It's compiled as a dynamically loaded module
* to avoid that.
*/
WalReceiverMain = load_external_function("walreceiver",
"WalReceiverMain",
true, NULL);
WalReceiverMain(NULL);
}
proc_exit(1); /* should never return */
default:
elog(PANIC, "unrecognized process type: %d", auxType);
proc_exit(1);
......
......@@ -11,7 +11,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/commands/cluster.c,v 1.192 2010/01/06 11:25:39 itagaki Exp $
* $PostgreSQL: pgsql/src/backend/commands/cluster.c,v 1.193 2010/01/15 09:19:01 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -816,10 +816,10 @@ copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex,
isnull = (bool *) palloc(natts * sizeof(bool));
/*
* We need to log the copied data in WAL iff WAL archiving is enabled AND
* it's not a temp rel.
* We need to log the copied data in WAL iff WAL archiving/streaming
* is enabled AND it's not a temp rel.
*/
use_wal = XLogArchivingActive() && !NewHeap->rd_istemp;
use_wal = XLogIsNeeded() && !NewHeap->rd_istemp;
/* use_wal off requires rd_targblock be initially invalid */
Assert(NewHeap->rd_targblock == InvalidBlockNumber);
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.320 2010/01/02 16:57:37 momjian Exp $
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.321 2010/01/15 09:19:01 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -1725,7 +1725,7 @@ CopyFrom(CopyState cstate)
/*----------
* Check to see if we can avoid writing WAL
*
* If archive logging is not enabled *and* either
* If archive logging/streaming is not enabled *and* either
* - table was created in same transaction as this COPY
* - data is being written to relfilenode created in this transaction
* then we can skip writing WAL. It's safe because if the transaction
......@@ -1753,7 +1753,7 @@ CopyFrom(CopyState cstate)
cstate->rel->rd_newRelfilenodeSubid != InvalidSubTransactionId)
{
hi_options |= HEAP_INSERT_SKIP_FSM;
if (!XLogArchivingActive())
if (!XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
}
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/commands/tablecmds.c,v 1.314 2010/01/06 03:04:00 momjian Exp $
* $PostgreSQL: pgsql/src/backend/commands/tablecmds.c,v 1.315 2010/01/15 09:19:01 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -7044,10 +7044,10 @@ copy_relation_data(SMgrRelation src, SMgrRelation dst,
Page page = (Page) buf;
/*
* We need to log the copied data in WAL iff WAL archiving is enabled AND
* it's not a temp rel.
* We need to log the copied data in WAL iff WAL archiving/streaming is
* enabled AND it's not a temp rel.
*/
use_wal = XLogArchivingActive() && !istemp;
use_wal = XLogIsNeeded() && !istemp;
nblocks = smgrnblocks(src, forkNum);
......
......@@ -26,7 +26,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.341 2010/01/08 02:44:00 tgl Exp $
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.342 2010/01/15 09:19:02 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -2213,11 +2213,11 @@ OpenIntoRel(QueryDesc *queryDesc)
myState->rel = intoRelationDesc;
/*
* We can skip WAL-logging the insertions, unless PITR is in use. We can
* skip the FSM in any case.
* We can skip WAL-logging the insertions, unless PITR or streaming
* replication is in use. We can skip the FSM in any case.
*/
myState->hi_options = HEAP_INSERT_SKIP_FSM |
(XLogArchivingActive() ? 0 : HEAP_INSERT_SKIP_WAL);
(XLogIsNeeded() ? 0 : HEAP_INSERT_SKIP_WAL);
myState->bistate = GetBulkInsertState();
/* Not using WAL requires rd_targblock be initially invalid */
......
......@@ -11,7 +11,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/libpq/be-secure.c,v 1.95 2010/01/02 16:57:45 momjian Exp $
* $PostgreSQL: pgsql/src/backend/libpq/be-secure.c,v 1.96 2010/01/15 09:19:02 heikki Exp $
*
* Since the server static private key ($DataDir/server.key)
* will normally be stored unencrypted so that the database
......@@ -255,6 +255,8 @@ rloop:
break;
case SSL_ERROR_WANT_READ:
case SSL_ERROR_WANT_WRITE:
if (port->noblock)
return 0;
#ifdef WIN32
pgwin32_waitforsinglesocket(SSL_get_fd(port->ssl),
(err == SSL_ERROR_WANT_READ) ?
......
......@@ -10,7 +10,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/libpq/hba.c,v 1.194 2010/01/02 16:57:45 momjian Exp $
* $PostgreSQL: pgsql/src/backend/libpq/hba.c,v 1.195 2010/01/15 09:19:02 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -28,6 +28,7 @@
#include "libpq/ip.h"
#include "libpq/libpq.h"
#include "regex/regex.h"
#include "replication/walsender.h"
#include "storage/fd.h"
#include "utils/acl.h"
#include "utils/guc.h"
......@@ -190,7 +191,8 @@ next_token(FILE *fp, char *buf, int bufsz)
(strcmp(start_buf, "all") == 0 ||
strcmp(start_buf, "sameuser") == 0 ||
strcmp(start_buf, "samegroup") == 0 ||
strcmp(start_buf, "samerole") == 0))
strcmp(start_buf, "samerole") == 0 ||
strcmp(start_buf, "replication") == 0))
{
/* append newline to a magical keyword */
*buf++ = '\n';
......@@ -514,6 +516,9 @@ check_db(const char *dbname, const char *role, Oid roleid, char *param_str)
if (is_member(roleid, dbname))
return true;
}
else if (strcmp(tok, "replication\n") == 0 &&
am_walsender)
return true;
else if (strcmp(tok, dbname) == 0)
return true;
}
......
......@@ -20,8 +20,8 @@
# "host" is either a plain or SSL-encrypted TCP/IP socket, "hostssl" is an
# SSL-encrypted TCP/IP socket, and "hostnossl" is a plain TCP/IP socket.
#
# DATABASE can be "all", "sameuser", "samerole", a database name, or
# a comma-separated list thereof.
# DATABASE can be "all", "sameuser", "samerole", "replication",
# a database name, or a comma-separated list thereof.
#
# USER can be "all", a user name, a group name prefixed with "+", or
# a comma-separated list thereof. In both the DATABASE and USER fields
......@@ -47,9 +47,9 @@
# for a list of which options are available for which authentication methods.
#
# Database and user names containing spaces, commas, quotes and other special
# characters must be quoted. Quoting one of the keywords "all", "sameuser" or
# "samerole" makes the name lose its special character, and just match a
# database or username with that name.
# characters must be quoted. Quoting one of the keywords "all", "sameuser",
# "samerole" or "replication" makes the name lose its special character,
# and just match a database or username with that name.
#
# This file is read on server startup and when the postmaster receives
# a SIGHUP signal. If you edit the file on a running system, you have
......
......@@ -30,7 +30,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/backend/libpq/pqcomm.c,v 1.201 2010/01/10 14:16:07 mha Exp $
* $PostgreSQL: pgsql/src/backend/libpq/pqcomm.c,v 1.202 2010/01/15 09:19:02 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -55,6 +55,7 @@
* pq_peekbyte - peek at next byte from connection
* pq_putbytes - send bytes to connection (not flushed until pq_flush)
* pq_flush - flush pending output
* pq_getbyte_if_available - get a byte if available without blocking
*
* message-level I/O (and old-style-COPY-OUT cruft):
* pq_putmessage - send a normal message (suppressed in COPY OUT mode)
......@@ -815,6 +816,56 @@ pq_peekbyte(void)
return (unsigned char) PqRecvBuffer[PqRecvPointer];
}
/* --------------------------------
* pq_getbyte_if_available - get a single byte from connection,
* if available
*
* The received byte is stored in *c. Returns 1 if a byte was read, 0 if
* if no data was available, or EOF.
* --------------------------------
*/
int
pq_getbyte_if_available(unsigned char *c)
{
int r;
if (PqRecvPointer < PqRecvLength)
{
*c = PqRecvBuffer[PqRecvPointer++];
return 1;
}
/* Temporarily put the socket into non-blocking mode */
if (!pg_set_noblock(MyProcPort->sock))
ereport(ERROR,
(errmsg("couldn't put socket to non-blocking mode: %m")));
MyProcPort->noblock = true;
PG_TRY();
{
r = secure_read(MyProcPort, c, 1);
}
PG_CATCH();
{
/*
* The rest of the backend code assumes the socket is in blocking
* mode, so treat failure as FATAL.
*/
if (!pg_set_block(MyProcPort->sock))
ereport(FATAL,
(errmsg("couldn't put socket to blocking mode: %m")));
MyProcPort->noblock = false;
PG_RE_THROW();
}
PG_END_TRY();
if (!pg_set_block(MyProcPort->sock))
ereport(FATAL,
(errmsg("couldn't put socket to blocking mode: %m")));
MyProcPort->noblock = false;
return r;
}
/* --------------------------------
* pq_getbytes - get a known number of bytes from connection
*
......
......@@ -38,7 +38,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/postmaster/bgwriter.c,v 1.65 2010/01/02 16:57:50 momjian Exp $
* $PostgreSQL: pgsql/src/backend/postmaster/bgwriter.c,v 1.66 2010/01/15 09:19:02 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -353,6 +353,12 @@ BackgroundWriterMain(void)
*/
PG_SETMASK(&UnBlockSig);
/*
* Use the recovery target timeline ID during recovery
*/
if (RecoveryInProgress())
ThisTimeLineID = GetRecoveryTargetTLI();
/*
* Loop forever
*/
......
This diff is collapsed.
#-------------------------------------------------------------------------
#
# Makefile--
# Makefile for src/backend/replication
#
# IDENTIFICATION
# $PostgreSQL: pgsql/src/backend/replication/Makefile,v 1.1 2010/01/15 09:19:03 heikki Exp $
#
#-------------------------------------------------------------------------
subdir = src/backend/replication
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = walsender.o walreceiverfuncs.o
include $(top_srcdir)/src/backend/common.mk
$PostgreSQL: pgsql/src/backend/replication/README,v 1.1 2010/01/15 09:19:03 heikki Exp $
Walreceiver IPC
---------------
When the WAL replay in startup process has reached the end of archived WAL,
recoverable using recovery_command, it starts up the walreceiver process
to fetch more WAL (if streaming replication is configured).
Walreceiver is a postmaster subprocess, so the startup process can't fork it
directly. Instead, it sends a signal to postmaster, asking postmaster to launch
it. Before that, however, startup process fills in WalRcvData->conninfo,
and initializes the starting point in WalRcvData->receivedUpTo.
As walreceiver receives WAL from the master server, and writes and flushes
it to disk (in pg_xlog), it updates WalRcvData->receivedUpTo. Startup process
polls that to know how far it can proceed with WAL replay.
Walsender IPC
-------------
At shutdown, postmaster handles walsender processes differently from regular
backends. It waits for regular backends to die before writing the
shutdown checkpoint and terminating pgarch and other auxiliary processes, but
that's not desirable for walsenders, because we want the standby servers to
receive all the WAL, including the shutdown checkpoint, before the master
is shut down. Therefore postmaster treats walsenders like the pgarch process,
and instructs them to terminate at PM_SHUTDOWN_2 phase, after all regular
backends have died and bgwriter has written the shutdown checkpoint.
When postmaster accepts a connection, it immediately forks a new process
to handle the handshake and authentication, and the process initializes to
become a backend. Postmaster doesn't know if the process becomes a regular
backend or a walsender process at that time - that's indicated in the
connection handshake - so we need some extra signaling to let postmaster
identify walsender processes.
When walsender process starts up, it marks itself as a walsender process in
the PMSignal array. That way postmaster can tell it apart from regular
backends.
Note that no big harm is done if postmaster thinks that a walsender is a
regular backend; it will just terminate the walsender earlier in the shutdown
phase. A walsenders will look like a regular backends until it's done with the
initialization and has marked itself in PMSignal array, and at process
termination, after unmarking the PMSignal slot.
Each walsender allocates an entry from the WalSndCtl array, and advertises
there how far it has streamed WAL already. This is used at checkpoints, to
avoid recycling WAL that hasn't been streamed to a slave yet. However,
that doesn't stop such WAL from being recycled when the connection is not
established.
Walsender - walreceiver protocol
--------------------------------
See manual.
#-------------------------------------------------------------------------
#
# Makefile--
# Makefile for src/backend/replication/walreceiver
#
# IDENTIFICATION
# $PostgreSQL: pgsql/src/backend/replication/walreceiver/Makefile,v 1.1 2010/01/15 09:19:03 heikki Exp $
#
#-------------------------------------------------------------------------
subdir = src/backend/postmaster/walreceiver
top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
OBJS = walreceiver.o
SHLIB_LINK = $(libpq)
NAME = walreceiver
all: submake-libpq all-shared-lib
include $(top_srcdir)/src/Makefile.shlib
install: all installdirs install-lib
installdirs: installdirs-lib
uninstall: uninstall-lib
clean distclean maintainer-clean: clean-lib
rm -f $(OBJS)
This diff is collapsed.
/*-------------------------------------------------------------------------
*
* walreceiverfuncs.c
*
* This file contains functions used by the startup process to communicate
* with the walreceiver process. Functions implementing walreceiver itself
* are in src/backend/replication/walreceiver subdirectory.
*
* Portions Copyright (c) 2010-2010, PostgreSQL Global Development Group
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/replication/walreceiverfuncs.c,v 1.1 2010/01/15 09:19:03 heikki Exp $
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <signal.h>
#include "access/xlog_internal.h"
#include "replication/walreceiver.h"
#include "storage/fd.h"
#include "storage/pmsignal.h"
#include "storage/shmem.h"
#include "utils/guc.h"
WalRcvData *WalRcv = NULL;
static bool CheckForStandbyTrigger(void);
static void ShutdownWalRcv(void);
/* Report shared memory space needed by WalRcvShmemInit */
Size
WalRcvShmemSize(void)
{
Size size = 0;
size = add_size(size, sizeof(WalRcvData));
return size;
}
/* Allocate and initialize walreceiver-related shared memory */
void
WalRcvShmemInit(void)
{
bool found;
WalRcv = (WalRcvData *)
ShmemInitStruct("Wal Receiver Ctl", WalRcvShmemSize(), &found);
if (WalRcv == NULL)
ereport(FATAL,
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("not enough shared memory for walreceiver")));
if (found)
return; /* already initialized */
/* Initialize the data structures */
MemSet(WalRcv, 0, WalRcvShmemSize());
WalRcv->walRcvState = WALRCV_NOT_STARTED;
SpinLockInit(&WalRcv->mutex);
}
/* Is walreceiver in progress (or starting up)? */
bool
WalRcvInProgress(void)
{
/* use volatile pointer to prevent code rearrangement */
volatile WalRcvData *walrcv = WalRcv;
WalRcvState state;
SpinLockAcquire(&walrcv->mutex);
state = walrcv->walRcvState;
SpinLockRelease(&walrcv->mutex);
if (state == WALRCV_RUNNING || state == WALRCV_STOPPING)
return true;
else
return false;
}
/*
* Wait for the XLOG record at given position to become available.
*
* 'recptr' indicates the byte position which caller wants to read the
* XLOG record up to. The byte position actually written and flushed
* by walreceiver is returned. It can be higher than the requested
* location, and the caller can safely read up to that point without
* calling WaitNextXLogAvailable() again.
*
* If WAL streaming is ended (because a trigger file is found), *finished
* is set to true and function returns immediately. The returned position
* can be lower than requested in that case.
*
* Called by the startup process during streaming recovery.
*/
XLogRecPtr
WaitNextXLogAvailable(XLogRecPtr recptr, bool *finished)
{
static XLogRecPtr receivedUpto = {0, 0};
*finished = false;
/* Quick exit if already known available */
if (XLByteLT(recptr, receivedUpto))
return receivedUpto;
for (;;)
{
/* use volatile pointer to prevent code rearrangement */
volatile WalRcvData *walrcv = WalRcv;
/* Update local status */
SpinLockAcquire(&walrcv->mutex);
receivedUpto = walrcv->receivedUpto;
SpinLockRelease(&walrcv->mutex);
/* If available already, leave here */
if (XLByteLT(recptr, receivedUpto))
return receivedUpto;
/* Check to see if the trigger file exists */
if (CheckForStandbyTrigger())
{
*finished = true;
return receivedUpto;
}
pg_usleep(100000L); /* 100ms */
/*
* This possibly-long loop needs to handle interrupts of startup
* process.
*/
HandleStartupProcInterrupts();
/*
* Emergency bailout if postmaster has died. This is to avoid the
* necessity for manual cleanup of all postmaster children.
*/
if (!PostmasterIsAlive(true))
exit(1);
}
}
/*
* Stop walreceiver and wait for it to die.
*/
static void
ShutdownWalRcv(void)
{
/* use volatile pointer to prevent code rearrangement */
volatile WalRcvData *walrcv = WalRcv;
pid_t walrcvpid;
/*
* Request walreceiver to stop. Walreceiver will switch to WALRCV_STOPPED
* mode once it's finished, and will also request postmaster to not
* restart itself.
*/
SpinLockAcquire(&walrcv->mutex);
Assert(walrcv->walRcvState == WALRCV_RUNNING);
walrcv->walRcvState = WALRCV_STOPPING;
walrcvpid = walrcv->pid;
SpinLockRelease(&walrcv->mutex);
/*
* Pid can be 0, if no walreceiver process is active right now.
* Postmaster should restart it, and when it does, it will see the
* STOPPING state.
*/
if (walrcvpid != 0)
kill(walrcvpid, SIGTERM);
/*
* Wait for walreceiver to acknowledge its death by setting state to
* WALRCV_STOPPED.
*/
while (WalRcvInProgress())
{
/*
* This possibly-long loop needs to handle interrupts of startup
* process.
*/
HandleStartupProcInterrupts();
pg_usleep(100000); /* 100ms */
}
}
/*
* Check to see if the trigger file exists. If it does, request postmaster
* to shut down walreceiver and wait for it to exit, and remove the trigger
* file.
*/
static bool
CheckForStandbyTrigger(void)
{
struct stat stat_buf;
if (TriggerFile == NULL)
return false;
if (stat(TriggerFile, &stat_buf) == 0)
{
ereport(LOG,
(errmsg("trigger file found: %s", TriggerFile)));
ShutdownWalRcv();
unlink(TriggerFile);
return true;
}
return false;
}
/*
* Request postmaster to start walreceiver.
*
* recptr indicates the position where streaming should begin, and conninfo
* is a libpq connection string to use.
*/
void
RequestXLogStreaming(XLogRecPtr recptr, const char *conninfo)
{
/* use volatile pointer to prevent code rearrangement */
volatile WalRcvData *walrcv = WalRcv;
Assert(walrcv->walRcvState == WALRCV_NOT_STARTED);
/* locking is just pro forma here; walreceiver isn't started yet */
SpinLockAcquire(&walrcv->mutex);
walrcv->receivedUpto = recptr;
if (conninfo != NULL)
strlcpy((char *) walrcv->conninfo, conninfo, MAXCONNINFO);
else
walrcv->conninfo[0] = '\0';
walrcv->walRcvState = WALRCV_RUNNING;
SpinLockRelease(&walrcv->mutex);
SendPostmasterSignal(PMSIGNAL_START_WALRECEIVER);
}
/*
* Returns the byte position that walreceiver has written
*/
XLogRecPtr
GetWalRcvWriteRecPtr(void)
{
/* use volatile pointer to prevent code rearrangement */
volatile WalRcvData *walrcv = WalRcv;
XLogRecPtr recptr;
SpinLockAcquire(&walrcv->mutex);
recptr = walrcv->receivedUpto;
SpinLockRelease(&walrcv->mutex);
return recptr;
}
This diff is collapsed.
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/ipc/ipci.c,v 1.102 2010/01/02 16:57:51 momjian Exp $
* $PostgreSQL: pgsql/src/backend/storage/ipc/ipci.c,v 1.103 2010/01/15 09:19:03 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -25,6 +25,8 @@
#include "postmaster/autovacuum.h"
#include "postmaster/bgwriter.h"
#include "postmaster/postmaster.h"
#include "replication/walreceiver.h"
#include "replication/walsender.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
#include "storage/pg_shmem.h"
......@@ -116,6 +118,8 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
size = add_size(size, ProcSignalShmemSize());
size = add_size(size, BgWriterShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, WalSndShmemSize());
size = add_size(size, WalRcvShmemSize());
size = add_size(size, BTreeShmemSize());
size = add_size(size, SyncScanShmemSize());
#ifdef EXEC_BACKEND
......@@ -213,6 +217,8 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
ProcSignalShmemInit();
BgWriterShmemInit();
AutoVacuumShmemInit();
WalSndShmemInit();
WalRcvShmemInit();
/*
* Set up other modules that need some shared memory space
......
......@@ -8,7 +8,7 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/ipc/pmsignal.c,v 1.29 2010/01/02 16:57:51 momjian Exp $
* $PostgreSQL: pgsql/src/backend/storage/ipc/pmsignal.c,v 1.30 2010/01/15 09:19:03 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -45,11 +45,15 @@
* process is actively using shared memory. The slots are assigned to
* child processes at random, and postmaster.c is responsible for tracking
* which one goes with which PID.
*
* The fourth state, WALSENDER, is just like ACTIVE, but carries the extra
* information that the child is a WAL sender.
*/
#define PM_CHILD_UNUSED 0 /* these values must fit in sig_atomic_t */
#define PM_CHILD_ASSIGNED 1
#define PM_CHILD_ACTIVE 2
#define PM_CHILD_WALSENDER 3
/* "typedef struct PMSignalData PMSignalData" appears in pmsignal.h */
struct PMSignalData
......@@ -192,6 +196,22 @@ ReleasePostmasterChildSlot(int slot)
return result;
}
/*
* IsPostmasterChildWalSender - check if given slot is in use by a
* walsender process.
*/
bool
IsPostmasterChildWalSender(int slot)
{
Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
slot--;
if (PMSignalState->PMChildFlags[slot] == PM_CHILD_WALSENDER)
return true;
else
return false;
}
/*
* MarkPostmasterChildActive - mark a postmaster child as about to begin
* actively using shared memory. This is called in the child process.
......@@ -207,6 +227,22 @@ MarkPostmasterChildActive(void)
PMSignalState->PMChildFlags[slot] = PM_CHILD_ACTIVE;
}
/*
* MarkPostmasterChildWalSender - like MarkPostmasterChildActive(), but
* marks the postmaster child as a WAL sender instead of a regular backend.
* This is called in the child process.
*/
void
MarkPostmasterChildWalSender(void)
{
int slot = MyPMChildSlot;
Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
slot--;
Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED);
PMSignalState->PMChildFlags[slot] = PM_CHILD_WALSENDER;
}
/*
* MarkPostmasterChildInactive - mark a postmaster child as done using
* shared memory. This is called in the child process.
......@@ -218,7 +254,8 @@ MarkPostmasterChildInactive(void)
Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
slot--;
Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ACTIVE);
Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ACTIVE ||
PMSignalState->PMChildFlags[slot] == PM_CHILD_WALSENDER);
PMSignalState->PMChildFlags[slot] = PM_CHILD_ASSIGNED;
}
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.211 2010/01/02 16:57:52 momjian Exp $
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.212 2010/01/15 09:19:03 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -39,6 +39,7 @@
#include "access/xact.h"
#include "miscadmin.h"
#include "postmaster/autovacuum.h"
#include "replication/walsender.h"
#include "storage/ipc.h"
#include "storage/lmgr.h"
#include "storage/pmsignal.h"
......@@ -290,7 +291,12 @@ InitProcess(void)
* this; it probably should.)
*/
if (IsUnderPostmaster && !IsAutoVacuumLauncherProcess())
{
if (am_walsender)
MarkPostmasterChildWalSender();
else
MarkPostmasterChildActive();
}
/*
* Initialize all fields of MyProc, except for the semaphore which was
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/tcop/postgres.c,v 1.581 2010/01/07 16:29:58 tgl Exp $
* $PostgreSQL: pgsql/src/backend/tcop/postgres.c,v 1.582 2010/01/15 09:19:04 heikki Exp $
*
* NOTES
* this is the "main" module of the postgres backend and
......@@ -56,6 +56,7 @@
#include "parser/parser.h"
#include "postmaster/autovacuum.h"
#include "postmaster/postmaster.h"
#include "replication/walsender.h"
#include "rewrite/rewriteHandler.h"
#include "storage/bufmgr.h"
#include "storage/ipc.h"
......@@ -3331,6 +3332,10 @@ PostgresMain(int argc, char *argv[], const char *username)
* an issue for signals that are locally generated, such as SIGALRM and
* SIGPIPE.)
*/
if (am_walsender)
WalSndSignals();
else
{
pqsignal(SIGHUP, SigHupHandler); /* set flag to read config file */
pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */
pqsignal(SIGTERM, die); /* cancel current query and exit */
......@@ -3361,6 +3366,7 @@ PostgresMain(int argc, char *argv[], const char *username)
* Reset some signals that are accepted by postmaster but not by backend
*/
pqsignal(SIGCHLD, SIG_DFL); /* system() requires this on some platforms */
}
pqinitmask();
......@@ -3456,6 +3462,10 @@ PostgresMain(int argc, char *argv[], const char *username)
if (IsUnderPostmaster && Log_disconnections)
on_proc_exit(log_disconnections, 0);
/* If this is a WAL sender process, we're done with initialization. */
if (am_walsender)
proc_exit(WalSenderMain());
/*
* process any libraries that should be preloaded at backend start (this
* likewise can't be done until GUC settings are complete)
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/init/postinit.c,v 1.200 2010/01/02 16:57:56 momjian Exp $
* $PostgreSQL: pgsql/src/backend/utils/init/postinit.c,v 1.201 2010/01/15 09:19:04 heikki Exp $
*
*
*-------------------------------------------------------------------------
......@@ -36,6 +36,7 @@
#include "pgstat.h"
#include "postmaster/autovacuum.h"
#include "postmaster/postmaster.h"
#include "replication/walsender.h"
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "storage/ipc.h"
......@@ -446,6 +447,7 @@ BaseInit(void)
* In bootstrap mode no parameters are used. The autovacuum launcher process
* doesn't use any parameters either, because it only goes far enough to be
* able to read pg_database; it doesn't connect to any particular database.
* In walsender mode only username is used.
*
* As of PostgreSQL 8.2, we expect InitProcess() was already called, so we
* already have a PGPROC struct ... but it's not completely filled in yet.
......@@ -557,10 +559,10 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* Set up the global variables holding database id and default tablespace.
* But note we won't actually try to touch the database just yet.
*
* We take a shortcut in the bootstrap case, otherwise we have to look up
* the db's entry in pg_database.
* We take a shortcut in the bootstrap and walsender case, otherwise we
* have to look up the db's entry in pg_database.
*/
if (bootstrap)
if (bootstrap || am_walsender)
{
MyDatabaseId = TemplateDbOid;
MyDatabaseTableSpace = DEFAULTTABLESPACE_OID;
......@@ -623,7 +625,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* AccessShareLock for such sessions and thereby not conflict against
* CREATE DATABASE.
*/
if (!bootstrap)
if (!bootstrap && !am_walsender)
LockSharedObject(DatabaseRelationId, MyDatabaseId, 0,
RowExclusiveLock);
......@@ -632,7 +634,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* If there was a concurrent DROP DATABASE, this ensures we will die
* cleanly without creating a mess.
*/
if (!bootstrap)
if (!bootstrap && !am_walsender)
{
HeapTuple tuple;
......@@ -652,7 +654,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
*/
fullpath = GetDatabasePath(MyDatabaseId, MyDatabaseTableSpace);
if (!bootstrap)
if (!bootstrap && !am_walsender)
{
if (access(fullpath, F_OK) == -1)
{
......@@ -727,7 +729,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* database-access infrastructure is up. (Also, it wants to know if the
* user is a superuser, so the above stuff has to happen first.)
*/
if (!bootstrap)
if (!bootstrap && !am_walsender)
CheckMyDatabase(dbname, am_superuser);
/*
......@@ -824,6 +826,10 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/* initialize client encoding */
InitializeClientEncoding();
/* reset the database for walsender */
if (am_walsender)
MyProc->databaseId = MyDatabaseId = InvalidOid;
/* report this backend in the PgBackendStatus array */
if (!bootstrap)
pgstat_bestart();
......
......@@ -10,7 +10,7 @@
* Written by Peter Eisentraut <peter_e@gmx.net>.
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.532 2010/01/07 04:53:35 tgl Exp $
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.533 2010/01/15 09:19:04 heikki Exp $
*
*--------------------------------------------------------------------
*/
......@@ -55,6 +55,7 @@
#include "postmaster/postmaster.h"
#include "postmaster/syslogger.h"
#include "postmaster/walwriter.h"
#include "replication/walsender.h"
#include "storage/bufmgr.h"
#include "storage/fd.h"
#include "tcop/tcopprot.h"
......@@ -494,6 +495,8 @@ const char *const config_group_names[] =
gettext_noop("Write-Ahead Log / Settings"),
/* WAL_CHECKPOINTS */
gettext_noop("Write-Ahead Log / Checkpoints"),
/* WAL_REPLICATION */
gettext_noop("Write-Ahead Log / Replication"),
/* QUERY_TUNING */
gettext_noop("Query Tuning"),
/* QUERY_TUNING_METHOD */
......@@ -1697,6 +1700,26 @@ static struct config_int ConfigureNamesInt[] =
200, 1, 10000, NULL, NULL
},
{
/* see max_connections */
{"max_wal_senders", PGC_POSTMASTER, WAL_REPLICATION,
gettext_noop("Sets the maximum number of simultaneously running WAL sender processes."),
NULL
},
&MaxWalSenders,
0, 0, INT_MAX / 4, NULL, NULL
},
{
{"wal_sender_delay", PGC_SIGHUP, WAL_REPLICATION,
gettext_noop("WAL sender sleep time between WAL replications."),
NULL,
GUC_UNIT_MS
},
&WalSndDelay,
200, 1, 10000, NULL, NULL
},
{
{"commit_delay", PGC_USERSET, WAL_SETTINGS,
gettext_noop("Sets the delay in microseconds between transaction commit and "
......
......@@ -185,6 +185,12 @@
#max_standby_delay = 30 # max acceptable standby lag (s) to help queries
# complete without conflict; -1 disables
# - Replication -
#max_wal_senders = 0 # max number of walsender processes
#wal_sender_delay = 200ms # 1-10000 milliseconds
#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------
......
......@@ -4,7 +4,7 @@
#
# 'make install' installs whole contents of src/include.
#
# $PostgreSQL: pgsql/src/include/Makefile,v 1.30 2010/01/05 01:06:56 tgl Exp $
# $PostgreSQL: pgsql/src/include/Makefile,v 1.31 2010/01/15 09:19:05 heikki Exp $
#
#-------------------------------------------------------------------------
......@@ -18,8 +18,8 @@ all: pg_config.h pg_config_os.h
# Subdirectories containing headers for server-side dev
SUBDIRS = access bootstrap catalog commands executor foreign lib libpq mb \
nodes optimizer parser postmaster regex rewrite storage tcop \
snowball snowball/libstemmer tsearch tsearch/dicts utils \
nodes optimizer parser postmaster regex replication rewrite storage \
tcop snowball snowball/libstemmer tsearch tsearch/dicts utils \
port port/win32 port/win32_msvc port/win32_msvc/sys \
port/win32/arpa port/win32/netinet port/win32/sys \
portability
......
......@@ -6,7 +6,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.95 2010/01/02 16:58:00 momjian Exp $
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.96 2010/01/15 09:19:06 heikki Exp $
*/
#ifndef XLOG_H
#define XLOG_H
......@@ -188,6 +188,18 @@ extern int MaxStandbyDelay;
#define XLogArchiveCommandSet() (XLogArchiveCommand[0] != '\0')
#define XLogStandbyInfoActive() (XLogRequestRecoveryConnections && XLogArchiveMode)
/*
* This is in walsender.c, but declared here so that we don't need to include
* walsender.h in all files that check XLogIsNeeded()
*/
extern int MaxWalSenders;
/*
* Is WAL-logging necessary? We need to log an XLOG record iff either
* WAL archiving is enabled or XLOG streaming is allowed.
*/
#define XLogIsNeeded() (XLogArchivingActive() || (MaxWalSenders > 0))
#ifdef WAL_DEBUG
extern bool XLOG_DEBUG;
#endif
......@@ -228,12 +240,19 @@ typedef struct CheckpointStatsData
extern CheckpointStatsData CheckpointStats;
/* Read from recovery.conf, in startup process */
extern char *TriggerFile;
extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata);
extern void XLogFlush(XLogRecPtr RecPtr);
extern void XLogBackgroundFlush(void);
extern void XLogAsyncCommitFlush(void);
extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
extern int XLogFileInit(uint32 log, uint32 seg,
bool *use_existent, bool use_lock);
extern int XLogFileOpen(uint32 log, uint32 seg);
extern void XLogSetAsyncCommitLSN(XLogRecPtr record);
......@@ -242,11 +261,14 @@ extern void RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord *record, bool cleanup);
extern void xlog_redo(XLogRecPtr lsn, XLogRecord *record);
extern void xlog_desc(StringInfo buf, uint8 xl_info, char *rec);
extern void issue_xlog_fsync(int fd, uint32 log, uint32 seg);
extern bool RecoveryInProgress(void);
extern bool XLogInsertAllowed(void);
extern TimestampTz GetLatestXLogTime(void);
extern void UpdateControlFile(void);
extern uint64 GetSystemIdentifier(void);
extern Size XLOGShmemSize(void);
extern void XLOGShmemInit(void);
extern void BootStrapXLOG(void);
......@@ -258,8 +280,11 @@ extern bool CreateRestartPoint(int flags);
extern void XLogPutNextOid(Oid nextOid);
extern XLogRecPtr GetRedoRecPtr(void);
extern XLogRecPtr GetInsertRecPtr(void);
extern XLogRecPtr GetWriteRecPtr(void);
extern void GetNextXidAndEpoch(TransactionId *xid, uint32 *epoch);
extern TimeLineID GetRecoveryTargetTLI(void);
extern void HandleStartupProcInterrupts(void);
extern void StartupProcessMain(void);
#endif /* XLOG_H */
......@@ -11,7 +11,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/access/xlog_internal.h,v 1.27 2010/01/02 16:58:01 momjian Exp $
* $PostgreSQL: pgsql/src/include/access/xlog_internal.h,v 1.28 2010/01/15 09:19:06 heikki Exp $
*/
#ifndef XLOG_INTERNAL_H
#define XLOG_INTERNAL_H
......@@ -151,6 +151,19 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader;
} \
} while (0)
/* Align a record pointer to next page */
#define NextLogPage(recptr) \
do { \
if (recptr.xrecoff % XLOG_BLCKSZ != 0) \
recptr.xrecoff += \
(XLOG_BLCKSZ - recptr.xrecoff % XLOG_BLCKSZ); \
if (recptr.xrecoff >= XLogFileSize) \
{ \
(recptr.xlogid)++; \
recptr.xrecoff = 0; \
} \
} while (0)
/*
* Compute ID and segment from an XLogRecPtr.
*
......@@ -253,6 +266,8 @@ extern Datum pg_stop_backup(PG_FUNCTION_ARGS);
extern Datum pg_switch_xlog(PG_FUNCTION_ARGS);
extern Datum pg_current_xlog_location(PG_FUNCTION_ARGS);
extern Datum pg_current_xlog_insert_location(PG_FUNCTION_ARGS);
extern Datum pg_last_xlog_receive_location(PG_FUNCTION_ARGS);
extern Datum pg_last_xlog_replay_location(PG_FUNCTION_ARGS);
extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS);
extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS);
extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/access/xlogdefs.h,v 1.24 2010/01/02 16:58:01 momjian Exp $
* $PostgreSQL: pgsql/src/include/access/xlogdefs.h,v 1.25 2010/01/15 09:19:06 heikki Exp $
*/
#ifndef XLOG_DEFS_H
#define XLOG_DEFS_H
......@@ -56,6 +56,22 @@ typedef struct XLogRecPtr
((a).xlogid == (b).xlogid && (a).xrecoff == (b).xrecoff)
/*
* Macro for advancing a record pointer by the specified number of bytes.
*/
#define XLByteAdvance(recptr, nbytes) \
do { \
if (recptr.xrecoff + nbytes >= XLogFileSize) \
{ \
recptr.xlogid += 1; \
recptr.xrecoff \
= recptr.xrecoff + nbytes - XLogFileSize; \
} \
else \
recptr.xrecoff += nbytes; \
} while (0)
/*
* TimeLineID (TLI) - identifies different database histories to prevent
* confusion after restoring a prior state of a database installation.
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/bootstrap/bootstrap.h,v 1.54 2010/01/02 16:58:01 momjian Exp $
* $PostgreSQL: pgsql/src/include/bootstrap/bootstrap.h,v 1.55 2010/01/15 09:19:06 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -23,6 +23,7 @@ typedef enum
StartupProcess,
BgWriterProcess,
WalWriterProcess,
WalReceiverProcess,
NUM_AUXPROCTYPES /* Must be last! */
} AuxProcType;
......
......@@ -37,7 +37,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/catalog/catversion.h,v 1.572 2010/01/14 16:31:09 teodor Exp $
* $PostgreSQL: pgsql/src/include/catalog/catversion.h,v 1.573 2010/01/15 09:19:07 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
#define CATALOG_VERSION_NO 201001141
#define CATALOG_VERSION_NO 201001151
#endif
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/catalog/pg_proc.h,v 1.561 2010/01/14 16:31:09 teodor Exp $
* $PostgreSQL: pgsql/src/include/catalog/pg_proc.h,v 1.562 2010/01/15 09:19:07 heikki Exp $
*
* NOTES
* The script catalog/genbki.pl reads this file and generates .bki
......@@ -3290,6 +3290,11 @@ DESCR("xlog filename, given an xlog location");
DATA(insert OID = 3810 ( pg_is_in_recovery PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_in_recovery _null_ _null_ _null_ ));
DESCR("true if server is in recovery");
DATA(insert OID = 3820 ( pg_last_xlog_receive_location PGNSP PGUID 12 1 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_last_xlog_receive_location _null_ _null_ _null_ ));
DESCR("current xlog flush location");
DATA(insert OID = 3821 ( pg_last_xlog_replay_location PGNSP PGUID 12 1 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_last_xlog_replay_location _null_ _null_ _null_ ));
DESCR("last xlog replay location");
DATA(insert OID = 2621 ( pg_reload_conf PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ ));
DESCR("reload configuration files");
DATA(insert OID = 2622 ( pg_rotate_logfile PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ ));
......
......@@ -11,7 +11,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/libpq/libpq-be.h,v 1.73 2010/01/10 14:16:08 mha Exp $
* $PostgreSQL: pgsql/src/include/libpq/libpq-be.h,v 1.74 2010/01/15 09:19:08 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -104,6 +104,7 @@ typedef struct
typedef struct Port
{
pgsocket sock; /* File descriptor */
bool noblock; /* is the socket in non-blocking mode? */
ProtocolVersion proto; /* FE/BE protocol version */
SockAddr laddr; /* local addr (postmaster) */
SockAddr raddr; /* remote addr (client) */
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/libpq/libpq.h,v 1.73 2010/01/10 14:16:08 mha Exp $
* $PostgreSQL: pgsql/src/include/libpq/libpq.h,v 1.74 2010/01/15 09:19:08 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -57,6 +57,7 @@ extern int pq_getstring(StringInfo s);
extern int pq_getmessage(StringInfo s, int maxlen);
extern int pq_getbyte(void);
extern int pq_peekbyte(void);
extern int pq_getbyte_if_available(unsigned char *c);
extern int pq_putbytes(const char *s, size_t len);
extern int pq_flush(void);
extern int pq_putmessage(char msgtype, const char *s, size_t len);
......
/*-------------------------------------------------------------------------
*
* walreceiver.h
* Exports from replication/walreceiverfuncs.c.
*
* Portions Copyright (c) 2010-2010, PostgreSQL Global Development Group
*
* $PostgreSQL: pgsql/src/include/replication/walreceiver.h,v 1.1 2010/01/15 09:19:09 heikki Exp $
*
*-------------------------------------------------------------------------
*/
#ifndef _WALRECEIVER_H
#define _WALRECEIVER_H
#include "storage/spin.h"
/*
* MAXCONNINFO: maximum size of a connection string.
*
* XXX: Should this move to pg_config_manual.h?
*/
#define MAXCONNINFO 1024
/*
* Values for WalRcv->walRcvState.
*/
typedef enum
{
WALRCV_NOT_STARTED,
WALRCV_RUNNING, /* walreceiver has been started */
WALRCV_STOPPING, /* requested to stop, but still running */
WALRCV_STOPPED /* stopped and mustn't start up again */
} WalRcvState;
/* Shared memory area for management of walreceiver process */
typedef struct
{
/*
* connection string; is used for walreceiver to connect with
* the primary.
*/
char conninfo[MAXCONNINFO];
/*
* PID of currently active walreceiver process, and the current state.
*/
pid_t pid;
WalRcvState walRcvState;
/*
* receivedUpto-1 is the last byte position that has been already
* received. When startup process starts the walreceiver, it sets this
* to the point where it wants the streaming to begin. After that,
* walreceiver updates this whenever it flushes the received WAL.
*/
XLogRecPtr receivedUpto;
slock_t mutex; /* locks shared variables shown above */
} WalRcvData;
extern WalRcvData *WalRcv;
extern Size WalRcvShmemSize(void);
extern void WalRcvShmemInit(void);
extern bool WalRcvInProgress(void);
extern XLogRecPtr WaitNextXLogAvailable(XLogRecPtr recptr, bool *finished);
extern void RequestXLogStreaming(XLogRecPtr recptr, const char *conninfo);
extern XLogRecPtr GetWalRcvWriteRecPtr(void);
#endif /* _WALRECEIVER_H */
/*-------------------------------------------------------------------------
*
* walsender.h
* Exports from replication/walsender.c.
*
* Portions Copyright (c) 2010-2010, PostgreSQL Global Development Group
*
* $PostgreSQL: pgsql/src/include/replication/walsender.h,v 1.1 2010/01/15 09:19:09 heikki Exp $
*
*-------------------------------------------------------------------------
*/
#ifndef _WALSENDER_H
#define _WALSENDER_H
#include "access/xlog.h"
#include "storage/spin.h"
/*
* Each walsender has a WalSnd struct in shared memory.
*/
typedef struct WalSnd
{
pid_t pid; /* this walsender's process id, or 0 */
XLogRecPtr sentPtr; /* WAL has been sent up to this point */
slock_t mutex; /* locks shared variables shown above */
} WalSnd;
/* There is one WalSndCtl struct for the whole database cluster */
typedef struct
{
WalSnd walsnds[1]; /* VARIABLE LENGTH ARRAY */
} WalSndCtlData;
extern WalSndCtlData *WalSndCtl;
/* global state */
extern bool am_walsender;
/* user-settable parameters */
extern int WalSndDelay;
extern int WalSenderMain(void);
extern void WalSndSignals(void);
extern Size WalSndShmemSize(void);
extern void WalSndShmemInit(void);
extern XLogRecPtr GetOldestWALSendPointer(void);
#endif /* _WALSENDER_H */
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/storage/pmsignal.h,v 1.27 2010/01/02 16:58:08 momjian Exp $
* $PostgreSQL: pgsql/src/include/storage/pmsignal.h,v 1.28 2010/01/15 09:19:09 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -29,6 +29,8 @@ typedef enum
PMSIGNAL_ROTATE_LOGFILE, /* send SIGUSR1 to syslogger to rotate logfile */
PMSIGNAL_START_AUTOVAC_LAUNCHER, /* start an autovacuum launcher */
PMSIGNAL_START_AUTOVAC_WORKER, /* start an autovacuum worker */
PMSIGNAL_START_WALRECEIVER, /* start a walreceiver */
PMSIGNAL_SHUTDOWN_WALRECEIVER, /* shut down a walreceiver */
NUM_PMSIGNALS /* Must be last value of enum! */
} PMSignalReason;
......@@ -45,7 +47,9 @@ extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern int AssignPostmasterChildSlot(void);
extern bool ReleasePostmasterChildSlot(int slot);
extern bool IsPostmasterChildWalSender(int slot);
extern void MarkPostmasterChildActive(void);
extern void MarkPostmasterChildWalSender(void);
extern void MarkPostmasterChildInactive(void);
extern bool PostmasterIsAlive(bool amDirectChild);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment