Commit 93f4d7f8 authored by Tom Lane's avatar Tom Lane

Support Linux's oom_score_adj API as well as the older oom_adj API.

The simplest way to handle this is just to copy-and-paste the relevant
code block in fork_process.c, so that's what I did. (It's possible that
something more complicated would be useful to packagers who want to work
with either the old or the new API; but at this point the number of such
people is rapidly approaching zero, so let's just get the minimal thing
done.)  Update relevant documentation as well.
parent b9212e37
...@@ -42,10 +42,14 @@ PGLOG="$PGDATA/serverlog" ...@@ -42,10 +42,14 @@ PGLOG="$PGDATA/serverlog"
# It's often a good idea to protect the postmaster from being killed by the # It's often a good idea to protect the postmaster from being killed by the
# OOM killer (which will tend to preferentially kill the postmaster because # OOM killer (which will tend to preferentially kill the postmaster because
# of the way it accounts for shared memory). Setting the OOM_ADJ value to # of the way it accounts for shared memory). Setting the OOM_SCORE_ADJ value
# -17 will disable OOM kill altogether. If you enable this, you probably want # to -1000 will disable OOM kill altogether. If you enable this, you probably
# to compile PostgreSQL with "-DLINUX_OOM_ADJ=0", so that individual backends # want to compile PostgreSQL with "-DLINUX_OOM_SCORE_ADJ=0", so that
# can still be killed by the OOM killer. # individual backends can still be killed by the OOM killer.
#OOM_SCORE_ADJ=-1000
# Older Linux kernels may not have /proc/self/oom_score_adj, but instead
# /proc/self/oom_adj, which works similarly except the disable value is -17.
# For such a system, enable this and compile with "-DLINUX_OOM_ADJ=0".
#OOM_ADJ=-17 #OOM_ADJ=-17
## STOP EDITING HERE ## STOP EDITING HERE
...@@ -78,6 +82,7 @@ test -x $DAEMON || ...@@ -78,6 +82,7 @@ test -x $DAEMON ||
case $1 in case $1 in
start) start)
echo -n "Starting PostgreSQL: " echo -n "Starting PostgreSQL: "
test x"$OOM_SCORE_ADJ" != x && echo "$OOM_SCORE_ADJ" > /proc/self/oom_score_adj
test x"$OOM_ADJ" != x && echo "$OOM_ADJ" > /proc/self/oom_adj test x"$OOM_ADJ" != x && echo "$OOM_ADJ" > /proc/self/oom_adj
su - $PGUSER -c "$DAEMON -D '$PGDATA' &" >>$PGLOG 2>&1 su - $PGUSER -c "$DAEMON -D '$PGDATA' &" >>$PGLOG 2>&1
echo "ok" echo "ok"
...@@ -90,6 +95,7 @@ case $1 in ...@@ -90,6 +95,7 @@ case $1 in
restart) restart)
echo -n "Restarting PostgreSQL: " echo -n "Restarting PostgreSQL: "
su - $PGUSER -c "$PGCTL stop -D '$PGDATA' -s -m fast -w" su - $PGUSER -c "$PGCTL stop -D '$PGDATA' -s -m fast -w"
test x"$OOM_SCORE_ADJ" != x && echo "$OOM_SCORE_ADJ" > /proc/self/oom_score_adj
test x"$OOM_ADJ" != x && echo "$OOM_ADJ" > /proc/self/oom_adj test x"$OOM_ADJ" != x && echo "$OOM_ADJ" > /proc/self/oom_adj
su - $PGUSER -c "$DAEMON -D '$PGDATA' &" >>$PGLOG 2>&1 su - $PGUSER -c "$DAEMON -D '$PGDATA' &" >>$PGLOG 2>&1
echo "ok" echo "ok"
......
...@@ -1268,7 +1268,7 @@ default:\ ...@@ -1268,7 +1268,7 @@ default:\
In Linux 2.4 and later, the default virtual memory behavior is not In Linux 2.4 and later, the default virtual memory behavior is not
optimal for <productname>PostgreSQL</productname>. Because of the optimal for <productname>PostgreSQL</productname>. Because of the
way that the kernel implements memory overcommit, the kernel might way that the kernel implements memory overcommit, the kernel might
terminate the <productname>PostgreSQL</productname> server (the terminate the <productname>PostgreSQL</productname> postmaster (the
master server process) if the memory demands of master server process) if the memory demands of
another process cause the system to run out of virtual memory. another process cause the system to run out of virtual memory.
</para> </para>
...@@ -1317,22 +1317,31 @@ sysctl -w vm.overcommit_memory=2 ...@@ -1317,22 +1317,31 @@ sysctl -w vm.overcommit_memory=2
<para> <para>
Another approach, which can be used with or without altering Another approach, which can be used with or without altering
<varname>vm.overcommit_memory</>, is to set the process-specific <varname>vm.overcommit_memory</>, is to set the process-specific
<varname>oom_adj</> value for the postmaster process to <literal>-17</>, <varname>oom_score_adj</> value for the postmaster process to
thereby guaranteeing it will not be targeted by the OOM killer. The <literal>-1000</>, thereby guaranteeing it will not be targeted by the OOM
simplest way to do this is to execute killer. The simplest way to do this is to execute
<programlisting> <programlisting>
echo -17 > /proc/self/oom_adj echo -1000 > /proc/self/oom_score_adj
</programlisting> </programlisting>
in the postmaster's startup script just before invoking the postmaster. in the postmaster's startup script just before invoking the postmaster.
Note that this action must be done as root, or it will have no effect; Note that this action must be done as root, or it will have no effect;
so a root-owned startup script is the easiest place to do it. If you so a root-owned startup script is the easiest place to do it. If you
do this, you may also wish to build <productname>PostgreSQL</> do this, you may also wish to build <productname>PostgreSQL</>
with <literal>-DLINUX_OOM_ADJ=0</> added to <varname>CPPFLAGS</>. with <literal>-DLINUX_OOM_SCORE_ADJ=0</> added to <varname>CPPFLAGS</>.
That will cause postmaster child processes to run with the normal That will cause postmaster child processes to run with the normal
<varname>oom_adj</> value of zero, so that the OOM killer can still <varname>oom_score_adj</> value of zero, so that the OOM killer can still
target them at need. target them at need.
</para> </para>
<para>
Older Linux kernels do not offer <filename>/proc/self/oom_score_adj</>,
but may have a previous version of the same functionality called
<filename>/proc/self/oom_adj</>. This works the same except the disable
value is <literal>-17</> not <literal>-1000</>. The corresponding
build flag for <productname>PostgreSQL</> is
<literal>-DLINUX_OOM_ADJ=0</>.
</para>
<note> <note>
<para> <para>
Some vendors' Linux 2.4 kernels are reported to have early versions Some vendors' Linux 2.4 kernels are reported to have early versions
......
...@@ -68,12 +68,40 @@ fork_process(void) ...@@ -68,12 +68,40 @@ fork_process(void)
* process sizes *including shared memory*. (This is unbelievably * process sizes *including shared memory*. (This is unbelievably
* stupid, but the kernel hackers seem uninterested in improving it.) * stupid, but the kernel hackers seem uninterested in improving it.)
* Therefore it's often a good idea to protect the postmaster by * Therefore it's often a good idea to protect the postmaster by
* setting its oom_adj value negative (which has to be done in a * setting its oom_score_adj value negative (which has to be done in a
* root-owned startup script). If you just do that much, all child * root-owned startup script). If you just do that much, all child
* processes will also be protected against OOM kill, which might not * processes will also be protected against OOM kill, which might not
* be desirable. You can then choose to build with LINUX_OOM_ADJ * be desirable. You can then choose to build with
* #defined to 0, or some other value that you want child processes to * LINUX_OOM_SCORE_ADJ #defined to 0, or to some other value that you
* adopt here. * want child processes to adopt here.
*/
#ifdef LINUX_OOM_SCORE_ADJ
{
/*
* Use open() not stdio, to ensure we control the open flags. Some
* Linux security environments reject anything but O_WRONLY.
*/
int fd = open("/proc/self/oom_score_adj", O_WRONLY, 0);
/* We ignore all errors */
if (fd >= 0)
{
char buf[16];
int rc;
snprintf(buf, sizeof(buf), "%d\n", LINUX_OOM_SCORE_ADJ);
rc = write(fd, buf, strlen(buf));
(void) rc;
close(fd);
}
}
#endif /* LINUX_OOM_SCORE_ADJ */
/*
* Older Linux kernels have oom_adj not oom_score_adj. This works
* similarly except with a different scale of adjustment values.
* If it's necessary to build Postgres to work with either API,
* you can define both LINUX_OOM_SCORE_ADJ and LINUX_OOM_ADJ.
*/ */
#ifdef LINUX_OOM_ADJ #ifdef LINUX_OOM_ADJ
{ {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment