Commit 7f508f1c authored by Heikki Linnakangas's avatar Heikki Linnakangas

Add 'directory' format to pg_dump. The new directory format is compatible

with the 'tar' format, in that untarring a tar format archive produces a
valid directory format archive.

Joachim Wieland and Heikki Linnakangas
parent f3692079
......@@ -76,11 +76,7 @@ PostgreSQL documentation
database are to be restored. The most flexible output file format is
the <quote>custom</quote> format (<option>-Fc</option>). It allows
for selection and reordering of all archived items, and is compressed
by default. The <application>tar</application> format
(<option>-Ft</option>) is not compressed and has restrictions on
reordering data when loading, but it is otherwise quite flexible;
moreover, it can be manipulated with standard Unix tools such as
<command>tar</command>.
by default.
</para>
<para>
......@@ -194,8 +190,12 @@ PostgreSQL documentation
<term><option>--file=<replaceable class="parameter">file</replaceable></option></term>
<listitem>
<para>
Send output to the specified file. If this is omitted, the
standard output is used.
Send output to the specified file. This parameter can be omitted for
file based output formats, in which case the standard output is used.
It must be given for the directory output format however, where it
specifies the target directory instead of a file. In this case the
directory is created by <command>pg_dump</command> and must not exist
before.
</para>
</listitem>
</varlistentry>
......@@ -226,9 +226,28 @@ PostgreSQL documentation
<para>
Output a custom-format archive suitable for input into
<application>pg_restore</application>.
This is the most flexible output format in that it allows manual
selection and reordering of archived items during restore.
This format is also compressed by default.
Together with the directory output format, this is the most flexible
output format in that it allows manual selection and reordering of
archived items during restore. This format is also compressed by
default.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>d</></term>
<term><literal>directory</></term>
<listitem>
<para>
Output a directory-format archive suitable for input into
<application>pg_restore</application>. This will create a directory
with one file for each table and blob being dumped, plus a
so-called Table of Contents file describing the dumped objects in a
machine-readable format that <application>pg_restore</application>
can read. A directory format archive can be manipulated with
standard Unix tools; for example, files in an uncompressed archive
can be compressed with the <application>gzip</application> tool.
This format is compressed by default.
</para>
</listitem>
</varlistentry>
......@@ -239,13 +258,12 @@ PostgreSQL documentation
<listitem>
<para>
Output a <command>tar</command>-format archive suitable for input
into <application>pg_restore</application>.
This output format allows manual selection and reordering of
archived items during restore, but there is a restriction: the
relative order of table data items cannot be changed during
restore. Also, <command>tar</command> format does not support
compression and has a limit of 8 GB on the size of individual
tables.
into <application>pg_restore</application>. The tar-format is
compatible with the directory-format; extracting a tar-format
archive produces a valid directory-format archive.
However, the tar-format does not support compression and has a
limit of 8 GB on the size of individual tables. Also, the relative
order of table data items cannot be changed during restore.
</para>
</listitem>
</varlistentry>
......@@ -946,6 +964,14 @@ CREATE DATABASE foo WITH TEMPLATE template0;
</screen>
</para>
<para>
To dump a database into a directory-format archive:
<screen>
<prompt>$</prompt> <userinput>pg_dump -Fd mydb -f dumpdir</userinput>
</screen>
</para>
<para>
To reload an archive file into a (freshly created) database named
<literal>newdb</>:
......
......@@ -79,7 +79,8 @@
<term><replaceable class="parameter">filename</replaceable></term>
<listitem>
<para>
Specifies the location of the archive file to be restored.
Specifies the location of the archive file (or directory, for a
directory-format archive) to be restored.
If not specified, the standard input is used.
</para>
</listitem>
......@@ -166,6 +167,16 @@
one of the following:
<variablelist>
<varlistentry>
<term><literal>d</></term>
<term><literal>directory</></term>
<listitem>
<para>
The archive is a <command>directory</command> archive.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>t</></term>
<term><literal>tar</></term>
......
......@@ -20,7 +20,7 @@ override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
OBJS= pg_backup_archiver.o pg_backup_db.o pg_backup_custom.o \
pg_backup_files.o pg_backup_null.o pg_backup_tar.o \
dumputils.o compress_io.o $(WIN32RES)
pg_backup_directory.o dumputils.o compress_io.o $(WIN32RES)
KEYWRDOBJS = keywords.o kwlookup.o
......
......@@ -7,6 +7,17 @@
* Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* This file includes two APIs for dealing with compressed data. The first
* provides more flexibility, using callbacks to read/write data from the
* underlying stream. The second API is a wrapper around fopen/gzopen and
* friends, providing an interface similar to those, but abstracts away
* the possible compression. Both APIs use libz for the compression, but
* the second API uses gzip headers, so the resulting files can be easily
* manipulated with the gzip utility.
*
* Compressor API
* --------------
*
* The interface for writing to an archive consists of three functions:
* AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
* AllocateCompressor, then write all the data by calling WriteDataToArchive
......@@ -23,6 +34,17 @@
*
* The interface is the same for compressed and uncompressed streams.
*
* Compressed stream API
* ----------------------
*
* The compressed stream API is a wrapper around the C standard fopen() and
* libz's gzopen() APIs. It allows you to use the same functions for
* compressed and uncompressed streams. cfopen_read() first tries to open
* the file with given name, and if it fails, it tries to open the same
* file with the .gz suffix. cfopen_write() opens a file for writing, an
* extra argument specifies if the file should be compressed, and adds the
* .gz suffix to the filename if so. This allows you to easily handle both
* compressed and uncompressed files.
*
* IDENTIFICATION
* src/bin/pg_dump/compress_io.c
......@@ -32,6 +54,10 @@
#include "compress_io.h"
/*----------------------
* Compressor API
*----------------------
*/
/* typedef appears in compress_io.h */
struct CompressorState
......@@ -418,3 +444,234 @@ WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
}
/*----------------------
* Compressed stream API
*----------------------
*/
/*
* cfp represents an open stream, wrapping the underlying FILE or gzFile
* pointer. This is opaque to the callers.
*/
struct cfp
{
FILE *uncompressedfp;
#ifdef HAVE_LIBZ
gzFile compressedfp;
#endif
};
#ifdef HAVE_LIBZ
static int hasSuffix(const char *filename, const char *suffix);
#endif
/*
* Open a file for reading. 'path' is the file to open, and 'mode' should
* be either "r" or "rb".
*
* If the file at 'path' does not exist, we append the ".gz" suffix (if 'path'
* doesn't already have it) and try again. So if you pass "foo" as 'path',
* this will open either "foo" or "foo.gz".
*/
cfp *
cfopen_read(const char *path, const char *mode)
{
cfp *fp;
#ifdef HAVE_LIBZ
if (hasSuffix(path, ".gz"))
fp = cfopen(path, mode, 1);
else
#endif
{
fp = cfopen(path, mode, 0);
#ifdef HAVE_LIBZ
if (fp == NULL)
{
int fnamelen = strlen(path) + 4;
char *fname = malloc(fnamelen);
if (fname == NULL)
die_horribly(NULL, modulename, "Out of memory\n");
snprintf(fname, fnamelen, "%s%s", path, ".gz");
fp = cfopen(fname, mode, 1);
free(fname);
}
#endif
}
return fp;
}
/*
* Open a file for writing. 'path' indicates the path name, and 'mode' must
* be a filemode as accepted by fopen() and gzopen() that indicates writing
* ("w", "wb", "a", or "ab").
*
* If 'compression' is non-zero, a gzip compressed stream is opened, and
* and 'compression' indicates the compression level used. The ".gz" suffix
* is automatically added to 'path' in that case.
*/
cfp *
cfopen_write(const char *path, const char *mode, int compression)
{
cfp *fp;
if (compression == 0)
fp = cfopen(path, mode, 0);
else
{
#ifdef HAVE_LIBZ
int fnamelen = strlen(path) + 4;
char *fname = malloc(fnamelen);
if (fname == NULL)
die_horribly(NULL, modulename, "Out of memory\n");
snprintf(fname, fnamelen, "%s%s", path, ".gz");
fp = cfopen(fname, mode, 1);
free(fname);
#else
die_horribly(NULL, modulename, "not built with zlib support\n");
#endif
}
return fp;
}
/*
* Opens file 'path' in 'mode'. If 'compression' is non-zero, the file
* is opened with libz gzopen(), otherwise with plain fopen()
*/
cfp *
cfopen(const char *path, const char *mode, int compression)
{
cfp *fp = malloc(sizeof(cfp));
if (fp == NULL)
die_horribly(NULL, modulename, "Out of memory\n");
if (compression != 0)
{
#ifdef HAVE_LIBZ
fp->compressedfp = gzopen(path, mode);
fp->uncompressedfp = NULL;
if (fp->compressedfp == NULL)
{
free(fp);
fp = NULL;
}
#else
die_horribly(NULL, modulename, "not built with zlib support\n");
#endif
}
else
{
#ifdef HAVE_LIBZ
fp->compressedfp = NULL;
#endif
fp->uncompressedfp = fopen(path, mode);
if (fp->uncompressedfp == NULL)
{
free(fp);
fp = NULL;
}
}
return fp;
}
int
cfread(void *ptr, int size, cfp *fp)
{
#ifdef HAVE_LIBZ
if (fp->compressedfp)
return gzread(fp->compressedfp, ptr, size);
else
#endif
return fread(ptr, 1, size, fp->uncompressedfp);
}
int
cfwrite(const void *ptr, int size, cfp *fp)
{
#ifdef HAVE_LIBZ
if (fp->compressedfp)
return gzwrite(fp->compressedfp, ptr, size);
else
#endif
return fwrite(ptr, 1, size, fp->uncompressedfp);
}
int
cfgetc(cfp *fp)
{
#ifdef HAVE_LIBZ
if (fp->compressedfp)
return gzgetc(fp->compressedfp);
else
#endif
return fgetc(fp->uncompressedfp);
}
char *
cfgets(cfp *fp, char *buf, int len)
{
#ifdef HAVE_LIBZ
if (fp->compressedfp)
return gzgets(fp->compressedfp, buf, len);
else
#endif
return fgets(buf, len, fp->uncompressedfp);
}
int
cfclose(cfp *fp)
{
int result;
if (fp == NULL)
{
errno = EBADF;
return EOF;
}
#ifdef HAVE_LIBZ
if (fp->compressedfp)
{
result = gzclose(fp->compressedfp);
fp->compressedfp = NULL;
}
else
#endif
{
result = fclose(fp->uncompressedfp);
fp->uncompressedfp = NULL;
}
free(fp);
return result;
}
int
cfeof(cfp *fp)
{
#ifdef HAVE_LIBZ
if (fp->compressedfp)
return gzeof(fp->compressedfp);
else
#endif
return feof(fp->uncompressedfp);
}
#ifdef HAVE_LIBZ
static int
hasSuffix(const char *filename, const char *suffix)
{
int filenamelen = strlen(filename);
int suffixlen = strlen(suffix);
if (filenamelen < suffixlen)
return 0;
return memcmp(&filename[filenamelen - suffixlen],
suffix,
suffixlen) == 0;
}
#endif
......@@ -54,4 +54,17 @@ extern size_t WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
const void *data, size_t dLen);
extern void EndCompressor(ArchiveHandle *AH, CompressorState *cs);
typedef struct cfp cfp;
extern cfp *cfopen(const char *path, const char *mode, int compression);
extern cfp *cfopen_read(const char *path, const char *mode);
extern cfp *cfopen_write(const char *path, const char *mode, int compression);
extern int cfread(void *ptr, int size, cfp *fp);
extern int cfwrite(const void *ptr, int size, cfp *fp);
extern int cfgetc(cfp *fp);
extern char *cfgets(cfp *fp, char *buf, int len);
extern int cfclose(cfp *fp);
extern int cfeof(cfp *fp);
#endif
......@@ -50,7 +50,8 @@ typedef enum _archiveFormat
archCustom = 1,
archFiles = 2,
archTar = 3,
archNull = 4
archNull = 4,
archDirectory = 5
} ArchiveFormat;
typedef enum _archiveMode
......
......@@ -25,6 +25,7 @@
#include <ctype.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
......@@ -1751,11 +1752,46 @@ _discoverArchiveFormat(ArchiveHandle *AH)
if (AH->fSpec)
{
struct stat st;
wantClose = 1;
fh = fopen(AH->fSpec, PG_BINARY_R);
if (!fh)
die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
AH->fSpec, strerror(errno));
/*
* Check if the specified archive is a directory. If so, check if
* there's a "toc.dat" (or "toc.dat.gz") file in it.
*/
if (stat(AH->fSpec, &st) == 0 && S_ISDIR(st.st_mode))
{
char buf[MAXPGPATH];
if (snprintf(buf, MAXPGPATH, "%s/toc.dat", AH->fSpec) >= MAXPGPATH)
die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
AH->fSpec);
if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
{
AH->format = archDirectory;
return AH->format;
}
#ifdef HAVE_LIBZ
if (snprintf(buf, MAXPGPATH, "%s/toc.dat.gz", AH->fSpec) >= MAXPGPATH)
die_horribly(AH, modulename, "directory name too long: \"%s\"\n",
AH->fSpec);
if (stat(buf, &st) == 0 && S_ISREG(st.st_mode))
{
AH->format = archDirectory;
return AH->format;
}
#endif
die_horribly(AH, modulename, "directory \"%s\" does not appear to be a valid archive (\"toc.dat\" does not exist)\n",
AH->fSpec);
}
else
{
fh = fopen(AH->fSpec, PG_BINARY_R);
if (!fh)
die_horribly(AH, modulename, "could not open input file \"%s\": %s\n",
AH->fSpec, strerror(errno));
}
}
else
{
......@@ -1973,6 +2009,10 @@ _allocAH(const char *FileSpec, const ArchiveFormat fmt,
InitArchiveFmt_Null(AH);
break;
case archDirectory:
InitArchiveFmt_Directory(AH);
break;
case archTar:
InitArchiveFmt_Tar(AH);
break;
......
......@@ -370,6 +370,7 @@ extern void EndRestoreBlobs(ArchiveHandle *AH);
extern void InitArchiveFmt_Custom(ArchiveHandle *AH);
extern void InitArchiveFmt_Files(ArchiveHandle *AH);
extern void InitArchiveFmt_Null(ArchiveHandle *AH);
extern void InitArchiveFmt_Directory(ArchiveHandle *AH);
extern void InitArchiveFmt_Tar(ArchiveHandle *AH);
extern bool isValidTarHeader(char *header);
......
This diff is collapsed.
......@@ -4,6 +4,10 @@
*
* This file is copied from the 'files' format file, but dumps data into
* one temp file then sends it to the output TAR archive.
*
* NOTE: If you untar the created 'tar' file, the resulting files are
* compatible with the 'directory' format. Please keep the two formats in
* sync.
*
* See the headers to pg_backup_files & pg_restore for more details.
*
......@@ -167,7 +171,7 @@ InitArchiveFmt_Tar(ArchiveHandle *AH)
die_horribly(AH, modulename, "out of memory\n");
/*
* Now open the TOC file
* Now open the tar file, and load the TOC if we're in read mode.
*/
if (AH->mode == archModeWrite)
{
......
......@@ -138,6 +138,7 @@ static int no_unlogged_table_data = 0;
static void help(const char *progname);
static ArchiveFormat parseArchiveFormat(const char *format, ArchiveMode *mode);
static void expand_schema_name_patterns(SimpleStringList *patterns,
SimpleOidList *oids);
static void expand_table_name_patterns(SimpleStringList *patterns,
......@@ -267,6 +268,8 @@ main(int argc, char **argv)
int my_version;
int optindex;
RestoreOptions *ropt;
ArchiveFormat archiveFormat = archUnknown;
ArchiveMode archiveMode;
static int disable_triggers = 0;
static int outputNoTablespaces = 0;
......@@ -539,36 +542,31 @@ main(int argc, char **argv)
exit(1);
}
/* open the output file */
if (pg_strcasecmp(format, "a") == 0 || pg_strcasecmp(format, "append") == 0)
{
/* This is used by pg_dumpall, and is not documented */
plainText = 1;
g_fout = CreateArchive(filename, archNull, 0, archModeAppend);
}
else if (pg_strcasecmp(format, "c") == 0 || pg_strcasecmp(format, "custom") == 0)
g_fout = CreateArchive(filename, archCustom, compressLevel, archModeWrite);
else if (pg_strcasecmp(format, "f") == 0 || pg_strcasecmp(format, "file") == 0)
{
/*
* Dump files into the current directory; for demonstration only, not
* documented.
*/
g_fout = CreateArchive(filename, archFiles, compressLevel, archModeWrite);
}
else if (pg_strcasecmp(format, "p") == 0 || pg_strcasecmp(format, "plain") == 0)
{
archiveFormat = parseArchiveFormat(format, &archiveMode);
/* archiveFormat specific setup */
if (archiveFormat == archNull)
plainText = 1;
g_fout = CreateArchive(filename, archNull, 0, archModeWrite);
}
else if (pg_strcasecmp(format, "t") == 0 || pg_strcasecmp(format, "tar") == 0)
g_fout = CreateArchive(filename, archTar, compressLevel, archModeWrite);
else
/*
* Ignore compression level for plain format. XXX: This is a bit
* inconsistent, tar-format throws an error instead.
*/
if (archiveFormat == archNull)
compressLevel = 0;
/* Custom and directory formats are compressed by default */
if (compressLevel == -1)
{
write_msg(NULL, "invalid output format \"%s\" specified\n", format);
exit(1);
if (archiveFormat == archCustom || archiveFormat == archDirectory)
compressLevel = Z_DEFAULT_COMPRESSION;
else
compressLevel = 0;
}
/* open the output file */
g_fout = CreateArchive(filename, archiveFormat, compressLevel, archiveMode);
if (g_fout == NULL)
{
write_msg(NULL, "could not open output file \"%s\" for writing\n", filename);
......@@ -835,8 +833,8 @@ help(const char *progname)
printf(_(" %s [OPTION]... [DBNAME]\n"), progname);
printf(_("\nGeneral options:\n"));
printf(_(" -f, --file=FILENAME output file name\n"));
printf(_(" -F, --format=c|t|p output file format (custom, tar, plain text)\n"));
printf(_(" -f, --file=OUTPUT output file or directory name\n"));
printf(_(" -F, --format=c|d|t|p output file format (custom, directory, tar, plain text)\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
printf(_(" --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock\n"));
......@@ -894,6 +892,49 @@ exit_nicely(void)
exit(1);
}
static ArchiveFormat
parseArchiveFormat(const char *format, ArchiveMode *mode)
{
ArchiveFormat archiveFormat;
*mode = archModeWrite;
if (pg_strcasecmp(format, "a") == 0 || pg_strcasecmp(format, "append") == 0)
{
/* This is used by pg_dumpall, and is not documented */
archiveFormat = archNull;
*mode = archModeAppend;
}
else if (pg_strcasecmp(format, "c") == 0)
archiveFormat = archCustom;
else if (pg_strcasecmp(format, "custom") == 0)
archiveFormat = archCustom;
else if (pg_strcasecmp(format, "d") == 0)
archiveFormat = archDirectory;
else if (pg_strcasecmp(format, "directory") == 0)
archiveFormat = archDirectory;
else if (pg_strcasecmp(format, "f") == 0 || pg_strcasecmp(format, "file") == 0)
/*
* Dump files into the current directory; for demonstration only, not
* documented.
*/
archiveFormat = archFiles;
else if (pg_strcasecmp(format, "p") == 0)
archiveFormat = archNull;
else if (pg_strcasecmp(format, "plain") == 0)
archiveFormat = archNull;
else if (pg_strcasecmp(format, "t") == 0)
archiveFormat = archTar;
else if (pg_strcasecmp(format, "tar") == 0)
archiveFormat = archTar;
else
{
write_msg(NULL, "invalid output format \"%s\" specified\n", format);
exit(1);
}
return archiveFormat;
}
/*
* Find the OIDs of all schemas matching the given list of patterns,
* and append them to the given OID list.
......
......@@ -352,6 +352,11 @@ main(int argc, char **argv)
opts->format = archCustom;
break;
case 'd':
case 'D':
opts->format = archDirectory;
break;
case 'f':
case 'F':
opts->format = archFiles;
......@@ -363,7 +368,7 @@ main(int argc, char **argv)
break;
default:
write_msg(NULL, "unrecognized archive format \"%s\"; please specify \"c\" or \"t\"\n",
write_msg(NULL, "unrecognized archive format \"%s\"; please specify \"c\", \"d\" or \"t\"\n",
opts->formatName);
exit(1);
}
......@@ -418,7 +423,7 @@ usage(const char *progname)
printf(_("\nGeneral options:\n"));
printf(_(" -d, --dbname=NAME connect to database name\n"));
printf(_(" -f, --file=FILENAME output file name\n"));
printf(_(" -F, --format=c|t backup file format (should be automatic)\n"));
printf(_(" -F, --format=c|d|t backup file format (should be automatic)\n"));
printf(_(" -l, --list print summarized TOC of the archive\n"));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" --help show this help, then exit\n"));
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment