Commit 67a87535 authored by Tom Lane's avatar Tom Lane

In pgbench logging, avoid assuming that instr_times match Unix timestamps.

For aggregated logging, pg_bench supposed that printing the integer part of
INSTR_TIME_GET_DOUBLE() would produce a Unix timestamp.  That was already
broken on Windows, and it's about to get broken on most other platforms as
well.  As in commit 74baa1e3, we can remove the entanglement at the price
of one extra syscall per transaction; though here it seems more convenient
to use time(NULL) instead of gettimeofday(), since we only need
integral-second precision.

I took the time to do some wordsmithing on the documentation about
pgbench's logging features, too.

Discussion: https://postgr.es/m/8837.1483216839@sss.pgh.pa.us
parent 74baa1e3
...@@ -355,7 +355,7 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> ...@@ -355,7 +355,7 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
<term><option>--log</option></term> <term><option>--log</option></term>
<listitem> <listitem>
<para> <para>
Write the time taken by each transaction to a log file. Write information about each transaction to a log file.
See below for details. See below for details.
</para> </para>
</listitem> </listitem>
...@@ -572,13 +572,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> ...@@ -572,13 +572,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
<term><option>--aggregate-interval=<replaceable>seconds</></option></term> <term><option>--aggregate-interval=<replaceable>seconds</></option></term>
<listitem> <listitem>
<para> <para>
Length of aggregation interval (in seconds). May be used only together Length of aggregation interval (in seconds). May be used only
with <application>-l</application> - with this option, the log contains with <option>-l</option> option. With this option, the log contains
per-interval summary (number of transactions, min/max latency and two per-interval summary data, as described below.
additional fields useful for variance estimation).
</para>
<para>
This option is not currently supported on Windows.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -618,8 +614,8 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</> ...@@ -618,8 +614,8 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
<term><option>--log-prefix=<replaceable>prefix</></option></term> <term><option>--log-prefix=<replaceable>prefix</></option></term>
<listitem> <listitem>
<para> <para>
Set the filename prefix for the transaction log file created by Set the filename prefix for the transaction log files created by
<option>--log</>. The default is <replaceable>pgbench_log</>. <option>--log</>. The default is <literal>pgbench_log</>.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1128,38 +1124,44 @@ END; ...@@ -1128,38 +1124,44 @@ END;
<title>Per-Transaction Logging</title> <title>Per-Transaction Logging</title>
<para> <para>
With the <option>-l</> option but without the <option>--aggregate-interval</option>, With the <option>-l</> option (but without
<application>pgbench</> writes the time taken by each transaction the <option>--aggregate-interval</option> option),
<application>pgbench</> writes information about each transaction
to a log file. The log file will be named to a log file. The log file will be named
<filename><replaceable>prefix</>.<replaceable>nnn</></filename>, <filename><replaceable>prefix</>.<replaceable>nnn</></filename>,
where <replaceable>prefix</> defaults to <literal>pgbench_log</>, and where <replaceable>prefix</> defaults to <literal>pgbench_log</>, and
<replaceable>nnn</> is the PID of the <replaceable>nnn</> is the PID of the
<application>pgbench</application> process. If the <option>-j</> option is 2 or higher, <application>pgbench</application> process.
creating multiple worker threads, each will have its own log file. The first worker will The prefix can be changed by using the <option>--log-prefix</> option.
If the <option>-j</> option is 2 or higher, so that there are multiple
worker threads, each will have its own log file. The first worker will
use the same name for its log file as in the standard single worker case. use the same name for its log file as in the standard single worker case.
The additional log files for the other workers will be named The additional log files for the other workers will be named
<filename><replaceable>pgbench_log</>.<replaceable>nnn</>.<replaceable>mmm</></filename>, <filename><replaceable>prefix</>.<replaceable>nnn</>.<replaceable>mmm</></filename>,
where <replaceable>mmm</> is a sequential number for each worker starting where <replaceable>mmm</> is a sequential number for each worker starting
with 1. The prefix can be changed by using the <option>--log-prefix</> with 1.
option.
</para> </para>
<para> <para>
The format of the log is: The format of the log is:
<synopsis> <synopsis>
<replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>script_no</> <replaceable>time_epoch</> <replaceable>time_us</> <optional><replaceable>schedule_lag</replaceable></optional> <replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>script_no</> <replaceable>time_epoch</> <replaceable>time_us</> <optional> <replaceable>schedule_lag</replaceable> </optional>
</synopsis> </synopsis>
where <replaceable>time</> is the total elapsed transaction time in microseconds, where
<replaceable>client_id</> indicates which client session ran the transaction,
<replaceable>transaction_no</> counts how many transactions have been
run by that session,
<replaceable>time</> is the total elapsed transaction time in microseconds,
<replaceable>script_no</> identifies which script file was used (useful when <replaceable>script_no</> identifies which script file was used (useful when
multiple scripts were specified with <option>-f</> or <option>-b</>), multiple scripts were specified with <option>-f</> or <option>-b</>),
and <replaceable>time_epoch</>/<replaceable>time_us</> are a and <replaceable>time_epoch</>/<replaceable>time_us</> are a
Unix epoch format time stamp and an offset Unix-epoch time stamp and an offset
in microseconds (suitable for creating an ISO 8601 in microseconds (suitable for creating an ISO 8601
time stamp with fractional seconds) showing when time stamp with fractional seconds) showing when
the transaction completed. the transaction completed.
Field <replaceable>schedule_lag</> is the difference between the The <replaceable>schedule_lag</> field is the difference between the
transaction's scheduled start time, and the time it actually started, in transaction's scheduled start time, and the time it actually started, in
microseconds. It is only present when the <option>--rate</> option is used. microseconds. It is only present when the <option>--rate</> option is used.
When both <option>--rate</> and <option>--latency-limit</> are used, When both <option>--rate</> and <option>--latency-limit</> are used,
...@@ -1168,7 +1170,7 @@ END; ...@@ -1168,7 +1170,7 @@ END;
</para> </para>
<para> <para>
Here is a snippet of the log file generated: Here is a snippet of a log file generated in a single-client run:
<screen> <screen>
0 199 2241 0 1175850568 995598 0 199 2241 0 1175850568 995598
0 200 2465 0 1175850568 998079 0 200 2465 0 1175850568 998079
...@@ -1176,7 +1178,8 @@ END; ...@@ -1176,7 +1178,8 @@ END;
0 202 2038 0 1175850569 2663 0 202 2038 0 1175850569 2663
</screen> </screen>
Another example with --rate=100 and --latency-limit=5 (note the additional Another example with <literal>--rate=100</>
and <literal>--latency-limit=5</> (note the additional
<replaceable>schedule_lag</> column): <replaceable>schedule_lag</> column):
<screen> <screen>
0 81 4621 0 1412881037 912698 3005 0 81 4621 0 1412881037 912698 3005
...@@ -1203,32 +1206,41 @@ END; ...@@ -1203,32 +1206,41 @@ END;
<title>Aggregated Logging</title> <title>Aggregated Logging</title>
<para> <para>
With the <option>--aggregate-interval</option> option, the logs use a bit different format: With the <option>--aggregate-interval</option> option, a different
format is used for the log files:
<synopsis> <synopsis>
<replaceable>interval_start</> <replaceable>num_of_transactions</> <replaceable>latency_sum</> <replaceable>latency_2_sum</> <replaceable>min_latency</> <replaceable>max_latency</> <optional><replaceable>lag_sum</> <replaceable>lag_2_sum</> <replaceable>min_lag</> <replaceable>max_lag</> <optional><replaceable>skipped_transactions</></optional></optional> <replaceable>interval_start</> <replaceable>num_transactions</> <replaceable>sum_latency</> <replaceable>sum_latency_2</> <replaceable>min_latency</> <replaceable>max_latency</> <optional> <replaceable>sum_lag</> <replaceable>sum_lag_2</> <replaceable>min_lag</> <replaceable>max_lag</> <optional> <replaceable>skipped</> </optional> </optional>
</synopsis> </synopsis>
where <replaceable>interval_start</> is the start of the interval (Unix epoch where
format time stamp), <replaceable>num_of_transactions</> is the number of transactions <replaceable>interval_start</> is the start of the interval (as a Unix
within the interval, <replaceable>latency_sum</replaceable> is a sum of latencies epoch time stamp),
(so you can compute average latency easily). The following two fields are useful <replaceable>num_transactions</> is the number of transactions
for variance estimation - <replaceable>latency_sum</> is a sum of latencies and within the interval,
<replaceable>latency_2_sum</> is a sum of 2nd powers of latencies. The next two <replaceable>sum_latency</replaceable> is the sum of the transaction
fields are <replaceable>min_latency</> - a minimum latency within the interval, and latencies within the interval,
<replaceable>max_latency</> - maximum latency within the interval. A transaction is <replaceable>sum_latency_2</replaceable> is the sum of squares of the
counted into the interval when it was committed. The fields in the end, transaction latencies within the interval,
<replaceable>lag_sum</>, <replaceable>lag_2_sum</>, <replaceable>min_lag</>, <replaceable>min_latency</> is the minimum latency within the interval,
and
<replaceable>max_latency</> is the maximum latency within the interval.
The next fields,
<replaceable>sum_lag</>, <replaceable>sum_lag_2</>, <replaceable>min_lag</>,
and <replaceable>max_lag</>, are only present if the <option>--rate</> and <replaceable>max_lag</>, are only present if the <option>--rate</>
option is used. The very last one, <replaceable>skipped_transactions</>, option is used.
is only present if the option <option>--latency-limit</> is present, too. They provide statistics about the time each transaction had to wait for the
They are calculated from the time each transaction had to wait for the
previous one to finish, i.e. the difference between each transaction's previous one to finish, i.e. the difference between each transaction's
scheduled start time and the time it actually started. scheduled start time and the time it actually started.
The very last field, <replaceable>skipped</>,
is only present if the <option>--latency-limit</> option is used, too.
It counts the number of transactions skipped because they would have
started too late.
Each transaction is counted in the interval when it was committed.
</para> </para>
<para> <para>
Here is example output: Here is some example output:
<screen> <screen>
1345828501 5601 1542744 483552416 61 2573 1345828501 5601 1542744 483552416 61 2573
1345828503 7884 1979812 565806736 60 1479 1345828503 7884 1979812 565806736 60 1479
...@@ -1238,9 +1250,9 @@ END; ...@@ -1238,9 +1250,9 @@ END;
</screen></para> </screen></para>
<para> <para>
Notice that while the plain (unaggregated) log file contains a reference Notice that while the plain (unaggregated) log file shows which script
to the custom script files, the aggregated log does not. Therefore if was used for each transaction, the aggregated log does not. Therefore if
you need per script data, you need to aggregate the data on your own. you need per-script data, you need to aggregate the data on your own.
</para> </para>
</refsect2> </refsect2>
......
...@@ -42,6 +42,7 @@ ...@@ -42,6 +42,7 @@
#include <limits.h> #include <limits.h>
#include <math.h> #include <math.h>
#include <signal.h> #include <signal.h>
#include <time.h>
#include <sys/time.h> #include <sys/time.h>
#ifdef HAVE_SYS_SELECT_H #ifdef HAVE_SYS_SELECT_H
#include <sys/select.h> #include <sys/select.h>
...@@ -227,7 +228,7 @@ typedef struct SimpleStats ...@@ -227,7 +228,7 @@ typedef struct SimpleStats
*/ */
typedef struct StatsData typedef struct StatsData
{ {
long start_time; /* interval start time, for aggregates */ time_t start_time; /* interval start time, for aggregates */
int64 cnt; /* number of transactions */ int64 cnt; /* number of transactions */
int64 skipped; /* number of transactions skipped under --rate int64 skipped; /* number of transactions skipped under --rate
* and --latency-limit */ * and --latency-limit */
...@@ -449,7 +450,7 @@ static const BuiltinScript builtin_script[] = ...@@ -449,7 +450,7 @@ static const BuiltinScript builtin_script[] =
static void setIntValue(PgBenchValue *pv, int64 ival); static void setIntValue(PgBenchValue *pv, int64 ival);
static void setDoubleValue(PgBenchValue *pv, double dval); static void setDoubleValue(PgBenchValue *pv, double dval);
static bool evaluateExpr(TState *, CState *, PgBenchExpr *, PgBenchValue *); static bool evaluateExpr(TState *, CState *, PgBenchExpr *, PgBenchValue *);
static void doLog(TState *thread, CState *st, instr_time *now, static void doLog(TState *thread, CState *st,
StatsData *agg, bool skipped, double latency, double lag); StatsData *agg, bool skipped, double latency, double lag);
static void processXactStats(TState *thread, CState *st, instr_time *now, static void processXactStats(TState *thread, CState *st, instr_time *now,
bool skipped, StatsData *agg); bool skipped, StatsData *agg);
...@@ -780,7 +781,7 @@ mergeSimpleStats(SimpleStats *acc, SimpleStats *ss) ...@@ -780,7 +781,7 @@ mergeSimpleStats(SimpleStats *acc, SimpleStats *ss)
* the given value. * the given value.
*/ */
static void static void
initStats(StatsData *sd, double start_time) initStats(StatsData *sd, time_t start_time)
{ {
sd->start_time = start_time; sd->start_time = start_time;
sd->cnt = 0; sd->cnt = 0;
...@@ -2425,10 +2426,15 @@ doCustom(TState *thread, CState *st, StatsData *agg) ...@@ -2425,10 +2426,15 @@ doCustom(TState *thread, CState *st, StatsData *agg)
} }
/* /*
* print log entry after completing one transaction. * Print log entry after completing one transaction.
*
* We print Unix-epoch timestamps in the log, so that entries can be
* correlated against other logs. On some platforms this could be obtained
* from the instr_time reading the caller has, but rather than get entangled
* with that, we just eat the cost of an extra syscall in all cases.
*/ */
static void static void
doLog(TState *thread, CState *st, instr_time *now, doLog(TState *thread, CState *st,
StatsData *agg, bool skipped, double latency, double lag) StatsData *agg, bool skipped, double latency, double lag)
{ {
FILE *logfile = thread->logfile; FILE *logfile = thread->logfile;
...@@ -2447,15 +2453,17 @@ doLog(TState *thread, CState *st, instr_time *now, ...@@ -2447,15 +2453,17 @@ doLog(TState *thread, CState *st, instr_time *now,
if (agg_interval > 0) if (agg_interval > 0)
{ {
/* /*
* Loop until we reach the interval of the current transaction, and * Loop until we reach the interval of the current moment, and print
* print all the empty intervals in between (this may happen with very * any empty intervals in between (this may happen with very low tps,
* low tps, e.g. --rate=0.1). * e.g. --rate=0.1).
*/ */
while (agg->start_time + agg_interval < INSTR_TIME_GET_DOUBLE(*now)) time_t now = time(NULL);
while (agg->start_time + agg_interval <= now)
{ {
/* print aggregated report to logfile */ /* print aggregated report to logfile */
fprintf(logfile, "%ld " INT64_FORMAT " %.0f %.0f %.0f %.0f", fprintf(logfile, "%ld " INT64_FORMAT " %.0f %.0f %.0f %.0f",
agg->start_time, (long) agg->start_time,
agg->cnt, agg->cnt,
agg->latency.sum, agg->latency.sum,
agg->latency.sum2, agg->latency.sum2,
...@@ -2485,12 +2493,6 @@ doLog(TState *thread, CState *st, instr_time *now, ...@@ -2485,12 +2493,6 @@ doLog(TState *thread, CState *st, instr_time *now,
/* no, print raw transactions */ /* no, print raw transactions */
struct timeval tv; struct timeval tv;
/*
* We print the current system timestamp in the log, so that entries
* can be correlated against other logs. On some platforms this is
* available in *now, but rather than get entangled with that, we just
* eat the cost of an extra syscall in all cases.
*/
gettimeofday(&tv, NULL); gettimeofday(&tv, NULL);
if (skipped) if (skipped)
fprintf(logfile, "%d " INT64_FORMAT " skipped %d %ld %ld", fprintf(logfile, "%d " INT64_FORMAT " skipped %d %ld %ld",
...@@ -2518,7 +2520,7 @@ processXactStats(TState *thread, CState *st, instr_time *now, ...@@ -2518,7 +2520,7 @@ processXactStats(TState *thread, CState *st, instr_time *now,
double latency = 0.0, double latency = 0.0,
lag = 0.0; lag = 0.0;
if ((!skipped || agg_interval) && INSTR_TIME_IS_ZERO(*now)) if ((!skipped) && INSTR_TIME_IS_ZERO(*now))
INSTR_TIME_SET_CURRENT(*now); INSTR_TIME_SET_CURRENT(*now);
if (!skipped) if (!skipped)
...@@ -2540,7 +2542,7 @@ processXactStats(TState *thread, CState *st, instr_time *now, ...@@ -2540,7 +2542,7 @@ processXactStats(TState *thread, CState *st, instr_time *now,
thread->stats.cnt++; thread->stats.cnt++;
if (use_log) if (use_log)
doLog(thread, st, now, agg, skipped, latency, lag); doLog(thread, st, agg, skipped, latency, lag);
/* XXX could use a mutex here, but we choose not to */ /* XXX could use a mutex here, but we choose not to */
if (per_script_stats) if (per_script_stats)
...@@ -3202,7 +3204,7 @@ ParseScript(const char *script, const char *desc, int weight) ...@@ -3202,7 +3204,7 @@ ParseScript(const char *script, const char *desc, int weight)
ps.desc = desc; ps.desc = desc;
ps.weight = weight; ps.weight = weight;
ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num); ps.commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
initStats(&ps.stats, 0.0); initStats(&ps.stats, 0);
/* Prepare to parse script */ /* Prepare to parse script */
sstate = psql_scan_create(&pgbench_callbacks); sstate = psql_scan_create(&pgbench_callbacks);
...@@ -3972,10 +3974,6 @@ main(int argc, char **argv) ...@@ -3972,10 +3974,6 @@ main(int argc, char **argv)
} }
break; break;
case 5: case 5:
#ifdef WIN32
fprintf(stderr, "--aggregate-interval is not currently supported on Windows\n");
exit(1);
#else
benchmarking_option_set = true; benchmarking_option_set = true;
agg_interval = atoi(optarg); agg_interval = atoi(optarg);
if (agg_interval <= 0) if (agg_interval <= 0)
...@@ -3984,7 +3982,6 @@ main(int argc, char **argv) ...@@ -3984,7 +3982,6 @@ main(int argc, char **argv)
optarg); optarg);
exit(1); exit(1);
} }
#endif
break; break;
case 6: case 6:
progress_timestamp = true; progress_timestamp = true;
...@@ -4267,7 +4264,7 @@ main(int argc, char **argv) ...@@ -4267,7 +4264,7 @@ main(int argc, char **argv)
thread->random_state[2] = random(); thread->random_state[2] = random();
thread->logfile = NULL; /* filled in later */ thread->logfile = NULL; /* filled in later */
thread->latency_late = 0; thread->latency_late = 0;
initStats(&thread->stats, 0.0); initStats(&thread->stats, 0);
nclients_dealt += thread->nstate; nclients_dealt += thread->nstate;
} }
...@@ -4321,7 +4318,7 @@ main(int argc, char **argv) ...@@ -4321,7 +4318,7 @@ main(int argc, char **argv)
#endif /* ENABLE_THREAD_SAFETY */ #endif /* ENABLE_THREAD_SAFETY */
/* wait for threads and accumulate results */ /* wait for threads and accumulate results */
initStats(&stats, 0.0); initStats(&stats, 0);
INSTR_TIME_SET_ZERO(conn_total_time); INSTR_TIME_SET_ZERO(conn_total_time);
for (i = 0; i < nthreads; i++) for (i = 0; i < nthreads; i++)
{ {
...@@ -4394,6 +4391,9 @@ threadRun(void *arg) ...@@ -4394,6 +4391,9 @@ threadRun(void *arg)
INSTR_TIME_SET_ZERO(thread->conn_time); INSTR_TIME_SET_ZERO(thread->conn_time);
initStats(&aggs, time(NULL));
last = aggs;
/* open log file if requested */ /* open log file if requested */
if (use_log) if (use_log)
{ {
...@@ -4429,9 +4429,6 @@ threadRun(void *arg) ...@@ -4429,9 +4429,6 @@ threadRun(void *arg)
INSTR_TIME_SET_CURRENT(thread->conn_time); INSTR_TIME_SET_CURRENT(thread->conn_time);
INSTR_TIME_SUBTRACT(thread->conn_time, thread->start_time); INSTR_TIME_SUBTRACT(thread->conn_time, thread->start_time);
initStats(&aggs, INSTR_TIME_GET_DOUBLE(thread->start_time));
last = aggs;
/* explicitly initialize the state machines */ /* explicitly initialize the state machines */
for (i = 0; i < nstate; i++) for (i = 0; i < nstate; i++)
{ {
...@@ -4635,7 +4632,7 @@ threadRun(void *arg) ...@@ -4635,7 +4632,7 @@ threadRun(void *arg)
* (If a read from a 64-bit integer is not atomic, you might * (If a read from a 64-bit integer is not atomic, you might
* get a "torn" read and completely bogus latencies though!) * get a "torn" read and completely bogus latencies though!)
*/ */
initStats(&cur, 0.0); initStats(&cur, 0);
for (i = 0; i < nthreads; i++) for (i = 0; i < nthreads; i++)
{ {
mergeSimpleStats(&cur.latency, &thread[i].stats.latency); mergeSimpleStats(&cur.latency, &thread[i].stats.latency);
...@@ -4695,12 +4692,13 @@ done: ...@@ -4695,12 +4692,13 @@ done:
INSTR_TIME_ACCUM_DIFF(thread->conn_time, end, start); INSTR_TIME_ACCUM_DIFF(thread->conn_time, end, start);
if (thread->logfile) if (thread->logfile)
{ {
if (agg_interval) if (agg_interval > 0)
{ {
/* log aggregated but not yet reported transactions */ /* log aggregated but not yet reported transactions */
doLog(thread, state, &end, &aggs, false, 0, 0); doLog(thread, state, &aggs, false, 0, 0);
} }
fclose(thread->logfile); fclose(thread->logfile);
thread->logfile = NULL;
} }
return NULL; return NULL;
} }
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment