Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
d9bb75dd
Commit
d9bb75dd
authored
May 10, 2012
by
Peter Eisentraut
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Whitespace cleanup
parent
1908a679
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
92 additions
and
98 deletions
+92
-98
doc/src/sgml/pgtesttiming.sgml
doc/src/sgml/pgtesttiming.sgml
+92
-98
No files found.
doc/src/sgml/pgtesttiming.sgml
View file @
d9bb75dd
...
@@ -29,7 +29,7 @@
...
@@ -29,7 +29,7 @@
<para>
<para>
<application>pg_test_timing</> is a tool to measure the timing overhead
<application>pg_test_timing</> is a tool to measure the timing overhead
on your system and confirm that the system time never moves backwards.
on your system and confirm that the system time never moves backwards.
Systems that are slow to collect timing data can give less accurate
Systems that are slow to collect timing data can give less accurate
<command>EXPLAIN ANALYZE</command> results.
<command>EXPLAIN ANALYZE</command> results.
</para>
</para>
</refsect1>
</refsect1>
...
@@ -68,11 +68,10 @@
...
@@ -68,11 +68,10 @@
<title>Interpreting results</title>
<title>Interpreting results</title>
<para>
<para>
Good results will show most (>90%) individual timing calls take less
Good results will show most (>90%) individual timing calls take less than
than one microsecond. Average per loop overhead will be even lower,
one microsecond. Average per loop overhead will be even lower, below 100
below 100 nanoseconds. This example from an Intel i7-860 system using
nanoseconds. This example from an Intel i7-860 system using a TSC clock
a TSC clock source shows excellent performance:
source shows excellent performance:
</para>
<screen>
<screen>
Testing timing overhead for 3 seconds.
Testing timing overhead for 3 seconds.
...
@@ -85,12 +84,13 @@ Histogram of timing durations:
...
@@ -85,12 +84,13 @@ Histogram of timing durations:
2: 2999652 3.59518%
2: 2999652 3.59518%
1: 80435604 96.40465%
1: 80435604 96.40465%
</screen>
</screen>
</para>
<para>
<para>
Note that different units are used for the per loop time than the
Note that different units are used for the per loop time than the
histogram. The loop can have resolution within a few nanoseconds
histogram. The loop can have resolution within a few nanoseconds (nsec),
(nsec), while the individual timing calls can only resolve down to
while the individual timing calls can only resolve down to one microsecond
one microsecond
(usec).
(usec).
</para>
</para>
</refsect2>
</refsect2>
...
@@ -98,11 +98,10 @@ Histogram of timing durations:
...
@@ -98,11 +98,10 @@ Histogram of timing durations:
<title>Measuring executor timing overhead</title>
<title>Measuring executor timing overhead</title>
<para>
<para>
When the query executor is running a statement using
When the query executor is running a statement using
<command>EXPLAIN ANALYZE</command>, individual operations are
<command>EXPLAIN ANALYZE</command>, individual operations are timed as well
timed as well as showing a summary. The overhead of your system
as showing a summary. The overhead of your system can be checked by
can be checked by counting rows with the psql program:
counting rows with the psql program:
</para>
<screen>
<screen>
CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
...
@@ -110,16 +109,16 @@ CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
...
@@ -110,16 +109,16 @@ CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
SELECT COUNT(*) FROM t;
SELECT COUNT(*) FROM t;
EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
</screen>
</screen>
</para>
<para>
<para>
The i7-860 system measured runs the count query in 9.8 ms while
The i7-860 system measured runs the count query in 9.8 ms while
the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms,
the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms, each
each processing just over 100,000 rows. That 6.8 ms difference
processing just over 100,000 rows. That 6.8 ms difference means the timing
means the timing overhead per row is 68 ns, about twice what
overhead per row is 68 ns, about twice what pg_test_timing estimated it
pg_test_timing estimated it would be. Even that relatively
would be. Even that relatively small amount of overhead is making the fully
small amount of overhead is making the fully timed count statement
timed count statement take almost 70% longer. On more substantial queries,
take almost 70% longer. On more substantial queries, the
the timing overhead would be less problematic.
timing overhead would be less problematic.
</para>
</para>
</refsect2>
</refsect2>
...
@@ -127,14 +126,13 @@ EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
...
@@ -127,14 +126,13 @@ EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
<refsect2>
<refsect2>
<title>Changing time sources</title>
<title>Changing time sources</title>
<para>
<para>
On some newer Linux systems, it's possible to change the clock
On some newer Linux systems, it's possible to change the clock source used
source used to collect timing data at any time. A second example
to collect timing data at any time. A second example shows the slowdown
shows the slowdown possible from switching to the slower acpi_pm
possible from switching to the slower acpi_pm time source, on the same
time source, on the same system used for the fast results above:
system used for the fast results above:
</para>
<screen>
<screen>
# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
tsc hpet acpi_pm
# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
# pg_test_timing
# pg_test_timing
...
@@ -147,45 +145,43 @@ Histogram of timing durations:
...
@@ -147,45 +145,43 @@ Histogram of timing durations:
2: 2990371 72.05956%
2: 2990371 72.05956%
1: 1155682 27.84870%
1: 1155682 27.84870%
</screen>
</screen>
</para>
<para>
<para>
In this configuration, the sample <command>EXPLAIN ANALYZE</command>
In this configuration, the sample <command>EXPLAIN ANALYZE</command> above
above takes 115.9 ms. That's 1061 nsec of timing overhead, again
takes 115.9 ms. That's 1061 nsec of timing overhead, again a small multiple
a small multiple of what's measured directly by this utility.
of what's measured directly by this utility. That much timing overhead
That much timing overhead means the actual query itself is only
means the actual query itself is only taking a tiny fraction of the
taking a tiny fraction of the accounted for time, most of it
accounted for time, most of it is being consumed in overhead instead. In
is being consumed in overhead instead. In this configuration,
this configuration, any <command>EXPLAIN ANALYZE</command> totals involving
any <command>EXPLAIN ANALYZE</command> totals involving many
many timed operations would be inflated significantly by timing overhead.
timed operations would be inflated significantly by timing overhead.
</para>
</para>
<para>
<para>
FreeBSD also allows changing the time source on the fly, and
FreeBSD also allows changing the time source on the fly, and it logs
it logs information about the timer selected during boot:
information about the timer selected during boot:
</para>
<screen>
<screen>
dmesg | grep "Timecounter"
dmesg | grep "Timecounter"
sysctl kern.timecounter.hardware=TSC
sysctl kern.timecounter.hardware=TSC
</screen>
</screen>
</para>
<para>
<para>
Other systems may only allow setting the time source on boot.
Other systems may only allow setting the time source on boot. On older
On older Linux systems the "clock" kernel setting is the only way
Linux systems the "clock" kernel setting is the only way to make this sort
to make this sort of change. And even on some more recent ones,
of change. And even on some more recent ones, the only option you'll see
the only option you'll see for a clock source is "jiffies". Jiffies
for a clock source is "jiffies". Jiffies are the older Linux software clock
are the older Linux software clock implementation, which can have
implementation, which can have good resolution when it's backed by fast
good resolution when it's backed by fast enough timing hardware,
enough timing hardware, as in this example:
as in this example:
</para>
<screen>
<screen>
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
jiffies
jiffies
$ dmesg | grep time.c
$ dmesg | grep time.c
time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
time.c: Detected 2400.153 MHz processor.
time.c: Detected 2400.153 MHz processor.
$
./pg_test_timing
$
pg_test_timing
Testing timing overhead for 3 seconds.
Testing timing overhead for 3 seconds.
Per timing duration including loop overhead: 97.75 ns
Per timing duration including loop overhead: 97.75 ns
Histogram of timing durations:
Histogram of timing durations:
...
@@ -197,76 +193,74 @@ Histogram of timing durations:
...
@@ -197,76 +193,74 @@ Histogram of timing durations:
2: 2993204 9.75277%
2: 2993204 9.75277%
1: 27694571 90.23734%
1: 27694571 90.23734%
</screen>
</screen>
</para>
</refsect2>
</refsect2>
<refsect2>
<refsect2>
<title>Clock hardware and timing accuracy</title>
<title>Clock hardware and timing accuracy</title>
<para>
<para>
Collecting accurate timing information is normally done on computers
Collecting accurate timing information is normally done on computers using
using hardware clocks with various levels of accuracy. With some
hardware clocks with various levels of accuracy. With some hardware the
hardware the operating systems can pass the system clock time almost
operating systems can pass the system clock time almost directly to
directly to programs. A system clock can also be derived from a chip
programs. A system clock can also be derived from a chip that simply
that simply provides timing interrupts, periodic ticks at some known
provides timing interrupts, periodic ticks at some known time interval. In
time interval. In either case, operating system kernels provide
either case, operating system kernels provide a clock source that hides
a clock source that hides these details. But the accuracy of that
these details. But the accuracy of that clock source and how quickly it can
clock source and how quickly it can return results varies based
return results varies based on the underlying hardware.
on the underlying hardware.
</para>
</para>
<para>
<para>
Inaccurate time keeping can result in system instability. Test
Inaccurate time keeping can result in system instability. Test any change
any change to the clock source very carefully. Operating system
to the clock source very carefully. Operating system defaults are sometimes
defaults are sometimes made to favor reliability over best
made to favor reliability over best accuracy. And if you are using a virtual
accuracy. And if you are using a virtual machine, look into the
machine, look into the recommended time sources compatible with it. Virtual
recommended time sources compatible with it. Virtual hardware
hardware faces additional difficulties when emulating timers, and there are
faces additional difficulties when emulating timers, and there
often per operating system settings suggested by vendors.
are often per operating system settings suggested by vendors.
</para>
</para>
<para>
<para>
The Time Stamp Counter (TSC) clock source is the most accurate one
The Time Stamp Counter (TSC) clock source is the most accurate one available
available on current generation CPUs. It's the preferred way to track
on current generation CPUs. It's the preferred way to track the system time
the system time when it's supported by the operating system and the
when it's supported by the operating system and the TSC clock is
TSC clock is reliable. There are several ways that TSC can fail
reliable. There are several ways that TSC can fail to provide an accurate
to provide an accurate timing source, making it unreliable. Older
timing source, making it unreliable. Older systems can have a TSC clock that
systems can have a TSC clock that varies based on the CPU
varies based on the CPU temperature, making it unusable for timing. Trying
temperature, making it unusable for timing. Trying to use TSC on some
to use TSC on some older multi-core CPUs can give a reported time that's
older multi-core CPUs can give a reported time that's inconsistent
inconsistent among multiple cores. This can result in the time going
among multiple cores. This can result in the time going backwards, a
backwards, a problem this program checks for. And even the newest systems
problem this program checks for. And even the newest systems can
can fail to provide accurate TSC timing with very aggressive power saving
fail to provide accurate TSC timing with very aggressive power saving
configurations.
configurations.
</para>
</para>
<para>
<para>
Newer operating systems may check for the known TSC problems and
Newer operating systems may check for the known TSC problems and switch to a
switch to a slower, more stable clock source when they are seen.
slower, more stable clock source when they are seen. If your system
If your system supports TSC time but doesn't default to that, it
supports TSC time but doesn't default to that, it may be disabled for a good
may be disabled for a good reason. And some operating systems may
reason. And some operating systems may not detect all the possible problems
not detect all the possible problems correctly, or will allow using
correctly, or will allow using TSC even in situations where it's known to be
TSC even in situations where it's known to be
inaccurate.
inaccurate.
</para>
</para>
<para>
<para>
The High Precision Event Timer (HPET) is the preferred timer on
The High Precision Event Timer (HPET) is the preferred timer on systems
systems where it's available and TSC is not accurate. The timer
where it's available and TSC is not accurate. The timer chip itself is
chip itself is programmable to allow up to 100 nanosecond resolution,
programmable to allow up to 100 nanosecond resolution, but you may not see
but you may not see
that much accuracy in your system clock.
that much accuracy in your system clock.
</para>
</para>
<para>
<para>
Advanced Configuration and Power Interface (ACPI) provides a
Advanced Configuration and Power Interface (ACPI) provides a Power
Power Management (PM) Timer, which Linux refers to as the acpi_pm.
Management (PM) Timer, which Linux refers to as the acpi_pm. The clock
The clock derived from acpi_pm will at best provide 300 nanosecond
derived from acpi_pm will at best provide 300 nanosecond resolution.
resolution.
</para>
</para>
<para>
<para>
Timers used on older PC hardware including the 8254 Programmable
Timers used on older PC hardware including the 8254 Programmable Interval
Interval Timer (PIT), the real-time clock (RTC), the Advanced
Timer (PIT), the real-time clock (RTC), the Advanced Programmable Interrupt
Programmable Interrupt Controller (APIC) timer, and the Cyclone
Controller (APIC) timer, and the Cyclone timer. These timers aim for
timer. These timers aim for
millisecond resolution.
millisecond resolution.
</para>
</para>
</refsect2>
</refsect2>
</refsect1>
</refsect1>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment