Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
76e386d5
Commit
76e386d5
authored
May 24, 2003
by
Bruce Momjian
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Add cost estimate discussion to TODO.detail.
parent
07d89f6f
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
402 additions
and
1 deletion
+402
-1
doc/TODO.detail/optimizer
doc/TODO.detail/optimizer
+402
-1
No files found.
doc/TODO.detail/optimizer
View file @
76e386d5
...
@@ -1059,7 +1059,7 @@ From owner-pgsql-hackers@hub.org Thu Jan 20 18:45:32 2000
...
@@ -1059,7 +1059,7 @@ From owner-pgsql-hackers@hub.org Thu Jan 20 18:45:32 2000
Received
:
from
renoir
.
op
.
net
(
root
@
renoir
.
op
.
net
[
207.29.195.4
])
Received
:
from
renoir
.
op
.
net
(
root
@
renoir
.
op
.
net
[
207.29.195.4
])
by
candle
.
pha
.
pa
.
us
(
8.9.0
/
8.9.0
)
with
ESMTP
id
TAA00672
by
candle
.
pha
.
pa
.
us
(
8.9.0
/
8.9.0
)
with
ESMTP
id
TAA00672
for
<
pgman
@
candle
.
pha
.
pa
.
us
>;
Thu
,
20
Jan
2000
19
:
45
:
30
-
0500
(
EST
)
for
<
pgman
@
candle
.
pha
.
pa
.
us
>;
Thu
,
20
Jan
2000
19
:
45
:
30
-
0500
(
EST
)
Received
:
from
hub
.
org
(
hub
.
org
[
216.126.84.1
])
by
renoir
.
op
.
net
(
o1
/$
Revision
:
1.
19
$)
with
ESMTP
id
TAA01989
for
<
pgman
@
candle
.
pha
.
pa
.
us
>;
Thu
,
20
Jan
2000
19
:
39
:
15
-
0500
(
EST
)
Received
:
from
hub
.
org
(
hub
.
org
[
216.126.84.1
])
by
renoir
.
op
.
net
(
o1
/$
Revision
:
1.
20
$)
with
ESMTP
id
TAA01989
for
<
pgman
@
candle
.
pha
.
pa
.
us
>;
Thu
,
20
Jan
2000
19
:
39
:
15
-
0500
(
EST
)
Received
:
from
localhost
(
majordom
@
localhost
)
Received
:
from
localhost
(
majordom
@
localhost
)
by
hub
.
org
(
8.9.3
/
8.9.3
)
with
SMTP
id
TAA00957
;
by
hub
.
org
(
8.9.3
/
8.9.3
)
with
SMTP
id
TAA00957
;
Thu
,
20
Jan
2000
19
:
35
:
19
-
0500
(
EST
)
Thu
,
20
Jan
2000
19
:
35
:
19
-
0500
(
EST
)
...
@@ -2003,3 +2003,404 @@ your stats be out-of-date or otherwise misleading.
...
@@ -2003,3 +2003,404 @@ your stats be out-of-date or otherwise misleading.
regards, tom lane
regards, tom lane
From pgsql-hackers-owner+M29943@postgresql.org Thu Oct 3 18:18:27 2002
Return-path: <pgsql-hackers-owner+M29943@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g93MIOU23771
for <pgman@candle.pha.pa.us>; Thu, 3 Oct 2002 18:18:25 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP
id B9F51476570; Thu, 3 Oct 2002 18:18:21 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id E083B4761B0; Thu, 3 Oct 2002 18:18:19 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP id 13ADC476063
for <pgsql-hackers@postgresql.org>; Thu, 3 Oct 2002 18:18:17 -0400 (EDT)
Received: from acorn.he.net (acorn.he.net [64.71.137.130])
by postgresql.org (Postfix) with ESMTP id 3AEC8475FFF
for <pgsql-hackers@postgresql.org>; Thu, 3 Oct 2002 18:18:16 -0400 (EDT)
Received: from CurtisVaio ([63.164.0.47] (may be forged)) by acorn.he.net (8.8.6/8.8.2) with SMTP id PAA19215; Thu, 3 Oct 2002 15:18:14 -0700
From: "Curtis Faith" <curtis@galtair.com>
To: "Tom Lane" <tgl@sss.pgh.pa.us>
cc: "Pgsql-Hackers" <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Advice: Where could I be of help?
Date: Thu, 3 Oct 2002 18:17:55 -0400
Message-ID: <DMEEJMCDOJAKPPFACMPMGEBNCEAA.curtis@galtair.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <13379.1033675158@sss.pgh.pa.us>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
X-Virus-Scanned: by AMaViS new-20020517
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
X-Virus-Scanned: by AMaViS new-20020517
Status: OR
tom lane wrote:
> But more globally, I think that our worst problems these days have to do
> with planner misestimations leading to bad plans. The planner is
> usually *capable* of generating a good plan, but all too often it picks
> the wrong one. We need work on improving the cost modeling equations
> to be closer to reality. If that'
s
at
all
close
to
your
sphere
of
>
interest
then
I
think
it
should
be
#
1
priority
---
it
's both localized,
> which I think is important for a first project, and potentially a
> considerable win.
This seems like a very interesting problem. One of the ways that I thought
would be interesting and would solve the problem of trying to figure out the
right numbers is to have certain guesses for the actual values based on
statistics gathered during vacuum and general running and then have the
planner run the "best" plan.
Then during execution if the planner turned out to be VERY wrong about
certain assumptions the execution system could update the stats that led to
those wrong assumptions. That way the system would seek the correct values
automatically. We could also gather the stats that the system produces for
certain actual databases and then use those to make smarter initial guesses.
I'
ve
found
that
I
can
never
predict
costs
.
I
always
end
up
testing
empirically
and
find
myself
surprised
at
the
results
.
We
should
be
able
to
make
the
executor
smart
enough
to
keep
count
of
actual
costs
(
or
a
statistical
approximation
)
without
introducing
any
significant
overhead
.
tom
lane
also
wrote
:
>
There
is
no
"cache flushing"
.
We
have
a
shared
buffer
cache
management
>
algorithm
that
's straight LRU across all buffers. There'
s
been
some
>
interest
in
smarter
cache
-
replacement
code
;
I
believe
Neil
Conway
is
>
messing
around
with
an
LRU
-
2
implementation
right
now
.
If
you
've got
> better ideas we'
re
all
ears
.
Hmmm
,
this
is
the
area
that
I
think
could
lead
to
huge
performance
gains
.
Consider
a
simple
system
with
a
table
tbl_master
that
gets
read
by
each
process
many
times
but
with
very
infrequent
inserts
and
that
contains
about
3
,
000
rows
.
The
single
but
heavily
used
index
for
this
table
is
contained
in
a
btree
with
a
depth
of
three
with
20
-
8
K
pages
in
the
first
two
levels
of
the
btree
.
Another
table
tbl_detail
with
10
indices
that
gets
very
frequent
inserts
.
There
are
over
300
,
000
rows
.
Some
queries
result
in
index
scans
over
the
approximatley
5
,
000
8
K
pages
in
the
index
.
There
is
a
40
M
shared
cache
for
this
system
.
Everytime
a
query
which
requires
the
index
scan
runs
it
will
blow
out
the
entire
cache
since
the
scan
will
load
more
blocks
than
the
cache
holds
.
Only
blocks
that
are
accessed
while
the
scan
is
going
will
survive
.
LRU
is
bad
,
bad
,
bad
!
LRU
-
2
might
be
better
but
it
seems
like
it
still
won
't give enough priority
to the most frequently used blocks. I don'
t
see
how
it
would
do
better
for
the
above
case
.
I
once
implemented
a
modified
cache
algorithm
that
was
based
on
the
clock
algorithm
for
VM
page
caches
.
VM
paging
is
similar
to
databases
in
that
there
is
definite
locality
of
reference
and
certain
pages
are
MUCH
more
likely
to
be
requested
.
The
basic
idea
was
to
have
a
flag
in
each
block
that
represented
the
access
time
in
clock
intervals
.
Imagine
a
clock
hand
sweeping
across
a
clock
,
every
access
is
like
a
tiny
movement
in
the
clock
hand
.
Blocks
that
are
not
accessed
during
a
sweep
are
candidates
for
removal
.
My
modification
was
to
use
access
counts
to
increase
the
durability
of
the
more
accessed
blocks
.
Each
time
a
block
is
accessed
it
's flag is shifted
left (up to a maximum number of shifts - ShiftN ) and 1 is added to it.
Every so many cache accesses (and synchronously when the cache is full) a
pass is made over each block, right shifting the flags (a clock sweep). This
can also be done one block at a time each access so the clock is directly
linked to the cache access rate. Any blocks with 0 are placed into a doubly
linked list of candidates for removal. New cache blocks are allocated from
the list of candidates. Accesses of blocks in the candidate list just
removes them from the list.
An index root node page would likely be accessed frequently enough so that
all it'
s
bits
would
be
set
so
it
would
take
ShiftN
clock
sweeps
.
This
algorithm
increased
the
cache
hit
ratio
from
40
%
to
about
90
%
for
the
cases
I
tested
when
compared
to
a
simple
LRU
mechanism
.
The
paging
ratio
is
greatly
dependent
on
the
ratio
of
the
actual
database
size
to
the
cache
size
.
The
bottom
line
that
it
is
very
important
to
keep
blocks
that
are
frequently
accessed
in
the
cache
.
The
top
levels
of
large
btrees
are
accessed
many
hundreds
(
actually
a
power
of
the
number
of
keys
in
each
page
)
of
times
more
frequently
than
the
leaf
pages
.
LRU
can
be
the
worst
possible
algorithm
for
something
like
an
index
or
table
scan
of
large
tables
since
it
flushes
a
large
number
of
potentially
frequently
accessed
blocks
in
favor
of
ones
that
are
very
unlikely
to
be
retrieved
again
.
tom
lane
also
wrote
:
>
This
is
an
interesting
area
.
Keep
in
mind
though
that
Postgres
is
a
>
portable
DB
that
tries
to
be
agnostic
about
what
kernel
and
filesystem
>
it
's sitting on top of --- and in any case it does not run as root, so
> has very limited ability to affect what the kernel/filesystem do.
> I'
m
not
sure
how
much
can
be
done
without
losing
those
portability
>
advantages
.
The
kinds
of
things
I
was
thinking
about
should
be
very
portable
.
I
found
that
simply
writing
the
cache
in
order
of
the
file
system
offset
results
in
very
greatly
improved
performance
since
it
lets
the
head
seek
in
smaller
increments
and
much
more
smoothly
,
especially
with
modern
disks
.
Most
of
the
time
the
file
system
will
create
files
are
large
sequential
bytes
on
the
physical
disks
in
order
.
It
might
be
in
a
few
chunks
but
those
chunks
will
be
sequential
and
fairly
large
.
tom
lane
also
wrote
:
>
Well
,
not
really
all
that
isolated
.
The
bottom
-
level
index
code
doesn
't
> know whether you'
re
doing
INSERT
or
UPDATE
,
and
would
have
no
easy
>
access
to
the
original
tuple
if
it
did
know
.
The
original
theory
about
>
this
was
that
the
planner
could
detect
the
situation
where
the
index
(
es
)
>
don
't overlap the set of columns being changed by the UPDATE, which
> would be nice since there'
d
be
zero
runtime
overhead
.
Unfortunately
>
that
breaks
down
if
any
BEFORE
UPDATE
triggers
are
fired
that
modify
the
>
tuple
being
stored
.
So
all
in
all
it
turns
out
to
be
a
tad
messy
to
fit
>
this
in
:-(.
I
am
unconvinced
that
the
impact
would
be
huge
anyway
,
>
especially
as
of
7.3
which
has
a
shortcut
path
for
dead
index
entries
.
Well
,
this
probably
is
not
the
right
place
to
start
then
.
-
Curtis
---------------------------(
end
of
broadcast
)---------------------------
TIP
4
:
Don
't '
kill
-
9
' the postmaster
From pgsql-hackers-owner+M29945@postgresql.org Thu Oct 3 18:47:34 2002
Return-path: <pgsql-hackers-owner+M29945@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g93MlWU26068
for <pgman@candle.pha.pa.us>; Thu, 3 Oct 2002 18:47:32 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP
id F2AAE476306; Thu, 3 Oct 2002 18:47:27 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id E7B5247604F; Thu, 3 Oct 2002 18:47:24 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP id 9ADCC4761A1
for <pgsql-hackers@postgresql.org>; Thu, 3 Oct 2002 18:47:18 -0400 (EDT)
Received: from sss.pgh.pa.us (unknown [192.204.191.242])
by postgresql.org (Postfix) with ESMTP id DDB0B476187
for <pgsql-hackers@postgresql.org>; Thu, 3 Oct 2002 18:47:17 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.12.5/8.12.5) with ESMTP id g93MlIhR015091;
Thu, 3 Oct 2002 18:47:18 -0400 (EDT)
To: "Curtis Faith" <curtis@galtair.com>
cc: "Pgsql-Hackers" <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Advice: Where could I be of help?
In-Reply-To: <DMEEJMCDOJAKPPFACMPMGEBNCEAA.curtis@galtair.com>
References: <DMEEJMCDOJAKPPFACMPMGEBNCEAA.curtis@galtair.com>
Comments: In-reply-to "Curtis Faith" <curtis@galtair.com>
message dated "Thu, 03 Oct 2002 18:17:55 -0400"
Date: Thu, 03 Oct 2002 18:47:17 -0400
Message-ID: <15090.1033685237@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
X-Virus-Scanned: by AMaViS new-20020517
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
X-Virus-Scanned: by AMaViS new-20020517
Status: OR
"Curtis Faith" <curtis@galtair.com> writes:
> Then during execution if the planner turned out to be VERY wrong about
> certain assumptions the execution system could update the stats that led to
> those wrong assumptions. That way the system would seek the correct values
> automatically.
That has been suggested before, but I'
m
unsure
how
to
make
it
work
.
There
are
a
lot
of
parameters
involved
in
any
planning
decision
and
it
's
not obvious which ones to tweak, or in which direction, if the plan
turns out to be bad. But if you can come up with some ideas, go to
it!
> Everytime a query which requires the index scan runs it will blow out the
> entire cache since the scan will load more blocks than the cache
> holds.
Right, that'
s
the
scenario
that
kills
simple
LRU
...
>
LRU
-
2
might
be
better
but
it
seems
like
it
still
won
't give enough priority
> to the most frequently used blocks.
Blocks touched more than once per query (like the upper-level index
blocks) will survive under LRU-2. Blocks touched once per query won'
t
.
Seems
to
me
that
it
should
be
a
win
.
>
My
modification
was
to
use
access
counts
to
increase
the
durability
of
the
>
more
accessed
blocks
.
You
could
do
it
that
way
too
,
but
I
'm unsure whether the extra
complexity will buy anything. Ultimately, I think an LRU-anything
algorithm is equivalent to a clock sweep for those pages that only get
touched once per some-long-interval: the single-touch guys get recycled
in order of last use, which seems just like a clock sweep around the
cache. The guys with some amount of preference get excluded from the
once-around sweep. To determine whether LRU-2 is better or worse than
some other preference algorithm requires a finer grain of analysis than
this. I'
m
not
a
fan
of
"more complex must be better"
,
so
I
'd want to see
why it'
s
better
before
buying
into
it
...
>
The
kinds
of
things
I
was
thinking
about
should
be
very
portable
.
I
found
>
that
simply
writing
the
cache
in
order
of
the
file
system
offset
results
in
>
very
greatly
improved
performance
since
it
lets
the
head
seek
in
smaller
>
increments
and
much
more
smoothly
,
especially
with
modern
disks
.
Shouldn
't the OS be responsible for scheduling those writes
appropriately? Ye good olde elevator algorithm ought to handle this;
and it'
s
at
least
one
layer
closer
to
the
actual
disk
layout
than
we
are
,
thus
more
likely
to
issue
the
writes
in
a
good
order
.
It
's worth
experimenting with, perhaps, but I'
m
pretty
dubious
about
it
.
BTW
,
one
other
thing
that
Vadim
kept
saying
we
should
do
is
alter
the
cache
management
strategy
to
retain
dirty
blocks
in
memory
(
ie
,
give
some
amount
of
preference
to
as
-
yet
-
unwritten
dirty
pages
compared
to
clean
pages
).
There
is
no
reliability
cost
here
since
the
WAL
will
let
us
reconstruct
any
dirty
pages
if
we
crash
before
they
get
written
;
and
the
periodic
checkpoints
will
ensure
that
we
eventually
write
a
dirty
block
and
thus
it
will
become
available
for
recycling
.
This
seems
like
a
promising
line
of
thought
that
's orthogonal to the basic
LRU-vs-whatever issue. Nobody'
s
got
round
to
looking
at
it
yet
though
.
I
've got no idea how much preference should be given to a dirty block
--- not infinite, probably, but some.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
From pgsql-hackers-owner+M29974@postgresql.org Fri Oct 4 01:28:54 2002
Return-path: <pgsql-hackers-owner+M29974@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g945SpU13476
for <pgman@candle.pha.pa.us>; Fri, 4 Oct 2002 01:28:52 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP
id 63999476BB2; Fri, 4 Oct 2002 01:26:56 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id BB7CA476B85; Fri, 4 Oct 2002 01:26:54 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP id 5FD7E476759
for <pgsql-hackers@postgresql.org>; Fri, 4 Oct 2002 01:26:52 -0400 (EDT)
Received: from mclean.mail.mindspring.net (mclean.mail.mindspring.net [207.69.200.57])
by postgresql.org (Postfix) with ESMTP id 1F4A14766D8
for <pgsql-hackers@postgresql.org>; Fri, 4 Oct 2002 01:26:51 -0400 (EDT)
Received: from 1cust163.tnt1.st-thomas.vi.da.uu.net ([200.58.4.163] helo=CurtisVaio)
by mclean.mail.mindspring.net with smtp (Exim 3.33 #1)
id 17xKzB-0000yK-00; Fri, 04 Oct 2002 01:26:49 -0400
From: "Curtis Faith" <curtis@galtair.com>
To: "Tom Lane" <tgl@sss.pgh.pa.us>
cc: "Pgsql-Hackers" <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Advice: Where could I be of help?
Date: Fri, 4 Oct 2002 01:26:36 -0400
Message-ID: <DMEEJMCDOJAKPPFACMPMIECECEAA.curtis@galtair.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
In-Reply-To: <15090.1033685237@sss.pgh.pa.us>
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
Importance: Normal
X-Virus-Scanned: by AMaViS new-20020517
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
X-Virus-Scanned: by AMaViS new-20020517
Status: OR
I wrote:
> > My modification was to use access counts to increase the
> durability of the
> > more accessed blocks.
>
tom lane replies:
> You could do it that way too, but I'
m
unsure
whether
the
extra
>
complexity
will
buy
anything
.
Ultimately
,
I
think
an
LRU
-
anything
>
algorithm
is
equivalent
to
a
clock
sweep
for
those
pages
that
only
get
>
touched
once
per
some
-
long
-
interval
:
the
single
-
touch
guys
get
recycled
>
in
order
of
last
use
,
which
seems
just
like
a
clock
sweep
around
the
>
cache
.
The
guys
with
some
amount
of
preference
get
excluded
from
the
>
once
-
around
sweep
.
To
determine
whether
LRU
-
2
is
better
or
worse
than
>
some
other
preference
algorithm
requires
a
finer
grain
of
analysis
than
>
this
.
I
'm not a fan of "more complex must be better", so I'
d
want
to
see
>
why
it
's better before buying into it ...
I'
m
definitely
not
a
fan
of
"more complex must be better either"
.
In
fact
,
its
surprising
how
often
the
real
performance
problems
are
easy
to
fix
and
simple
while
many
person
years
are
spent
solving
the
issue
everyone
"knows"
must
be
causing
the
performance
problems
only
to
find
little
gain
.
The
key
here
is
empirical
testing
.
If
the
cache
hit
ratio
for
LRU
-
2
is
much
better
then
there
may
be
no
need
here
.
OTOH
,
it
took
less
than
less
than
30
lines
or
so
of
code
to
do
what
I
described
,
so
I
don
't consider
it too, too "more complex" :=} We should run a test which includes
running indexes (or is indices the PostgreSQL convention?) that are three
or more times the size of the cache to see how well LRU-2 works. Is there
any cache performance reporting built into pgsql?
tom lane wrote:
> Shouldn'
t
the
OS
be
responsible
for
scheduling
those
writes
>
appropriately
?
Ye
good
olde
elevator
algorithm
ought
to
handle
this
;
>
and
it
's at least one layer closer to the actual disk layout than we
> are, thus more likely to issue the writes in a good order. It'
s
worth
>
experimenting
with
,
perhaps
,
but
I
'm pretty dubious about it.
I wasn'
t
proposing
anything
other
than
changing
the
order
of
the
writes
,
not
actually
ensuring
that
they
get
written
that
way
at
the
level
you
describe
above
.
This
will
help
a
lot
on
brain
-
dead
file
systems
that
can
't do this ordering and probably also in cases where the number
of blocks in the cache is very large.
On a related note, while looking at the code, it seems to me that we
are writing out the buffer cache synchronously, so there won'
t
be
any
possibility
of
the
file
system
reordering
anyway
.
This
appears
to
be
a
huge
performance
problem
.
I
've read claims in the archives that
that the buffers are written asynchronously but my read of the
code says otherwise. Can someone point out my error?
I only see calls that ultimately call FileWrite or write(2) which will
block without a O_NOBLOCK open. I thought one of the main reasons
for having a WAL is so that you can write out the buffer'
s
asynchronously
.
What
am
I
missing
?
I
wrote
:
>
>
Then
during
execution
if
the
planner
turned
out
to
be
VERY
wrong
about
>
>
certain
assumptions
the
execution
system
could
update
the
stats
>
that
led
to
>
>
those
wrong
assumptions
.
That
way
the
system
would
seek
the
>
correct
values
>
>
automatically
.
tom
lane
replied
:
>
That
has
been
suggested
before
,
but
I
'm unsure how to make it work.
> There are a lot of parameters involved in any planning decision and it'
s
>
not
obvious
which
ones
to
tweak
,
or
in
which
direction
,
if
the
plan
>
turns
out
to
be
bad
.
But
if
you
can
come
up
with
some
ideas
,
go
to
>
it
!
I
'll have to look at the current planner before I can suggest
anything concrete.
- Curtis
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment