Commit 2b721d3d authored by Bruce Momjian's avatar Bruce Momjian

Remove TODO.detail files that contained useless or very old information.

Update TODO accordingly.
parent 5de02e28
From fjoe@iclub.nsu.ru Tue Jan 23 03:38:45 2001
Received: from mx.nsu.ru (root@mx.nsu.ru [193.124.215.71])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA14458
for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 03:38:24 -0500 (EST)
Received: from iclub.nsu.ru (root@iclub.nsu.ru [193.124.222.66])
by mx.nsu.ru (8.9.1/8.9.0) with ESMTP id OAA29153;
Tue, 23 Jan 2001 14:31:27 +0600 (NOVT)
Received: from localhost (fjoe@localhost)
by iclub.nsu.ru (8.11.1/8.11.1) with ESMTP id f0N8VOr15273;
Tue, 23 Jan 2001 14:31:25 +0600 (NS)
(envelope-from fjoe@iclub.nsu.ru)
Date: Tue, 23 Jan 2001 14:31:24 +0600 (NS)
From: Max Khon <fjoe@iclub.nsu.ru>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Bug in FOREIGN KEY
In-Reply-To: <200101230416.XAA04293@candle.pha.pa.us>
Message-ID: <Pine.BSF.4.21.0101231429310.12474-100000@iclub.nsu.ru>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: RO
hi, there!
On Mon, 22 Jan 2001, Bruce Momjian wrote:
>
> > This problem with foreign keys has been reported to me, and I have confirmed
> > the bug exists in current sources. The DELETE should succeed:
> >
> > ---------------------------------------------------------------------------
> >
> > CREATE TABLE primarytest2 (
> > col1 INTEGER,
> > col2 INTEGER,
> > PRIMARY KEY(col1, col2)
> > );
> >
> > CREATE TABLE foreigntest2 (col3 INTEGER,
> > col4 INTEGER,
> > FOREIGN KEY (col3, col4) REFERENCES primarytest2
> > );
> > test=> BEGIN;
> > BEGIN
> > test=> INSERT INTO primarytest2 VALUES (5,5);
> > INSERT 27618 1
> > test=> DELETE FROM primarytest2 WHERE col1 = 5 AND col2 = 5;
> > ERROR: triggered data change violation on relation "primarytest2"
I have another (slightly different) example:
--- cut here ---
test=> CREATE TABLE pr(obj_id int PRIMARY KEY);
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'pr_pkey' for
table 'pr'
CREATE
test=> CREATE TABLE fr(obj_id int REFERENCES pr ON DELETE CASCADE);
NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE
test=> BEGIN;
BEGIN
test=> INSERT INTO pr (obj_id) VALUES (1);
INSERT 200539 1
test=> INSERT INTO fr (obj_id) SELECT obj_id FROM pr;
INSERT 200540 1
test=> DELETE FROM fr;
ERROR: triggered data change violation on relation "fr"
test=>
--- cut here ---
we are running postgresql 7.1 beta3
/fjoe
From sszabo@megazone23.bigpanda.com Tue Jan 23 13:41:55 2001
Received: from megazone23.bigpanda.com (rfx-64-6-210-138.users.reflexcom.com [64.6.210.138])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA19924
for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 13:41:54 -0500 (EST)
Received: from localhost (sszabo@localhost)
by megazone23.bigpanda.com (8.11.1/8.11.1) with ESMTP id f0NIfLa41018;
Tue, 23 Jan 2001 10:41:21 -0800 (PST)
Date: Tue, 23 Jan 2001 10:41:21 -0800 (PST)
From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Bug in FOREIGN KEY
In-Reply-To: <200101230417.XAA04332@candle.pha.pa.us>
Message-ID: <Pine.BSF.4.21.0101231031290.40955-100000@megazone23.bigpanda.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: RO
> > Think I misinterpreted the SQL3 specs WR to this detail. The
> > checks must be made per statement, not at the transaction
> > level. I'll try to fix it, but we need to define what will
> > happen with referential actions in the case of conflicting
> > actions on the same key - there are some possible conflicts:
> >
> > 1. DEFERRED ON DELETE NO ACTION or RESTRICT
> >
> > Do the referencing rows reference to the new PK row with
> > the same key now, or is this still a constraint
> > violation? I would say it's not, because the constraint
> > condition is satisfied at the end of the transaction. How
> > do other databases behave?
> >
> > 2. DEFERRED ON DELETE CASCADE, SET NULL or SET DEFAULT
> >
> > Again I'd say that the action should be suppressed
> > because a matching PK row is present at transaction end -
> > it's not the same old row, but the constraint itself is
> > still satisfied.
I'm not actually sure on the cascade, set null and set default. The
way they are written seems to imply to me that it's based on the state
of the database before/after the command in question as opposed to the
deferred state of the database because of the stuff about updating the
state of partially matching rows immediately after the delete/update of
the row which wouldn't really make sense when deferred. Does anyone know
what other systems do with a case something like this all in a
transaction:
create table a (a int primary key);
create table b (b int references a match full on update cascade
on delete cascade deferrable initially deferred);
insert into a values (1);
insert into a values (2);
insert into b values (1);
delete from a where a=1;
select * from b;
commit;
From pgsql-hackers-owner+M3901@postgresql.org Fri Jan 26 17:00:24 2001
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10576
for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 17:00:24 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLtVq53019;
Fri, 26 Jan 2001 16:55:31 -0500 (EST)
(envelope-from pgsql-hackers-owner+M3901@postgresql.org)
Received: from smtp1b.mail.yahoo.com (smtp3.mail.yahoo.com [128.11.68.135])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLqmq52691
for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 16:52:48 -0500 (EST)
(envelope-from janwieck@yahoo.com)
Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 22:49:57 -0000
X-Apparently-From: <janwieck@yahoo.com>
Received: (from janwieck@localhost)
by jupiter.greatbridge.com (8.9.3/8.9.3) id RAA04701;
Fri, 26 Jan 2001 17:02:32 -0500
From: Jan Wieck <janwieck@Yahoo.com>
Message-Id: <200101262202.RAA04701@jupiter.greatbridge.com>
Subject: Re: [HACKERS] Bug in FOREIGN KEY
In-Reply-To: <200101262110.QAA06902@candle.pha.pa.us> from Bruce Momjian at "Jan
26, 2001 04:10:22 pm"
To: Bruce Momjian <pgman@candle.pha.pa.us>
Date: Fri, 26 Jan 2001 17:02:32 -0500 (EST)
CC: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: RO
Bruce Momjian wrote:
> Here is another bug:
>
> test=> begin;
> BEGIN
> test=> INSERT INTO primarytest2 VALUES (5,5);
> INSERT 18757 1
> test=> UPDATE primarytest2 SET col2=1 WHERE col1 = 5 AND col2 = 5;
> ERROR: deferredTriggerGetPreviousEvent: event for tuple (0,10) not
> found
Schema?
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
From pgsql-hackers-owner+M3864@postgresql.org Fri Jan 26 10:07:36 2001
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA17732
for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 10:07:35 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QF3lq12782;
Fri, 26 Jan 2001 10:03:47 -0500 (EST)
(envelope-from pgsql-hackers-owner+M3864@postgresql.org)
Received: from mailout00.sul.t-online.com (mailout00.sul.t-online.com [194.25.134.16])
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0QF0Yq12614
for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 10:00:34 -0500 (EST)
(envelope-from peter_e@gmx.net)
Received: from fwd01.sul.t-online.com
by mailout00.sul.t-online.com with smtp
id 14MALp-0006Im-00; Fri, 26 Jan 2001 15:59:45 +0100
Received: from peter.localdomain (520083510237-0001@[212.185.245.73]) by fmrl01.sul.t-online.com
with esmtp id 14MALQ-1Z0gkaC; Fri, 26 Jan 2001 15:59:20 +0100
Date: Fri, 26 Jan 2001 16:07:27 +0100 (CET)
From: Peter Eisentraut <peter_e@gmx.net>
To: Hiroshi Inoue <Inoue@tpf.co.jp>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Open 7.1 items
In-Reply-To: <3A70FA87.933B3D51@tpf.co.jp>
Message-ID: <Pine.LNX.4.30.0101261604030.769-100000@peter.localdomain>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Sender: 520083510237-0001@t-dialin.net
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: RO
Hiroshi Inoue writes:
> What does this item mean ?
> Is it the following ?
>
> begin;
> insert into pk (id) values (1);
> update(delete from) pk where id=1;
> ERROR: triggered data change violation on relation pk"
>
> If so, isn't it a simple bug ?
Depends on the definition of "bug". It's not spec compliant and it's not
documented and it's annoying. But it's been like this for a year and the
issue is well known and can normally be avoided. It looks like a
documentation to-do to me.
--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
From pgsql-hackers-owner+M3876@postgresql.org Fri Jan 26 13:07:10 2001
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA26086
for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 13:07:09 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI4Vq30248;
Fri, 26 Jan 2001 13:04:31 -0500 (EST)
(envelope-from pgsql-hackers-owner+M3876@postgresql.org)
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI3Aq30098
for <pgsql-hackers@postgreSQL.org>; Fri, 26 Jan 2001 13:03:11 -0500 (EST)
(envelope-from vmikheev@SECTORBASE.COM)
Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
id <D49FAF71>; Fri, 26 Jan 2001 09:41:23 -0800
Message-ID: <8F4C99C66D04D4118F580090272A7A234D32C1@sectorbase1.sectorbase.com>
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
To: "'Jan Wieck'" <janwieck@Yahoo.com>,
PostgreSQL HACKERS
<pgsql-hackers@postgresql.org>,
Bruce Momjian <root@candle.pha.pa.us>
Subject: RE: [HACKERS] Open 7.1 items
Date: Fri, 26 Jan 2001 10:02:59 -0800
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
charset="iso-8859-1"
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: RO
> > FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
>
> A well known issue, and I've asked multiple times how exactly
> we want to define the behaviour for deferred constraints. Do
> foreign keys reference just to a key value and are happy with
> it's existance, or do they refer to a particular row?
I think first. The last is closer to OODBMS world, not to [O]RDBMS one.
> Consider you have a deferred "ON DELETE CASCADE" constraint
> and do a DELETE, INSERT of a PK. Do the FK rows need to be
> deleted or not?
Good example. I think FK should not be deleted. If someone really
want to delete "old" FK then he can do
DELETE PK;
SET CONSTRAINT ... IMMEDIATE; -- FK need to be deleted here
INSERT PK;
> Consider you have a deferred "ON DELETE RESTRICT" and "ON
> UPDATE CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
> to PK1, the FK2 rows need to follow, but does PK2 inherit all
> FK1 rows now so it's the master of both groups?
Yes. Again one can use SET CONSTRAINT to achieve desirable results.
It seems that SET CONSTRAINT was designed for these purposes - ie
for better flexibility.
Though, it would be better to look how other DBes handle all these
cases -:)
Vadim
From janwieck@yahoo.com Fri Jan 26 12:20:27 2001
Received: from smtp6.mail.yahoo.com (smtp6.mail.yahoo.com [128.11.69.103])
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22158
for <root@candle.pha.pa.us>; Fri, 26 Jan 2001 12:20:27 -0500 (EST)
Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 17:20:26 -0000
X-Apparently-From: <janwieck@yahoo.com>
Received: (from janwieck@localhost)
by jupiter.greatbridge.com (8.9.3/8.9.3) id MAA03196;
Fri, 26 Jan 2001 12:30:05 -0500
From: Jan Wieck <janwieck@yahoo.com>
Message-Id: <200101261730.MAA03196@jupiter.greatbridge.com>
Subject: Re: [HACKERS] Open 7.1 items
To: PostgreSQL HACKERS <pgsql-hackers@postgreSQL.org>,
Bruce Momjian <root@candle.pha.pa.us>
Date: Fri, 26 Jan 2001 12:30:05 -0500 (EST)
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Status: RO
Bruce Momjian wrote:
> Here are my open 7.1 items. Thanks for shrinking the list so far.
>
> ---------------------------------------------------------------------------
>
> FreeBSD locale bug
> Reorder INSERT firing in rules
I don't recall why this is wanted. AFAIK there's no reason
NOT to do so, except for the actual state of beeing far too
close to a release candidate.
> Philip Warner UPDATE crash
> JDBC LargeObject short read return value missing
> SELECT cash_out(1) crashes all backends
> LAZY VACUUM
> FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
A well known issue, and I've asked multiple times how exactly
we want to define the behaviour for deferred constraints. Do
foreign keys reference just to a key value and are happy with
it's existance, or do they refer to a particular row?
Consider you have a deferred "ON DELETE CASCADE" constraint
and do a DELETE, INSERT of a PK. Do the FK rows need to be
deleted or not?
Consider you have a deferred "ON DELETE RESTRICT" and "ON
UPDATE CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
to PK1, the FK2 rows need to follow, but does PK2 inherit all
FK1 rows now so it's the master of both groups?
These are only two possible combinations. There are many to
think of. As said, I've asked before, but noone voted yet.
Move the item to 7.2 anyway, because changing this behaviour
would require massive changes in the trigger queue *and* the
generic RI triggers, which cannot be tested enough any more.
Jan
> Usernames limited in length
> Does pg_dump preserve COMMENTs?
> Failure of nested cursors in JDBC
> JDBC setMaxRows() is global variable affecting other objects
> Does JDBC Makefile need current dir?
> Fix for pg_dump of bad system tables
> Steve Howe failure query with rules
> ODBC/JDBC not disconnecting properly?
> Magnus Hagander ODBC issues?
> Merge MySQL/PgSQL translation scripts
> Fix ipcclean on Linux
> Merge global and template BKI files?
>
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman@candle.pha.pa.us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
From pgsql-general-owner+M590@postgresql.org Tue Nov 14 16:30:40 2000
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA22313
for <pgman@candle.pha.pa.us>; Tue, 14 Nov 2000 17:30:39 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAEMSJs66979;
Tue, 14 Nov 2000 17:28:21 -0500 (EST)
(envelope-from pgsql-general-owner+M590@postgresql.org)
Received: from megazone23.bigpanda.com (138.210.6.64.reflexcom.com [64.6.210.138])
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAEMREs66800
for <pgsql-general@postgresql.org>; Tue, 14 Nov 2000 17:27:14 -0500 (EST)
(envelope-from sszabo@megazone23.bigpanda.com)
Received: from localhost (sszabo@localhost)
by megazone23.bigpanda.com (8.11.1/8.11.0) with ESMTP id eAEMPpH69059;
Tue, 14 Nov 2000 14:25:51 -0800 (PST)
Date: Tue, 14 Nov 2000 14:25:51 -0800 (PST)
From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
To: "Beth K. Gatewood" <bethg@mbt.washington.edu>
cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] a request for some experienced input.....
In-Reply-To: <3A11ACA1.E5D847DD@mbt.washington.edu>
Message-ID: <Pine.BSF.4.21.0011141403380.68986-100000@megazone23.bigpanda.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk
Sender: pgsql-general-owner@postgresql.org
Status: OR
On Tue, 14 Nov 2000, Beth K. Gatewood wrote:
> >
>
> Stephan-
>
> Thank you so much for taking the effort to answer this these questions. You
> help is truly appreciated....
>
> I just have a few points for clarification.
>
> >
> > MATCH PARTIAL is a specific match type which describes which rows are
> > considered matching rows for purposes of meeting or failing the
> > constraint. (In match partial, a fktable (NULL, 2) would match a pk
> > table (1,2) as well as a pk table (2,2). It's different from match
> > full in which case (NULL,2) would be invalid or match unspecified
> > in which case it would match due to the existance of the NULL in any
> > case). There are some bizarre implementation details involved with
> > it and it's different from the others in ways that make it difficult.
> > It's in my list of things to do, but I haven't come up with an acceptable
> > mechanism in my head yet.
>
> Does this mean, currently that I can not have foreign keys with null values?
Not exactly...
Match full = In FK row, all columns must be NULL or the value of each
column must not be null and there is a row in the PK table where
each referencing column equals the corresponding referenced
column.
Unspecified = In FK row, at least one column must be NULL or each
referencing column shall be equal to the corresponding referenced
column in some row of the referenced table
Match partial is similar to match full except we ignore the null columns
for purposes of the each referencing column equals bit.
For example:
PK Table Key values: (1,2), (1,3), (3,3)
Attempted FK Table Key values: (1,2), (1,NULL), (5,NULL), (NULL, NULL)
(hopefully I get this right)...
In match full, only the 1st and 4th fk values are valid.
In match partial, the 1st, 2nd, and 4th fk values are valid.
In match unspecified, all the fk values are valid.
The other note is that generally speaking, all three are basically the
same for the single column key. If you're only doing references on one
column, the match type is mostly meaningless.
> > PENDANT adds that for each row of the referenced table the values of
> > the specified column(s) are the same as the values of the specified
> > column(s) in some row of the referencing tables.
>
> I am not sure I know what you mean here.....Are you saying that the value for
> the FK column must match the value for the PK column?
I haven't really looked at PENDANT, the above was just a small rewrite of
some descriptive text in the sql99 draft I have. There's a whole bunch
of rules in the actual text of the referential constraint definition.
The base stuff seems to be: (Rf is the referencing columns, T is the
referenced table)
3) If PENDANT is specified, then:
a) For a given row in the referencing table, let pendant
reference designate an instance in which all Rf are
non-null.
b) Let number of pendant paths be the number of pendant
references to the same referenced row in a referenced table
from all referencing rows in all base tables.
c) For every row in T, the number of pendant paths is equal to
or greater than 1.
So, I'd read it as every row in T must have at least one referencing row
in some base table.
There are some details about updates and that you can't mix PENDANT and
MATCH PARTIAL or SET DEFAULT actions.
> > The main issues in 7.0 are that older versions (might be fixed in
> > 7.0.3) would fail very badly if you used alter table to rename tables that
> > were referenced in a fk constraint and that you need to give update
> > permission to the referenced table. For the former, 7.1 will (and 7.0.3
> > may) give an elog(ERROR) to you rather than crashing the backend and the
> > latter should be fixed for 7.1 (although you still need to have write
> > perms to the referencing table for referential actions to work properly)
>
> Are the steps to this outlined somewhere then?
The permissions stuff is just a matter of using GRANT and REVOKE to set
the permissions that a user has to a table.
From pgsql-hackers-owner+M908@postgresql.org Sun Nov 19 14:27:43 2000
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA10885
for <pgman@candle.pha.pa.us>; Sun, 19 Nov 2000 14:27:42 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAJJSMs83653;
Sun, 19 Nov 2000 14:28:22 -0500 (EST)
(envelope-from pgsql-hackers-owner+M908@postgresql.org)
Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46] (may be forged))
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAJJQns83565
for <pgsql-hackers@postgreSQL.org>; Sun, 19 Nov 2000 14:26:49 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id OAA06790;
Sun, 19 Nov 2000 14:23:06 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200011191923.OAA06790@candle.pha.pa.us>
Subject: Re: [HACKERS] WAL fsync scheduling
In-Reply-To: <002101c0525e$2d964480$b97a30d0@sectorbase.com> "from Vadim Mikheev
at Nov 19, 2000 11:23:19 am"
To: Vadim Mikheev <vmikheev@sectorbase.com>
Date: Sun, 19 Nov 2000 14:23:06 -0500 (EST)
CC: Tom Samplonius <tom@sdf.com>, Alfred@candle.pha.pa.us,
Perlstein <bright@wintelcom.net>, Larry@candle.pha.pa.us,
Rosenman <ler@lerctr.org>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
[ Charset ISO-8859-1 unsupported, converting... ]
> > There are two parts to transaction commit. The first is writing all
> > dirty buffers or log changes to the kernel, and second is fsync of the
> ^^^^^^^^^^^^
> Backend doesn't write any dirty buffer to the kernel at commit time.
Yes, I suspected that.
>
> > log file.
>
> The first part is writing commit record into WAL buffers in shmem.
> This is what XLogInsert does. After that XLogFlush is called to ensure
> that entire commit record is on disk. XLogFlush does *both* write() and
> fsync() (single slock is used for both writing and fsyncing) if it needs to
> do it at all.
Yes, I realize there are new steps in WAL.
>
> > I suggest having a per-backend shared memory byte that has the following
> > values:
> >
> > START_LOG_WRITE
> > WAIT_ON_FSYNC
> > NOT_IN_COMMIT
> > backend_number_doing_fsync
> >
> > I suggest that when each backend starts a commit, it sets its byte to
> > START_LOG_WRITE.
> ^^^^^^^^^^^^^^^^^^^^^^^
> Isn't START_COMMIT more meaningful?
Yes.
>
> > When it gets ready to fsync, it checks all backends.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^
> What do you mean by this? The moment just after XLogInsert?
Just before it calls fsync().
>
> > If all are NOT_IN_COMMIT, it does fsync and continues.
>
> 1st edition:
> > If one or more are in START_LOG_WRITE, it waits until no one is in
> > START_LOG_WRITE. It then checks all WAIT_ON_FSYNC, and if it is the
> > lowest backend in WAIT_ON_FSYNC, marks all others with its backend
> > number, and does fsync. It then clears all backends with its number to
> > NOT_IN_COMMIT. Other backend will see they are not the lowest
> > WAIT_ON_FSYNC and will wait for their byte to be set to NOT_IN_COMMIT
> > so they can then continue, knowing their data was synced.
>
> 2nd edition:
> > I have another idea. If a backend gets to the point that it needs
> > fsync, and there is another backend in START_LOG_WRITE, it can go to an
> > interuptable sleep, knowing another backend will perform the fsync and
> > wake it up. Therefore, there is no busy-wait or timed sleep.
> >
> > Of course, a backend must set its status to WAIT_ON_FSYNC to avoid a
> > race condition.
>
> The 2nd edition is much better. But I'm not sure do we really need in
> these per-backend bytes in shmem. Why not just have some counters?
> We can use a semaphore to wake-up all waiters at once.
Yes, that is much better and clearer. My idea was just to say, "if no
one is entering commit phase, do the commit. If someone else is coming,
sleep and wait for them to do the fsync and wake me up with a singal."
>
> > This allows a single backend not to sleep, and allows multiple backends
> > to bunch up only when they are all about to commit.
> >
> > The reason backend numbers are written is so other backends entering the
> > commit code will not interfere with the backends performing fsync.
>
> Being waked-up backend can check what's written/fsynced by calling XLogFlush.
Seems that may not be needed anymore with a counter. The only issue is
that other backends may enter commit while fsync() is happening. The
process that did the fsync must be sure to wake up only the backends
that were waiting for it, and not other backends that may be also be
doing fsync as a group while the first fsync was happening. I leave
those details to people more experienced. :-)
I am just glad people liked my idea.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
This source diff could not be displayed because it is too large. You can view the blob instead.
From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
Mon, 11 May 1998 11:14:43 -0400 (EDT)
To: Brett McCormick <brett@work.chicken.org>
cc: hackers@postgreSQL.org
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
<13655.4384.345723.466046@abraxas.scene.com>
Date: Mon, 11 May 1998 11:14:43 -0400
Message-ID: <24913.894899683@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Sender: owner-pgsql-hackers@hub.org
Precedence: bulk
Status: RO
Brett McCormick <brett@work.chicken.org> writes:
> same way that the current network socket is passed -- through an execv
> argument. hopefully, however, the non-execv()ing fork will be in 6.4.
Um, you missed the point, Brett. David was hoping to transfer a client
connection from the postmaster to an *already existing* backend process.
Fork, with or without exec, solves the problem for a backend that's
started after the postmaster has accepted the client socket.
This does lead to a different line of thought, however. Pre-started
backends would have access to the "master" connection socket on which
the postmaster listens for client connections, right? Suppose that we
fire the postmaster as postmaster, and demote it to being simply a
manufacturer of new backend processes as old ones get used up. Have
one of the idle backend processes be the one doing the accept() on the
master socket. Once it has a client connection, it performs the
authentication handshake and then starts serving the client (or just
quits if authentication fails). Meanwhile the next idle backend process
has executed accept() on the master socket and is waiting for the next
client; and shortly the postmaster/factory/whateverwecallitnow notices
that it needs to start another backend to add to the idle-backend pool.
This'd probably need some interlocking among the backends. I have no
idea whether it'd be safe to have all the idle backends trying to
do accept() on the master socket simultaneously, but it sounds risky.
Better to use a mutex so that only one gets to do it while the others
sleep.
regards, tom lane
From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
Received: from hub.org (hub.org [209.47.148.200])
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
Mon, 11 May 1998 11:26:44 -0400 (EDT)
To: Brett McCormick <brett@work.chicken.org>
cc: hackers@postgreSQL.org
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
<13655.4384.345723.466046@abraxas.scene.com>
Date: Mon, 11 May 1998 11:26:44 -0400
Message-ID: <25004.894900404@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Sender: owner-pgsql-hackers@hub.org
Precedence: bulk
Status: RO
Meanwhile, *I* missed the point about Brett's second comment :-(
Brett McCormick <brett@work.chicken.org> writes:
> There will have to be some sort of arg parsing in any case,
> considering that you can pass configurable arguments to the backend..
If we do the sort of change David and I were just discussing, then the
pre-spawned backend would become responsible for parsing and dealing
with the PGOPTIONS portion of the client's connection request message.
That's just part of shifting the authentication handshake code from
postmaster to backend, so it shouldn't be too hard.
BUT: the whole point is to be able to initialize the backend before it
is connected to a client. How much of the expensive backend startup
work depends on having the client connection options available?
Any work that needs to know the options will have to wait until after
the client connects. If that means most of the startup work can't
happen in advance anyway, then we're out of luck; a pre-started backend
won't save enough time to be worth the effort. (Unless we are willing
to eliminate or redefine the troublesome options...)
regards, tom lane
......@@ -1319,3 +1319,105 @@ DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
Mon, 11 May 1998 11:14:43 -0400 (EDT)
To: Brett McCormick <brett@work.chicken.org>
cc: hackers@postgreSQL.org
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
<13655.4384.345723.466046@abraxas.scene.com>
Date: Mon, 11 May 1998 11:14:43 -0400
Message-ID: <24913.894899683@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Sender: owner-pgsql-hackers@hub.org
Precedence: bulk
Status: RO
Brett McCormick <brett@work.chicken.org> writes:
> same way that the current network socket is passed -- through an execv
> argument. hopefully, however, the non-execv()ing fork will be in 6.4.
Um, you missed the point, Brett. David was hoping to transfer a client
connection from the postmaster to an *already existing* backend process.
Fork, with or without exec, solves the problem for a backend that's
started after the postmaster has accepted the client socket.
This does lead to a different line of thought, however. Pre-started
backends would have access to the "master" connection socket on which
the postmaster listens for client connections, right? Suppose that we
fire the postmaster as postmaster, and demote it to being simply a
manufacturer of new backend processes as old ones get used up. Have
one of the idle backend processes be the one doing the accept() on the
master socket. Once it has a client connection, it performs the
authentication handshake and then starts serving the client (or just
quits if authentication fails). Meanwhile the next idle backend process
has executed accept() on the master socket and is waiting for the next
client; and shortly the postmaster/factory/whateverwecallitnow notices
that it needs to start another backend to add to the idle-backend pool.
This'd probably need some interlocking among the backends. I have no
idea whether it'd be safe to have all the idle backends trying to
do accept() on the master socket simultaneously, but it sounds risky.
Better to use a mutex so that only one gets to do it while the others
sleep.
regards, tom lane
From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
Received: from hub.org (hub.org [209.47.148.200])
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
Mon, 11 May 1998 11:26:44 -0400 (EDT)
To: Brett McCormick <brett@work.chicken.org>
cc: hackers@postgreSQL.org
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
<13655.4384.345723.466046@abraxas.scene.com>
Date: Mon, 11 May 1998 11:26:44 -0400
Message-ID: <25004.894900404@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Sender: owner-pgsql-hackers@hub.org
Precedence: bulk
Status: RO
Meanwhile, *I* missed the point about Brett's second comment :-(
Brett McCormick <brett@work.chicken.org> writes:
> There will have to be some sort of arg parsing in any case,
> considering that you can pass configurable arguments to the backend..
If we do the sort of change David and I were just discussing, then the
pre-spawned backend would become responsible for parsing and dealing
with the PGOPTIONS portion of the client's connection request message.
That's just part of shifting the authentication handshake code from
postmaster to backend, so it shouldn't be too hard.
BUT: the whole point is to be able to initialize the backend before it
is connected to a client. How much of the expensive backend startup
work depends on having the client connection options available?
Any work that needs to know the options will have to wait until after
the client connects. If that means most of the startup work can't
happen in advance anyway, then we're out of luck; a pre-started backend
won't save enough time to be worth the effort. (Unless we are willing
to eliminate or redefine the troublesome options...)
regards, tom lane
From owner-pgsql-hackers@hub.org Wed Nov 18 14:40:49 1998
Received: from hub.org (majordom@hub.org [209.47.148.200])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA29743
for <maillist@candle.pha.pa.us>; Wed, 18 Nov 1998 14:40:36 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.1/8.9.1) with SMTP id OAA03716;
Wed, 18 Nov 1998 14:37:04 -0500 (EST)
(envelope-from owner-pgsql-hackers@hub.org)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 18 Nov 1998 14:34:39 +0000 (EST)
Received: (from majordom@localhost)
by hub.org (8.9.1/8.9.1) id OAA03395
for pgsql-hackers-outgoing; Wed, 18 Nov 1998 14:34:37 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
by hub.org (8.9.1/8.9.1) with SMTP id OAA03381
for <pgsql-hackers@hub.org>; Wed, 18 Nov 1998 14:34:31 -0500 (EST)
(envelope-from wieck@sapserv.debis.de)
Received: by orion.SAPserv.Hamburg.dsh.de
for pgsql-hackers@hub.org
id m0zgDnj-000EBTC; Wed, 18 Nov 98 21:02 MET
Message-Id: <m0zgDnj-000EBTC@orion.SAPserv.Hamburg.dsh.de>
From: jwieck@debis.com (Jan Wieck)
Subject: Re: [HACKERS] PREPARE
To: meskes@usa.net (Michael Meskes)
Date: Wed, 18 Nov 1998 21:02:06 +0100 (MET)
Cc: pgsql-hackers@hub.org
Reply-To: jwieck@debis.com (Jan Wieck)
In-Reply-To: <19981118084843.B869@usa.net> from "Michael Meskes" at Nov 18, 98 08:48:43 am
X-Mailer: ELM [version 2.4 PL25]
Content-Type: text
Sender: owner-pgsql-hackers@postgreSQL.org
Precedence: bulk
Status: RO
Michael Meskes wrote:
>
> On Wed, Nov 18, 1998 at 03:23:30AM +0000, Thomas G. Lockhart wrote:
> > > I didn't get this one completly. What input do you mean?
> >
> > Just the original string/query to be prepared...
>
> I see. But wouldn't it be more useful to preprocess the query and store the
> resulting nodes instead? We don't want to parse the statement everytime a
> variable binding comes in.
Right. A real improvement would only be to have the prepared
execution plan in the backend and just giving the parameter
values.
I can think of the following construct:
PREPARE optimizable-statement;
That one will run parser/rewrite/planner, create a new memory
context with a unique identifier and saves the querytree's
and plan's in it. Parameter values are identified by the
usual $n notation. The command returns the identifier.
EXECUTE QUERY identifier [value [, ...]];
then get's back the prepared plan and querytree by the id,
creates an executor context with the given values in the
parameter array and calls ExecutorRun() for them.
The PREPARE needs to analyze the resulting parsetrees to get
the datatypes (and maybe atttypmod's) of the parameters, so
EXECUTE QUERY can convert the values into Datum's using the
types input functions. And the EXECUTE has to be handled
special in tcop (it's something between a regular query and
an utility statement). But it's not too hard to implement.
Finally a
FORGET QUERY identifier;
(don't remember how the others named it) will remove the
prepared plan etc. simply by destroying the memory context
and dropping the identifier from the id->mcontext+prepareinfo
mapping.
This all restricts the usage of PREPARE to optimizable
statements. Is it required to be able to prepare utility
statements (like CREATE TABLE or so) too?
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #
From pgsql-hackers-owner+M67@postgresql.org Tue Oct 31 19:18:16 2000
Received: from mail.postgresql.org ([216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA08916
for <pgman@candle.pha.pa.us>; Tue, 31 Oct 2000 19:18:15 -0500 (EST)
Received: from mail.postgresql.org ([216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eA10IOl60635;
Tue, 31 Oct 2000 19:18:24 -0500 (EST)
(envelope-from pgsql-hackers-owner+M67@postgresql.org)
Received: from ara.zf.jcu.cz (ara.zf.jcu.cz [160.217.161.4])
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eA10H8l60400
for <pgsql-hackers@postgresql.org>; Tue, 31 Oct 2000 19:17:08 -0500 (EST)
(envelope-from zakkr@zf.jcu.cz)
Received: from localhost (zakkr@localhost)
by ara.zf.jcu.cz (8.9.3/8.9.3/Debian 8.9.3-21) with SMTP id BAA32036;
Wed, 1 Nov 2000 01:16:42 +0100
Date: Wed, 1 Nov 2000 01:16:42 +0100 (CET)
From: Karel Zak <zakkr@zf.jcu.cz>
To: Alfred Perlstein <bright@wintelcom.net>
cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Query cache import?
In-Reply-To: <20001031151144.F22110@fw.wintelcom.net>
Message-ID: <Pine.LNX.3.96.1001101005110.31713B-100000@ara.zf.jcu.cz>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On Tue, 31 Oct 2000, Alfred Perlstein wrote:
> I never saw much traffic regarding Karel's work on making stored
> proceedures:
>
> http://people.freebsd.org/~alfred/karel-pgsql.txt
>
> What happened with this? It looked pretty interesting. :(
It's probably a little about me :-) ... well,
My query cache is in usable state and it's efficient for all
things those motivate me to work on this.
some basic features:
- share parsed plans between backends in shared memory
- store plans to private backend hash table
- use parameters for stored queries
- better design for SPI
- memory usage for saved plans
- save plans "by key"
The current query cache code depend on 7.1 memory management. After
official 7.1 release I prepare patch with query cache+SPI (if not
hit me over head, please ..)
All what will doing next time not depend on me, *it's on code developers*.
For example Jan has interesting idea about caching all plans which
processing backend. But it's far future and IMHO we must go by small
steps to Oracle's funeral :-)
If I need the query cache in the my work (typical for some web+pgsql) or
will some public interest I will continue on this, if not I freeze it.
(Exists more interesting work like http://mape.jcu.cz ... sorry of
advertising :-)
Karel
From pgsql-hackers-owner+M312@postgresql.org Mon Nov 6 03:27:32 2000
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28404
for <pgman@candle.pha.pa.us>; Mon, 6 Nov 2000 03:27:32 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eA68Pos51966;
Mon, 6 Nov 2000 03:25:50 -0500 (EST)
(envelope-from pgsql-hackers-owner+M312@postgresql.org)
Received: from ara.zf.jcu.cz (ara.zf.jcu.cz [160.217.161.4])
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eA68Fes50414
for <pgsql-hackers@postgresql.org>; Mon, 6 Nov 2000 03:15:40 -0500 (EST)
(envelope-from zakkr@zf.jcu.cz)
Received: from localhost (zakkr@localhost)
by ara.zf.jcu.cz (8.9.3/8.9.3/Debian 8.9.3-21) with SMTP id JAA20862;
Mon, 6 Nov 2000 09:15:04 +0100
Date: Mon, 6 Nov 2000 09:15:04 +0100 (CET)
From: Karel Zak <zakkr@zf.jcu.cz>
To: Christof Petig <christof.petig@wtal.de>
cc: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
The Hermit Hacker <scrappy@hub.org>, pgsql-hackers@postgresql.org
Subject: Re: AW: [HACKERS] Re: [GENERAL] Query caching
In-Reply-To: <3A02DDFF.E8CBFCF3@wtal.de>
Message-ID: <Pine.LNX.3.96.1001106090801.20612C-100000@ara.zf.jcu.cz>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On Fri, 3 Nov 2000, Christof Petig wrote:
> Karel Zak wrote:
>
> > On Thu, 2 Nov 2000, Zeugswetter Andreas SB wrote:
> >
> > >
> > > > Well I can re-write and resubmit this patch. Add it as a
> > > > compile time option
> > > > is not bad idea. Second possibility is distribute it as patch
> > > > in the contrib
> > > > tree. And if it until not good tested not dirty with this main tree...
> > > >
> > > > Ok, I next week prepare it...
> > >
> > > One thing that worries me though is, that it extends the sql language,
> > > and there has been no discussion about the chosen syntax.
> > >
> > > Imho the standard embedded SQL syntax (prepare ...) could be a
> > > starting point.
> >
> > Yes, you are right... my PREPARE/EXECUTE is not too much ready to SQL92,
> > I some old letter I speculate about "SAVE/EXECUTE PLAN" instead
> > PREPARE/EXECUTE. But don't forget, it will *experimental* patch... we can
> > change it in future ..etc.
> >
> > Karel
>
> [Sorry, I didn't look into your patch, yet.]
Please, read my old query cache and PREPARE/EXECUTE description...
> What about parameters? Normally you can prepare a statement and execute it
We have in PG parameters, see SPI, but now it's used inside backend only
and not exist statement that allows to use this feature in be<->fe.
> using different parameters. AFAIK postgres' frontend-backend protocol is not
> designed to take parameters for statements (e.g. like result presents
> results). A very long road to go.
> By the way, I'm somewhat interested in getting this feature in. Perhaps it
> should be part of a protocol redesign (e.g. binary parameters/results).
> Handling endianness is one aspect, floats are harder (but float->ascii->float
> sometimes fails as well).
PREPARE <name> AS <query>
[ USING type, ... typeN ]
[ NOSHARE | SHARE | GLOBAL ]
EXECUTE <name>
[ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table ]
[ USING val, ... valN ]
[ NOSHARE | SHARE | GLOBAL ]
DEALLOCATE PREPARE
[ <name> [ NOSHARE | SHARE | GLOBAL ]]
[ ALL | ALL INTERNAL ]
An example:
PREPARE chris_query AS SELECT * FROM pg_class WHERE relname = $1 USING text;
EXECUTE chris_query USING 'pg_shadow';
Or mean you something other?
Karel
From pgsql-hackers-owner+M444@postgresql.org Thu Nov 9 03:32:10 2000
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA09953
for <pgman@candle.pha.pa.us>; Thu, 9 Nov 2000 03:32:09 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eA98RSs11426;
Thu, 9 Nov 2000 03:27:28 -0500 (EST)
(envelope-from pgsql-hackers-owner+M444@postgresql.org)
Received: from ara.zf.jcu.cz (ara.zf.jcu.cz [160.217.161.4])
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eA98OPs11045;
Thu, 9 Nov 2000 03:24:25 -0500 (EST)
(envelope-from zakkr@zf.jcu.cz)
Received: from localhost (zakkr@localhost)
by ara.zf.jcu.cz (8.9.3/8.9.3/Debian 8.9.3-21) with SMTP id JAA08951;
Thu, 9 Nov 2000 09:23:41 +0100
Date: Thu, 9 Nov 2000 09:23:41 +0100 (CET)
From: Karel Zak <zakkr@zf.jcu.cz>
To: Christof Petig <christof.petig@wtal.de>
cc: PostgreSQL Hackers <pgsql-hackers@postgresql.org>,
Michael Meskes <meskes@postgresql.org>,
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
The Hermit Hacker <scrappy@hub.org>
Subject: Re: AW: [HACKERS] Re: [GENERAL] Query caching
In-Reply-To: <3A096BCE.F9887955@wtal.de>
Message-ID: <Pine.LNX.3.96.1001109090739.8052B-100000@ara.zf.jcu.cz>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On Wed, 8 Nov 2000, Christof Petig wrote:
> Karel Zak wrote:
>
> > > What about parameters? Normally you can prepare a statement and execute it
> >
> > We have in PG parameters, see SPI, but now it's used inside backend only
> > and not exist statement that allows to use this feature in be<->fe.
>
> Sad. Since ecpg would certainly benefit from this.
>
> > > using different parameters. AFAIK postgres' frontend-backend protocol is not
> > > designed to take parameters for statements (e.g. like result presents
> > > results). A very long road to go.
> > > By the way, I'm somewhat interested in getting this feature in. Perhaps it
> > > should be part of a protocol redesign (e.g. binary parameters/results).
> > > Handling endianness is one aspect, floats are harder (but float->ascii->float
> > > sometimes fails as well).
> >
> > PREPARE <name> AS <query>
> > [ USING type, ... typeN ]
> > [ NOSHARE | SHARE | GLOBAL ]
> >
> > EXECUTE <name>
> > [ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table ]
> > [ USING val, ... valN ]
> > [ NOSHARE | SHARE | GLOBAL ]
> >
> > DEALLOCATE PREPARE
> > [ <name> [ NOSHARE | SHARE | GLOBAL ]]
> > [ ALL | ALL INTERNAL ]
> >
> > An example:
> >
> > PREPARE chris_query AS SELECT * FROM pg_class WHERE relname = $1 USING text;
>
> I would prefer '?' as a parameter name, since this is in the embedded sql standard
> (do you have a copy of the 94 draft? I can mail mine to you?)
This not depend on query cache. The '$n' is PostgreSQL query parametr
keyword and is defined in standard parser. The PREPARE statement not parsing
query it's job for standard parser.
> Also the standard says a whole lot about guessing the parameter's type.
>
> Also I vote for ?::type or type(?) or sql's cast(...) (don't know it's syntax)
> instead of abusing the using keyword.
The postgresql executor expect types of parametrs in separate input (array).
I not sure how much expensive/executable is survey it from query.
> > EXECUTE chris_query USING 'pg_shadow';
>
> Great idea of yours to implement this! Since I was thinking about implementing a
> more decent schema for ecpg but had no mind to touch the backend and be-fe
> protocol (yet).
> It would be desirable to do an 'execute immediate using', since using input
> parameters would take a lot of code away from ecpg.
By the way, PREPARE/EXECUTE is face only. More interesting in this period is
query-cache-kernel. SQL92 is really a little unlike my PREPARE/EXECUTE.
Karel
From pgsql-hackers-owner+M9563@postgresql.org Thu May 31 16:31:59 2001
Return-path: <pgsql-hackers-owner+M9563@postgresql.org>
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f4VKVxc26942
for <pgman@candle.pha.pa.us>; Thu, 31 May 2001 16:31:59 -0400 (EDT)
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
by postgresql.org (8.11.3/8.11.1) with SMTP id f4VKVIE38645;
Thu, 31 May 2001 16:31:18 -0400 (EDT)
(envelope-from pgsql-hackers-owner+M9563@postgresql.org)
Received: from ara.zf.jcu.cz (ara.zf.jcu.cz [160.217.161.4])
by postgresql.org (8.11.3/8.11.1) with ESMTP id f4VKNVE35356
for <pgsql-hackers@postgresql.org>; Thu, 31 May 2001 16:23:31 -0400 (EDT)
(envelope-from zakkr@zf.jcu.cz)
Received: (from zakkr@localhost)
by ara.zf.jcu.cz (8.9.3/8.9.3/Debian 8.9.3-21) id WAA19957;
Thu, 31 May 2001 22:23:26 +0200
Date: Thu, 31 May 2001 22:23:26 +0200
From: Karel Zak <zakkr@zf.jcu.cz>
To: Roberto Abalde <roberto.abalde@galego21.org>
cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Cache for query plans
Message-ID: <20010531222326.B16862@ara.zf.jcu.cz>
References: <000701c0e932$d17646c0$c6023dc8@ultra>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
User-Agent: Mutt/1.0.1i
In-Reply-To: <000701c0e932$d17646c0$c6023dc8@ultra>; from roberto.abalde@galego21.org on Wed, May 30, 2001 at 03:00:53PM -0300
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: ORr
On Wed, May 30, 2001 at 03:00:53PM -0300, Roberto Abalde wrote:
> Hi,
>
> I need to implement a cache for query plans as part of my BSc thesis. Does
> anybody know what happened to Karel Zak's patch?
>
Hi,
my patch is on my ftp and nobody works on it, but I mean it's good
begin for some next work. I not sure with implement this experimental
patch (but usable) to official sources. For example Jan has more complex
idea about query plan cache ... but first time we must solve some
sub-problems like memory management in shared memory that is transparently
for starndard routines like copy query plan ... and Tom isn't sure with
query cache in shared memory...etc. Too much queries, but less answers :-)
Karel
>
> PS: Sorry for my english :(
Do you anytime read any my mail :-)
Karel
--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/
C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From pgsql-hackers-owner+M21218@postgresql.org Fri Apr 12 04:52:19 2002
Return-path: <pgsql-hackers-owner+M21218@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3C8qIS25666
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 04:52:18 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id AE2FA4769F1; Fri, 12 Apr 2002 03:54:34 -0400 (EDT)
Received: from ara.zf.jcu.cz (ara.zf.jcu.cz [160.217.161.4])
by postgresql.org (Postfix) with ESMTP id A05A94769DC
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 03:51:27 -0400 (EDT)
Received: from ara.zf.jcu.cz (LOCALHOST [127.0.0.1])
by ara.zf.jcu.cz (8.12.1/8.12.1/Debian -5) with ESMTP id g3C7pHBK012031;
Fri, 12 Apr 2002 09:51:17 +0200
Received: (from zakkr@localhost)
by ara.zf.jcu.cz (8.12.1/8.12.1/Debian -5) id g3C7pGum012030;
Fri, 12 Apr 2002 09:51:16 +0200
Date: Fri, 12 Apr 2002 09:51:16 +0200
From: Karel Zak <zakkr@zf.jcu.cz>
To: pgsql-hackers@postgresql.org
cc: Hiroshi Inoue <Inoue@tpf.co.jp>
Subject: Re: [HACKERS] 7.3 schedule
Message-ID: <20020412095116.B6370@zf.jcu.cz>
References: <GNELIHDDFBOCMGBFGEFOGEBHCCAA.chriskl@familyhealth.com.au> <3CB52C54.4020507@freaky-namuh.com> <20020411115434.201ff92f.nconway@klamath.dyndns.org> <3CB61DAB.5010601@freaky-namuh.com> <24184.1018581907@sss.pgh.pa.us> <3CB65B49.93F2F790@tpf.co.jp> <20020412004134.5d35a2dd.nconway@klamath.dyndns.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20020412004134.5d35a2dd.nconway@klamath.dyndns.org>; from nconway@klamath.dyndns.org on Fri, Apr 12, 2002 at 12:41:34AM -0400
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On Fri, Apr 12, 2002 at 12:41:34AM -0400, Neil Conway wrote:
> On Fri, 12 Apr 2002 12:58:01 +0900
> "Hiroshi Inoue" <Inoue@tpf.co.jp> wrote:
> >
> > Just a confirmation.
> > Someone is working on PREPARE/EXECUTE ?
> > What about Karel's work ?
Right question :-)
> I am. My work is based on Karel's stuff -- at the moment I'm still
> basically working on getting Karel's patch to play nicely with
> current sources; once that's done I'll be addressing whatever
> issues are stopping the code from getting into CVS.
My patch (qcache) for PostgreSQL 7.0 is available at
ftp://ftp2.zf.jcu.cz/users/zakkr/pg/.
I very look forward to Neil's work on this.
Notes:
* It's experimental patch, but usable. All features below mentioned
works.
* PREPARE/EXECUTE is not only SQL statements, I think good idea is
create something common and robus for query-plan caching,
beacuse there is for example SPI too. The RI triggers are based
on SPI_saveplan().
* My patch knows EXECUTE INTO feature:
PREPARE foo AS SELECT * FROM pg_class WHERE relname ~~ $1 USING text;
EXECUTE foo USING 'pg%'; <-- standard select
EXECUTE foo INTO TEMP newtab USING 'pg%'; <-- select into
* The patch allows store query-planns to shared memory and is
possible EXECUTE it at more backends (over same DB) and planns
are persistent across connetions. For this feature I create special
memory context subsystem (like current aset.c, but it works with
IPC shared memory).
This is maybe too complex solution and (maybe) sufficient is cache
query in one backend only. I know unbelief about this shared
memory solution (Tom?).
Karel
My experimental patch README (excuse my English):
Implementation
~~~~~~~~~~~~~~
The qCache allows save queryTree and queryPlan. There is available are
two space for data caching.
LOCAL - data are cached in backend non-shared memory and data aren't
available in other backends.
SHARE - data are cached in backend shared memory and data are
visible in all backends.
Because size of share memory pool is limited and it is set during
postmaster start up, the qCache must remove all old planns if pool is
full. You can mark each entry as "REMOVEABLE" or "NOTREMOVEABLE".
A removeable entry is removed if pool is full.
A not-removeable entry must be removed via qCache_Remove() or
the other routines. The qCache not remove this entry itself.
All records in qCache are cached (in the hash table) under some key.
The qCache knows two alternate of key --- "KEY_STRING" and "KEY_BINARY".
The qCache API not allows access to shared memory, all cached planns that
API returns are copy to CurrentMemoryContext. All (qCache_ ) routines lock
shmem itself (exception is qCache_RemoveOldest_ShareRemoveAble()).
- for locking is used spin lock.
Memory management
~~~~~~~~~~~~~~~~~
The qCache use for qCache's shared pool its memory context independent on
standard aset/mcxt, but use compatible API --- it allows to use standard
palloc() (it is very needful for basic plan-tree operations, an example
for copyObject()). The qCache memory management is very simular to current
aset.c code. It is chunk-ed blocks too, but the block is smaller - 1024b.
The number of blocks is available set in postmaster 'argv' via option
'-Z'.
For plan storing is used separate MemoryContext for each plan, it
is good idea (Hiroshi's ?), bucause create new context is simple and
inexpensive and allows easy destroy (free) cached plan. This method is
used in my SPI overhaul instead TopMemoryContext feeding.
Postmaster
~~~~~~~~~~
The query cache memory is init during potmaster startup. The size of
query cache pool is set via '-Z <number-of-blocks>' switch --- default
is 100 blocks where 1 block = 1024b, it is sufficient for 20-30 cached
planns. One query needs somewhere 3-10 blocks, for example query like
PREPARE sel AS SELECT * FROM pg_class;
needs 10Kb, because table pg_class has very much columns.
Note: for development I add SQL function: "SELECT qcache_state();",
this routine show usage of qCache.
SPI
~~~
I a little overwrite SPI save plan method and remove TopMemoryContext
"feeding".
Standard SPI:
SPI_saveplan() - save each plan to separate standard memory context.
SPI_freeplan() - free plan.
By key SPI:
It is SPI interface for query cache and allows save planns to SHARED
or LOCAL cache 'by' arbitrary key (string or binary). Routines:
SPI_saveplan_bykey() - save plan to query cache
SPI_freeplan_bykey() - remove plan from query cache
SPI_fetchplan_bykey() - fetch plan saved in query cache
SPI_execp_bykey() - execute (via SPI) plan saved in query
cache
- now, users can write functions that save planns to shared memory
and planns are visible in all backend and are persistent arcoss
connection.
Example:
~~~~~~~
/* ----------
* Save/exec query from shared cache via string key
* ----------
*/
int keySize = 0;
flag = SPI_BYKEY_SHARE | SPI_BYKEY_STRING;
char *key = "my unique key";
res = SPI_execp_bykey(values, nulls, tcount, key, flag, keySize);
if (res == SPI_ERROR_PLANNOTFOUND)
{
/* --- not plan in cache - must create it --- */
void *plan;
plan = SPI_prepare(querystr, valnum, valtypes);
SPI_saveplan_bykey(plan, key, keySize, flag);
res = SPI_execute(plan, values, Nulls, tcount);
}
elog(NOTICE, "Processed: %d", SPI_processed);
PREPARE/EXECUTE
~~~~~~~~~~~~~~~
* Syntax:
PREPARE <name> AS <query>
[ USING type, ... typeN ]
[ NOSHARE | SHARE | GLOBAL ]
EXECUTE <name>
[ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table ]
[ USING val, ... valN ]
[ NOSHARE | SHARE | GLOBAL ]
DEALLOCATE PREPARE
[ <name> [ NOSHARE | SHARE | GLOBAL ]]
[ ALL | ALL INTERNAL ]
I know that it is a little out of SQL92... (use CREATE/DROP PLAN instead
this?) --- what mean SQL standard guru?
* Where:
NOSHARE --- cached in local backend query cache - not accessable
from the others backends and not is persisten a across
conection.
SHARE --- cached in shared query cache and accessable from
all backends which work over same database.
GLOBAL --- cached in shared query cache and accessable from
all backends and all databases.
- default is 'SHARE'
Deallocate:
ALL --- deallocate all users's plans
ALL INTERNAL --- deallocate all internal plans, like planns
cached via SPI. It is needful if user
alter/drop table ...etc.
* Parameters:
"USING" part in the prepare statement is for datetype setting for
paremeters in the query. For example:
PREPARE sel AS SELECT * FROM pg_class WHERE relname ~~ $1 USING text;
EXECUTE sel USING 'pg%';
* Limitation:
- prepare/execute allow use full statement of SELECT/INSERT/DELETE/
UPDATE.
- possible is use union, subselects, limit, ofset, select-into
Performance:
~~~~~~~~~~~
* the SPI
- I for my tests a little change RI triggers to use SPI by_key API
and save planns to shared qCache instead to internal RI hash table.
The RI use very simple (for parsing) queries and qCache interest is
not visible. It's better if backend very often startup and RI check
always same tables. In this situation speed go up --- 10-12%.
(This snapshot not include this RI change.)
But all depend on how much complicate for parser is query in
trigger.
* PREPARE/EXECUTE
- For tests I use query that not use some table (the executor is
in boredom state), but is difficult for the parser. An example:
SELECT 'a text ' || (10*10+(100^2))::text || ' next text ' || cast
(date_part('year', timestamp 'now') AS text );
- (10000 * this query):
standard select: 54 sec
via prepare/execute: 4 sec (93% better)
IMHO it is nod bad.
- For standard query like:
SELECT u.usename, r.relname FROM pg_class r, pg_user u WHERE
r.relowner = u.usesysid;
it is with PREPARE/EXECUTE 10-20% faster.
--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/
C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M21228@postgresql.org Fri Apr 12 10:15:34 2002
Return-path: <pgsql-hackers-owner+M21228@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CEFXS29835
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 10:15:33 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 7BFE1475A55; Fri, 12 Apr 2002 10:15:27 -0400 (EDT)
Received: from sss.pgh.pa.us (unknown [192.204.191.242])
by postgresql.org (Postfix) with ESMTP id 5659B474E71
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 10:14:31 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g3CEEQF27238;
Fri, 12 Apr 2002 10:14:26 -0400 (EDT)
To: Karel Zak <zakkr@zf.jcu.cz>
cc: pgsql-hackers@postgresql.org, Neil Conway <nconway@klamath.dyndns.org>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <20020412095116.B6370@zf.jcu.cz>
References: <GNELIHDDFBOCMGBFGEFOGEBHCCAA.chriskl@familyhealth.com.au> <3CB52C54.4020507@freaky-namuh.com> <20020411115434.201ff92f.nconway@klamath.dyndns.org> <3CB61DAB.5010601@freaky-namuh.com> <24184.1018581907@sss.pgh.pa.us> <3CB65B49.93F2F790@tpf.co.jp> <20020412004134.5d35a2dd.nconway@klamath.dyndns.org> <20020412095116.B6370@zf.jcu.cz>
Comments: In-reply-to Karel Zak <zakkr@zf.jcu.cz>
message dated "Fri, 12 Apr 2002 09:51:16 +0200"
Date: Fri, 12 Apr 2002 10:14:26 -0400
Message-ID: <27235.1018620866@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: ORr
Karel Zak <zakkr@zf.jcu.cz> writes:
> * The patch allows store query-planns to shared memory and is
> possible EXECUTE it at more backends (over same DB) and planns
> are persistent across connetions. For this feature I create special
> memory context subsystem (like current aset.c, but it works with
> IPC shared memory).
> This is maybe too complex solution and (maybe) sufficient is cache
> query in one backend only. I know unbelief about this shared
> memory solution (Tom?).
Yes, that is the part that was my sticking point last time around.
(1) Because shared memory cannot be extended on-the-fly, I think it is
a very bad idea to put data structures in there without some well
thought out way of predicting/limiting their size. (2) How the heck do
you get rid of obsoleted cached plans, if the things stick around in
shared memory even after you start a new backend? (3) A shared cache
requires locking; contention among multiple backends to access that
shared resource could negate whatever performance benefit you might hope
to realize from it.
A per-backend cache kept in local memory avoids all of these problems,
and I have seen no numbers to make me think that a shared plan cache
would achieve significantly more performance benefit than a local one.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
From pgsql-hackers-owner+M21233@postgresql.org Fri Apr 12 12:26:32 2002
Return-path: <pgsql-hackers-owner+M21233@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CGQVS11018
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 12:26:31 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 38DBB475B20; Fri, 12 Apr 2002 12:22:08 -0400 (EDT)
Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
by postgresql.org (Postfix) with ESMTP id 0DA70475B9E
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 12:21:15 -0400 (EDT)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.11.6/8.10.1) id g3CGL4310492;
Fri, 12 Apr 2002 12:21:04 -0400 (EDT)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200204121621.g3CGL4310492@candle.pha.pa.us>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <27235.1018620866@sss.pgh.pa.us>
To: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 12 Apr 2002 12:21:04 -0400 (EDT)
cc: Karel Zak <zakkr@zf.jcu.cz>, pgsql-hackers@postgresql.org,
Neil Conway <nconway@klamath.dyndns.org>
X-Mailer: ELM [version 2.4ME+ PL97 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
Tom Lane wrote:
> Karel Zak <zakkr@zf.jcu.cz> writes:
> > * The patch allows store query-planns to shared memory and is
> > possible EXECUTE it at more backends (over same DB) and planns
> > are persistent across connetions. For this feature I create special
> > memory context subsystem (like current aset.c, but it works with
> > IPC shared memory).
> > This is maybe too complex solution and (maybe) sufficient is cache
> > query in one backend only. I know unbelief about this shared
> > memory solution (Tom?).
>
> Yes, that is the part that was my sticking point last time around.
> (1) Because shared memory cannot be extended on-the-fly, I think it is
> a very bad idea to put data structures in there without some well
> thought out way of predicting/limiting their size. (2) How the heck do
> you get rid of obsoleted cached plans, if the things stick around in
> shared memory even after you start a new backend? (3) A shared cache
> requires locking; contention among multiple backends to access that
> shared resource could negate whatever performance benefit you might hope
> to realize from it.
>
> A per-backend cache kept in local memory avoids all of these problems,
> and I have seen no numbers to make me think that a shared plan cache
> would achieve significantly more performance benefit than a local one.
Certainly a shared cache would be good for apps that connect to issue a
single query frequently. In such cases, there would be no local cache
to use.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M21234@postgresql.org Fri Apr 12 12:44:12 2002
Return-path: <pgsql-hackers-owner+M21234@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CGiBS12385
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 12:44:12 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id AEAA7475C6C; Fri, 12 Apr 2002 12:43:17 -0400 (EDT)
Received: from barry.xythos.com (h-64-105-36-191.SNVACAID.covad.net [64.105.36.191])
by postgresql.org (Postfix) with ESMTP id CE58C47598E
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 12:42:48 -0400 (EDT)
Received: from xythos.com (localhost.localdomain [127.0.0.1])
by barry.xythos.com (8.11.6/8.11.6) with ESMTP id g3CGgaI02920;
Fri, 12 Apr 2002 09:42:36 -0700
Message-ID: <3CB70E7C.3090801@xythos.com>
Date: Fri, 12 Apr 2002 09:42:36 -0700
From: Barry Lind <barry@xythos.com>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020310
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Tom Lane <tgl@sss.pgh.pa.us>
cc: Karel Zak <zakkr@zf.jcu.cz>, pgsql-hackers@postgresql.org,
Neil Conway <nconway@klamath.dyndns.org>
Subject: Re: [HACKERS] 7.3 schedule
References: <GNELIHDDFBOCMGBFGEFOGEBHCCAA.chriskl@familyhealth.com.au> <3CB52C54.4020507@freaky-namuh.com> <20020411115434.201ff92f.nconway@klamath.dyndns.org> <3CB61DAB.5010601@freaky-namuh.com> <24184.1018581907@sss.pgh.pa.us> <3CB65B49.93F2F790@tpf.co.jp> <20020412004134.5d35a2dd.nconway@klamath.dyndns.org> <20020412095116.B6370@zf.jcu.cz> <27235.1018620866@sss.pgh.pa.us>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: ORr
Tom Lane wrote:
> Yes, that is the part that was my sticking point last time around.
> (1) Because shared memory cannot be extended on-the-fly, I think it is
> a very bad idea to put data structures in there without some well
> thought out way of predicting/limiting their size. (2) How the heck do
> you get rid of obsoleted cached plans, if the things stick around in
> shared memory even after you start a new backend? (3) A shared cache
> requires locking; contention among multiple backends to access that
> shared resource could negate whatever performance benefit you might hope
> to realize from it.
>
> A per-backend cache kept in local memory avoids all of these problems,
> and I have seen no numbers to make me think that a shared plan cache
> would achieve significantly more performance benefit than a local one.
>
Oracle's implementation is a shared cache for all plans. This was
introduced in Oracle 6 or 7 (I don't remember which anymore). The net
effect was that in general there was a significant performance
improvement with the shared cache. However poorly written apps can now
bring the Oracle database to its knees because of the locking issues
associated with the shared cache. For example if the most frequently
run sql statements are coded poorly (i.e. they don't use bind variables,
eg. 'select bar from foo where foobar = $1' vs. 'select bar from foo
where foobar = || somevalue' (where somevalue is likely to be
different on every call)) the shared cache doesn't help and its overhead
becomes significant.
thanks,
--Barry
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From pgsql-hackers-owner+M21237@postgresql.org Fri Apr 12 12:50:28 2002
Return-path: <pgsql-hackers-owner+M21237@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CGoRS13005
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 12:50:28 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 32A28475BA1; Fri, 12 Apr 2002 12:50:15 -0400 (EDT)
Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
by postgresql.org (Postfix) with ESMTP id 07F1E475892
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 12:49:43 -0400 (EDT)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.11.6/8.10.1) id g3CGnbw12950;
Fri, 12 Apr 2002 12:49:37 -0400 (EDT)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200204121649.g3CGnbw12950@candle.pha.pa.us>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <3CB70E7C.3090801@xythos.com>
To: Barry Lind <barry@xythos.com>
Date: Fri, 12 Apr 2002 12:49:37 -0400 (EDT)
cc: Tom Lane <tgl@sss.pgh.pa.us>, Karel Zak <zakkr@zf.jcu.cz>,
pgsql-hackers@postgresql.org, Neil Conway <nconway@klamath.dyndns.org>
X-Mailer: ELM [version 2.4ME+ PL97 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
Barry Lind wrote:
> Oracle's implementation is a shared cache for all plans. This was
> introduced in Oracle 6 or 7 (I don't remember which anymore). The net
> effect was that in general there was a significant performance
> improvement with the shared cache. However poorly written apps can now
> bring the Oracle database to its knees because of the locking issues
> associated with the shared cache. For example if the most frequently
> run sql statements are coded poorly (i.e. they don't use bind variables,
> eg. 'select bar from foo where foobar = $1' vs. 'select bar from foo
> where foobar = || somevalue' (where somevalue is likely to be
> different on every call)) the shared cache doesn't help and its overhead
> becomes significant.
This is very interesting. We have always been concerned that shared
cache invalidation could cause more of a performance problem that the
shared cache gives benefit, and it sounds like you are saying exactly
that.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From pgsql-hackers-owner+M21238@postgresql.org Fri Apr 12 12:51:55 2002
Return-path: <pgsql-hackers-owner+M21238@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CGptS13119
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 12:51:55 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id C599D475BC6; Fri, 12 Apr 2002 12:51:47 -0400 (EDT)
Received: from sss.pgh.pa.us (unknown [192.204.191.242])
by postgresql.org (Postfix) with ESMTP id C9F94475892
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 12:51:26 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g3CGpQF27967;
Fri, 12 Apr 2002 12:51:27 -0400 (EDT)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Karel Zak <zakkr@zf.jcu.cz>, pgsql-hackers@postgresql.org,
Neil Conway <nconway@klamath.dyndns.org>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <200204121621.g3CGL4310492@candle.pha.pa.us>
References: <200204121621.g3CGL4310492@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Fri, 12 Apr 2002 12:21:04 -0400"
Date: Fri, 12 Apr 2002 12:51:26 -0400
Message-ID: <27964.1018630286@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Certainly a shared cache would be good for apps that connect to issue a
> single query frequently. In such cases, there would be no local cache
> to use.
We have enough other problems with the single-query-per-connection
scenario that I see no reason to believe that a shared plan cache will
help materially. The correct answer for those folks will *always* be
to find a way to reuse the connection.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M21241@postgresql.org Fri Apr 12 16:25:46 2002
Return-path: <pgsql-hackers-owner+M21241@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CKPkS03078
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 16:25:46 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 9C3BD475CC6; Fri, 12 Apr 2002 16:25:42 -0400 (EDT)
Received: from klamath.dyndns.org (CPE002078144ae0.cpe.net.cable.rogers.com [24.102.202.35])
by postgresql.org (Postfix) with ESMTP id B06D8475909
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 16:24:52 -0400 (EDT)
Received: from jiro (jiro [192.168.40.7])
by klamath.dyndns.org (Postfix) with SMTP
id C05557013; Fri, 12 Apr 2002 16:24:53 -0400 (EDT)
Date: Fri, 12 Apr 2002 16:24:48 -0400
From: Neil Conway <nconway@klamath.dyndns.org>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
cc: tgl@sss.pgh.pa.us, zakkr@zf.jcu.cz, pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] 7.3 schedule
Message-ID: <20020412162448.4d46d747.nconway@klamath.dyndns.org>
In-Reply-To: <200204121621.g3CGL4310492@candle.pha.pa.us>
References: <27235.1018620866@sss.pgh.pa.us>
<200204121621.g3CGL4310492@candle.pha.pa.us>
X-Mailer: Sylpheed version 0.7.4 (GTK+ 1.2.10; i386-debian-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: ORr
On Fri, 12 Apr 2002 12:21:04 -0400 (EDT)
"Bruce Momjian" <pgman@candle.pha.pa.us> wrote:
> Tom Lane wrote:
> > A per-backend cache kept in local memory avoids all of these problems,
> > and I have seen no numbers to make me think that a shared plan cache
> > would achieve significantly more performance benefit than a local one.
>
> Certainly a shared cache would be good for apps that connect to issue a
> single query frequently. In such cases, there would be no local cache
> to use.
One problem with this kind of scenario is: what to do if the plan no
longer exists for some reason? (e.g. the code that was supposed to be
PREPARE-ing your statements failed to execute properly, or the cached
plan has been evicted from shared memory, or the database was restarted,
etc.) -- EXECUTE in and of itself won't have enough information to do
anything useful. We could perhaps provide a means for an application
to test for the existence of a cached plan (in which case the
application developer will need to add logic to their application
to re-prepare the query if necessary, which could get complicated).
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
From pgsql-hackers-owner+M21242@postgresql.org Fri Apr 12 17:27:24 2002
Return-path: <pgsql-hackers-owner+M21242@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CLRNS14410
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 17:27:23 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id E05A1475D30; Fri, 12 Apr 2002 17:26:40 -0400 (EDT)
Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
by postgresql.org (Postfix) with ESMTP id 36BBB475858
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 17:25:44 -0400 (EDT)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.11.6/8.10.1) id g3CLPVa14231;
Fri, 12 Apr 2002 17:25:31 -0400 (EDT)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200204122125.g3CLPVa14231@candle.pha.pa.us>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <20020412162448.4d46d747.nconway@klamath.dyndns.org>
To: Neil Conway <nconway@klamath.dyndns.org>
Date: Fri, 12 Apr 2002 17:25:31 -0400 (EDT)
cc: tgl@sss.pgh.pa.us, zakkr@zf.jcu.cz, pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL97 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
Neil Conway wrote:
> On Fri, 12 Apr 2002 12:21:04 -0400 (EDT)
> "Bruce Momjian" <pgman@candle.pha.pa.us> wrote:
> > Tom Lane wrote:
> > > A per-backend cache kept in local memory avoids all of these problems,
> > > and I have seen no numbers to make me think that a shared plan cache
> > > would achieve significantly more performance benefit than a local one.
> >
> > Certainly a shared cache would be good for apps that connect to issue a
> > single query frequently. In such cases, there would be no local cache
> > to use.
>
> One problem with this kind of scenario is: what to do if the plan no
> longer exists for some reason? (e.g. the code that was supposed to be
> PREPARE-ing your statements failed to execute properly, or the cached
> plan has been evicted from shared memory, or the database was restarted,
> etc.) -- EXECUTE in and of itself won't have enough information to do
> anything useful. We could perhaps provide a means for an application
> to test for the existence of a cached plan (in which case the
> application developer will need to add logic to their application
> to re-prepare the query if necessary, which could get complicated).
Oh, are you thinking that one backend would do the PREPARE and another
one the EXECUTE? I can't see that working at all. I thought there
would some way to quickly test if the submitted query was in the cache,
but maybe that is too much of a performance penalty to be worth it.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From tgl@sss.pgh.pa.us Fri Apr 12 17:36:17 2002
Return-path: <tgl@sss.pgh.pa.us>
Received: from sss.pgh.pa.us (root@[192.204.191.242])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CLaGS16061
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 17:36:17 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g3CLaGF10813;
Fri, 12 Apr 2002 17:36:16 -0400 (EDT)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Neil Conway <nconway@klamath.dyndns.org>, zakkr@zf.jcu.cz,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <200204122125.g3CLPVa14231@candle.pha.pa.us>
References: <200204122125.g3CLPVa14231@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Fri, 12 Apr 2002 17:25:31 -0400"
Date: Fri, 12 Apr 2002 17:36:16 -0400
Message-ID: <10810.1018647376@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: ORr
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Oh, are you thinking that one backend would do the PREPARE and another
> one the EXECUTE? I can't see that working at all.
Uh, why exactly were you advocating a shared cache then? Wouldn't that
be exactly the *point* of a shared cache?
regards, tom lane
From pgsql-hackers-owner+M21245@postgresql.org Fri Apr 12 17:39:13 2002
Return-path: <pgsql-hackers-owner+M21245@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CLdCS16515
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 17:39:12 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id A904B475E15; Fri, 12 Apr 2002 17:39:09 -0400 (EDT)
Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
by postgresql.org (Postfix) with ESMTP id B1A3F4758DE
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 17:38:25 -0400 (EDT)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.11.6/8.10.1) id g3CLcFX16347;
Fri, 12 Apr 2002 17:38:15 -0400 (EDT)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200204122138.g3CLcFX16347@candle.pha.pa.us>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <10810.1018647376@sss.pgh.pa.us>
To: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 12 Apr 2002 17:38:15 -0400 (EDT)
cc: Neil Conway <nconway@klamath.dyndns.org>, zakkr@zf.jcu.cz,
pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL97 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Oh, are you thinking that one backend would do the PREPARE and another
> > one the EXECUTE? I can't see that working at all.
>
> Uh, why exactly were you advocating a shared cache then? Wouldn't that
> be exactly the *point* of a shared cache?
I thought it would somehow compare the SQL query string to the cached
plans and if it matched, it would use that plan rather than make a new
one. Any DDL statement would flush the cache.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
From pgsql-hackers-owner+M21246@postgresql.org Fri Apr 12 17:56:58 2002
Return-path: <pgsql-hackers-owner+M21246@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3CLuvS19021
for <pgman@candle.pha.pa.us>; Fri, 12 Apr 2002 17:56:58 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 1B4D6475E2C; Fri, 12 Apr 2002 17:56:55 -0400 (EDT)
Received: from voyager.corporate.connx.com (unknown [209.20.248.131])
by postgresql.org (Postfix) with ESMTP id 059F1475858
for <pgsql-hackers@postgresql.org>; Fri, 12 Apr 2002 17:56:13 -0400 (EDT)
X-MimeOLE: Produced By Microsoft Exchange V6.0.4712.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Subject: Re: [HACKERS] 7.3 schedule
Date: Fri, 12 Apr 2002 14:59:15 -0700
Message-ID: <D90A5A6C612A39408103E6ECDD77B82906F42C@voyager.corporate.connx.com>
Thread-Topic: [HACKERS] 7.3 schedule
Thread-Index: AcHia2aODSpgXEd4Tluz/N0jN5fJOQAAC//w
From: "Dann Corbit" <DCorbit@connx.com>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>, "Tom Lane" <tgl@sss.pgh.pa.us>
cc: "Neil Conway" <nconway@klamath.dyndns.org>, <zakkr@zf.jcu.cz>,
<pgsql-hackers@postgresql.org>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id g3CLuvS19021
Status: OR
-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Friday, April 12, 2002 2:38 PM
To: Tom Lane
Cc: Neil Conway; zakkr@zf.jcu.cz; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] 7.3 schedule
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Oh, are you thinking that one backend would do the PREPARE and
another
> > one the EXECUTE? I can't see that working at all.
>
> Uh, why exactly were you advocating a shared cache then? Wouldn't
that
> be exactly the *point* of a shared cache?
I thought it would somehow compare the SQL query string to the cached
plans and if it matched, it would use that plan rather than make a new
one. Any DDL statement would flush the cache.
>>-------------------------------------------------------------------
Many applications will have similar queries coming from lots of
different end-users. Imagine an order-entry program where people are
ordering parts. Many of the queries might look like this:
SELECT part_number FROM parts WHERE part_id = 12324 AND part_cost
< 12.95
In order to cache this query, we first parse it to replace the data
fields with paramter markers.
Then it looks like this:
SELECT part_number FROM parts WHERE part_id = ? AND part_cost < ?
{in the case of a 'LIKE' query or some other query where you can use
key information, you might have a symbolic replacement like this:
WHERE field LIKE '{D}%' to indicate that the key can be used}
Then, we make sure that the case is consistent by either capitalizing
the whole query or changing it all into lower case:
select part_number from parts where part_id = ? and part_cost < ?
Then, we run a checksum on the parameterized string.
The checksum might be used as a hash table key, where we keep some
additional information like how stale the entry is, and a pointer to
the actual parameterized SQL (in case the hash key has a collision
it would be simply wrong to run an incorrect query for obvious enough
reasons).
Now, if there are a huge number of users of the same application, it
makes sense that the probabilities of reusing queries goes up with
the number of users of the same application. Therefore, I would
advocate that the cache be kept in shared memory.
Consider a single application with 100 different queries. Now, add
one user, ten users, 100 users, ... 10,000 users and you can see
that the benefit would be greater and greater as we add users.
<<-------------------------------------------------------------------
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
From pgsql-hackers-owner+M21270@postgresql.org Sat Apr 13 02:30:47 2002
Return-path: <pgsql-hackers-owner+M21270@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3D6UkS07169
for <pgman@candle.pha.pa.us>; Sat, 13 Apr 2002 02:30:46 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 23FEC475D1E; Sat, 13 Apr 2002 02:30:38 -0400 (EDT)
Received: from mail.iinet.net.au (symphony-01.iinet.net.au [203.59.3.33])
by postgresql.org (Postfix) with SMTP id A08A4475C6C
for <pgsql-hackers@postgresql.org>; Sat, 13 Apr 2002 02:29:37 -0400 (EDT)
Received: (qmail 11594 invoked by uid 666); 13 Apr 2002 06:29:36 -0000
Received: from unknown (HELO SOL) (203.59.103.193)
by mail.iinet.net.au with SMTP; 13 Apr 2002 06:29:36 -0000
Message-ID: <002301c1e2b3$804bd000$0200a8c0@SOL>
From: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
To: "Barry Lind" <barry@xythos.com>, "Tom Lane" <tgl@sss.pgh.pa.us>
cc: "Karel Zak" <zakkr@zf.jcu.cz>, <pgsql-hackers@postgresql.org>,
"Neil Conway" <nconway@klamath.dyndns.org>
References: <GNELIHDDFBOCMGBFGEFOGEBHCCAA.chriskl@familyhealth.com.au> <3CB52C54.4020507@freaky-namuh.com> <20020411115434.201ff92f.nconway@klamath.dyndns.org> <3CB61DAB.5010601@freaky-namuh.com> <24184.1018581907@sss.pgh.pa.us> <3CB65B49.93F2F790@tpf.co.jp> <20020412004134.5d35a2dd.nconway@klamath.dyndns.org> <20020412095116.B6370@zf.jcu.cz> <27235.1018620866@sss.pgh.pa.us> <3CB70E7C.3090801@xythos.com>
Subject: Re: [HACKERS] 7.3 schedule
Date: Sat, 13 Apr 2002 14:21:50 +0800
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4522.1200
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
> > thought out way of predicting/limiting their size. (2) How the heck do
> > you get rid of obsoleted cached plans, if the things stick around in
> > shared memory even after you start a new backend? (3) A shared cache
> > requires locking; contention among multiple backends to access that
> > shared resource could negate whatever performance benefit you might hope
> > to realize from it.
I don't understand all these locking problems? Surely the only lock a
transaction would need on a stored query is one that prevents the cache
invalidation mechanism from deleting it out from under it? Surely this
means that there would be tonnes of readers on the cache - none of them
blocking each other, and the odd invalidation event that needs a complete
lock?
Also, as for invalidation, there probably could be just two reasons to
invalidate a query in the cache. (1) The cache is running out of space and
you use LRU or something to remove old queries, or (2) someone runs ANALYZE,
in which case all cached queries should just be flushed? If they specify an
actual table to analyze, then just drop all queries on the table.
Could this cache mechanism be used to make views fast as well? You could
cache the queries that back views on first use, and then they can follow the
above rules for flushing...
Chris
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
From pgsql-hackers-owner+M21276@postgresql.org Sat Apr 13 11:48:51 2002
Return-path: <pgsql-hackers-owner+M21276@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3DFmoS27879
for <pgman@candle.pha.pa.us>; Sat, 13 Apr 2002 11:48:51 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 9EB81475C5C; Sat, 13 Apr 2002 11:46:52 -0400 (EDT)
Received: from sss.pgh.pa.us (unknown [192.204.191.242])
by postgresql.org (Postfix) with ESMTP id 0FE0B474E78
for <pgsql-hackers@postgresql.org>; Sat, 13 Apr 2002 11:46:09 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g3DFk2F15743;
Sat, 13 Apr 2002 11:46:02 -0400 (EDT)
To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
cc: "Barry Lind" <barry@xythos.com>, "Karel Zak" <zakkr@zf.jcu.cz>,
pgsql-hackers@postgresql.org, "Neil Conway" <nconway@klamath.dyndns.org>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <002301c1e2b3$804bd000$0200a8c0@SOL>
References: <GNELIHDDFBOCMGBFGEFOGEBHCCAA.chriskl@familyhealth.com.au> <3CB52C54.4020507@freaky-namuh.com> <20020411115434.201ff92f.nconway@klamath.dyndns.org> <3CB61DAB.5010601@freaky-namuh.com> <24184.1018581907@sss.pgh.pa.us> <3CB65B49.93F2F790@tpf.co.jp> <20020412004134.5d35a2dd.nconway@klamath.dyndns.org> <20020412095116.B6370@zf.jcu.cz> <27235.1018620866@sss.pgh.pa.us> <3CB70E7C.3090801@xythos.com> <002301c1e2b3$804bd000$0200a8c0@SOL>
Comments: In-reply-to "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
message dated "Sat, 13 Apr 2002 14:21:50 +0800"
Date: Sat, 13 Apr 2002 11:46:01 -0400
Message-ID: <15740.1018712761@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> thought out way of predicting/limiting their size. (2) How the heck do
> you get rid of obsoleted cached plans, if the things stick around in
> shared memory even after you start a new backend? (3) A shared cache
> requires locking; contention among multiple backends to access that
> shared resource could negate whatever performance benefit you might hope
> to realize from it.
> I don't understand all these locking problems?
Searching the cache and inserting/deleting entries in the cache probably
have to be mutually exclusive; concurrent insertions probably won't work
either (at least not without a remarkably intelligent data structure).
Unless the cache hit rate is remarkably high, there are going to be lots
of insertions --- and, at steady state, an equal rate of deletions ---
leading to lots of contention.
This could possibly be avoided if the cache is not used for all query
plans but only for explicitly PREPAREd plans, so that only explicit
EXECUTEs would need to search it. But that approach also makes a
sizable dent in the usefulness of the cache to begin with.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From pgsql-hackers-owner+M21280@postgresql.org Sat Apr 13 14:36:34 2002
Return-path: <pgsql-hackers-owner+M21280@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3DIaYS10293
for <pgman@candle.pha.pa.us>; Sat, 13 Apr 2002 14:36:34 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id AA151475BB1; Sat, 13 Apr 2002 14:36:17 -0400 (EDT)
Received: from klamath.dyndns.org (CPE002078144ae0.cpe.net.cable.rogers.com [24.102.202.35])
by postgresql.org (Postfix) with ESMTP id 42993475BCB
for <pgsql-hackers@postgresql.org>; Sat, 13 Apr 2002 14:35:42 -0400 (EDT)
Received: from jiro (jiro [192.168.40.7])
by klamath.dyndns.org (Postfix) with SMTP
id 82B84700C; Sat, 13 Apr 2002 14:35:42 -0400 (EDT)
Date: Sat, 13 Apr 2002 14:35:39 -0400
From: Neil Conway <nconway@klamath.dyndns.org>
To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
cc: barry@xythos.com, tgl@sss.pgh.pa.us, zakkr@zf.jcu.cz,
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] 7.3 schedule
Message-ID: <20020413143539.7818bf7d.nconway@klamath.dyndns.org>
In-Reply-To: <002301c1e2b3$804bd000$0200a8c0@SOL>
References: <GNELIHDDFBOCMGBFGEFOGEBHCCAA.chriskl@familyhealth.com.au>
<3CB52C54.4020507@freaky-namuh.com>
<20020411115434.201ff92f.nconway@klamath.dyndns.org>
<3CB61DAB.5010601@freaky-namuh.com>
<24184.1018581907@sss.pgh.pa.us>
<3CB65B49.93F2F790@tpf.co.jp>
<20020412004134.5d35a2dd.nconway@klamath.dyndns.org>
<20020412095116.B6370@zf.jcu.cz>
<27235.1018620866@sss.pgh.pa.us>
<3CB70E7C.3090801@xythos.com>
<002301c1e2b3$804bd000$0200a8c0@SOL>
X-Mailer: Sylpheed version 0.7.4 (GTK+ 1.2.10; i386-debian-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On Sat, 13 Apr 2002 14:21:50 +0800
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> wrote:
> Could this cache mechanism be used to make views fast as well?
The current PREPARE/EXECUTE code will speed up queries that use
rules of any kind, including views: the query plan is cached after
it has been rewritten as necessary, so (AFAIK) this should mean
that rules will be evaluated once when the query is PREPAREd, and
then cached for subsequent EXECUTE commands.
Cheers,
Neil
--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From pgsql-hackers-owner+M21309@postgresql.org Sun Apr 14 15:22:44 2002
Return-path: <pgsql-hackers-owner+M21309@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3EJMiS24239
for <pgman@candle.pha.pa.us>; Sun, 14 Apr 2002 15:22:44 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 44BAC475E05; Sun, 14 Apr 2002 15:22:42 -0400 (EDT)
Received: from ara.zf.jcu.cz (ara.zf.jcu.cz [160.217.161.4])
by postgresql.org (Postfix) with ESMTP id 3CD03475925
for <pgsql-hackers@postgresql.org>; Sun, 14 Apr 2002 15:21:58 -0400 (EDT)
Received: from ara.zf.jcu.cz (LOCALHOST [127.0.0.1])
by ara.zf.jcu.cz (8.12.1/8.12.1/Debian -5) with ESMTP id g3EJLiBK012612;
Sun, 14 Apr 2002 21:21:44 +0200
Received: (from zakkr@localhost)
by ara.zf.jcu.cz (8.12.1/8.12.1/Debian -5) id g3EJLi3k012611;
Sun, 14 Apr 2002 21:21:44 +0200
Date: Sun, 14 Apr 2002 21:21:44 +0200
From: Karel Zak <zakkr@zf.jcu.cz>
To: Tom Lane <tgl@sss.pgh.pa.us>
cc: Bruce Momjian <pgman@candle.pha.pa.us>, pgsql-hackers@postgresql.org,
Neil Conway <nconway@klamath.dyndns.org>
Subject: Re: [HACKERS] 7.3 schedule
Message-ID: <20020414212144.A12196@zf.jcu.cz>
References: <200204121621.g3CGL4310492@candle.pha.pa.us> <27964.1018630286@sss.pgh.pa.us>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <27964.1018630286@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Fri, Apr 12, 2002 at 12:51:26PM -0400
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On Fri, Apr 12, 2002 at 12:51:26PM -0400, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Certainly a shared cache would be good for apps that connect to issue a
> > single query frequently. In such cases, there would be no local cache
> > to use.
>
> We have enough other problems with the single-query-per-connection
> scenario that I see no reason to believe that a shared plan cache will
> help materially. The correct answer for those folks will *always* be
> to find a way to reuse the connection.
My query cache was write for 7.0. If some next release will use
pre-forked backend and after a client disconnection the backend will
still alives and waits for new client the shared cache is (maybe:-) not
needful. The current backend fork model is killer of all possible
caching.
We have more caches. I hope persistent backend help will help to all
and I'm sure that speed will grow up with persistent backend and
persistent caches without shared memory usage. There I can agree with
Tom :-)
Karel
--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/
C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
From pgsql-hackers-owner+M21321@postgresql.org Sun Apr 14 20:40:08 2002
Return-path: <pgsql-hackers-owner+M21321@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3F0e7S29723
for <pgman@candle.pha.pa.us>; Sun, 14 Apr 2002 20:40:07 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 3B5FB475DC5; Sun, 14 Apr 2002 20:40:03 -0400 (EDT)
Received: from localhost.localdomain (bgp01077650bgs.wanarb01.mi.comcast.net [68.40.135.112])
by postgresql.org (Postfix) with ESMTP id 7B1D3474E71
for <pgsql-hackers@postgresql.org>; Sun, 14 Apr 2002 20:39:18 -0400 (EDT)
Received: from localhost (camber@localhost)
by localhost.localdomain (8.11.6/8.11.6) with ESMTP id g3F0cmD10631;
Sun, 14 Apr 2002 20:38:48 -0400
X-Authentication-Warning: localhost.localdomain: camber owned process doing -bs
Date: Sun, 14 Apr 2002 20:38:48 -0400 (EDT)
From: Brian Bruns <camber@ais.org>
X-X-Sender: <camber@localhost.localdomain>
To: Hannu Krosing <hannu@tm.ee>
cc: <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] 7.3 schedule
In-Reply-To: <1018704763.1784.1.camel@taru.tm.ee>
Message-ID: <Pine.LNX.4.33.0204142027180.9523-100000@localhost.localdomain>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
On 13 Apr 2002, Hannu Krosing wrote:
> On Fri, 2002-04-12 at 03:04, Brian Bruns wrote:
> > On 11 Apr 2002, Hannu Krosing wrote:
> >
> > > IIRC someone started work on modularising the network-related parts with
> > > a goal of supporting DRDA (DB2 protocol) and others in future.
> >
> > That was me, although I've been bogged down lately, and haven't been able
> > to get back to it.
>
> Has any of your modularisation work got into CVS yet ?
No, Bruce didn't like the way I did certain things, and had some qualms
about the value of supporting multiple wire protocols IIRC. Plus the
patch was not really ready for primetime yet.
I'm hoping to get back to it soon and sync it with the latest CVS, and
clean up the odds and ends.
> > DRDA, btw, is not just a DB2 protocol but an opengroup
> > spec that hopefully will someday be *the* standard on the wire database
> > protocol. DRDA handles prepare/execute and is completely binary in
> > representation, among other advantages.
>
> What about extensibility - is there some predefined way of adding new
> types ?
Not really, there is some ongoing standards activity adding some new
features. The list of supported types is pretty impressive, anything in
particular you are looking for?
> Also, does it handle NOTIFY ?
I don't know the answer to this. The spec is pretty huge, so it may, but
I haven't seen it.
Even if it is supported as a secondary protocol, I believe there is alot
of value in having a single database protocol standard. (why else would I
be doing it!). I'm also looking into what it will take to do the same for
MySQL and Firebird. Hopefully they will be receptive to the idea as well.
> ----------------
> Hannu
Cheers,
Brian
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
This source diff could not be displayed because it is too large. You can view the blob instead.
From pgsql-hackers-owner+M1833@hub.org Sat May 13 22:49:26 2000
Received: from news.tht.net (news.hub.org [216.126.91.242])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07394
for <pgman@candle.pha.pa.us>; Sat, 13 May 2000 22:49:24 -0400 (EDT)
Received: from hub.org (majordom@hub.org [216.126.84.1])
by news.tht.net (8.9.3/8.9.3) with ESMTP id WAB99859;
Sat, 13 May 2000 22:44:15 -0400 (EDT)
(envelope-from pgsql-hackers-owner+M1833@hub.org)
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by hub.org (8.9.3/8.9.3) with ESMTP id WAA51058
for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:41:16 -0400 (EDT)
(envelope-from tgl@sss.pgh.pa.us)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA18343
for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:40:38 -0400 (EDT)
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Proposal for fixing numeric type-resolution issues
Date: Sat, 13 May 2000 22:40:38 -0400
Message-ID: <18340.958272038@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: ORr
We've got a collection of problems that are related to the parser's
inability to make good type-resolution choices for numeric constants.
In some cases you get a hard error; for example "NumericVar + 4.4"
yields
ERROR: Unable to identify an operator '+' for types 'numeric' and 'float8'
You will have to retype this query using an explicit cast
because "4.4" is initially typed as float8 and the system can't figure
out whether to use numeric or float8 addition. A more subtle problem
is that a query like "... WHERE Int2Var < 42" is unable to make use of
an index on the int2 column: 42 is resolved as int4, so the operator
is int24lt, which works but is not in the opclass of an int2 index.
Here is a proposal for fixing these problems. I think we could get this
done for 7.1 if people like it.
The basic problem is that there's not enough smarts in the type resolver
about the interrelationships of the numeric datatypes. All it has is
a concept of a most-preferred type within the category of numeric types.
(We are abusing the most-preferred-type mechanism, BTW, because both
FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
category! This is in fact why the resolver can't make a choice for
"numeric+float8".) We need more intelligence than that.
I propose that we set up a strictly-ordered hierarchy of numeric
datatypes, running from least preferred to most preferred:
int2, int4, int8, numeric, float4, float8.
Rather than simply considering coercions to the most-preferred type,
the type resolver should use the following rules:
1. No value will be down-converted (eg int4 to int2) except by an
explicit conversion.
2. If there is not an exact matching operator, numeric values will be
up-converted to the highest numeric datatype present among the operator
or function's arguments. For example, given "int2 + int8" we'd up-
convert the int2 to int8 and apply int8 addition.
The final piece of the puzzle is that the type initially assigned to
an undecorated numeric constant should be NUMERIC if it contains a
decimal point or exponent, and otherwise the smallest of int2, int4,
int8, NUMERIC that will represent it. This is a considerable change
from the current lexer behavior, where you get either int4 or float8.
For example, given "NumericVar + 4.4", the constant 4.4 will initially
be assigned type NUMERIC, we will resolve the operator as numeric plus,
and everything's fine. Given "Float8Var + 4.4", the constant is still
initially numeric, but will be up-converted to float8 so that float8
addition can be used. The end result is the same as in traditional
Postgres: you get float8 addition. Given "Int2Var < 42", the constant
is initially typed as int2, since it fits, and we end up selecting
int2lt, thereby allowing use of an int2 index. (On the other hand,
given "Int2Var < 100000", we'd end up using int4lt, which is correct
to avoid overflow.)
A couple of crucial subtleties here:
1. We are assuming that the parser or optimizer will constant-fold
any conversion functions that are introduced. Thus, in the
"Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
time execution begins, so there's no performance loss.
2. We cannot lose precision by initially representing a constant as
numeric and later converting it to float. Nor can we exceed NUMERIC's
range (the default 1000-digit limit is more than the range of IEEE
float8 data). It would not work as well to start out by representing
a constant as float and then converting it to numeric.
Presently, the pg_proc and pg_operator tables contain a pretty fair
collection of cross-datatype numeric operators, such as int24lt,
float48pl, etc. We could perhaps leave these in, but I believe that
it is better to remove them. For example, if int42lt is left in place,
then it would capture cases like "Int4Var < 42", whereas we need that
to be translated to int4lt so that an int4 index can be used. Removing
these operators will eliminate some code bloat and system-catalog bloat
to boot.
As far as I can tell, this proposal is almost compatible with the rules
given in SQL92: in particular, SQL92 specifies that an operator having
both "approximate numeric" (float) and "exact numeric" (int or numeric)
inputs should deliver an approximate-numeric result. I propose
deviating from SQL92 in a single respect: SQL92 specifies that a
constant containing an exponent (eg 1.2E34) is approximate numeric,
which implies that the result of an operator using it is approximate
even if the other operand is exact. I believe it's better to treat
such a constant as exact (ie, type NUMERIC) and only convert it to
float if the other operand is float. Without doing that, an assignment
like
UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
will not work as desired because the constant will be prematurely
coerced to float, causing precision loss.
Comments?
regards, tom lane
From tgl@sss.pgh.pa.us Sun May 14 17:30:56 2000
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05808
for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:30:52 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.4 $) with ESMTP id RAA16657 for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:29:52 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA20914;
Sun, 14 May 2000 17:29:30 -0400 (EDT)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] type conversion discussion
In-reply-to: <200005141950.PAA04636@candle.pha.pa.us>
References: <200005141950.PAA04636@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Sun, 14 May 2000 15:50:20 -0400"
Date: Sun, 14 May 2000 17:29:30 -0400
Message-ID: <20911.958339770@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: OR
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> As some point, it seems we need to get all the PostgreSQL minds together
> to discuss type conversion issues. These problems continue to come up
> from release to release. We are getting better, but it seems a full
> discussion could help solidify our strategy.
OK, here are a few things that bug me about the current type-resolution
code:
1. Poor choice of type to attribute to numeric literals. (A possible
solution is sketched in my earlier message, but do we need similar
mechanisms for other type categories?)
2. Tensions between treating string literals as "unknown" type and
as "text" type, per this thread so far.
3. IS_BINARY_COMPATIBLE seems like a bogus concept. Do we really want a
fully symmetrical ring of types in each group? I'd prefer to see a
one-way equivalence, which allows eg. OID to be silently converted
to INT4, but *not* vice versa (except perhaps by specific user cast).
This'd be more like a traditional "is-a" or inheritance relationship
between datatypes, which has well-understood semantics.
4. I'm also concerned that the behavior of IS_BINARY_COMPATIBLE isn't
very predictable because it will happily go either way. For example,
if I do
select * from pg_class where oid = 1234;
it's unclear whether I will get an oideq or an int4eq operator ---
and that's a rather critical point since only one of them can exploit
an index on the oid column. Currently, there is some klugery in the
planner that works around this by overriding the parser's choice of
operator to substitute one that is compatible with an available index.
That's a pretty ugly solution ... I'm not sure I know a better one,
but as long as we're discussing type resolution issues ...
5. Lack of extensibility. There's way too much knowledge hard-wired
into the parser about type categories, preferred types, binary
compatibility, etc. All of it falls down when faced with
user-defined datatypes. If we do something like I suggested with
a hardwired hierarchy of numeric datatypes, it'll get even worse.
All this stuff ought to be driven off fields in pg_type rather than
be hardwired into the code, so that the same concepts can be extended
to user-defined types.
I don't have worked-out proposals for any of these but the first,
but they've all been bothering me for a while.
regards, tom lane
From tgl@sss.pgh.pa.us Sun May 14 21:02:31 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07700
for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 21:02:28 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA21261;
Sun, 14 May 2000 21:03:17 -0400 (EDT)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] type conversion discussion
In-reply-to: <20911.958339770@sss.pgh.pa.us>
References: <200005141950.PAA04636@candle.pha.pa.us> <20911.958339770@sss.pgh.pa.us>
Comments: In-reply-to Tom Lane <tgl@sss.pgh.pa.us>
message dated "Sun, 14 May 2000 17:29:30 -0400"
Date: Sun, 14 May 2000 21:03:17 -0400
Message-ID: <21258.958352597@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: OR
Here are the results of some further thoughts about type-conversion
issues. This is not a complete proposal yet, but a sketch of an
approach that might solve several of the gripes in my previous proposal.
While thinking about this, I realized that my numeric-types proposal
of yesterday would break at least a few cases that work nicely now.
For example, I frequently do things like
select * from pg_class where oid = 1234;
whilst poking around in system tables and querytree dumps. If that
constant is initially resolved as int2, as I suggested yesterday,
then we have "oid = int2" for which there is no operator. To succeed
we must decide to promote the constant to int4 --- but with no int4
visible among the operands of the "=", it will not work to just "promote
numerics to the highest type seen in the operands" as I suggested
yesterday. So there has to be some more interaction in there.
Anyway, I was complaining about the looseness of the concept of
binary-compatible types and the fact that the parser's type conversion
knowledge is mostly hardwired. These might be resolved by generalizing
the numeric type hierarchy idea into a "type promotion lattice", which
would work like this:
* Add a "typpromote" column to pg_type, which contains either zero or
the OID of another type that the parser is allowed to promote this
type to when searching for usable functions/operators. For example,
my numeric-types hierarchy of yesterday would be expressed by making
int2 promote to int4, int4 to int8, int8 to numeric, numeric to
float4, and float4 to float8. The promotion idea also replaces the
current concept of binary-compatible types: for example, OID would
link to int4 and varchar would link to text (but not vice versa!).
* Also add a "typpromotebin" boolean column to pg_type, which contains
't' if the type conversion indicated by typpromote is "free", ie,
no conversion function need be executed before regarding a value as
belonging to the promoted type. This distinguishes binary-compatible
from non-binary-compatible cases. If "typpromotebin" is 'f' and the
parser decides it needs to apply the conversion, then it has to look
up the appropriate conversion function in pg_proc. (More about this
below.)
Now, if the parser fails to find an exact match for a given function
or operator name and the exact set of input data types, it proceeds by
chasing up the promotion chains for the input data types and trying to
locate a set of types for which there is a matching function/operator.
If there are multiple possibilities, we choose the one which is the
"least promoted" by some yet-to-be-determined metric. (This metric
would probably favor "free" conversions over non-free ones, but other
than that I'm not quite sure how it should work. The metric would
replace a whole bunch of ad-hoc heuristics that are currently applied
in the type resolver, so even if it seems rather ad-hoc it'd still be
cleaner than what we have ;-).)
In a situation like the "oid = int2" example above, this mechanism would
presumably settle on "int4 = int4" as being the least-promoted
equivalent operator. (It could not find "oid = oid" since there is
no promotion path from int2 to oid.) That looks bad since it isn't
compatible with an oidops index --- but I have a solution for that!
I don't think we need the oid opclass at all; why shouldn't indexes
on oid be expressed as int4 indexes to begin with? In general, if
two types are considered binary-equivalent under the old scheme, then
the one that is considered the subtype probably shouldn't have separate
index operators under this new scheme. Instead it should just rely on
the index operators of the promoted type.
The point of the proposed typpromotebin field is to save a pg_proc
lookup when trying to determine whether a particular promotion is "free"
or not. We could save even more lookups if we didn't store the boolean
but instead the actual OID of the conversion function, or zero if the
promotion is "free". The trouble with that is that it creates a
circularity problem when trying to define a new user type --- you can't
define the conversion function if its input type doesn't exist yet.
In any case, we want the parser to do a function lookup if we've
advanced more than one step in the promotion hierarchy: if we've decided
to promote int4 to float8 (which will be a four-step chain through int8,
numeric, float4) we sure want the thing to use a direct int4tofloat8
conversion function if available, not a chain of four conversion
functions. So on balance I think we want to look in pg_proc once we've
decided which conversion to perform. The only reason for having
typpromotebin is that the promotion metric will want to know which
conversions are free, and we don't want to have to do a lookup in
pg_proc for each alternative we consider, only the ones that are finally
selected to be used.
I can think of at least one special case that still isn't cleanly
handled under this scheme, and that is bpchar vs. varchar comparison.
Currently, we have
regression=# select 'a'::bpchar = 'a '::bpchar;
?column?
----------
t
(1 row)
This is correct since trailing blanks are insignificant in bpchar land,
so the two values should be considered equal. If we try
regression=# select 'a'::bpchar = 'a '::varchar;
ERROR: Unable to identify an operator '=' for types 'bpchar' and 'varchar'
You will have to retype this query using an explicit cast
which is pretty bogus but at least it saves the system from making some
random choice about whether bpchar or varchar comparison rules apply.
On the other hand,
regression=# select 'a'::bpchar = 'a '::text;
?column?
----------
f
(1 row)
Here the bpchar value has been promoted to text and then text comparison
(where trailing blanks *are* significant) is applied. I'm not sure that
we can really justify doing this in this case when we reject the bpchar
vs varchar case, but maybe someone wants to argue that that's correct.
The natural setup in my type-promotion scheme would be that both bpchar
and varchar link to 'text' as their promoted type. If we do nothing
special then text-style comparison would be used in a bpchar vs varchar
comparison, which is arguably wrong.
One way to deal with this without introducing kluges into the type
resolver is to provide a full set of bpchar vs text and text vs bpchar
operators, and make sure that the promotion metric is such that these
will be used in place of text vs text operators if they apply (which
should hold, I think, for any reasonable metric). This is probably
the only way to get the "right" behavior in any case --- I think that
the "right" behavior for such comparisons is to strip trailing blanks
from the bpchar side but not the text/varchar side. (I haven't checked
to see if SQL92 agrees, though.)
Another issue is how to fit resolution of "unknown" literals into this
scheme. We could probably continue to handle them more or less as we
do now, but they might complicate the promotion metric.
I am not clear yet on whether we'd still need the concept of "type
categories" as they presently exist in the resolver. It's possible
that we wouldn't, which would be a nice simplification. (If we do
still need them, we should have a column in pg_type that defines the
category of a type, instead of hard-wiring category assignments.)
regards, tom lane
From e99re41@DoCS.UU.SE Mon May 15 07:39:03 2000
Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA10251
for <pgman@candle.pha.pa.us>; Mon, 15 May 2000 07:39:01 -0400 (EDT)
Received: from Zebra.DoCS.UU.SE (e99re41@Zebra.DoCS.UU.SE [130.238.9.158])
by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id NAA10849;
Mon, 15 May 2000 13:39:45 +0200 (MET DST)
Received: from localhost (e99re41@localhost) by Zebra.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id NAA26523; Mon, 15 May 2000 13:39:44 +0200
X-Authentication-Warning: Zebra.DoCS.UU.SE: e99re41 owned process doing -bs
Date: Mon, 15 May 2000 13:39:44 +0200 (MET DST)
From: Peter Eisentraut <e99re41@DoCS.UU.SE>
Reply-To: Peter Eisentraut <peter_e@gmx.net>
To: Tom Lane <tgl@sss.pgh.pa.us>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] type conversion discussion
In-Reply-To: <20911.958339770@sss.pgh.pa.us>
Message-ID: <Pine.GSO.4.02A.10005151309020.26399-100000@Zebra.DoCS.UU.SE>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id HAA10251
Status: OR
On Sun, 14 May 2000, Tom Lane wrote:
> 1. Poor choice of type to attribute to numeric literals. (A possible
> solution is sketched in my earlier message, but do we need similar
> mechanisms for other type categories?)
I think your plan looks good for the numerical land. (I'll ponder the oid
issues in a second.) For other type categories, perhaps not. Should a line
be promoted to a polygon so you can check if it contains a point? Or a
polygon to a box? Higher dimensions? :-)
> 2. Tensions between treating string literals as "unknown" type and
> as "text" type, per this thread so far.
Yes, while we're at it, let's look at this in detail. I claim that
something of the form 'xxx' should always be text (or char or whatever),
period. Let's consider the cases were this could potentially clash with
the current behaviour:
a) The target type is unambiguously clear, e.g., UPDATE ... SET. Then you
cast text to the target type. The effect is identical.
b) The target type is completely unspecified, e.g. CREATE TABLE AS SELECT
'xxx'; This will currently create an "unknown" column. It should arguably
create a "text" column.
Function argument resolution:
c) There is only one function and it has a "text" argument. No-brainer.
d) There is only one function and it has an argument other than text. Try
to cast text to that type. (This is what's done in general, isn't it?)
e) The function is overloaded for many types, amongst which is text. Then
call the text version. I believe this would currently fail, which I'd
consider a deficiency.
f) The function is overloaded for many types, none of which is text. In
that case you have to cast anyway, so you don't lose anything.
On thing to also keep in mind regarding required casting for (b) and (f)
is that SQL never allowed literals of "fancy" types (e.g., DATE) to have
undecorated 'yyyy-mm-dd' constants, you always have to say DATE
'yyyy-mm-dd'. What Postgres allows is a convencience where DATE would be
obvious or implied. In the end it's a win-win situation: you tell the
system what you want, and your code is clearer.
> 3. IS_BINARY_COMPATIBLE seems like a bogus concept.
At least it's bogus when used for types which are not actually binary
compatible, e.g. int4 and oid. The result of the current implementation is
that you can perfectly happily insert and retrieve negative numbers from
oid fields.
I'm not so sure about the value of this particular equivalency anyway.
AFAICS the only functions that make sense for oids are comparisons (incl.
min, max), adding integers to them, subtracting one oid from another.
Silent mangling with int4 means that you can multiply them, square them,
add floating point numbers to them (doesn't really work in practice
though), all things that have no business with oids.
I'd say define the operators that are useful for oids explicitly for oids
and require casts for all others, so the users know what they're doing.
The fact that an oid is also a number should be an implementation detail.
In my mind oids are like pointers in C. Indiscriminate mangling of
pointers and integers in C has long been dismissed as questionable coding.
Of course I'd be very willing to consider counterexamples to these
theories ...
--
Peter Eisentraut Sernanders väg 10:115
peter_e@gmx.net 75262 Uppsala
http://yi.org/peter-e/ Sweden
From tgl@sss.pgh.pa.us Tue Jun 13 04:58:20 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24281
for <pgman@candle.pha.pa.us>; Tue, 13 Jun 2000 03:58:18 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA02571;
Tue, 13 Jun 2000 03:58:43 -0400 (EDT)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Proposal for fixing numeric type-resolution issues
In-reply-to: <200006130741.DAA23502@candle.pha.pa.us>
References: <200006130741.DAA23502@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Tue, 13 Jun 2000 03:41:56 -0400"
Date: Tue, 13 Jun 2000 03:58:43 -0400
Message-ID: <2568.960883123@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: OR
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Again, anything to add to the TODO here?
IIRC, there was some unhappiness with the proposal you quote, so I'm
not sure we've quite agreed what to do... but clearly something must
be done.
regards, tom lane
>> We've got a collection of problems that are related to the parser's
>> inability to make good type-resolution choices for numeric constants.
>> In some cases you get a hard error; for example "NumericVar + 4.4"
>> yields
>> ERROR: Unable to identify an operator '+' for types 'numeric' and 'float8'
>> You will have to retype this query using an explicit cast
>> because "4.4" is initially typed as float8 and the system can't figure
>> out whether to use numeric or float8 addition. A more subtle problem
>> is that a query like "... WHERE Int2Var < 42" is unable to make use of
>> an index on the int2 column: 42 is resolved as int4, so the operator
>> is int24lt, which works but is not in the opclass of an int2 index.
>>
>> Here is a proposal for fixing these problems. I think we could get this
>> done for 7.1 if people like it.
>>
>> The basic problem is that there's not enough smarts in the type resolver
>> about the interrelationships of the numeric datatypes. All it has is
>> a concept of a most-preferred type within the category of numeric types.
>> (We are abusing the most-preferred-type mechanism, BTW, because both
>> FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
>> category! This is in fact why the resolver can't make a choice for
>> "numeric+float8".) We need more intelligence than that.
>>
>> I propose that we set up a strictly-ordered hierarchy of numeric
>> datatypes, running from least preferred to most preferred:
>> int2, int4, int8, numeric, float4, float8.
>> Rather than simply considering coercions to the most-preferred type,
>> the type resolver should use the following rules:
>>
>> 1. No value will be down-converted (eg int4 to int2) except by an
>> explicit conversion.
>>
>> 2. If there is not an exact matching operator, numeric values will be
>> up-converted to the highest numeric datatype present among the operator
>> or function's arguments. For example, given "int2 + int8" we'd up-
>> convert the int2 to int8 and apply int8 addition.
>>
>> The final piece of the puzzle is that the type initially assigned to
>> an undecorated numeric constant should be NUMERIC if it contains a
>> decimal point or exponent, and otherwise the smallest of int2, int4,
>> int8, NUMERIC that will represent it. This is a considerable change
>> from the current lexer behavior, where you get either int4 or float8.
>>
>> For example, given "NumericVar + 4.4", the constant 4.4 will initially
>> be assigned type NUMERIC, we will resolve the operator as numeric plus,
>> and everything's fine. Given "Float8Var + 4.4", the constant is still
>> initially numeric, but will be up-converted to float8 so that float8
>> addition can be used. The end result is the same as in traditional
>> Postgres: you get float8 addition. Given "Int2Var < 42", the constant
>> is initially typed as int2, since it fits, and we end up selecting
>> int2lt, thereby allowing use of an int2 index. (On the other hand,
>> given "Int2Var < 100000", we'd end up using int4lt, which is correct
>> to avoid overflow.)
>>
>> A couple of crucial subtleties here:
>>
>> 1. We are assuming that the parser or optimizer will constant-fold
>> any conversion functions that are introduced. Thus, in the
>> "Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
>> time execution begins, so there's no performance loss.
>>
>> 2. We cannot lose precision by initially representing a constant as
>> numeric and later converting it to float. Nor can we exceed NUMERIC's
>> range (the default 1000-digit limit is more than the range of IEEE
>> float8 data). It would not work as well to start out by representing
>> a constant as float and then converting it to numeric.
>>
>> Presently, the pg_proc and pg_operator tables contain a pretty fair
>> collection of cross-datatype numeric operators, such as int24lt,
>> float48pl, etc. We could perhaps leave these in, but I believe that
>> it is better to remove them. For example, if int42lt is left in place,
>> then it would capture cases like "Int4Var < 42", whereas we need that
>> to be translated to int4lt so that an int4 index can be used. Removing
>> these operators will eliminate some code bloat and system-catalog bloat
>> to boot.
>>
>> As far as I can tell, this proposal is almost compatible with the rules
>> given in SQL92: in particular, SQL92 specifies that an operator having
>> both "approximate numeric" (float) and "exact numeric" (int or numeric)
>> inputs should deliver an approximate-numeric result. I propose
>> deviating from SQL92 in a single respect: SQL92 specifies that a
>> constant containing an exponent (eg 1.2E34) is approximate numeric,
>> which implies that the result of an operator using it is approximate
>> even if the other operand is exact. I believe it's better to treat
>> such a constant as exact (ie, type NUMERIC) and only convert it to
>> float if the other operand is float. Without doing that, an assignment
>> like
>> UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
>> will not work as desired because the constant will be prematurely
>> coerced to float, causing precision loss.
>>
>> Comments?
>>
>> regards, tom lane
>>
> --
> Bruce Momjian | http://www.op.net/~candle
> pgman@candle.pha.pa.us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
From tgl@sss.pgh.pa.us Mon Jun 12 14:09:45 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01993
for <pgman@candle.pha.pa.us>; Mon, 12 Jun 2000 13:09:43 -0400 (EDT)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA01515;
Mon, 12 Jun 2000 13:10:01 -0400 (EDT)
To: Peter Eisentraut <peter_e@gmx.net>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Adding time to DATE type
In-reply-to: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
References: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
message dated "Sun, 11 Jun 2000 13:41:24 +0200"
Date: Mon, 12 Jun 2000 13:10:00 -0400
Message-ID: <1512.960829800@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: ORr
Peter Eisentraut <peter_e@gmx.net> writes:
> Bruce Momjian writes:
>> Can someone give me a TODO summary for this issue?
> * make 'text' constants default to text type (not unknown)
> (I think not everyone's completely convinced on this issue, but I don't
> recall anyone being firmly opposed to it.)
It would be a mistake to eliminate the distinction between unknown and
text. See for example my just-posted response to John Cochran on
pgsql-general about why 'BOULEVARD'::text behaves differently from
'BOULEVARD'::char. If string literals are immediately assigned type
text then we will have serious problems with char(n) fields.
I think it's fine to assign string literals a type of 'unknown'
initially. What we need to do is add a phase of type resolution that
considers treating them as text, but only after the existing logic fails
to deduce a type.
(BTW it might be better to treat string literals as defaulting to char(n)
instead of text, allowing the normal promotion rules to replace char(n)
with text if necessary. Not sure if that would make things more or less
confusing for operations that intermix fixed- and variable-width char
types.)
regards, tom lane
From pgsql-hackers-owner+M1936@postgresql.org Sun Dec 10 13:17:54 2000
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA20676
for <pgman@candle.pha.pa.us>; Sun, 10 Dec 2000 13:17:54 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eBAIGvZ40566;
Sun, 10 Dec 2000 13:16:57 -0500 (EST)
(envelope-from pgsql-hackers-owner+M1936@postgresql.org)
Received: from sss.pgh.pa.us (sss.pgh.pa.us [209.114.132.154])
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eBAI8HZ39820
for <pgsql-hackers@postgreSQL.org>; Sun, 10 Dec 2000 13:08:17 -0500 (EST)
(envelope-from tgl@sss.pgh.pa.us)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.1/8.11.1) with ESMTP id eBAI82o28682;
Sun, 10 Dec 2000 13:08:02 -0500 (EST)
To: Thomas Lockhart <lockhart@alumni.caltech.edu>
cc: pgsql-hackers@postgresql.org
Subject: [HACKERS] Unknown-type resolution rules, redux
Date: Sun, 10 Dec 2000 13:08:02 -0500
Message-ID: <28679.976471682@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR
parse_coerce.c contains the following conversation --- I believe the
first XXX comment is from me and the second from you:
/*
* Still too many candidates? Try assigning types for the unknown
* columns.
*
* We do this by examining each unknown argument position to see if all
* the candidates agree on the type category of that slot. If so, and
* if some candidates accept the preferred type in that category,
* eliminate the candidates with other input types. If we are down to
* one candidate at the end, we win.
*
* XXX It's kinda bogus to do this left-to-right, isn't it? If we
* eliminate some candidates because they are non-preferred at the
* first slot, we won't notice that they didn't have the same type
* category for a later slot.
* XXX Hmm. How else would you do this? These candidates are here because
* they all have the same number of matches on arguments with explicit
* types, so from here on left-to-right resolution is as good as any.
* Need a counterexample to see otherwise...
*/
The comment is out of date anyway because it fails to mention the new
rule about preferring STRING category. But to answer your request for
a counterexample: consider
SELECT foo('bar', 'baz')
First, suppose the available candidates are
foo(float8, int4)
foo(float8, point)
In this case, we examine the first argument position, see that all the
candidates agree on NUMERIC category, so we consider resolving the first
unknown input to float8. That eliminates neither candidate so we move
on to the second argument position. Here there is a conflict of
categories so we can't eliminate anything, and we decide the call is
ambiguous. That's correct (or at least Operating As Designed ;-)).
But now suppose we have
foo(float8, int4)
foo(float4, point)
Here, at the first position we will still see that all candidates agree
on NUMERIC category, and then we will eliminate candidate 2 because it
isn't the preferred type in that category. Now when we come to the
second argument position, there's only one candidate left so there's
no category conflict. Result: this call is considered non-ambiguous.
This means there is a left-to-right bias in the algorithm. For example,
the exact same call *would* be considered ambiguous if the candidates'
argument orders were reversed:
foo(int4, float8)
foo(point, float4)
I do not like that. You could maybe argue that earlier arguments are
more important than later ones for functions, but it's harder to make
that case for binary operators --- and in any case this behavior is
extremely difficult to explain in prose.
To fix this, I think we need to split the loop into two passes.
The first pass does *not* remove any candidates. What it does is to
look separately at each UNKNOWN-argument position and attempt to deduce
a probable category for it, using the following rules:
* If any candidate has an input type of STRING category, use STRING
category; else if all candidates agree on the category, use that
category; else fail because no resolution can be made.
* The first pass must also remember whether any candidates are of a
preferred type within the selected category.
The probable categories and exists-preferred-type booleans are saved in
local arrays. (Note this has to be done this way because
IsPreferredType currently allows more than one type to be considered
preferred in a category ... so the first pass cannot try to determine a
unique type, only a category.)
If we find a category for every UNKNOWN arg, then we enter a second loop
in which we discard candidates. In this pass we discard a candidate if
(a) it is of the wrong category, or (b) it is of the right category but
is not of preferred type in that category, *and* we found candidate(s)
of preferred type at this slot.
If we end with exactly one candidate then we win.
It is clear in this algorithm that there is no order dependency: the
conditions for keeping or discarding a candidate are fixed before we
start the second pass, and do not vary depending on which other
candidates were discarded before it.
Comments?
regards, tom lane
From pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 15:47:47 2001
Return-path: <pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org>
Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBTKlkT05111
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 15:47:46 -0500 (EST)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTKhZN74322
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 14:43:35 -0600 (CST)
(envelope-from pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org)
Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTKaem38452
for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 15:36:40 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.11.6/8.10.1) id fBTKaTg04256;
Sat, 29 Dec 2001 15:36:29 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-ID: <200112292036.fBTKaTg04256@candle.pha.pa.us>
Subject: Re: [GENERAL] Casting Varchar to Numeric
In-Reply-To: <20011206150158.O28880-100000@megazone23.bigpanda.com>
To: Stephan Szabo <sszabo@megazone23.bigpanda.com>
Date: Sat, 29 Dec 2001 15:36:29 -0500 (EST)
cc: Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL96 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-general-owner@postgresql.org
Status: OR
> On Mon, 3 Dec 2001, Andy Marden wrote:
>
> > Martijn,
> >
> > It does work (believe it or not). I've now tried the method you mention
> > below - that also works and is much nicer. I can't believe that PostgreSQL
> > can't work this out. Surely implementing an algorithm that understands that
> > if you can go from a ->b and b->c then you can certainly go from a->c. If
>
> It's more complicated than that (and postgres does some of this but not
> all), for example the cast text->float8->numeric potentially loses
> precision and should probably not be an automatic cast for that reason.
>
> > this is viewed as too complex a task for the internals - at least a diagram
> > or some way of understanding how you should go from a->c would be immensely
> > helpful wouldn't it! Daunting for anyone picking up the database and trying
> > to do something simple(!)
>
> There may be a need for documentation on this. Would you like to write
> some ;)
OK, I ran some tests:
test=> create table test (x text);
CREATE
test=> insert into test values ('323');
INSERT 5122745 1
test=> select cast (x as numeric) from test;
ERROR: Cannot cast type 'text' to 'numeric'
I can see problems with automatically casting numeric to text because
you have to guess the desired format, but going from text to numeric
seems quite easy to do. Is there a reason we don't do it?
I can cast to integer and float8 fine:
test=> select cast ( x as integer) from test;
?column?
----------
323
(1 row)
test=> select cast ( x as float8) from test;
?column?
----------
323
(1 row)
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 19:10:38 2001
Return-path: <pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org>
Received: from west.navpoint.com (west.navpoint.com [207.106.42.13])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBU0AbT23972
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 19:10:37 -0500 (EST)
Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
by west.navpoint.com (8.11.6/8.10.1) with ESMTP id fBTNVj008959
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 18:31:45 -0500 (EST)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTNQrN78655
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 17:26:53 -0600 (CST)
(envelope-from pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org)
Received: from sss.pgh.pa.us ([192.204.191.242])
by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTN8Fm47978
for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 18:08:15 -0500 (EST)
(envelope-from tgl@sss.pgh.pa.us)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id fBTN7vg20245;
Sat, 29 Dec 2001 18:07:57 -0500 (EST)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Stephan Szabo <sszabo@megazone23.bigpanda.com>,
Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
Subject: Re: [GENERAL] Casting Varchar to Numeric
In-Reply-To: <200112292036.fBTKaTg04256@candle.pha.pa.us>
References: <200112292036.fBTKaTg04256@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Sat, 29 Dec 2001 15:36:29 -0500"
Date: Sat, 29 Dec 2001 18:07:57 -0500
Message-ID: <20242.1009667277@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-general-owner@postgresql.org
Status: OR
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I can see problems with automatically casting numeric to text because
> you have to guess the desired format, but going from text to numeric
> seems quite easy to do. Is there a reason we don't do it?
I do not think it's a good idea to have implicit casts between text and
everything under the sun, because that essentially destroys the type
checking system. What we need (see previous discussion) is a flag in
pg_proc that says whether a type conversion function may be invoked
implicitly or not. I've got no problem with offering text(numeric) and
numeric(text) functions that are invoked by explicit function calls or
casts --- I just don't want the system trying to use them to make
sense of a bogus query.
> I can cast to integer and float8 fine:
I don't believe that those should be available as implicit casts either.
They are, at the moment:
regression=# select 33 || 44.0;
?column?
----------
3344
(1 row)
Ugh.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
From Inoue@tpf.co.jp Tue Jan 18 19:08:30 2000
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA10148
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 20:08:27 -0500 (EST)
Received: from cadzone ([126.0.1.40] (may be forged))
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
id KAA02790; Wed, 19 Jan 2000 10:08:02 +0900
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
Subject: RE: [HACKERS] Index recreation in vacuum
Date: Wed, 19 Jan 2000 10:13:40 +0900
Message-ID: <000201bf621a$6b9baf20$2801007e@tpf.co.jp>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Importance: Normal
In-Reply-To: <200001181821.NAA02988@candle.pha.pa.us>
Status: ROr
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
>
> [Charset iso-8859-1 unsupported, filtering to ASCII...]
> > Hi all,
> >
> > I'm trying to implement REINDEX command.
> >
> > REINDEX operation itself is available everywhere and
> > I've thought about applying it to VACUUM.
>
> That is a good idea. Vacuuming of indexes can be very slow.
>
> > .
> > My plan is as follows.
> >
> > Add a new option to force index recreation in vacuum
> > and if index recreation is specified.
>
> Couldn't we auto-recreate indexes based on the number of tuples moved by
> vacuum,
Yes,we could probably do it. But I'm not sure the availability of new
vacuum.
New vacuum would give us a big advantage that
1) Much faster than current if vacuum remove/moves many tuples.
2) Does shrink index files
But in case of abort/crash
1) couldn't choose index scan for the table
2) unique constraints of the table would be lost
I don't know how people estimate this disadvantage.
>
> > Now I'm inclined to use relhasindex of pg_class to
> > validate/invalidate indexes of a table at once.
>
> There are a few calls to CatalogIndexInsert() that know the
> system table they
> are using and know it has indexes, so it does not check that field. You
> could add cases for that.
>
I think there aren't so many places to check.
I would examine it if my idea is OK.
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
From owner-pgsql-hackers@hub.org Tue Jan 18 19:15:27 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA10454
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 20:15:26 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id UAA42280;
Tue, 18 Jan 2000 20:10:35 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 20:10:30 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id UAA42081
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 20:09:31 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
by hub.org (8.9.3/8.9.3) with ESMTP id UAA41943
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 20:08:39 -0500 (EST)
(envelope-from Inoue@tpf.co.jp)
Received: from cadzone ([126.0.1.40] (may be forged))
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
id KAA02790; Wed, 19 Jan 2000 10:08:02 +0900
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
Subject: RE: [HACKERS] Index recreation in vacuum
Date: Wed, 19 Jan 2000 10:13:40 +0900
Message-ID: <000201bf621a$6b9baf20$2801007e@tpf.co.jp>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Importance: Normal
In-Reply-To: <200001181821.NAA02988@candle.pha.pa.us>
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
>
> [Charset iso-8859-1 unsupported, filtering to ASCII...]
> > Hi all,
> >
> > I'm trying to implement REINDEX command.
> >
> > REINDEX operation itself is available everywhere and
> > I've thought about applying it to VACUUM.
>
> That is a good idea. Vacuuming of indexes can be very slow.
>
> > .
> > My plan is as follows.
> >
> > Add a new option to force index recreation in vacuum
> > and if index recreation is specified.
>
> Couldn't we auto-recreate indexes based on the number of tuples moved by
> vacuum,
Yes,we could probably do it. But I'm not sure the availability of new
vacuum.
New vacuum would give us a big advantage that
1) Much faster than current if vacuum remove/moves many tuples.
2) Does shrink index files
But in case of abort/crash
1) couldn't choose index scan for the table
2) unique constraints of the table would be lost
I don't know how people estimate this disadvantage.
>
> > Now I'm inclined to use relhasindex of pg_class to
> > validate/invalidate indexes of a table at once.
>
> There are a few calls to CatalogIndexInsert() that know the
> system table they
> are using and know it has indexes, so it does not check that field. You
> could add cases for that.
>
I think there aren't so many places to check.
I would examine it if my idea is OK.
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
************
From owner-pgsql-hackers@hub.org Tue Jan 18 19:57:21 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA11764
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 20:57:19 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id UAA50653;
Tue, 18 Jan 2000 20:52:38 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 20:52:30 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id UAA50513
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 20:51:32 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
by hub.org (8.9.3/8.9.3) with ESMTP id UAA50462
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 20:51:06 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id UAA11421;
Tue, 18 Jan 2000 20:50:50 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200001190150.UAA11421@candle.pha.pa.us>
Subject: Re: [HACKERS] Index recreation in vacuum
In-Reply-To: <000201bf621a$6b9baf20$2801007e@tpf.co.jp> from Hiroshi Inoue at
"Jan 19, 2000 10:13:40 am"
To: Hiroshi Inoue <Inoue@tpf.co.jp>
Date: Tue, 18 Jan 2000 20:50:50 -0500 (EST)
CC: pgsql-hackers <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: ROr
> > > Add a new option to force index recreation in vacuum
> > > and if index recreation is specified.
> >
> > Couldn't we auto-recreate indexes based on the number of tuples moved by
> > vacuum,
>
> Yes,we could probably do it. But I'm not sure the availability of new
> vacuum.
>
> New vacuum would give us a big advantage that
> 1) Much faster than current if vacuum remove/moves many tuples.
> 2) Does shrink index files
>
> But in case of abort/crash
> 1) couldn't choose index scan for the table
> 2) unique constraints of the table would be lost
>
> I don't know how people estimate this disadvantage.
That's why I was recommending rename(). The actual window of
vunerability goes from perhaps hours to fractions of a second.
In fact, if I understand this right, you could make the vulerability
zero by just performing the rename as one operation.
In fact, for REINDEX cases where you don't have a lock on the entire
table as you do in vacuum, you could reindex the table with a simple
read-lock on the base table and index, and move the new index into place
with the users seeing no change. Only people traversing the index
during the change would have a problem. You just need an exclusive
access on the index for the duration of the rename() so no one is
traversing the index during the rename().
Destroying the index and recreating opens a large time span that there
is no index, and you have to jury-rig something so people don't try to
use the index. With rename() you just put the new index in place with
one operation. Just don't let people traverse the index during the
change. The pointers to the heap tuples is the same in both indexes.
In fact, with WAL, we will allow multiple physical files for the same
table by appending the table oid to the file name. In this case, the
old index could be deleted by rename, and people would continue to use
the old index until they deleted the open file pointers. Not sure how
this works in practice because new tuples would not be inserted into the
old copy of the index.
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
************
From pgman Tue Jan 18 20:04:11 2000
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id VAA11990;
Tue, 18 Jan 2000 21:04:11 -0500 (EST)
From: Bruce Momjian <pgman>
Message-Id: <200001190204.VAA11990@candle.pha.pa.us>
Subject: Re: [HACKERS] Index recreation in vacuum
In-Reply-To: <200001190150.UAA11421@candle.pha.pa.us> from Bruce Momjian at "Jan
18, 2000 08:50:50 pm"
To: Bruce Momjian <pgman@candle.pha.pa.us>
Date: Tue, 18 Jan 2000 21:04:11 -0500 (EST)
CC: Hiroshi Inoue <Inoue@tpf.co.jp>,
pgsql-hackers <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Status: RO
> > I don't know how people estimate this disadvantage.
>
> That's why I was recommending rename(). The actual window of
> vunerability goes from perhaps hours to fractions of a second.
>
> In fact, if I understand this right, you could make the vulerability
> zero by just performing the rename as one operation.
>
> In fact, for REINDEX cases where you don't have a lock on the entire
> table as you do in vacuum, you could reindex the table with a simple
> read-lock on the base table and index, and move the new index into place
> with the users seeing no change. Only people traversing the index
> during the change would have a problem. You just need an exclusive
> access on the index for the duration of the rename() so no one is
> traversing the index during the rename().
>
> Destroying the index and recreating opens a large time span that there
> is no index, and you have to jury-rig something so people don't try to
> use the index. With rename() you just put the new index in place with
> one operation. Just don't let people traverse the index during the
> change. The pointers to the heap tuples is the same in both indexes.
>
> In fact, with WAL, we will allow multiple physical files for the same
> table by appending the table oid to the file name. In this case, the
> old index could be deleted by rename, and people would continue to use
> the old index until they deleted the open file pointers. Not sure how
> this works in practice because new tuples would not be inserted into the
> old copy of the index.
Maybe I am all wrong here. Maybe most of the advantage of rename() are
meaningless with reindex using during vacuum, which is the most
important use of reindex.
Let's look at index using during vacuum. Right now, how does vacuum
handle indexes when it moves a tuple? Does it do each index update as
it moves a tuple? Is that why it is so slow?
If we don't do that and vacuum fails, what state is the table left in?
If we don't update the index for every tuple, the index is invalid in a
vacuum failure. rename() is not going to help us here. It keeps the
old index around, but the index is invalid anyway, right?
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
From Inoue@tpf.co.jp Tue Jan 18 20:18:48 2000
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA12437
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 21:18:46 -0500 (EST)
Received: from cadzone ([126.0.1.40] (may be forged))
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
id LAA02845; Wed, 19 Jan 2000 11:18:18 +0900
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
Subject: RE: [HACKERS] Index recreation in vacuum
Date: Wed, 19 Jan 2000 11:23:55 +0900
Message-ID: <000801bf6224$3bfdd9a0$2801007e@tpf.co.jp>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Importance: Normal
In-Reply-To: <200001190204.VAA11990@candle.pha.pa.us>
Status: ROr
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
>
> > > I don't know how people estimate this disadvantage.
> >
> > That's why I was recommending rename(). The actual window of
> > vunerability goes from perhaps hours to fractions of a second.
> >
> > In fact, if I understand this right, you could make the vulerability
> > zero by just performing the rename as one operation.
> >
> > In fact, for REINDEX cases where you don't have a lock on the entire
> > table as you do in vacuum, you could reindex the table with a simple
> > read-lock on the base table and index, and move the new index into place
> > with the users seeing no change. Only people traversing the index
> > during the change would have a problem. You just need an exclusive
> > access on the index for the duration of the rename() so no one is
> > traversing the index during the rename().
> >
> > Destroying the index and recreating opens a large time span that there
> > is no index, and you have to jury-rig something so people don't try to
> > use the index. With rename() you just put the new index in place with
> > one operation. Just don't let people traverse the index during the
> > change. The pointers to the heap tuples is the same in both indexes.
> >
> > In fact, with WAL, we will allow multiple physical files for the same
> > table by appending the table oid to the file name. In this case, the
> > old index could be deleted by rename, and people would continue to use
> > the old index until they deleted the open file pointers. Not sure how
> > this works in practice because new tuples would not be inserted into the
> > old copy of the index.
>
> Maybe I am all wrong here. Maybe most of the advantage of rename() are
> meaningless with reindex using during vacuum, which is the most
> important use of reindex.
>
> Let's look at index using during vacuum. Right now, how does vacuum
> handle indexes when it moves a tuple? Does it do each index update as
> it moves a tuple? Is that why it is so slow?
>
Yes,I believe so. It's necessary to keep consistency between heap
table and indexes even in case of abort/crash.
As far as I see,it has been a big charge for vacuum.
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
From owner-pgsql-hackers@hub.org Tue Jan 18 20:53:49 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA13285
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 21:53:47 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id VAA65183;
Tue, 18 Jan 2000 21:47:47 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 21:47:33 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id VAA65091
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 21:46:33 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
by hub.org (8.9.3/8.9.3) with ESMTP id VAA65034
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 21:46:12 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id VAA13040;
Tue, 18 Jan 2000 21:45:27 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200001190245.VAA13040@candle.pha.pa.us>
Subject: Re: [HACKERS] Index recreation in vacuum
In-Reply-To: <000801bf6224$3bfdd9a0$2801007e@tpf.co.jp> from Hiroshi Inoue at
"Jan 19, 2000 11:23:55 am"
To: Hiroshi Inoue <Inoue@tpf.co.jp>
Date: Tue, 18 Jan 2000 21:45:27 -0500 (EST)
CC: pgsql-hackers <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
> > > In fact, for REINDEX cases where you don't have a lock on the entire
> > > table as you do in vacuum, you could reindex the table with a simple
> > > read-lock on the base table and index, and move the new index into place
> > > with the users seeing no change. Only people traversing the index
> > > during the change would have a problem. You just need an exclusive
> > > access on the index for the duration of the rename() so no one is
> > > traversing the index during the rename().
> > >
> > > Destroying the index and recreating opens a large time span that there
> > > is no index, and you have to jury-rig something so people don't try to
> > > use the index. With rename() you just put the new index in place with
> > > one operation. Just don't let people traverse the index during the
> > > change. The pointers to the heap tuples is the same in both indexes.
> > >
> > > In fact, with WAL, we will allow multiple physical files for the same
> > > table by appending the table oid to the file name. In this case, the
> > > old index could be deleted by rename, and people would continue to use
> > > the old index until they deleted the open file pointers. Not sure how
> > > this works in practice because new tuples would not be inserted into the
> > > old copy of the index.
> >
> > Maybe I am all wrong here. Maybe most of the advantage of rename() are
> > meaningless with reindex using during vacuum, which is the most
> > important use of reindex.
> >
> > Let's look at index using during vacuum. Right now, how does vacuum
> > handle indexes when it moves a tuple? Does it do each index update as
> > it moves a tuple? Is that why it is so slow?
> >
>
> Yes,I believe so. It's necessary to keep consistency between heap
> table and indexes even in case of abort/crash.
> As far as I see,it has been a big charge for vacuum.
OK, how about making a copy of the heap table before starting vacuum,
moving all the tuples in that copy, create new index, and then move the
new heap and indexes over the old version. We already have an exclusive
lock on the table. That would be 100% reliable, with the disadvantage
of using 2x the disk space. Seems like a big win.
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
************
From owner-pgsql-hackers@hub.org Tue Jan 18 21:15:24 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA14115
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 22:15:23 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id WAA72950;
Tue, 18 Jan 2000 22:10:40 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 22:10:32 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id WAA72644
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 22:09:36 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
by hub.org (8.9.3/8.9.3) with ESMTP id WAA72504
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 22:08:40 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id WAA13965;
Tue, 18 Jan 2000 22:08:25 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200001190308.WAA13965@candle.pha.pa.us>
Subject: Re: [HACKERS] Index recreation in vacuum
In-Reply-To: <000f01bf622a$bf423940$2801007e@tpf.co.jp> from Hiroshi Inoue at
"Jan 19, 2000 12:10:32 pm"
To: Hiroshi Inoue <Inoue@tpf.co.jp>
Date: Tue, 18 Jan 2000 22:08:25 -0500 (EST)
CC: pgsql-hackers <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=UNKNOWN-8BIT
Content-Transfer-Encoding: 8bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
> I heard from someone that old vacuum had been like so.
> Probably 2x disk space for big tables was a big disadvantage.
That's interesting.
>
> In addition,rename(),unlink(),mv aren't preferable for transaction
> control as far as I see. We couldn't avoid inconsistency using
> those OS functions.
I disagree. Vacuum can't be rolled back anyway in the sense you can
bring back expire tuples, though I have no idea why you would want to.
You have an exclusive lock on the table. Putting new heap/indexes in
place that match and have no expired tuples seems like it can not fail
in any situation.
Of course, the buffers of the old table have to be marked as invalid,
but with an exclusive lock, that is not a problem. I am sure we do that
anyway in vacuum.
> We have to wait the change of relation file naming if copying
> vacuum is needed.
> Under the spec we need not rename(),mv etc.
Sorry, I don't agree, yet...
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
************
From Inoue@tpf.co.jp Tue Jan 18 21:05:23 2000
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA13858
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 22:05:21 -0500 (EST)
Received: from cadzone ([126.0.1.40] (may be forged))
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
id MAA02870; Wed, 19 Jan 2000 12:04:55 +0900
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
Subject: RE: [HACKERS] Index recreation in vacuum
Date: Wed, 19 Jan 2000 12:10:32 +0900
Message-ID: <000f01bf622a$bf423940$2801007e@tpf.co.jp>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Importance: Normal
In-Reply-To: <200001190245.VAA13040@candle.pha.pa.us>
Status: ROr
> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> > >
> > > Maybe I am all wrong here. Maybe most of the advantage of
> rename() are
> > > meaningless with reindex using during vacuum, which is the most
> > > important use of reindex.
> > >
> > > Let's look at index using during vacuum. Right now, how does vacuum
> > > handle indexes when it moves a tuple? Does it do each index update as
> > > it moves a tuple? Is that why it is so slow?
> > >
> >
> > Yes,I believe so. It's necessary to keep consistency between heap
> > table and indexes even in case of abort/crash.
> > As far as I see,it has been a big charge for vacuum.
>
> OK, how about making a copy of the heap table before starting vacuum,
> moving all the tuples in that copy, create new index, and then move the
> new heap and indexes over the old version. We already have an exclusive
> lock on the table. That would be 100% reliable, with the disadvantage
> of using 2x the disk space. Seems like a big win.
>
I heard from someone that old vacuum had been like so.
Probably 2x disk space for big tables was a big disadvantage.
In addition,rename(),unlink(),mv aren't preferable for transaction
control as far as I see. We couldn't avoid inconsistency using
those OS functions.
We have to wait the change of relation file naming if copying
vacuum is needed.
Under the spec we need not rename(),mv etc.
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
From dms@wplus.net Wed Jan 19 15:30:40 2000
Received: from relay.wplus.net (relay.wplus.net [195.131.52.179])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA25919
for <pgman@candle.pha.pa.us>; Wed, 19 Jan 2000 16:30:38 -0500 (EST)
X-Real-To: pgman@candle.pha.pa.us
Received: from wplus.net (ppdms.dialup.wplus.net [195.131.52.71])
by relay.wplus.net (8.9.1/8.9.1/wplus.2) with ESMTP id AAA64218;
Thu, 20 Jan 2000 00:26:37 +0300 (MSK)
Message-ID: <38862C9D.C2151E4E@wplus.net>
Date: Thu, 20 Jan 2000 00:29:01 +0300
From: Dmitry Samersoff <dms@wplus.net>
X-Mailer: Mozilla 4.61 [en] (WinNT; I)
X-Accept-Language: ru,en
MIME-Version: 1.0
To: Hiroshi Inoue <Inoue@tpf.co.jp>
CC: Bruce Momjian <pgman@candle.pha.pa.us>,
pgsql-hackers <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] Index recreation in vacuum
References: <000f01bf622a$bf423940$2801007e@tpf.co.jp>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Status: ROr
Hiroshi Inoue wrote:
> > > Yes,I believe so. It's necessary to keep consistency between heap
> > > table and indexes even in case of abort/crash.
> > > As far as I see,it has been a big charge for vacuum.
> >
> > OK, how about making a copy of the heap table before starting vacuum,
> > moving all the tuples in that copy, create new index, and then move the
> > new heap and indexes over the old version. We already have an exclusive
> > lock on the table. That would be 100% reliable, with the disadvantage
> > of using 2x the disk space. Seems like a big win.
> >
>
> I heard from someone that old vacuum had been like so.
> Probably 2x disk space for big tables was a big disadvantage.
Yes, It is critical.
How about sequence like this:
* Drop indices (keeping somewhere index descriptions)
* vacuuming table
* recreate indices
If something crash, user have been noticed
to re-run vacuum or recreate indices by hand
when system restarts.
I use script like described above for vacuuming
- it really increase vacuum performance for large table.
--
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* there will come soft rains
From dms@wplus.net Wed Jan 19 15:42:49 2000
Received: from relay.wplus.net (relay.wplus.net [195.131.52.179])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA26645
for <pgman@candle.pha.pa.us>; Wed, 19 Jan 2000 16:42:47 -0500 (EST)
X-Real-To: pgman@candle.pha.pa.us
Received: from wplus.net (ppdms.dialup.wplus.net [195.131.52.71])
by relay.wplus.net (8.9.1/8.9.1/wplus.2) with ESMTP id AAA65264;
Thu, 20 Jan 2000 00:39:02 +0300 (MSK)
Message-ID: <38862F86.20328BD3@wplus.net>
Date: Thu, 20 Jan 2000 00:41:26 +0300
From: Dmitry Samersoff <dms@wplus.net>
X-Mailer: Mozilla 4.61 [en] (WinNT; I)
X-Accept-Language: ru,en
MIME-Version: 1.0
To: Bruce Momjian <pgman@candle.pha.pa.us>
CC: Hiroshi Inoue <Inoue@tpf.co.jp>,
pgsql-hackers <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] Index recreation in vacuum
References: <200001192132.QAA26048@candle.pha.pa.us>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Status: ROr
Bruce Momjian wrote:
>
> We need two things:
>
> auto-create index on startup
IMHO, It have to be controlled by user, because creating large index
can take a number of hours. Sometimes it's better to live without
indices
at all, and then build it by hand after workday end.
--
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* there will come soft rains
From owner-pgsql-hackers@hub.org Thu Jan 20 23:51:34 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA13891
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 00:51:31 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id AAA91784;
Fri, 21 Jan 2000 00:47:07 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 00:45:38 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id AAA91495
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 00:44:40 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
by hub.org (8.9.3/8.9.3) with ESMTP id AAA91378
for <pgsql-hackers@postgreSQL.org>; Fri, 21 Jan 2000 00:44:04 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id AAA13592;
Fri, 21 Jan 2000 00:43:49 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200001210543.AAA13592@candle.pha.pa.us>
Subject: [HACKERS] vacuum timings
To: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 21 Jan 2000 00:43:49 -0500 (EST)
CC: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
400MB and index is 160MB.
With index on the single in4 column, I got:
78 seconds for a vacuum
121 seconds for vacuum after deleting a single row
662 seconds for vacuum after deleting the entire table
With no index, I got:
43 seconds for a vacuum
43 seconds for vacuum after deleting a single row
43 seconds for vacuum after deleting the entire table
I find this quite interesting.
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
************
From owner-pgsql-hackers@hub.org Fri Jan 21 00:34:56 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA15559
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 01:34:55 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id BAA06108;
Fri, 21 Jan 2000 01:32:23 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 01:30:38 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id BAA03704
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 01:27:53 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37])
by hub.org (8.9.3/8.9.3) with ESMTP id BAA01710
for <pgsql-hackers@postgreSQL.org>; Fri, 21 Jan 2000 01:26:44 -0500 (EST)
(envelope-from vadim@krs.ru)
Received: from krs.ru (dune.krs.ru [195.161.16.38])
by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id NAA01685;
Fri, 21 Jan 2000 13:26:33 +0700 (KRS)
Message-ID: <3887FC19.80305217@krs.ru>
Date: Fri, 21 Jan 2000 13:26:33 +0700
From: Vadim Mikheev <vadim@krs.ru>
Organization: OJSC Rostelecom (Krasnoyarsk)
X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386)
X-Accept-Language: ru, en
MIME-Version: 1.0
To: Bruce Momjian <pgman@candle.pha.pa.us>
CC: Tom Lane <tgl@sss.pgh.pa.us>,
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] vacuum timings
References: <200001210543.AAA13592@candle.pha.pa.us>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
Bruce Momjian wrote:
>
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
> 400MB and index is 160MB.
>
> With index on the single in4 column, I got:
> 78 seconds for a vacuum
> 121 seconds for vacuum after deleting a single row
> 662 seconds for vacuum after deleting the entire table
>
> With no index, I got:
> 43 seconds for a vacuum
> 43 seconds for vacuum after deleting a single row
> 43 seconds for vacuum after deleting the entire table
Wi/wo -F ?
Vadim
************
From vadim@krs.ru Fri Jan 21 00:26:33 2000
Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA15239
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 01:26:31 -0500 (EST)
Received: from krs.ru (dune.krs.ru [195.161.16.38])
by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id NAA01685;
Fri, 21 Jan 2000 13:26:33 +0700 (KRS)
Sender: root@sunpine.krs.ru
Message-ID: <3887FC19.80305217@krs.ru>
Date: Fri, 21 Jan 2000 13:26:33 +0700
From: Vadim Mikheev <vadim@krs.ru>
Organization: OJSC Rostelecom (Krasnoyarsk)
X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386)
X-Accept-Language: ru, en
MIME-Version: 1.0
To: Bruce Momjian <pgman@candle.pha.pa.us>
CC: Tom Lane <tgl@sss.pgh.pa.us>,
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] vacuum timings
References: <200001210543.AAA13592@candle.pha.pa.us>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Status: ROr
Bruce Momjian wrote:
>
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
> 400MB and index is 160MB.
>
> With index on the single in4 column, I got:
> 78 seconds for a vacuum
> 121 seconds for vacuum after deleting a single row
> 662 seconds for vacuum after deleting the entire table
>
> With no index, I got:
> 43 seconds for a vacuum
> 43 seconds for vacuum after deleting a single row
> 43 seconds for vacuum after deleting the entire table
Wi/wo -F ?
Vadim
From Inoue@tpf.co.jp Fri Jan 21 00:40:35 2000
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA15684
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 01:40:33 -0500 (EST)
Received: from cadzone ([126.0.1.40] (may be forged))
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
id PAA04316; Fri, 21 Jan 2000 15:40:35 +0900
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
"Tom Lane" <tgl@sss.pgh.pa.us>
Subject: RE: [HACKERS] vacuum timings
Date: Fri, 21 Jan 2000 15:46:15 +0900
Message-ID: <000201bf63db$36cdae20$2801007e@tpf.co.jp>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
In-Reply-To: <200001210543.AAA13592@candle.pha.pa.us>
Status: RO
> -----Original Message-----
> From: owner-pgsql-hackers@postgreSQL.org
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian
>
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
> 400MB and index is 160MB.
>
> With index on the single in4 column, I got:
> 78 seconds for a vacuum
vc_vaconeind() is called once
> 121 seconds for vacuum after deleting a single row
vc_vaconeind() is called twice
Hmmm,vc_vaconeind() takes pretty long time even if it does little.
> 662 seconds for vacuum after deleting the entire table
>
How about half of the rows deleted case ?
It would take longer time.
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
From owner-pgsql-hackers@hub.org Fri Jan 21 12:00:49 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA13329
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 13:00:47 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id MAA96106;
Fri, 21 Jan 2000 12:55:34 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 12:53:53 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id MAA95775
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 12:52:54 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from candle.pha.pa.us (root@s5-03.ppp.op.net [209.152.195.67])
by hub.org (8.9.3/8.9.3) with ESMTP id MAA95720
for <pgsql-hackers@postgreSQL.org>; Fri, 21 Jan 2000 12:52:39 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id MAA12106;
Fri, 21 Jan 2000 12:51:53 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200001211751.MAA12106@candle.pha.pa.us>
Subject: [HACKERS] Re: vacuum timings
In-Reply-To: <3641.948433911@sss.pgh.pa.us> from Tom Lane at "Jan 21, 2000 00:51:51
am"
To: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri, 21 Jan 2000 12:51:53 -0500 (EST)
CC: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
> > 400MB and index is 160MB.
>
> > With index on the single in4 column, I got:
> > 78 seconds for a vacuum
> > 121 seconds for vacuum after deleting a single row
> > 662 seconds for vacuum after deleting the entire table
>
> > With no index, I got:
> > 43 seconds for a vacuum
> > 43 seconds for vacuum after deleting a single row
> > 43 seconds for vacuum after deleting the entire table
>
> > I find this quite interesting.
>
> How long does it take to create the index on your setup --- ie,
> if vacuum did a drop/create index, would it be competitive?
OK, new timings with -F enabled:
index no index
519 same load
247 " first vacuum
40 " other vacuums
1222 X index creation
90 X first vacuum
80 X other vacuums
<1 90 delete one row
121 38 vacuum after delete 1 row
346 344 delete all rows
440 44 first vacuum
20 <1 other vacuums(index is still same size)
Conclusions:
o indexes never get smaller
o drop/recreate index is slower than vacuum of indexes
What other conclusions can be made?
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
************
From scrappy@hub.org Fri Jan 21 12:45:38 2000
Received: from thelab.hub.org (nat200.60.mpoweredpc.net [142.177.200.60])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA14380
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 13:45:29 -0500 (EST)
Received: from localhost (scrappy@localhost)
by thelab.hub.org (8.9.3/8.9.1) with ESMTP id OAA68289;
Fri, 21 Jan 2000 14:45:35 -0400 (AST)
(envelope-from scrappy@hub.org)
X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
Date: Fri, 21 Jan 2000 14:45:34 -0400 (AST)
From: The Hermit Hacker <scrappy@hub.org>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Tom Lane <tgl@sss.pgh.pa.us>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Re: vacuum timings
In-Reply-To: <200001211751.MAA12106@candle.pha.pa.us>
Message-ID: <Pine.BSF.4.21.0001211443480.23487-100000@thelab.hub.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: RO
On Fri, 21 Jan 2000, Bruce Momjian wrote:
> OK, new timings with -F enabled:
>
> index no index
> 519 same load
> 247 " first vacuum
> 40 " other vacuums
>
> 1222 X index creation
> 90 X first vacuum
> 80 X other vacuums
>
> <1 90 delete one row
> 121 38 vacuum after delete 1 row
>
> 346 344 delete all rows
> 440 44 first vacuum
> 20 <1 other vacuums(index is still same size)
>
> Conclusions:
>
> o indexes never get smaller
this one, I thought, was a known? if I remember right, Vadim changed it
so that space was reused, but index never shrunk in size ... no?
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
From tgl@sss.pgh.pa.us Fri Jan 21 13:06:35 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA14618
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 14:06:33 -0500 (EST)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id OAA16501;
Fri, 21 Jan 2000 14:06:31 -0500 (EST)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: vacuum timings
In-reply-to: <200001211751.MAA12106@candle.pha.pa.us>
References: <200001211751.MAA12106@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Fri, 21 Jan 2000 12:51:53 -0500"
Date: Fri, 21 Jan 2000 14:06:31 -0500
Message-ID: <16498.948481591@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: RO
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Conclusions:
> o indexes never get smaller
Which we knew...
> o drop/recreate index is slower than vacuum of indexes
Quite a few people have reported finding the opposite in practice.
You should probably try vacuuming after deleting or updating some
fraction of the rows, rather than just the all or none cases.
regards, tom lane
From dms@wplus.net Fri Jan 21 13:51:27 2000
Received: from relay.wplus.net (relay.wplus.net [195.131.52.179])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA15623
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 14:51:24 -0500 (EST)
X-Real-To: pgman@candle.pha.pa.us
Received: from wplus.net (ppdms.dialup.wplus.net [195.131.52.71])
by relay.wplus.net (8.9.1/8.9.1/wplus.2) with ESMTP id WAA89451;
Fri, 21 Jan 2000 22:46:19 +0300 (MSK)
Message-ID: <3888B822.28F79A1F@wplus.net>
Date: Fri, 21 Jan 2000 22:48:50 +0300
From: Dmitry Samersoff <dms@wplus.net>
X-Mailer: Mozilla 4.7 [en] (WinNT; I)
X-Accept-Language: ru,en
MIME-Version: 1.0
To: Tom Lane <tgl@sss.pgh.pa.us>
CC: Bruce Momjian <pgman@candle.pha.pa.us>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Re: vacuum timings
References: <200001211751.MAA12106@candle.pha.pa.us> <16498.948481591@sss.pgh.pa.us>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Status: ROr
Tom Lane wrote:
>
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Conclusions:
> > o indexes never get smaller
>
> Which we knew...
>
> > o drop/recreate index is slower than vacuum of indexes
>
> Quite a few people have reported finding the opposite in practice.
I'm one of them. On 1,5 GB table with three indices it about twice
slowly.
Probably becouse vacuuming indices brakes system cache policy.
(FreeBSD 3.3)
--
Dmitry Samersoff, DM\S
dms@wplus.net http://devnull.wplus.net
* there will come soft rains
From owner-pgsql-hackers@hub.org Fri Jan 21 14:04:08 2000
Received: from hub.org (hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA16140
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 15:04:06 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id OAA34808;
Fri, 21 Jan 2000 14:59:30 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 14:57:48 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id OAA34320
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 14:56:50 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
by hub.org (8.9.3/8.9.3) with ESMTP id OAA34255
for <pgsql-hackers@postgresql.org>; Fri, 21 Jan 2000 14:56:18 -0500 (EST)
(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
by candle.pha.pa.us (8.9.0/8.9.0) id OAA15772;
Fri, 21 Jan 2000 14:54:22 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200001211954.OAA15772@candle.pha.pa.us>
Subject: Re: [HACKERS] Re: vacuum timings
In-Reply-To: <3888B822.28F79A1F@wplus.net> from Dmitry Samersoff at "Jan 21,
2000 10:48:50 pm"
To: Dmitry Samersoff <dms@wplus.net>
Date: Fri, 21 Jan 2000 14:54:21 -0500 (EST)
CC: Tom Lane <tgl@sss.pgh.pa.us>,
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: RO
[Charset koi8-r unsupported, filtering to ASCII...]
> Tom Lane wrote:
> >
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > Conclusions:
> > > o indexes never get smaller
> >
> > Which we knew...
> >
> > > o drop/recreate index is slower than vacuum of indexes
> >
> > Quite a few people have reported finding the opposite in practice.
>
> I'm one of them. On 1,5 GB table with three indices it about twice
> slowly.
> Probably becouse vacuuming indices brakes system cache policy.
> (FreeBSD 3.3)
OK, we are researching what things can be done to improve this. We are
toying with:
lock table for less duration, or read lock
creating another copy of heap/indexes, and rename() over old files
improving heap vacuum speed
improving index vacuum speed
moving analyze out of vacuum
--
Bruce Momjian | http://www.op.net/~candle
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
************
From scrappy@hub.org Fri Jan 21 14:12:16 2000
Received: from thelab.hub.org (nat200.60.mpoweredpc.net [142.177.200.60])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA16521
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 15:12:13 -0500 (EST)
Received: from localhost (scrappy@localhost)
by thelab.hub.org (8.9.3/8.9.1) with ESMTP id QAA69039;
Fri, 21 Jan 2000 16:12:25 -0400 (AST)
(envelope-from scrappy@hub.org)
X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
Date: Fri, 21 Jan 2000 16:12:25 -0400 (AST)
From: The Hermit Hacker <scrappy@hub.org>
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: Dmitry Samersoff <dms@wplus.net>, Tom Lane <tgl@sss.pgh.pa.us>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] Re: vacuum timings
In-Reply-To: <200001211954.OAA15772@candle.pha.pa.us>
Message-ID: <Pine.BSF.4.21.0001211607080.23487-100000@thelab.hub.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: RO
On Fri, 21 Jan 2000, Bruce Momjian wrote:
> [Charset koi8-r unsupported, filtering to ASCII...]
> > Tom Lane wrote:
> > >
> > > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > > Conclusions:
> > > > o indexes never get smaller
> > >
> > > Which we knew...
> > >
> > > > o drop/recreate index is slower than vacuum of indexes
> > >
> > > Quite a few people have reported finding the opposite in practice.
> >
> > I'm one of them. On 1,5 GB table with three indices it about twice
> > slowly.
> > Probably becouse vacuuming indices brakes system cache policy.
> > (FreeBSD 3.3)
>
> OK, we are researching what things can be done to improve this. We are
> toying with:
>
> lock table for less duration, or read lock
if there is some way that we can work around the bug that I believe Tom
found with removing the lock altogether (ie. makig use of MVCC), I think
that would be the best option ... if not possible, at least get things
down to a table lock vs the whole database?
a good example is the udmsearch that we are using on the site ... it uses
multiple tables to store the dictionary, each representing words of X size
... if I'm searching on a 4 letter word, and the whole database is locked
while it is working on the dictionary with 8 letter words, I'm sitting
there idle ... at least if we only locked the 8 letter table, everyone not
doing 8 letter searches can go on their merry way ...
Slightly longer vacuum's, IMHO, are acceptable if, to the end users, its
as transparent as possible ... locking per table would be slightly slower,
I think, because once a table is finished, the next table would need to
have an exclusive lock put on it before starting, so you'd have to
possibly wait for that...?
> creating another copy of heap/indexes, and rename() over old files
sounds to me like introducing a large potential for error here ...
> moving analyze out of vacuum
I think that should be done anyway ... if we ever get to the point that
we're able to re-use rows in tables, then that would eliminate the
immediate requirement for vacuum, but still retain a requirement for a
periodic analyze ... no?
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
From tgl@sss.pgh.pa.us Fri Jan 21 16:02:07 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA20290
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 17:02:06 -0500 (EST)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA09697;
Fri, 21 Jan 2000 17:02:06 -0500 (EST)
To: The Hermit Hacker <scrappy@hub.org>
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] Re: vacuum timings
In-reply-to: <Pine.BSF.4.21.0001211607080.23487-100000@thelab.hub.org>
References: <Pine.BSF.4.21.0001211607080.23487-100000@thelab.hub.org>
Comments: In-reply-to The Hermit Hacker <scrappy@hub.org>
message dated "Fri, 21 Jan 2000 16:12:25 -0400"
Date: Fri, 21 Jan 2000 17:02:06 -0500
Message-ID: <9694.948492126@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: RO
The Hermit Hacker <scrappy@hub.org> writes:
>> lock table for less duration, or read lock
> if there is some way that we can work around the bug that I believe Tom
> found with removing the lock altogether (ie. makig use of MVCC), I think
> that would be the best option ... if not possible, at least get things
> down to a table lock vs the whole database?
Huh? VACUUM only requires an exclusive lock on the table it is
currently vacuuming; there's no database-wide lock.
Even a single-table exclusive lock is bad, of course, if it's a large
table that's critical to a 24x7 application. Bruce was talking about
the possibility of having VACUUM get just a write lock on the table;
other backends could still read it, but not write it, during the vacuum
process. That'd be a considerable step forward for 24x7 applications,
I think.
It looks like that could be done if we rewrote the table as a new file
(instead of compacting-in-place), but there's a problem when it comes
time to rename the new files into place. At that point you'd need to
get an exclusive lock to ensure all the readers are out of the table too
--- and upgrading from a plain lock to an exclusive lock is a well-known
recipe for deadlocks. Not sure if this can be solved.
regards, tom lane
From tgl@sss.pgh.pa.us Fri Jan 21 22:50:34 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA01657
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 23:50:28 -0500 (EST)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA19681;
Fri, 21 Jan 2000 23:50:13 -0500 (EST)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: vacuum timings
In-reply-to: <200001211751.MAA12106@candle.pha.pa.us>
References: <200001211751.MAA12106@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Fri, 21 Jan 2000 12:51:53 -0500"
Date: Fri, 21 Jan 2000 23:50:13 -0500
Message-ID: <19678.948516613@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: ROr
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Conclusions:
> o drop/recreate index is slower than vacuum of indexes
BTW, I did some profiling of CREATE INDEX this evening (quite
unintentionally actually; I was interested in COPY IN, but the pg_dump
script I used as driver happened to create some indexes too). I was
startled to discover that 60% of the runtime of CREATE INDEX is spent in
_bt_invokestrat (which is called from tuplesort.c's comparetup_index,
and exists only to figure out which specific comparison routine to call).
Of this, a whopping 4% was spent in the useful subroutine, int4gt. All
the rest went into lookup and validation checks that by rights should be
done once per index creation, not once per comparison.
In short: a fairly straightforward bit of optimization will eliminate
circa 50% of the CPU time consumed by CREATE INDEX. All we need is to
figure out where to cache the lookup results. The optimization would
improve insertions and lookups in indexes, as well, if we can cache
the lookup results in those scenarios.
This was for a table small enough that tuplesort.c could do the sort
entirely in memory, so I'm sure the gains would be smaller for a large
table that requires a disk-based sort. Still, it seems worth looking
into...
regards, tom lane
From owner-pgsql-hackers@hub.org Sat Jan 22 02:31:03 2000
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA06743
for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 03:31:02 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.7 $) with ESMTP id DAA07529 for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 03:25:13 -0500 (EST)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id DAA31900;
Sat, 22 Jan 2000 03:19:53 -0500 (EST)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Sat, 22 Jan 2000 03:17:56 -0500
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id DAA31715
for pgsql-hackers-outgoing; Sat, 22 Jan 2000 03:16:58 -0500 (EST)
(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
by hub.org (8.9.3/8.9.3) with ESMTP id DAA31647
for <pgsql-hackers@postgresql.org>; Sat, 22 Jan 2000 03:16:26 -0500 (EST)
(envelope-from Inoue@tpf.co.jp)
Received: from mcadnote1 (ppm114.noc.fukui.nsk.ne.jp [210.161.188.33])
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
id RAA04754; Sat, 22 Jan 2000 17:14:43 +0900
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
Subject: RE: [HACKERS] Re: vacuum timings
Date: Sat, 22 Jan 2000 17:15:37 +0900
Message-ID: <NDBBIJLOILGIKBGDINDFIEEACCAA.Inoue@tpf.co.jp>
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-2022-jp"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
In-Reply-To: <16498.948481591@sss.pgh.pa.us>
Importance: Normal
Sender: owner-pgsql-hackers@postgresql.org
Status: RO
> -----Original Message-----
> From: owner-pgsql-hackers@postgresql.org
> [mailto:owner-pgsql-hackers@postgresql.org]On Behalf Of Tom Lane
>
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Conclusions:
> > o indexes never get smaller
>
> Which we knew...
>
> > o drop/recreate index is slower than vacuum of indexes
>
> Quite a few people have reported finding the opposite in practice.
> You should probably try vacuuming after deleting or updating some
> fraction of the rows, rather than just the all or none cases.
>
Vacuum after delelting all rows isn't a worst case.
There's no moving in that case and vacuum doesn't need to call
index_insert() corresponding to the moving of heap tuples.
Vacuum after deleting half of rows may be one of the worst case.
In this case,index_delete() is called as many times as 'delete all'
case and expensive index_insert() is called for moved_in tuples.
Regards.
Hiroshi Inoue
Inoue@tpf.co.jp
************
From tgl@sss.pgh.pa.us Sat Jan 22 10:31:02 2000
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA20882
for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 11:31:00 -0500 (EST)
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.7 $) with ESMTP id LAA26612 for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 11:12:44 -0500 (EST)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA20569;
Sat, 22 Jan 2000 11:11:26 -0500 (EST)
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] Re: vacuum timings
In-reply-to: <NDBBIJLOILGIKBGDINDFIEEACCAA.Inoue@tpf.co.jp>
References: <NDBBIJLOILGIKBGDINDFIEEACCAA.Inoue@tpf.co.jp>
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
message dated "Sat, 22 Jan 2000 17:15:37 +0900"
Date: Sat, 22 Jan 2000 11:11:25 -0500
Message-ID: <20566.948557485@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: RO
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> Vacuum after deleting half of rows may be one of the worst case.
Or equivalently, vacuum after updating all the rows.
regards, tom lane
From tgl@sss.pgh.pa.us Thu Jan 20 23:51:49 2000
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA13919
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 00:51:47 -0500 (EST)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA03644;
Fri, 21 Jan 2000 00:51:51 -0500 (EST)
To: Bruce Momjian <pgman@candle.pha.pa.us>
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: vacuum timings
In-reply-to: <200001210543.AAA13592@candle.pha.pa.us>
References: <200001210543.AAA13592@candle.pha.pa.us>
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
message dated "Fri, 21 Jan 2000 00:43:49 -0500"
Date: Fri, 21 Jan 2000 00:51:51 -0500
Message-ID: <3641.948433911@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Status: ROr
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
> 400MB and index is 160MB.
> With index on the single in4 column, I got:
> 78 seconds for a vacuum
> 121 seconds for vacuum after deleting a single row
> 662 seconds for vacuum after deleting the entire table
> With no index, I got:
> 43 seconds for a vacuum
> 43 seconds for vacuum after deleting a single row
> 43 seconds for vacuum after deleting the entire table
> I find this quite interesting.
How long does it take to create the index on your setup --- ie,
if vacuum did a drop/create index, would it be competitive?
regards, tom lane
From pgsql-hackers-owner+M5909@hub.org Thu Aug 17 20:15:33 2000
Received: from hub.org (root@hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00644
for <pgman@candle.pha.pa.us>; Thu, 17 Aug 2000 20:15:32 -0400 (EDT)
Received: from hub.org (majordom@localhost [127.0.0.1])
by hub.org (8.10.1/8.10.1) with SMTP id e7I0APm69660;
Thu, 17 Aug 2000 20:10:25 -0400 (EDT)
Received: from fw.wintelcom.net (bright@ns1.wintelcom.net [209.1.153.20])
by hub.org (8.10.1/8.10.1) with ESMTP id e7I01Jm68072
for <pgsql-hackers@postgresql.org>; Thu, 17 Aug 2000 20:01:19 -0400 (EDT)
Received: (from bright@localhost)
by fw.wintelcom.net (8.10.0/8.10.0) id e7I01IA20820
for pgsql-hackers@postgresql.org; Thu, 17 Aug 2000 17:01:18 -0700 (PDT)
Date: Thu, 17 Aug 2000 17:01:18 -0700
From: Alfred Perlstein <bright@wintelcom.net>
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] VACUUM optimization ideas.
Message-ID: <20000817170118.K4854@fw.wintelcom.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: ROr
Here's two ideas I had for optimizing vacuum, I apologize in advance
if the ideas presented here are niave and don't take into account
the actual code that makes up postgresql.
================
#1
Reducing the time vacuum must hold an exlusive lock on a table:
The idea is that since rows are marked deleted it's ok for the
vacuum to fill them with data from the tail of the table as
long as no transaction is in progress that has started before
the row was deleted.
This may allow the vacuum process to copyback all the data without
a lock, when all the copying is done it then aquires an exlusive lock
and does this:
Aquire an exclusive lock.
Walk all the deleted data marking it as current.
Truncate the table.
Release the lock.
Since the data is still marked invalid (right?) even if valid data
is copied into the space it should be ignored as long as there's no
transaction occurring that started before the data was invalidated.
================
#2
Reducing the amount of scanning a vaccum must do:
It would make sense that if a value of the earliest deleted chunk
was kept in a table then vacuum would not have to scan the entire
table in order to work, it would only need to start at the 'earliest'
invalidated row.
The utility of this (at least for us) is that we have several tables
that will grow to hundreds of megabytes, however changes will only
happen at the tail end (recently added rows). If we could reduce the
amount of time spent in a vacuum state it would help us a lot.
================
I'm wondering if these ideas make sense and may help at all.
thanks,
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
From pgsql-hackers-owner+M5912@hub.org Fri Aug 18 01:36:14 2000
Received: from hub.org (root@hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA07787
for <pgman@candle.pha.pa.us>; Fri, 18 Aug 2000 01:36:12 -0400 (EDT)
Received: from hub.org (majordom@localhost [127.0.0.1])
by hub.org (8.10.1/8.10.1) with SMTP id e7I5Q2m38759;
Fri, 18 Aug 2000 01:26:04 -0400 (EDT)
Received: from courier02.adinet.com.uy (courier02.adinet.com.uy [206.99.44.245])
by hub.org (8.10.1/8.10.1) with ESMTP id e7I5Bam35785
for <pgsql-hackers@postgresql.org>; Fri, 18 Aug 2000 01:11:37 -0400 (EDT)
Received: from adinet.com.uy (haroldo@r207-50-240-116.adinet.com.uy [207.50.240.116])
by courier02.adinet.com.uy (8.9.3/8.9.3) with ESMTP id CAA17259;
Fri, 18 Aug 2000 02:10:49 -0300 (GMT)
Message-ID: <399CC739.B9B13D18@adinet.com.uy>
Date: Fri, 18 Aug 2000 02:18:49 -0300
From: hstenger@adinet.com.uy
Reply-To: hstenger@ieee.org
Organization: PRISMA, Servicio y Desarrollo
X-Mailer: Mozilla 4.72 [en] (X11; I; Linux 2.2.14 i586)
X-Accept-Language: en
MIME-Version: 1.0
To: Alfred Perlstein <bright@wintelcom.net>, pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] VACUUM optimization ideas.
References: <20000817170118.K4854@fw.wintelcom.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: ROr
Alfred Perlstein wrote:
> #1
>
> Reducing the time vacuum must hold an exlusive lock on a table:
>
> The idea is that since rows are marked deleted it's ok for the
> vacuum to fill them with data from the tail of the table as
> long as no transaction is in progress that has started before
> the row was deleted.
>
> This may allow the vacuum process to copyback all the data without
> a lock, when all the copying is done it then aquires an exlusive lock
> and does this:
>
> Aquire an exclusive lock.
> Walk all the deleted data marking it as current.
> Truncate the table.
> Release the lock.
>
> Since the data is still marked invalid (right?) even if valid data
> is copied into the space it should be ignored as long as there's no
> transaction occurring that started before the data was invalidated.
Yes, but nothing prevents newer transactions from modifying the _origin_ side of
the copied data _after_ it was copied, but before the Lock-Walk-Truncate-Unlock
cycle takes place, and so it seems unsafe. Maybe locking each record before
copying it up ...
Regards,
Haroldo.
--
----------------------+------------------------
Haroldo Stenger | hstenger@ieee.org
Montevideo, Uruguay. | hstenger@adinet.com.uy
----------------------+------------------------
Visit UYLUG Web Site: http://www.linux.org.uy
-----------------------------------------------
From pgsql-hackers-owner+M5917@hub.org Fri Aug 18 09:41:33 2000
Received: from hub.org (root@hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA05170
for <pgman@candle.pha.pa.us>; Fri, 18 Aug 2000 09:41:33 -0400 (EDT)
Received: from hub.org (majordom@localhost [127.0.0.1])
by hub.org (8.10.1/8.10.1) with SMTP id e7IDVjm75143;
Fri, 18 Aug 2000 09:31:46 -0400 (EDT)
Received: from andie.ip23.net (andie.ip23.net [212.83.32.23])
by hub.org (8.10.1/8.10.1) with ESMTP id e7IDPIm73296
for <pgsql-hackers@postgresql.org>; Fri, 18 Aug 2000 09:25:18 -0400 (EDT)
Received: from imap1.ip23.net (imap1.ip23.net [212.83.32.35])
by andie.ip23.net (8.9.3/8.9.3) with ESMTP id PAA58387;
Fri, 18 Aug 2000 15:25:12 +0200 (CEST)
Received: from ip23.net (spc.ip23.net [212.83.32.122])
by imap1.ip23.net (8.9.3/8.9.3) with ESMTP id PAA59177;
Fri, 18 Aug 2000 15:41:28 +0200 (CEST)
Message-ID: <399D3938.582FDB49@ip23.net>
Date: Fri, 18 Aug 2000 15:25:12 +0200
From: Sevo Stille <sevo@ip23.net>
Organization: IP23
X-Mailer: Mozilla 4.61 [en] (X11; I; Linux 2.2.10 i686)
X-Accept-Language: en, de
MIME-Version: 1.0
To: Alfred Perlstein <bright@wintelcom.net>
CC: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] VACUUM optimization ideas.
References: <20000817170118.K4854@fw.wintelcom.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: RO
Alfred Perlstein wrote:
> The idea is that since rows are marked deleted it's ok for the
> vacuum to fill them with data from the tail of the table as
> long as no transaction is in progress that has started before
> the row was deleted.
Well, isn't one of the advantages of vacuuming in the reordering it
does? With a "fill deleted chunks" logic, we'd have far less order in
the databases.
> This may allow the vacuum process to copyback all the data without
> a lock,
Nope. Another process might update the values in between move and mark,
if the record is not locked. We'd either have to write-lock the entire
table for that period, write lock every item as it is moved, or lock,
move and mark on a per-record base. The latter would be slow, but it
could be done in a permanent low priority background process, utilizing
empty CPU cycles. Besides, it probably could not only be done simply
filling from the tail, but also moving up the records in a sorted
fashion.
> #2
>
> Reducing the amount of scanning a vaccum must do:
>
> It would make sense that if a value of the earliest deleted chunk
> was kept in a table then vacuum would not have to scan the entire
> table in order to work, it would only need to start at the 'earliest'
> invalidated row.
Trivial to do. But of course #1 may imply that the physical ordering is
even less likely to be related to the logical ordering in a way where
this helps.
> The utility of this (at least for us) is that we have several tables
> that will grow to hundreds of megabytes, however changes will only
> happen at the tail end (recently added rows).
The tail is a relative position - except for the case where you add
temporary records to a constant default set, everything in the tail will
move, at least relatively, to the head after some time.
> If we could reduce the
> amount of time spent in a vacuum state it would help us a lot.
Rather: If we can reduce the time spent in a locked state while
vacuuming, it would help a lot. Being in a vacuum is not the issue -
even permanent vacuuming need not be an issue, if the locks it uses are
suitably short-time.
Sevo
--
sevo@ip23.net
From pgsql-hackers-owner+M5911@hub.org Thu Aug 17 21:11:20 2000
Received: from hub.org (root@hub.org [216.126.84.1])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA01882
for <pgman@candle.pha.pa.us>; Thu, 17 Aug 2000 21:11:20 -0400 (EDT)
Received: from hub.org (majordom@localhost [127.0.0.1])
by hub.org (8.10.1/8.10.1) with SMTP id e7I119m80626;
Thu, 17 Aug 2000 21:01:09 -0400 (EDT)
Received: from acheron.rime.com.au (root@albatr.lnk.telstra.net [139.130.54.222])
by hub.org (8.10.1/8.10.1) with ESMTP id e7I0wMm79870
for <pgsql-hackers@postgresql.org>; Thu, 17 Aug 2000 20:58:22 -0400 (EDT)
Received: from oberon (Oberon.rime.com.au [203.8.195.100])
by acheron.rime.com.au (8.9.3/8.9.3) with SMTP id KAA03215;
Fri, 18 Aug 2000 10:58:25 +1000
Message-Id: <3.0.5.32.20000818105835.0280ade0@mail.rhyme.com.au>
X-Sender: pjw@mail.rhyme.com.au
X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)
Date: Fri, 18 Aug 2000 10:58:35 +1000
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>,
Ben Adida <ben@openforce.net>
From: Philip Warner <pjw@rhyme.com.au>
Subject: Re: [HACKERS] Inserting a select statement result into another
table
Cc: Andrew Selle <aselle@upl.cs.wisc.edu>, pgsql-hackers@postgresql.org
In-Reply-To: <399C7689.2DDDAD1D@nimrod.itg.telecom.com.au>
References: <20000817130517.A10909@upl.cs.wisc.edu>
<399BF555.43FB70C8@openforce.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: O
At 09:34 18/08/00 +1000, Chris Bitmead wrote:
>
>He does ask a legitimate question though. If you are going to have a
>LIMIT feature (which of course is not pure SQL), there seems no reason
>you shouldn't be able to insert the result into a table.
This feature is supported by two commercial DBs: Dec/RDB and SQL/Server. I
have no idea if Oracle supports it, but it is such a *useful* feature that
I would be very surprised if it didn't.
>Ben Adida wrote:
>>
>> What is the purpose you're trying to accomplish with this order by? No
matter what, all the
>> rows where done='f' will be inserted, and you will not be left with any
indication of that
>> order once the rows are in the todolist table.
I don't know what his *purpose* was, but the query should only insert the
first two rows from the select bacause of the limit).
>> Andrew Selle wrote:
>>
>> > Alright. My situation is this. I have a list of things that need to
be done
>> > in a table called tasks. I have a list of users who will complete
these tasks.
>> > I want these users to be able to come in and "claim" the top 2 most
recent tasks
>> > that have been added. These tasks then get stored in a table called
todolist
>> > which stores who claimed the task, the taskid, and when the task was
claimed.
>> > For each time someone wants to claim some number of tasks, I want to
do something
>> > like
>> >
>> > INSERT INTO todolist
>> > SELECT taskid,'1',now()
>> > FROM tasks
>> > WHERE done='f'
>> > ORDER BY submit DESC
>> > LIMIT 2;
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
From pgsql-hackers-owner+M29308@postgresql.org Mon Sep 23 09:47:54 2002
Return-path: <pgsql-hackers-owner+M29308@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g8NDlqd00289
for <pgman@candle.pha.pa.us>; Mon, 23 Sep 2002 09:47:53 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP
id 7CA64476497; Mon, 23 Sep 2002 09:43:28 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id EDA70475BC3; Mon, 23 Sep 2002 09:43:20 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP id 85264476479
for <pgsql-hackers@postgresql.org>; Mon, 23 Sep 2002 09:43:15 -0400 (EDT)
Received: from www.pspl.co.in (www.pspl.co.in [202.54.11.65])
by postgresql.org (Postfix) with ESMTP id C7899476477
for <pgsql-hackers@postgresql.org>; Mon, 23 Sep 2002 09:43:12 -0400 (EDT)
Received: (from root@localhost)
by www.pspl.co.in (8.11.6/8.11.6) id g8NDiQ030526
for <pgsql-hackers@postgresql.org>; Mon, 23 Sep 2002 19:14:26 +0530
Received: from daithan (daithan.intranet.pspl.co.in [192.168.7.161])
by www.pspl.co.in (8.11.6/8.11.0) with ESMTP id g8NDiQ330521;
Mon, 23 Sep 2002 19:14:26 +0530
From: "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>
To: pgsql-hackers@postgresql.org, pgsql-general@postgresql.org
Date: Mon, 23 Sep 2002 19:13:44 +0530
MIME-Version: 1.0
Subject: [HACKERS] Postgresql Automatic vacuum
Reply-To: shridhar_daithankar@persistent.co.in
Message-ID: <3D8F67E8.7500.4E0E180@localhost>
X-Mailer: Pegasus Mail for Windows (v4.02)
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Content-Description: Mail message body
X-Virus-Scanned: by AMaViS new-20020517
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
X-Virus-Scanned: by AMaViS new-20020517
Status: OR
Hello All,
I have written a small daemon that can automatically vacuum PostgreSQL
database, depending upon activity per table.
It sits on top of postgres statistics collector. The postgres installation
should have per row statistics collection enabled.
Features are,
* Vacuuming based on activity on the table
* Per table vacuum. So only heavily updated tables are vacuumed.
* multiple databases supported
* Performs 'vacuum analyze' only, so it will not block the database
The project location is
http://gborg.postgresql.org/project/pgavd/projdisplay.php
Let me know for bugs/improvements and comments..
I am sure real world postgres installations has some sort of scripts doing
similar thing. This is an attempt to provide a generic interface to periodic
vacuum.
Bye
Shridhar
--
The Abrams' Principle: The shortest distance between two points is off the
wall.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
From pgsql-hackers-owner+M29344@postgresql.org Tue Sep 24 02:42:36 2002
Return-path: <pgsql-hackers-owner+M29344@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g8O6gYg19416
for <pgman@candle.pha.pa.us>; Tue, 24 Sep 2002 02:42:35 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP
id 128704762AF; Tue, 24 Sep 2002 02:42:36 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id DE80C4760F5; Tue, 24 Sep 2002 02:42:32 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP id 40A8A475DBC
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 02:42:25 -0400 (EDT)
Received: from relay.icomedias.com (relay.icomedias.com [62.99.232.66])
by postgresql.org (Postfix) with ESMTP id 7ECC8475DAD
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 02:42:23 -0400 (EDT)
Received: from loki ([10.192.17.128])
by relay.icomedias.com (8.12.5/8.12.5) with ESMTP id g8O6g8BX014226;
Tue, 24 Sep 2002 08:42:09 +0200
Content-Type: text/plain;
charset="iso-8859-1"
From: Mario Weilguni <mweilguni@sime.com>
To: shridhar_daithankar@persistent.co.in, matthew@zeut.net
Subject: Re: [HACKERS] Postgresql Automatic vacuum
Date: Tue, 24 Sep 2002 08:42:06 +0200
User-Agent: KMail/1.4.3
cc: pgsql-hackers@postgresql.org
References: <3D8F67E8.7500.4E0E180@localhost> <3D9050B2.9782.86E55C0@localhost>
In-Reply-To: <3D9050B2.9782.86E55C0@localhost>
MIME-Version: 1.0
Message-ID: <200209240842.06459.mweilguni@sime.com>
avpresult: 0, ok, ok
X-Scanned-By: MIMEDefang 2.16 (www . roaringpenguin . com / mimedefang)
X-Virus-Scanned: by AMaViS new-20020517
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
X-Virus-Scanned: by AMaViS new-20020517
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id g8O6gYg19416
Status: OR
Am Dienstag, 24. September 2002 08:16 schrieb Shridhar Daithankar:
>
> > I will play with it more and give you some more feedback.
>
> Awaiting that.
>
IMO there are still several problems with that approach, namely:
* every database will get "polluted" with the autovacuum table, which is undesired
* the biggest problem is the ~/.pgavrc file. I think it should work like other postgres utils do, e.g. supporting -U, -d, ....
* it's not possible to use without activly administration the config file. it should be able to work without
adminstrator assistance.
When this is a daemon, why not store the data in memory? Even with several thousands of tables the memory footprint would
still be small. And it should be possible to use for all databases without modifying a config file.
Two weeks ago I began writing a similar daemon, but had no time yet to finish it. I've tried to avoid using fixed numbers (namely "vacuum table
after 1000 updates") and tried to make my own heuristic based on the statistics data and the size of the table. The reason is, for a large table 1000 entries might be
a small percentage and vacuum is not necessary, while for small tables 10 updates might be sufficient.
Best regards,
Mario Weilguni
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
From pgsql-hackers-owner+M29345@postgresql.org Tue Sep 24 03:02:50 2002
Return-path: <pgsql-hackers-owner+M29345@postgresql.org>
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g8O72lg21051
for <pgman@candle.pha.pa.us>; Tue, 24 Sep 2002 03:02:48 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP
id 9B3EA4762F6; Tue, 24 Sep 2002 03:02:48 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP
id 902EA476020; Tue, 24 Sep 2002 03:02:45 -0400 (EDT)
Received: from localhost (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with ESMTP id 98689475DAD
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 03:02:18 -0400 (EDT)
Received: from www.pspl.co.in (www.pspl.co.in [202.54.11.65])
by postgresql.org (Postfix) with ESMTP id 47B8647592C
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 03:02:16 -0400 (EDT)
Received: (from root@localhost)
by www.pspl.co.in (8.11.6/8.11.6) id g8O73QQ16318
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 12:33:26 +0530
Received: from daithan (daithan.intranet.pspl.co.in [192.168.7.161])
by www.pspl.co.in (8.11.6/8.11.0) with ESMTP id g8O73Q316313
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 12:33:26 +0530
From: "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>
To: pgsql-hackers@postgresql.org
Date: Tue, 24 Sep 2002 12:32:43 +0530
MIME-Version: 1.0
Subject: Re: [HACKERS] Postgresql Automatic vacuum
Reply-To: shridhar_daithankar@persistent.co.in
Message-ID: <3D905B6B.1635.898382A@localhost>
References: <3D9050B2.9782.86E55C0@localhost>
In-Reply-To: <200209240842.06459.mweilguni@sime.com>
X-Mailer: Pegasus Mail for Windows (v4.02)
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Content-Description: Mail message body
X-Virus-Scanned: by AMaViS new-20020517
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
X-Virus-Scanned: by AMaViS new-20020517
Status: OR
On 24 Sep 2002 at 8:42, Mario Weilguni wrote:
> Am Dienstag, 24. September 2002 08:16 schrieb Shridhar Daithankar:
> IMO there are still several problems with that approach, namely:
> * every database will get "polluted" with the autovacuum table, which is undesired
I agree. But that was the best alternative I could see. explanation
follows..Besides I didn't want to touch PG meta data..
> * the biggest problem is the ~/.pgavrc file. I think it should work like other postgres utils do, e.g. supporting -U, -d, ....
Shouldn't be a problem. The config stuff is working and I can add that. I would
rather term it a minor issue. On personal preference, I would just fire it
without any arguments. It's not a thing that you change daily. Configure it in
config file and done..
> * it's not possible to use without activly administration the config file. it should be able to work without
> adminstrator assistance.
Well. I would call that tuning. Each admin can tune it. Yes it's an effort but
certainly not an active administration.
> When this is a daemon, why not store the data in memory? Even with several thousands of tables the memory footprint would
> still be small. And it should be possible to use for all databases without modifying a config file.
Well. When postgresql has ability to deal with arbitrary number of rows, it
seemed redundant to me to duplicate all those functionality. Why write lists
and arrays again and again? Let postgresql do it.
> Two weeks ago I began writing a similar daemon, but had no time yet to finish it. I've tried to avoid using fixed numbers (namely "vacuum table
> after 1000 updates") and tried to make my own heuristic based on the statistics data and the size of the table. The reason is, for a large table 1000 entries might be
> a small percentage and vacuum is not necessary, while for small tables 10 updates might be sufficient.
Well, that fixed number is not really fixed but admin tunable, that too per
database. These are just defaults. Tune it to suit your needs.
The objective of whole exercise is to get rid of periodic vacuum as this app.
shifts threshold to activity rather than time.
Besides a table should be vacuumed when it starts affecting performance. On an
installation if a table a 1M rows and change 1K rows affects performance, there
will be a similar performance hit for a 100K rows table for 1K rows update.
Because overhead involved would be almost same.(Not disk space. pgavd does not
target vacuum full but tuple size should matter).
At least me thinks so..
I plan to implement per table threshold in addition to per database thresholds.
But right now, it seems like overhead to me. Besides there is an item in TODO,
to shift unit of work from rows to blocks affected. I guess that takes care of
some of your points..
Bye
Shridhar
--
Jones' Second Law: The man who smiles when things go wrong has thought of
someone to blame it on.
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
From selkovjr@mcs.anl.gov Sat Jul 25 05:31:05 1998
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA16564
for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:31:03 -0400 (EDT)
Received: from antares.mcs.anl.gov (mcs.anl.gov [140.221.9.6]) by renoir.op.net (o1/$ Revision: 1.18 $) with SMTP id FAA01775 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:28:22 -0400 (EDT)
Received: from mcs.anl.gov (wit.mcs.anl.gov [140.221.5.148]) by antares.mcs.anl.gov (8.6.10/8.6.10) with ESMTP
id EAA28698 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 04:27:05 -0500
Sender: selkovjr@mcs.anl.gov
Message-ID: <35B9968D.21CF60A2@mcs.anl.gov>
Date: Sat, 25 Jul 1998 08:25:49 +0000
From: "Gene Selkov, Jr." <selkovjr@mcs.anl.gov>
Organization: MCS, Argonne Natl. Lab
X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.32 i586)
MIME-Version: 1.0
To: Bruce Momjian <maillist@candle.pha.pa.us>
Subject: position-aware scanners
References: <199807250524.BAA07296@candle.pha.pa.us>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Status: RO
Bruce,
I attached here (trough the web links) a couple examples, totally
irrelevant to postgres but good enough to discuss token locations. I
might as well try to patch the backend parser, though not sure how soon.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1.
The first c parser I wrote,
http://wit.mcs.anl.gov/~selkovjr/unit-troff.tgz, is not very
sophisticated, so token locations reported by yyerr() may be slightly
incorrect (+/- one position depending on the existence and type of the
lookahead token. It is a filter used to typeset the units of measurement
with eqn. To use it, unpack the tar file and run make. The Makefile is
not too generic but I built it on various systems including linux,
freebsd and sunos 4.3. The invocation can be something like this:
./check 0 parse "l**3/(mmoll*min)"
parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
`'(''
l**3/(mmoll*min)
^^^^^
Now to the guts. As far as I can imagine, the only way to consistently
keep track of each character read by the scanner (regardless of the
length of expressions it will match) is to redefine its YY_INPUT like
this:
#undef YY_INPUT
#define YY_INPUT(buf,result,max_size) \
{ \
int c = (int) buffer[pos++]; \
result = (c == '\0') ? YY_NULL : (buf[0] = c, 1); \
}
Here, buffer is the pointer to the origin of the string being scanned
and pos is a global variable, similar in usage to a file pointer (you
can both read and manipulate it at will). The buffer and the pointer are
initialized by the function
void setString(char *s)
{
buffer = s;
pos = 0;
}
each time the new string is to be parsed. This (exportable) function is
part of the interface.
In this simplistic design, yyerror() is part of the scanner module and
it uses the pos variable to report the location of unexpected tokens.
The downside of such arrangement is that in case of error condition, you
can't easily tell whether your context is current or lookahead token, it
just reports the position of the last token read (be it $ (end of
buffer) or something else):
./check 0 convert "mol/foo"
parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
`'(''
mol/foo
^^^
(should be at the beginning of "foo")
./check 0 convert "mmol//l"
parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
`'(''
mmol//l
^
(should be at the second '/')
I believe this is why most simple parsers made with yacc would report
parse errors being "at or near" some token, which is fair enough if the
expression is not too complex.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2. The second version of the same scanner,
http://wit.mcs.anl.gov/~selkovjr/scanner-example.tgz, addresses this
problem by recording exact locations of the tokens in each instance of
the token semantic data structure. The global,
UNIT_YYSTYPE unit_yylval;
would be normally used to export the token semantics (including its
original or modified text and location data) to the parser.
Unfortunately, I cannot show you the parser part in c, because that's
about when I stopped writing parsers in c. Instead, I included a small
test program, test.c, that mimics the parser's expectations for the
scanner data pretty well. I am assuming here that you are not interested
in digging someone else's ugly guts for relatively small bit of
information; let me know if I am wrong and I will send you the complete
perl code (also generated with bison).
To run this example, unpack the tar file and run Make. Then do
gcc test.c scanner.o
and run a.out
Note the line
yylval = unit_getyylval();
in test.c. You will not normally need it in a c parser. It is enough to
define yylval as an external variable and link it to yylval in yylex()
In the bison-generated parser, yylval gets pushed into a stack (pointed
to by yylsp) each time a new token is read. For each syntax rule, the
bison macros @1, @2, ... are just shortcuts to locations in the stack 1,
2, ... levels deep. In following code fragment, @3 refers to the
location info for the third term in the rule (INTEGER):
(sorry about perl, but I think you can do the same things in c without
significant changes to your existing parser)
term: base {
$$ = $1;
$$->{'order'} = 1;
}
| base EXP INTEGER {
$$ = $1;
$$->{'order'} = @3->{'text'};
$$->{'scale'} = $$->{'scale'} ** $$->{'order'};
if ( $$->{'order'} == 0 ) {
yyerror("Error: expecting a non-zero
integer exponent");
YYERROR;
}
}
which translates to:
($yyn == 10) && do {
$yyval = $yyvsa[-1];
$yyval->{'order'} = 1;
last SWITCH;
};
($yyn == 11) && do {
$yyval = $yyvsa[-3];
$yyval->{'order'} = $yylsa[-1]->{'text'}
$yyval->{'scale'} = $yyval->{'scale'} ** $yyval->{'order'};
if ( $yyval->{'order'} == 0 ) {
yyerror("Error: expecting a non-zero integer
exponent");
goto yyerrlab1 ;
}
last SWITCH;
};
In c, you will have a bit more complicated pointer arithmetic to adress
the stack, but the usage of objects will be the same. Note here that it
is convenient to keep all information about the token in its location
info, (yylsa, yylsp, yylval, @n), while everything relating to the value
of the expression, or to the parse tree, is better placed in the
semantic stack (yyssa, yyssp, yysval, $n). Also note that in some cases
you can do semantic checks inside rules and report useful messages
before or instead of invoking yyerror();
Finally, it is useful to make the following wrapper function around
external yylex() in order to maintain your own token stack. Unlike the
parser's internal stack which is only as deep as the rule being reduced,
this one can hold all tokens recognized during the current run, and that
can be extremely helpful for error reporting and any transformations you
may need. In this way, you can even scan (tokenize) the whole buffer
before handing it off to the parser (who knows, you may need a token
ahead of what is currently seen by the parser):
sub tokenize {
undef @tokenTable;
my ($tok, $text, $name, $unit, $first_line, $first_column,
$last_line, $last_column);
while ( ($tok = &UnitLex::yylex()) > 0 ) { # this is where the
c-coded yylex is called,
# UnitLex is the perl
extension encapsulating it
( $text, $name, $unit, $first_line, $first_column, $last_line,
$last_column ) = &UnitLex::getyylval;
push(@tokenTable,
Unit::yyltype->new (
'token' => $tok,
'text' => $text,
'name' => $name,
'unit' => $unit,
'first_line' => $first_line,
'first_column' => $first_column,
'last_line' => $last_line,
'last_column' => $last_column,
)
)
}
}
It is now a lot easier to handle various state-related problems, such as
backtracking and error reporting. The yylex() function as seen by the
parser might be constructed somewhat like this:
sub yylex {
$yylloc = $tokenTable[$tokenNo]; # $tokenNo is a global; now
instead of a "file pointer",
# as in the first example, we have
a "token pointer"
undef $yylval;
# disregard this; name this block "computing semantic values"
if ( $yylloc->{'token'} == UNIT) {
$yylval = Unit::Operand->new(
'unit' => Unit::Dict::unit($yylloc->{'unit'}),
'base' => Unit::Dict::base($yylloc->{'unit'}),
'scale' => Unit::Dict::scale($yylloc->{'unit'}),
'scaleToBase' => Unit::Dict::scaleToBase($yylloc->{'unit'}),
'loc' => $yylloc,
);
}
elsif ( ($yylloc->{'token'} == INTEGER ) || ($yylloc->{'token'} ==
POSITIVE_NUMBER) ) {
$yylval = Unit::Operand->new(
'unit' => '1',
'base' => '1',
'scale' => 1,
'scaleToBase' => 1,
'loc' => $yylloc,
);
}
$tokenNo++;
return(%{$yylloc}->{'token'}); # This is all the parser needs to
know about this token.
# But we already made sure we saved
everything we need to know.
}
Now the most interesting part, the error reporting routine:
sub yyerror {
my ($str) = @_;
my ($message, $start, $end, $loc);
$loc = $tokenTable[$tokenNo-1]; # This is the same as to say,
# "obtain the location info for the
current token"
# You may use this routine for your own purposes or let parser use
it
if( $str ne 'parse error' ) {
$message = "$str instead of `" . $loc->{'name'} . "' <" .
$loc->{'text'} . ">, at line " . $loc->{'first_line'} . ":\n\
n";
}
else {
$message = "unexpected token `" . $loc->{'name'} . "' <" .
$loc->{'text'} . ">, at line " . loc->{'first_line'} . ":\n
\n";
}
$message .= $parseBuffer . "\n"; # that's the original string that
was used to set the parser buffer
$message .= ( ' ' x ($loc->{'first_column'} + 1) ) . ( '^' x
length($loc->{'text'}) ). "\n";
if( $str ne 'parse error' ) {
print STDERR "$str instead of `", $loc->{'name'}, "' {",
$loc->{'text'}, "}, at line ", $loc->{'first_line'}, ":\n\n";
}
else {
print STDERR "unexpected token `", $loc->{'name'}, "' {",
$loc->{'text'}, "}, at line ", $loc->{'first_line'}, ":\n\n";
}
print STDERR "$parseBuffer\n";
print STDERR ' ' x ($loc->{'first_column'} + 1), '^' x
length($loc->{'text'}), "\n";
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Scanners used in these examples assume there is a single line of text on
the input (the first_line and last_line elements of yylloc are simply
ignored). If you want to be able to parse multi-line buffers, just add a
lex rule for '\n' that will increment the line count and reset the pos
variable to zero.
Ugly as it may seem, I find this approach extremely liberating. If the
grammar becomes too complicated for a LALR(1) parser, I can cascade
multiple parsers. The token table can then be used to reassemble parts
of original expression for subordinate parsers, preserving the location
info all the way down, so that subordinate parsers can report their
problems consistently. You probably don't need this, as SQL is very well
thought of and has parsable grammar. But it may be of some help, for
error reporting.
--Gene
From pgsql-patches-owner+M1499@postgresql.org Sat Aug 4 13:11:53 2001
Return-path: <pgsql-patches-owner+M1499@postgresql.org>
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f74HBrh11339
for <pgman@candle.pha.pa.us>; Sat, 4 Aug 2001 13:11:53 -0400 (EDT)
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
by postgresql.org (8.11.3/8.11.4) with SMTP id f74H89655183;
Sat, 4 Aug 2001 13:08:09 -0400 (EDT)
(envelope-from pgsql-patches-owner+M1499@postgresql.org)
Received: from sss.pgh.pa.us ([192.204.191.242])
by postgresql.org (8.11.3/8.11.4) with ESMTP id f74Gxb653074
for <pgsql-patches@postgresql.org>; Sat, 4 Aug 2001 12:59:37 -0400 (EDT)
(envelope-from tgl@sss.pgh.pa.us)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f74GtPC29183;
Sat, 4 Aug 2001 12:55:25 -0400 (EDT)
To: Dave Page <dpage@vale-housing.co.uk>
cc: "'Fernando Nasser'" <fnasser@cygnus.com>,
Bruce Momjian <pgman@candle.pha.pa.us>, Neil Padgett <npadgett@redhat.com>,
pgsql-patches@postgresql.org
Subject: Re: [PATCHES] Patch for Improved Syntax Error Reporting
In-Reply-To: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk>
References: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk>
Comments: In-reply-to Dave Page <dpage@vale-housing.co.uk>
message dated "Sat, 04 Aug 2001 12:37:23 +0100"
Date: Sat, 04 Aug 2001 12:55:24 -0400
Message-ID: <29180.996944124@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-patches-owner@postgresql.org
Status: OR
Dave Page <dpage@vale-housing.co.uk> writes:
> Oh, I quite agree. I'm not adverse to updating my code, I just want to avoid
> users getting misleading messages until I come up with those updates.
Hmm ... if they were actively misleading then I'd share your concern.
I guess what you're thinking is that the error offset reported by the
backend won't correspond directly to what the user typed, and if the
user tries to use the offset to manually count off characters, he may
arrive at the wrong place? Good point. I'm not sure whether a message
like
ERROR: parser: parse error at or near 'frum';
POSITION: 42
would be likely to encourage people to try that. Thoughts? (I do think
this is a good argument for not embedding the position straight into the
main error message though...)
One possible compromise is to combine the straight character-offset
approach with a simplistic context display:
ERROR: parser: parse error at or near 'frum';
POSITION: 42 ... oid,relname FRUM ...
The idea is to define the "POSITION" field as an integer offset possibly
followed by whitespace and noise words. An updated client would grab
the offset, ignore the rest of the field, and do the right thing. A
not-updated client would display the entire message, and with any luck
the user would read it correctly.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment