good, you'll have to ram them down people's throats." -- Howard Aiken
From owner-pgsql-hackers@hub.org Tue Oct 19 10:31:10 1999
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA29087
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:31:08 -0400 (EDT)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.2 $) with ESMTP id KAA27535 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 10:19:47 -0400 (EDT)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id KAA30328;
Tue, 19 Oct 1999 10:12:10 -0400 (EDT)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 19 Oct 1999 10:11:55 -0400
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id KAA30030
for pgsql-hackers-outgoing; Tue, 19 Oct 1999 10:11:00 -0400 (EDT)
> 2. private cache holds uncommitted system tuples.
> 3. relpages of shared cache are updated immediately by
> phisical change and corresponding buffer pages are
> marked dirty.
> 4. on commit, the contents of uncommitted tuples except
> relpages,reltuples,... are copied to correponding tuples
> in shared cache and the combined contents are
> committed.
> If so,catalog cache invalidation would be no longer needed.
> But synchronization of the step 4. may be difficult.
I think the main problem is that relpages and reltuples shouldn't
be kept in pg_class columns at all, because they need to have
very different update behavior from the other pg_class columns.
The rest of pg_class is update-on-commit, and we can lock down any one
row in the normal MVCC way (if transaction A has modified a row and
transaction B also wants to modify it, B waits for A to commit or abort,
so it can know which version of the row to start from). Furthermore,
there can legitimately be several different values of a row in use in
different places: the latest committed, an uncommitted modification, and
one or more old values that are still being used by active transactions
because they were current when those transactions started. (BTW, the
present relcache is pretty bad about maintaining pure MVCC transaction
semantics like this, but it seems clear to me that that's the direction
we want to go in.)
relpages cannot operate this way. To be useful for avoiding lseeks,
relpages *must* change exactly when the physical file changes. It
matters not at all whether the particular transaction that extended the
file ultimately commits or not. Moreover there can be only one correct
value (per relation) across the whole system, because there is only one
length of the relation file.
If we want to take reltuples seriously and try to maintain it
on-the-fly, then I think it needs still a third behavior. Clearly
it cannot be updated using MVCC rules, or we lose all writer
concurrency (if A has added tuples to a rel, B would have to wait
for A to commit before it could update reltuples...). Furthermore
"updating" isn't a simple matter of storing what you think the new
value is; otherwise two transactions adding tuples in parallel would
leave the wrong answer after B commits and overwrites A's value.
I think it would work for each transaction to keep track of a net delta
in reltuples for each table it's changed (total tuples added less total
tuples deleted), and then atomically add that value to the table's
shared reltuples counter during commit. But that still leaves the
problem of how you use the counter during a transaction to get an
accurate answer to the question "If I scan this table now, how many tuples
will I see?" At the time the question is asked, the current shared
counter value might include the effects of transactions that have
committed since your transaction started, and therefore are not visible
under MVCC rules. I think getting the correct answer would involve
making an instantaneous copy of the current counter at the start of
your xact, and then adding your own private net-uncommitted-delta to
the saved shared counter value when asked the question. This doesn't
look real practical --- you'd have to save the reltuples counts of
*all* tables in the database at the start of each xact, on the off
chance that you might need them. Ugh. Perhaps someone has a better
idea. In any case, reltuples clearly needs different mechanisms than
the ordinary fields in pg_class do, because updating it will be a
performance bottleneck otherwise.
If we allow reltuples to be updated only by vacuum-like events, as
it is now, then I think keeping it in pg_class is still OK.
In short, it seems clear to me that relpages should be removed from
pg_class and kept somewhere else if we want to make it more reliable
than it is now, and the same for reltuples (but reltuples doesn't
behave the same as relpages, and probably ought to be handled
differently).
regards, tom lane
************
From owner-pgsql-hackers@hub.org Tue Oct 19 21:25:30 1999
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA28130
for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:25:26 -0400 (EDT)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.2 $) with ESMTP id VAA10512 for <maillist@candle.pha.pa.us>; Tue, 19 Oct 1999 21:15:28 -0400 (EDT)
Received: from localhost (majordom@localhost)
by hub.org (8.9.3/8.9.3) with SMTP id VAA50745;
Tue, 19 Oct 1999 21:07:23 -0400 (EDT)
(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 19 Oct 1999 21:07:01 -0400
Received: (from majordom@localhost)
by hub.org (8.9.3/8.9.3) id VAA50644
for pgsql-hackers-outgoing; Tue, 19 Oct 1999 21:06:06 -0400 (EDT)
<H4><ANAME="4.16.2">4.16.2</A>)How do I get the back the generated SERIAL value after an insert?</H4><P>
<H4><ANAME="4.16.2">4.16.2</A>)How do I get the back the generated SERIAL value after an insert?</H4><P>
Probably the simplest approach is to to retrieve the next SERIAL value from the sequence object with the <I>nextval()</I> function <I>before</I> inserting and then insert it explicitly. Using the example table in <AHREF="#4.16.1">4.16.1</A>, that might look like this:
<PRE>
$newSerialID = nextval('person_id_seq');
...
...
@@ -1069,12 +1056,12 @@ Similarly, you could retrieve the just-assigned SERIAL value with the <I>currval
</PRE>
Finally, you could use the <AHREF="#4.17">oid</A> returned from the INSERT statement to lookup the default value, though this is probably the least portable approach. In perl, using DBI with Edmund Mergl's DBD::Pg module, the oid value is made available via $sth->{pg_oid_status} after $sth->execute().
<H4><ANAME="4.16.3">4.16.3</A>)Wouldn't use of currval() and nextval() lead to a race condition with other concurrent backend processes?</H4><P>
<H4><ANAME="4.16.3">4.16.3</A>)Wouldn't use of currval() and nextval() lead to a race condition with other concurrent backend processes?</H4><P>
No. That has been handled by the backends.
<H4><ANAME="4.17">4.17</A>)What is an oid? What is a tid?</H4><P>
<H4><ANAME="4.17">4.17</A>)What is an oid? What is a tid?</H4><P>
Oids are PostgreSQL's answer to unique row ids. Every row that is
created in PostgreSQL gets a unique oid. All oids generated during
...
...
@@ -1111,7 +1098,7 @@ values. Tids change after rows are modified or reloaded. They are used
by index entries to point to physical rows.<P>
<H4><ANAME="4.18">4.18</A>)What is the meaning of some of the terms
<H4><ANAME="4.18">4.18</A>)What is the meaning of some of the terms
used in PostgreSQL?</H4><P>
Some of the source code and older documentation use terms that have more
...
...
@@ -1206,20 +1193,20 @@ We hope to fix this limitation in a future release.
<H2><CENTER>Extending PostgreSQL</CENTER></H2><P>
<H4><ANAME="5.1">5.1</A>)I wrote a user-defined function. When
<H4><ANAME="5.1">5.1</A>)I wrote a user-defined function. When
I run it in <I>psql,</I> why does it dump core?</H4><P>
The problem could be a number of things. Try testing your user-defined
function in a stand alone test program first.
<H4><ANAME="5.2">5.2</A>)What does the message:
<H4><ANAME="5.2">5.2</A>)What does the message:
<I>NOTICE:PortalHeapMemoryFree: 0x402251d0 not in alloc set!</I> mean?</H4><P>
You are <I>pfree'ing</I> something that was not <I>palloc'ed.</I>
Beware of mixing <I>malloc/free</I> and <I>palloc/pfree.</I>
<H4><ANAME="5.3">5.3</A>)How can I contribute some nifty new types and
<H4><ANAME="5.3">5.3</A>)How can I contribute some nifty new types and
functions for PostgreSQL?</H4><P>
...
...
@@ -1227,13 +1214,13 @@ Send your extensions to the pgsql-hackers mailing list, and they will
eventually end up in the <I>contrib/</I> subdirectory.<P>
<H4><ANAME="5.4">5.4</A>)How do I write a C function to return a
<H4><ANAME="5.4">5.4</A>)How do I write a C function to return a
tuple?</H4><P>
This requires wizardry so extreme that the authors have never
tried it, though in principle it can be done.<P>
<H4><ANAME="5.5">5.5</A>)I have changed a source file. Why does the
<H4><ANAME="5.5">5.5</A>)I have changed a source file. Why does the
recompile does not see the change?</H4><P>
The Makefiles do not have the proper dependencies for include files. You