Commits · 7505da2f459776a397177f7d591991f5591c2812 · Abuhujair Javed / Postgres FD Implementation

15 May, 2019 2 commits

Reverse order of newitem nbtree candidate splits. · 7505da2f

Peter Geoghegan authored May 15, 2019

Commit fab25024, which taught nbtree to choose candidate split points
more carefully, had _bt_findsplitloc() record all possible split points
in an initial pass over a page that is about to be split.  The order
that candidate split points were processed and stored in was assumed to
match the offset number order of split points on an imaginary version of
the page that contains the same items as the original, but also fits
newitem (the item that provoked the split precisely because it didn't
fit).

However, the order of split points in the final array was not quite what
was expected: the split point that makes newitem the firstright item
came after the split point that makes newitem the lastleft item -- not
before.  As a result, _bt_findsplitloc() could get confused about the
leftmost and rightmost tuples among all possible split points recorded
for the page.  This seems to have no appreciable impact on the quality
of the final split point chosen by _bt_findsplitloc(), but it's still
wrong.

To fix, switch the order in which newitem candidate splits are recorded
in.  This also makes it possible to describe candidate split points in
terms of which pair of adjoining tuples enclose the split point within
_bt_findsplitloc(), making it clearer why it's generally safe for
_bt_split() to expect lastleft and firstright tuples.

7505da2f

docs: properly indent PG 12 release notes · a429164e
Bruce Momjian authored May 15, 2019

a429164e

14 May, 2019 20 commits

Handle table_complete_speculative's succeeded argument as documented. · aa4b8c61

Andres Freund authored May 14, 2019

For some reason both callsite and the implementation for heapam had
the meaning inverted (i.e. succeeded == true was passed in case of
conflict). That's confusing.

I (Andres) briefly pondered whether it'd be better to rename
table_complete_speculative's argument to 'bool specConflict' or such,
but decided not to. The 'complete' in the function name for me makes
`succeeded` sound a bit better.

Reported-By: Ashwin Agrawal, Melanie Plageman, Heikki Linnakangas
Discussion:
   https://postgr.es/m/CALfoeitk7-TACwYv3hCw45FNPjkA86RfXg4iQ5kAOPhR+F1Y4w@mail.gmail.com
   https://postgr.es/m/97673451-339f-b21e-a781-998d06b1067c@iki.fi

aa4b8c61

Add isolation test for INSERT ON CONFLICT speculative insertion failure. · 08e2edc0

Andres Freund authored May 14, 2019

This path previously was not reliably covered. There was some
heuristic coverage via insert-conflict-toast.spec, but that test is
not deterministic, and only tested for a somewhat specific bug.

Backpatch, as this is a complicated and otherwise untested code
path. Unfortunately 9.5 cannot handle two waiting sessions, and thus
cannot execute this test.

Triggered by a conversion with Melanie Plageman.

Author: Andres Freund
Discussion: https://postgr.es/m/CAAKRu_a7hbyrk=wveHYhr4LbcRnRCG=yPUVoQYB9YO1CdUBE9Q@mail.gmail.com
Backpatch: 9.5-

08e2edc0

Fix "make clean" to clean out junk files left behind after ssl tests. · 6d2fba31
Tom Lane authored May 14, 2019
```
We .gitignore'd this junk, but we didn't actually remove it.
```
6d2fba31

Move logging.h and logging.c from src/fe_utils/ to src/common/. · fc9a62af

Tom Lane authored May 14, 2019

The original placement of this module in src/fe_utils/ is ill-considered,
because several src/common/ modules have dependencies on it, meaning that
libpgcommon and libpgfeutils now have mutual dependencies.  That makes it
pointless to have distinct libraries at all.  The intended design is that
libpgcommon is lower-level than libpgfeutils, so only dependencies from
the latter to the former are acceptable.

We already have the precedent that fe_memutils and a couple of other
modules in src/common/ are frontend-only, so it's not stretching anything
out of whack to treat logging.c as a frontend-only module in src/common/.
To the extent that such modules help provide a common frontend/backend
environment for the rest of common/ to use, it's a reasonable design.
(logging.c does not yet provide an ereport() emulation, but one can
dream.)

Hence, move these files over, and revert basically all of the build-system
changes made by commit cc8d4151.  There are no places that need to grow
new dependencies on libpgcommon, further reinforcing the idea that this
is the right solution.

Discussion: https://postgr.es/m/a912ffff-f6e4-778a-c86a-cf5c47a12933@2ndquadrant.com

fc9a62af

docs: Indent listitem tags in PG 12 release notes · b71dad22
Bruce Momjian authored May 14, 2019

b71dad22

Remove pg_rewind's private logging.h/logging.c files. · 53ddefba

Tom Lane authored May 14, 2019

The existence of these files became rather confusing with the
introduction of a widely-known logging.h header in commit cc8d4151.
(Indeed, there's already some duplicative #includes here, perhaps
betraying such confusion.)  The only thing left in them, after that
commit, is a progress-reporting function that's neither general-purpose
nor tied in any way to other logging infrastructure.  Hence, let's just
move that function to pg_rewind.c, and get rid of the separate files.

Discussion: https://postgr.es/m/3971.1557787914@sss.pgh.pa.us

53ddefba

Fix SQL-style substring() to have spec-compliant greediness behavior. · 7c850320

Tom Lane authored May 14, 2019

SQL's regular-expression substring() function is defined to have a
pattern argument that's separated into three subpatterns by escape-
double-quote markers; the function result is the part of the input
matching the second subpattern. The standard makes it clear that
if there is ambiguity about how to match the input to the subpatterns,
the first and third subpatterns should be taken to match the smallest
possible amount of text (i.e., they're "non greedy", in the terms of
our regex code). We were not doing it that way: the first subpattern
would eat the largest possible amount of text, causing the function
result to be shorter than what the spec requires.

Fix that by attaching explicit greediness quantifiers to the
subpatterns. (This depends on the regex fix in commit 8a29ed05;
before that, this didn't reliably change the regex engine's behavior.)

Also, by adding parentheses around each subpattern, we ensure that
"|" (OR) in the subpatterns behave sanely. Previously, "|" in the
first or third subpatterns didn't work.

This patch also makes the function throw error if you write more than
two escape-double-quote markers, and do something sane if you write
just one, and document that behavior. Previously, an odd number of
markers led to a confusing complaint about unbalanced parentheses,
while extra pairs of markers were just ignored. (Note that the spec
requires exactly two markers, but we've historically allowed there
to be none, and this patch preserves the old behavior for that case.)

In passing, adjust some substring() test cases that didn't really
prove what they said they were testing for: they used patterns
that didn't match the data string, so that the output would be
NULL whether or not the function was really strict.

Although this is certainly a bug fix, changing the behavior in back
branches seems undesirable: applications could perhaps be depending on
the old behavior, since it's not obviously wrong unless you read the
spec very closely. Hence, no back-patch.

Discussion: https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net

7c850320

In bootstrap mode, use default signal handling for SIGINT etc. · fb489e4b

Tom Lane authored May 14, 2019

Previously, the code pointed the standard process-termination signals
to postgres.c's die(). That would typically result in an attempt to
execute a transaction abort, which is not possible in bootstrap mode,
leading to PANIC. This choice seems to be a leftover from an old code
structure in which the same signal-assignment code was used for many
sorts of auxiliary processes, including interactive standalone
backends. It's not very sensible for bootstrap mode, which has no
interest in either interactivity or continuing after an error. We can
get better behavior with less effort by just letting normal process
termination happen, after which the parent initdb process will clean up.

This is basically cosmetic in any case, since initdb will react the
same way whether bootstrap dies on a signal or abort(). Given the
lack of previous complaints, I don't feel a need to back-patch,
even though the behavior is old.

Discussion: https://postgr.es/m/3850b11a.5121.16aaf827e4a.Coremail.thunder1@126.com

fb489e4b

Update SQL features/conformance information to SQL:2016 · 037165ca
Peter Eisentraut authored May 14, 2019

037165ca
Update information_schema for SQL:2016 · eb3a1376
Peter Eisentraut authored May 14, 2019
```
This is mainly a light renumbering to match the sections in the
standard.
```
eb3a1376

Update SQL keywords list to SQL:2016 · c29ba981

Peter Eisentraut authored May 14, 2019

Per previous convention (see
ace397e9), drop SQL:2008 and only keep
the latest two standards and SQL-92.

Note: SQL:2016-2 lists a large number of non-reserved keywords that
are really just information_schema column names related to new
features.  Those kinds of thing have not previously been listed as
keywords, and this was apparently done here by mistake, since these
keywords have been removed again in post-2016 working drafts.  So in
order to avoid bloating the keywords table unnecessarily, I have
omitted these erroneous keywords here.

c29ba981

docs: update partition item in PG 12 release notes · 356c8379

Bruce Momjian authored May 14, 2019

Reported-by: Amit Langote

Discussion: https://postgr.es/m/b7954643-41ef-a174-479d-1f8d4834f40a@lab.ntt.co.jp

356c8379

docs: fix duplicate wording in PG 12 release notes · 34d40bec

Bruce Momjian authored May 14, 2019

Reported-by: nickb@imap.cc

Discussion: https://postgr.es/m/6b3414e1-fcef-4ad9-b123-b3ab3702d3db@www.fastmail.com

34d40bec

Detect internal GiST page splits correctly during index build. · 22251686

Heikki Linnakangas authored May 14, 2019

As we descend the GiST tree during insertion, we modify any downlinks on
the way down to include the new tuple we're about to insert (if they don't
cover it already). Modifying an existing downlink might cause an internal
page to split, if the new downlink tuple is larger than the old one. If
that happens, we need to back up to the parent and re-choose a page to
insert to. We used to detect that situation, thanks to the NSN-LSN
interlock normally used to detect concurrent page splits, but that got
broken by commit 9155580f. With that commit, we now use a dummy constant
LSN value for every page during index build, so the LSN-NSN interlock no
longer works. I thought that was OK because there can't be any other
backends modifying the index during index build, but missed that the
insertion itself can modify the page we're inserting to. The consequence
was that we would sometimes insert the new tuple to an incorrect page, one
whose downlink doesn't cover the new tuple.

To fix, add a flag to the stack that keeps track of the state while
descending tree, to indicate that a page was split, and that we need to
retry the descend from the parent.

Thomas Munro first reported that the contrib/intarray regression test was
failing occasionally on the buildfarm after commit 9155580f. The failure
was intermittent, because the gistchoose() function is not deterministic,
and would only occasionally create the right circumstances for this bug to
cause the failure.

Patch by Anastasia Lubennikova, with some changes by me to make it work
correctly also when the internal page split also causes the "grandparent"
to be split.

Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJRzLo7tZExWfSbwM3XuK7aAK7FhdBV0FLkbUG%2BW0v0zg%40mail.gmail.com

22251686

Fix comment on when HOT update is possible. · e95d550b

Heikki Linnakangas authored May 14, 2019

The conditions listed in this comment have changed several times, and at
some point the thing that the "if so" referred to was negated.

The text was OK up to 9.6. It was differently wrong in v10, v11 and
master, so fix in all those versions.

e95d550b

Fix typo. · 7d9eca59
Etsuro Fujita authored May 14, 2019

7d9eca59

doc: Update OID item in PG 12 release notes · 0b62f0f2

Bruce Momjian authored May 13, 2019

Reported-by: Justin Pryzby

Discussion: https://postgr.es/m/20190513174759.GE23251@telsasoft.com

0b62f0f2

doc: improve wording of PG 12 releaase note partition item · f4125278

Bruce Momjian authored May 13, 2019

Reported-by: Amit Langote

Discussion: https://postgr.es/m/d5267ae5-bd4a-3e96-c21b-56bfa9fec7e8@lab.ntt.co.jp

f4125278

doc: properly attibute PG 12 pgbench release note item · 5d971565
Bruce Momjian authored May 13, 2019
```
Reported-by: Fabien COELHO

Discussion: https://postgr.es/m/alpine.DEB.2.21.1905130839140.13487@lancre
```
5d971565

Fix duplicated words in comments · 7e19929e

Michael Paquier authored May 14, 2019

Author: Stephen Amell
Discussion: https://postgr.es/m/539fa271-21b3-777e-a468-d96cffe9c768@gmail.com

7e19929e

13 May, 2019 12 commits

Standardize ItemIdData terminology. · ae7291ac

Peter Geoghegan authored May 13, 2019

The term "item pointer" should not be used to refer to ItemIdData
variables, since that is needlessly ambiguous. Only
ItemPointerData/ItemPointer variables should be called item pointers.

To fix, establish the convention that ItemIdData variables should always
be referred to either as "item identifiers" or "line pointers". The
term "item identifier" already predominates in docs and translatable
messages, and so should be the preferred alternative there.

Discussion: https://postgr.es/m/CAH2-Wz=c=MZQjUzde3o9+2PLAPuHTpVZPPdYxN=E4ndQ2--8ew@mail.gmail.com

ae7291ac

Doc: Refer to line pointers as item identifiers. · 08ca9d7f

Peter Geoghegan authored May 13, 2019

An upcoming HEAD-only patch will standardize the terminology around
ItemIdData variables/line pointers, ending the practice of referring to
them as "item pointers". Make the "Database Page Layout" docs
consistent with the new policy. The term "item identifier" is already
used in the same section, so stick with that.

Discussion: https://postgr.es/m/CAH2-Wz=c=MZQjUzde3o9+2PLAPuHTpVZPPdYxN=E4ndQ2--8ew@mail.gmail.com
Backpatch: All supported branches.

08ca9d7f

Fix logical replication's ideas about which type OIDs are built-in. · 32ebb351

Tom Lane authored May 13, 2019

Only hand-assigned type OIDs should be presumed to match across different
PG servers; those assigned during genbki.pl or during initdb are likely
to change due to addition or removal of unrelated objects.

This means that the cutoff should be FirstGenbkiObjectId (in HEAD)
or FirstBootstrapObjectId (before that), not FirstNormalObjectId.
Compare postgres_fdw's is_builtin() test.

It's likely that this error has no observable consequence in a
normally-functioning system, since ATM the only affected type OIDs are
system catalog rowtypes and information_schema types, which would not
typically be interesting for logical replication. But you could
probably break it if you tried hard, so back-patch.

Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us

32ebb351

Improve commentary about hack in is_publishable_class(). · e34ee993

Tom Lane authored May 13, 2019

The FirstNormalObjectId test here is a kluge that needs to go away,
but the only substitute we can think of is to add a column to pg_class,
which will take more work than can be handled right now.  Add some
commentary in the meanwhile.

Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us

e34ee993

Don't leave behind junk nbtree pages during split. · 9b42e713

Peter Geoghegan authored May 13, 2019

Commit 8fa30f90 reduced the elevel of a number of "can't happen"
_bt_split() errors from PANIC to ERROR.  At the same time, the new right
page buffer for the split could continue to be acquired well before the
critical section.  This was possible because it was relatively
straightforward to make sure that _bt_split() could not throw an error,
with a few specific exceptions.  The exceptional cases were safe because
they involved specific, well understood errors, making it possible to
consistently zero the right page before actually raising an error using
elog().  There was no danger of leaving around a junk page, provided
_bt_split() stuck to this coding rule.

Commit 8224de4f, which introduced INCLUDE indexes, added code to make
_bt_split() truncate away non-key attributes.  This happened at a point
that broke the rule around zeroing the right page in _bt_split().  If
truncation failed (perhaps due to palloc() failure), that would result
in an errant right page buffer with junk contents.  This could confuse
VACUUM when it attempted to delete the page, and should be avoided on
general principle.

To fix, reorganize _bt_split() so that truncation occurs before the new
right page buffer is even acquired.  A junk page/buffer will not be left
behind if _bt_nonkey_truncate()/_bt_truncate() raise an error.

Discussion: https://postgr.es/m/CAH2-WzkcWT_-NH7EeL=Az4efg0KCV+wArygW8zKB=+HoP=VWMw@mail.gmail.com
Backpatch: 11-, where INCLUDE indexes were introduced.

9b42e713

Improve comment for att_isnull. · 221b377f

Robert Haas authored May 13, 2019

The comment implies that a 1 in the null bitmap indicates a null value,
but actually a 0 in the null bitmap indicates a null value. Try to
be more clear.

Patch by me; proposed wording reviewed by Alvaro Herrera and Tom Lane.

Discussion: http://postgr.es/m/CA+TgmobHOP8r6cG+UnsDFMrS30-m=jRrCBhgw-nFkn0k9QnFsg@mail.gmail.com

221b377f

Fix misuse of an integer as a bool. · ddf927fb

Tom Lane authored May 13, 2019

pgtls_read_pending is declared to return bool, but what the underlying
SSL_pending function returns is a count of available bytes.

This is actually somewhat harmless if we're using C99 bools, but in
the back branches it's a live bug: if the available-bytes count happened
to be a multiple of 256, it would get converted to a zero char value.
On machines where char is signed, counts of 128 and up could misbehave
as well.  The net effect is that when using SSL, libpq might block
waiting for data even though some has already been received.

Broken by careless refactoring in commit 4e86f1b1, so back-patch
to 9.5 where that came in.

Per bug #15802 from David Binderman.

Discussion: https://postgr.es/m/15802-f0911a97f0346526@postgresql.org

ddf927fb

postgres_fdw: Fix typo in comment. · cc866941
Etsuro Fujita authored May 13, 2019

cc866941

doc: PG 12 release notes: normalize attribution names · f86b0c3c

Bruce Momjian authored May 12, 2019

Reported-by: David Rowley

Discussion: https://postgr.es/m/CAKJS1f-ktEhmQ2zJQ1L1niuJ9KB8WPA-bE-AhGiFsSO6QASB_w@mail.gmail.com

f86b0c3c

doc: adjust PG 12 release note sections · a6927996
Bruce Momjian authored May 12, 2019
```
Tighten section designations.
```
a6927996
docs: fix typo in mention of MSVC · fefb6a75
Bruce Momjian authored May 12, 2019

fefb6a75

Fix incorrect return value in JSON equality function for scalars · 1171dbde

Michael Paquier authored May 13, 2019

equalsJsonbScalarValue() uses a boolean as return type, however for one
code path -1 gets returned, which is confusing.  The origin of the
confusion is visibly that this code got copy-pasted from
compareJsonbScalarValue() since it has been introduced in d1d50bff.

No backpatch, as this is only cosmetic.

Author: Rikard Falkeborn
Discussion: https://postgr.es/m/CADRDgG7mJnek6HNW13f+LF6V=6gag9PM+P7H5dnyWZAv49aBGg@mail.gmail.com

1171dbde

12 May, 2019 3 commits

Fix misoptimization of "{1,1}" quantifiers in regular expressions. · 8a29ed05

Tom Lane authored May 12, 2019

A bounded quantifier with m = n = 1 might be thought a no-op.  But
according to our documentation (which traces back to Henry Spencer's
original man page) it still imposes greediness, or non-greediness in the
case of the non-greedy variant "{1,1}?", on whatever it's attached to.

This turns out not to work though, because parseqatom() optimizes away
the m = n = 1 case without regard for whether it's supposed to change
the greediness of the argument RE.

We can fix this by just not applying the optimization when the greediness
needs to change; the subsequent general cases handle it fine.

The three cases in which we can still apply the optimization are
(a) no quantifier, or quantifier does not impose a preference;
(b) atom has no greediness property, implying it cannot match a
variable amount of text anyway; or
(c) quantifier's greediness is same as atom's.
Note that in most cases where one of these applies, we'd have exited
earlier in the "not a messy case" fast path.  I think it's now only
possible to get to the optimization when the atom involves capturing
parentheses or a non-top-level backref.

Back-patch to all supported branches.  I'd ordinarily be hesitant to
put a subtle behavioral change into back branches, but in this case
it's very hard to see a reason why somebody would write "{1,1}?" unless
they're trying to get the documented change-of-greediness behavior.

Discussion: https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net

8a29ed05

Fail pgwin32_message_to_UTF16() for SQL_ASCII messages. · d02768dd

Noah Misch authored May 12, 2019

The function had been interpreting SQL_ASCII messages as UTF8, throwing
an error when they were invalid UTF8. The new behavior is consistent
with pg_do_encoding_conversion(). This affects LOG_DESTINATION_STDERR
and LOG_DESTINATION_EVENTLOG, which will send untranslated bytes to
write() and ReportEventA(). On buildfarm member bowerbird, enabling
log_connections caused an error whenever the role name was not valid
UTF8. Back-patch to 9.4 (all supported versions).

Discussion: https://postgr.es/m/20190512015615.GD1124997@rfd.leadboat.com

d02768dd

Rearrange pgstat_bestart() to avoid failures within its critical section. · 85ccb689

Tom Lane authored May 11, 2019

We long ago decided to design the shared PgBackendStatus data structure to
minimize the cost of writing status updates, which means that writers just
have to increment the st_changecount field twice. That isn't hooked into
any sort of resource management mechanism, which means that if something
were to throw error between the two increments, the st_changecount field
would be left odd indefinitely. That would cause readers to lock up.
Now, since it's also a bad idea to leave the field odd for longer than
absolutely necessary (because readers will spin while we have it set),
the expectation was that we'd treat these segments like spinlock critical
sections, with only short, more or less straight-line, code in them.

That was fine as originally designed, but commit 9029f4b3 broke it
by inserting a significant amount of non-straight-line code into
pgstat_bestart(), code that is very capable of throwing errors, not to
mention taking a significant amount of time during which readers will spin.
We have a report from Neeraj Kumar of readers actually locking up, which
I suspect was due to an encoding conversion error in X509_NAME_to_cstring,
though conceivably it was just a garden-variety OOM failure.

Subsequent commits have loaded even more dubious code into pgstat_bestart's
critical section (and commit fc70a4b0 deserves some kind of booby prize
for managing to miss the critical section entirely, although the negative
consequences seem minimal given that the PgBackendStatus entry should be
seen by readers as inactive at that point).

The right way to fix this mess seems to be to compute all these values
into a local copy of the process' PgBackendStatus struct, and then just
copy the data back within the critical section proper. This plan can't
be implemented completely cleanly because of the struct's heavy reliance
on out-of-line strings, which we must initialize separately within the
critical section. But still, the critical section is far smaller and
safer than it was before.

In hopes of forestalling future errors of the same ilk, rename the
macros for st_changecount management to make it more apparent that
the writer-side macros create a critical section. And to prevent
the worst consequences if we nonetheless manage to mess it up anyway,
adjust those macros so that they really are a critical section, ie
they now bump CritSectionCount. That doesn't add much overhead, and
it guarantees that if we do somehow throw an error while the counter
is odd, it will lead to PANIC and a database restart to reset shared
memory.

Back-patch to 9.5 where the problem was introduced.

In HEAD, also fix an oversight in commit b0b39f72: it failed to teach
pgstat_read_current_status to copy st_gssstatus data from shared memory to
local memory. Hence, subsequent use of that data within the transaction
would potentially see changing data that it shouldn't see.

Discussion: https://postgr.es/m/CAPR3Wj5Z17=+eeyrn_ZDG3NQGYgMEOY6JV6Y-WRRhGgwc16U3Q@mail.gmail.com

85ccb689

11 May, 2019 3 commits
- docs: remove second mention of btree max length reduction · 4217d15d
  Bruce Momjian authored May 11, 2019
```
I already added that to the incompatibility section as a separate item.

Reported-by: Peter Geoghegan
```
  4217d15d
- doc: remove pg_config mention from PG 12 release notes · 31f11f96
  Bruce Momjian authored May 11, 2019
```
Reported-by: Tom Lane

Discussion: https://postgr.es/m/28209.1556556696@sss.pgh.pa.us
```
  31f11f96
- docs: PG 12 release notes, mention that REINDEX could now fail · d56fd635
  Bruce Momjian authored May 11, 2019
```
This is because of the new tid in the index entry.

Reported-by: Peter Geoghegan
```
  d56fd635