1. 17 Jan, 2020 1 commit
  2. 16 Jan, 2020 5 commits
    • Tom Lane's avatar
      Rationalize code placement between wchar.c, encnames.c, and mbutils.c. · 5afaa2e4
      Tom Lane authored
      Move all the backend-only code that'd crept into wchar.c and encnames.c
      into mbutils.c.
      
      To remove the last few #ifdef dependencies from wchar.c and encnames.c,
      also make the following changes:
      
      * Adjust get_encoding_name_for_icu to return NULL, not throw an error,
      for unsupported encodings.  Its sole caller can perfectly well throw an
      error instead.  (While at it, I also made this function and its sibling
      is_encoding_supported_by_icu proof against out-of-range encoding IDs.)
      
      * Remove the overlength-name error condition from pg_char_to_encoding.
      It's completely silly not to treat that just like any other
      the-name-is-not-in-the-table case.
      
      Also, get rid of pg_mic_mblen --- there's no obvious reason why
      conv.c shouldn't call pg_mule_mblen instead.
      
      Other than that, this is just code movement and comment-polishing with
      no functional changes.  Notably, I reordered declarations in pg_wchar.h
      to show which functions are frontend-accessible and which are not.
      
      Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
      5afaa2e4
    • Tom Lane's avatar
      Update header comments for wchar.c and encnames.c. · 3d4cb5d6
      Tom Lane authored
      Bring these into common style (including having proper copyright
      notices) and adjust their self-declaration of where they live.
      
      Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
      3d4cb5d6
    • Tom Lane's avatar
      Move wchar.c and encnames.c to src/common/. · e6afa891
      Tom Lane authored
      Formerly, various frontend directories symlinked these two sources
      and then built them locally.  That's an ancient, ugly hack, and
      we now have a much better way: put them into libpgcommon.
      So do that.  (The immediate motivation for this is the prospect
      of having to introduce still more symlinking if we don't.)
      
      This commit moves these two files absolutely verbatim, for ease of
      reviewing the git history.  There's some follow-on work to be done
      that will modify them a bit.
      
      Robert Haas, Tom Lane
      
      Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
      e6afa891
    • Robert Haas's avatar
      Fix problems with "read only query" checks, and refactor the code. · 2eb34ac3
      Robert Haas authored
      Previously, check_xact_readonly() was responsible for determining
      which types of queries could not be run in a read-only transaction,
      standard_ProcessUtility() was responsibility for prohibiting things
      which were allowed in read only transactions but not in recovery, and
      utility commands were basically prohibited in bulk in parallel mode by
      calls to CommandIsReadOnly() in functions.c and spi.c.  This situation
      was confusing and error-prone. Accordingly, move all the checks to a
      new function ClassifyUtilityCommandAsReadOnly(), which determines the
      degree to which a given statement is read only.
      
      In the old code, check_xact_readonly() inadvertently failed to handle
      several statement types that actually should have been prohibited,
      specifically T_CreatePolicyStmt, T_AlterPolicyStmt, T_CreateAmStmt,
      T_CreateStatsStmt, T_AlterStatsStmt, and T_AlterCollationStmt.  As a
      result, thes statements were erroneously allowed in read only
      transactions, parallel queries, and standby operation. Generally, they
      would fail anyway due to some lower-level error check, but we
      shouldn't rely on that.  In the new code structure, future omissions
      of this type should cause ClassifyUtilityCommandAsReadOnly() to
      complain about an unrecognized node type.
      
      As a fringe benefit, this means we can allow certain types of utility
      commands in parallel mode, where it's safe to do so. This allows
      ALTER SYSTEM, CALL, DO, CHECKPOINT, COPY FROM, EXPLAIN, and SHOW.
      It might be possible to allow additional commands with more work
      and thought.
      
      Along the way, document the thinking process behind the current set
      of checks, as per discussion especially with Peter Eisentraut. There
      is some interest in revising some of these rules, but that seems
      like a job for another patch.
      
      Patch by me, reviewed by Tom Lane, Stephen Frost, and Peter
      Eisentraut.
      
      Discussion: http://postgr.es/m/CA+TgmoZ_rLqJt5sYkvh+JpQnfX0Y+B2R+qfi820xNih6x-FQOQ@mail.gmail.com
      2eb34ac3
    • Tom Lane's avatar
      Minor code beautification in regexp.c. · 0db7c670
      Tom Lane authored
      Remove duplicated code (apparently introduced by commit c8ea87e4).
      Also get rid of some PG_USED_FOR_ASSERTS_ONLY variables we don't
      really need to have.
      
      Li Japin, Tom Lane
      
      Discussion: https://postgr.es/m/PS1PR0601MB3770A5595B6E5E3FD6F35724B6360@PS1PR0601MB3770.apcprd06.prod.outlook.com
      0db7c670
  3. 15 Jan, 2020 5 commits
    • Tom Lane's avatar
      Restructure ALTER TABLE execution to fix assorted bugs. · 1281a5c9
      Tom Lane authored
      We've had numerous bug reports about how (1) IF NOT EXISTS clauses in
      ALTER TABLE don't behave as-expected, and (2) combining certain actions
      into one ALTER TABLE doesn't work, though executing the same actions as
      separate statements does.  This patch cleans up all of the cases so far
      reported from the field, though there are still some oddities associated
      with identity columns.
      
      The core problem behind all of these bugs is that we do parse analysis
      of ALTER TABLE subcommands too soon, before starting execution of the
      statement.  The root of the bugs in group (1) is that parse analysis
      schedules derived commands (such as a CREATE SEQUENCE for a serial
      column) before it's known whether the IF NOT EXISTS clause should cause
      a subcommand to be skipped.  The root of the bugs in group (2) is that
      earlier subcommands may change the catalog state that later subcommands
      need to be parsed against.
      
      Hence, postpone parse analysis of ALTER TABLE's subcommands, and do
      that one subcommand at a time, during "phase 2" of ALTER TABLE which
      is the phase that does catalog rewrites.  Thus the catalog effects
      of earlier subcommands are already visible when we analyze later ones.
      (The sole exception is that we do parse analysis for ALTER COLUMN TYPE
      subcommands during phase 1, so that their USING expressions can be
      parsed against the table's original state, which is what we need.
      Arguably, these bugs stem from falsely concluding that because ALTER
      COLUMN TYPE must do early parse analysis, every other command subtype
      can too.)
      
      This means that ALTER TABLE itself must deal with execution of any
      non-ALTER-TABLE derived statements that are generated by parse analysis.
      Add a suitable entry point to utility.c to accept those recursive
      calls, and create a struct to pass through the information needed by
      the recursive call, rather than making the argument lists of
      AlterTable() and friends even longer.
      
      Getting this to work correctly required a little bit of fiddling
      with the subcommand pass structure, in particular breaking up
      AT_PASS_ADD_CONSTR into multiple passes.  But otherwise it's mostly
      a pretty straightforward application of the above ideas.
      
      Fixing the residual issues for identity columns requires refactoring of
      where the dependency link from an identity column to its sequence gets
      set up.  So that seems like suitable material for a separate patch,
      especially since this one is pretty big already.
      
      Discussion: https://postgr.es/m/10365.1558909428@sss.pgh.pa.us
      1281a5c9
    • Alvaro Herrera's avatar
      Report progress of ANALYZE commands · a166d408
      Alvaro Herrera authored
      This uses the progress reporting infrastructure added by c16dc1ac,
      adding support for ANALYZE.
      Co-authored-by: default avatarÁlvaro Herrera <alvherre@alvh.no-ip.org>
      Co-authored-by: default avatarTatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp>
      Reviewed-by: Julien Rouhaud, Robert Haas, Anthony Nowocien, Kyotaro Horiguchi,
      	Vignesh C, Amit Langote
      a166d408
    • Peter Eisentraut's avatar
      Remove libpq.rc, use win32ver.rc for libpq · 16a4a3d5
      Peter Eisentraut authored
      For historical reasons, libpq used a separate libpq.rc file for the
      Windows builds while all other components use a common file
      win32ver.rc.  With a bit of tweaking, the libpq build can also use the
      win32ver.rc file.  This removes a bit of duplicative code.
      Reviewed-by: default avatarKyotaro Horiguchi <horikyota.ntt@gmail.com>
      Reviewed-by: default avatarMichael Paquier <michael@paquier.xyz>
      Discussion: https://www.postgresql.org/message-id/flat/ad505e61-a923-e114-9f38-9867d161073f@2ndquadrant.com
      16a4a3d5
    • Michael Paquier's avatar
      Fix buggy logic in isTempNamespaceInUse() · ac5bdf62
      Michael Paquier authored
      The logic introduced in this routine as of 246a6c8f would report an
      incorrect result when a session calls it to check if the temporary
      namespace owned by the session is in use or not.  It is possible to
      optimize more the routine in this case to avoid a PGPROC lookup, but
      let's keep the logic simple.  As this routine is used only by autovacuum
      for now, there were no live bugs, still let's be correct for any future
      code involving it.
      
      Author: Michael Paquier
      Reviewed-by: Julien Rouhaud
      Discussion: https://postgr.es/m/20200113093703.GA41902@paquier.xyz
      Backpatch-through: 11
      ac5bdf62
    • Amit Kapila's avatar
      Introduce IndexAM fields for parallel vacuum. · 4d8a8d0c
      Amit Kapila authored
      Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions
      in IndexAmRoutine for parallel vacuum.  The amusemaintenanceworkmem tells
      whether a particular IndexAM uses maintenance_work_mem or not.  This will
      help in controlling the memory used by individual workers as otherwise,
      each worker can consume memory equal to maintenance_work_mem.  The
      amparallelvacuumoptions tell whether a particular IndexAM participates in
      a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of
      vacuum.
      
      Author: Masahiko Sawada and Amit Kapila
      Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas
      Discussion:
      https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
      https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
      4d8a8d0c
  4. 14 Jan, 2020 9 commits
  5. 13 Jan, 2020 8 commits
    • Tom Lane's avatar
      Reduce size of backend scanner's tables. · 7f380c59
      Tom Lane authored
      Previously, the core scanner's yy_transition[] array had 37045 elements.
      Since that number is larger than INT16_MAX, Flex generated the array to
      contain 32-bit integers.  By reimplementing some of the bulkier scanner
      rules, this patch reduces the array to 20495 elements.  The much smaller
      total length, combined with the consequent use of 16-bit integers for
      the array elements reduces the binary size by over 200kB.  This was
      accomplished in two ways:
      
      1. Consolidate handling of quote continuations into a new start condition,
      rather than duplicating that logic for five different string types.
      
      2. Treat Unicode strings and identifiers followed by a UESCAPE sequence
      as three separate tokens, rather than one.  The logic to de-escape
      Unicode strings is moved to the filter code in parser.c, which already
      had the ability to provide special processing for token sequences.
      While we could have implemented the conversion in the grammar, that
      approach was rejected for performance and maintainability reasons.
      
      Performance in microbenchmarks of raw parsing seems equal or slightly
      faster in most cases, and it's reasonable to expect that in real-world
      usage (with more competition for the CPU cache) there will be a larger
      win.  The exception is UESCAPE sequences; lexing those is about 10%
      slower, primarily because the scanner now has to be called three times
      rather than one.  This seems acceptable since that feature is very
      rarely used.
      
      The psql and epcg lexers are likewise modified, primarily because we
      want to keep them all in sync.  Since those lexers don't use the
      space-hogging -CF option, the space savings is much less, but it's
      still good for perhaps 10kB apiece.
      
      While at it, merge the ecpg lexer's handling of C-style comments used
      in SQL and in C.  Those have different rules regarding nested comments,
      but since we already have the ability to keep track of the previous
      start condition, we can use that to handle both cases within a single
      start condition.  This matches the core scanner more closely.
      
      John Naylor
      
      Discussion: https://postgr.es/m/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw@mail.gmail.com
      7f380c59
    • Peter Eisentraut's avatar
      Fix base backup with database OIDs larger than INT32_MAX · 259bbe17
      Peter Eisentraut authored
      The use of pg_atoi() for parsing a string into an Oid fails for values
      larger than INT32_MAX, since OIDs are unsigned.  Instead, use
      atooid().  While this has less error checking, the contents of the
      data directory are expected to be trustworthy, so we don't need to go
      out of our way to do full error checking.
      
      Discussion: https://www.postgresql.org/message-id/flat/dea47fc8-6c89-a2b1-07e3-754ff1ab094b%402ndquadrant.com
      259bbe17
    • Amit Kapila's avatar
      Fix typo. · 23d0dfa8
      Amit Kapila authored
      Reported-by: Antonin Houska
      Author: Antonin Houska
      Backpatch-through: 11, where it was introduced
      Discussion: https://postgr.es/m/2246.1578900133@antos
      23d0dfa8
    • Michael Paquier's avatar
      Fix comment in heapam.c · 7689d907
      Michael Paquier authored
      Improvement per suggestion from Tom Lane.
      
      Author: Daniel Gustafsson
      Discussion: https://postgr.es/m/FED18699-4270-4778-8DA8-10F119A5ECF3@yesql.se
      7689d907
    • Andrew Dunstan's avatar
      cebf9d6e
    • Amit Kapila's avatar
      Delete empty pages in each pass during GIST VACUUM. · 4e514c61
      Amit Kapila authored
      Earlier, we use to postpone deleting empty pages till the second stage of
      vacuum to amortize the cost of scanning internal pages.  However, that can
      sometimes (say vacuum is canceled or errored between first and second
      stage) delay the pages to be recycled.
      
      Another thing is that to facilitate deleting empty pages in the second
      stage, we need to share the information about internal and empty pages
      between different stages of vacuum.  It will be quite tricky to share this
      information via DSM which is required for the upcoming parallel vacuum
      patch.
      
      Also, it will bring the logic to reclaim deleted pages closer to nbtree
      where we delete empty pages in each pass.
      
      Overall, the advantages of deleting empty pages in each pass outweigh the
      advantages of postponing the same.
      
      Author: Dilip Kumar, with changes by Amit Kapila
      Reviewed-by: Sawada Masahiko and Amit Kapila
      Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
      4e514c61
    • Tomas Vondra's avatar
      Apply multiple multivariate MCV lists when possible · eae056c1
      Tomas Vondra authored
      Until now we've only used a single multivariate MCV list per relation,
      covering the largest number of clauses. So for example given a query
      
          SELECT * FROM t WHERE a = 1 AND b =1 AND c = 1 AND d = 1
      
      and extended statistics on (a,b) and (c,d), we'd only pick and use one
      of them. This commit improves this by repeatedly picking and applying
      the best statistics (matching the largest number of remaining clauses)
      until no additional statistics is applicable.
      
      This greedy algorithm is simple, but may not be optimal. A different
      choice of statistics may leave fewer clauses unestimated and/or give
      better estimates for some other reason.
      
      This can however happen only when there are overlapping statistics, and
      selecting one makes it impossible to use the other. E.g. with statistics
      on (a,b), (c,d), (b,c,d), we may pick either (a,b) and (c,d) or (b,c,d).
      But it's not clear which option is the best one.
      
      We however assume cases like this are rare, and the easiest solution is
      to define statistics covering the whole group of correlated columns. In
      the future we might support overlapping stats, using some of the clauses
      as conditions (in conditional probability sense).
      
      Author: Tomas Vondra
      Reviewed-by: Mark Dilger, Kyotaro Horiguchi
      Discussion: https://postgr.es/m/20191028152048.jc6pqv5hb7j77ocp@development
      eae056c1
    • Tomas Vondra's avatar
      Apply all available functional dependencies · aaa67618
      Tomas Vondra authored
      When considering functional dependencies during selectivity estimation,
      it's not necessary to bother with selecting the best extended statistic
      object and then use just dependencies from it. We can simply consider
      all applicable functional dependencies at once.
      
      This means we need to deserialie all (applicable) dependencies before
      applying them to the clauses. This is a bit more expensive than picking
      the best statistics and deserializing dependencies for it. To minimize
      the additional cost, we ignore statistics that are not applicable.
      
      Author: Tomas Vondra
      Reviewed-by: Mark Dilger
      Discussion: https://postgr.es/m/20191028152048.jc6pqv5hb7j77ocp@development
      aaa67618
  6. 12 Jan, 2020 2 commits
    • Tom Lane's avatar
      Fix edge-case crashes and misestimation in range containment selectivity. · 652686a3
      Tom Lane authored
      When estimating the selectivity of "range_var <@ range_constant" or
      "range_var @> range_constant", if the upper (or respectively lower)
      bound of the range_constant was above the last bin of the range_var's
      histogram, the code would access uninitialized memory and potentially
      crash (though it seems the probability of a crash is quite low).
      Handle the endpoint cases explicitly to fix that.
      
      While at it, be more paranoid about the possibility of getting NaN
      or other silly results from the range type's subdiff function.
      And improve some comments.
      
      Ordinarily we'd probably add a regression test case demonstrating
      the bug in unpatched code.  But it's too hard to get it to crash
      reliably because of the uninitialized-memory dependence, so skip that.
      
      Per bug #16122 from Adam Scott.  It's been broken from the beginning,
      apparently, so backpatch to all supported branches.
      
      Diagnosis by Michael Paquier, patch by Andrey Borodin and Tom Lane.
      
      Discussion: https://postgr.es/m/16122-eb35bc248c806c15@postgresql.org
      652686a3
    • Michael Paquier's avatar
      Remove incorrect assertion for INSERT in logical replication's publisher · 1088729e
      Michael Paquier authored
      On the publisher, it was assumed that an INSERT change cannot happen for
      a relation with no replica identity.  However this is true only for a
      change that needs references to old rows, aka UPDATE or DELETE, so
      trying to use logical replication with a relation that has no replica
      identity led to an assertion failure in the publisher when issuing an
      INSERT.  This commit removes the incorrect assertion, and adds more
      regression tests to provide coverage for relations without replica
      identity.
      
      Reported-by: Neha Sharma
      Author: Dilip Kumar, Michael Paquier
      Reviewed-by: Andres Freund
      Discussion: https://postgr.es/m/CANiYTQsL1Hb8_Km08qd32svrqNumXLJeoGo014O7VZymgOhZEA@mail.gmail.com
      Backpatch-through: 10
      1088729e
  7. 11 Jan, 2020 4 commits
  8. 10 Jan, 2020 5 commits
  9. 09 Jan, 2020 1 commit