• David Rowley's avatar
    Trigger autovacuum based on number of INSERTs · b07642db
    David Rowley authored
    Traditionally autovacuum has only ever invoked a worker based on the
    estimated number of dead tuples in a table and for anti-wraparound
    purposes. For the latter, with certain classes of tables such as
    insert-only tables, anti-wraparound vacuums could be the first vacuum that
    the table ever receives. This could often lead to autovacuum workers being
    busy for extended periods of time due to having to potentially freeze
    every page in the table. This could be particularly bad for very large
    tables. New clusters, or recently pg_restored clusters could suffer even
    more as many large tables may have the same relfrozenxid, which could
    result in large numbers of tables requiring an anti-wraparound vacuum all
    at once.
    
    Here we aim to reduce the work required by anti-wraparound and aggressive
    vacuums in general, by triggering autovacuum when the table has received
    enough INSERTs. This is controlled by adding two new GUCs and reloptions;
    autovacuum_vacuum_insert_threshold and
    autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
    existing scale factor and threshold controls, only base themselves off the
    number of inserts since the last vacuum, rather than the number of dead
    tuples. New controls were added rather than reusing the existing
    controls, to allow these new vacuums to be tuned independently and perhaps
    even completely disabled altogether, which can be done by setting
    autovacuum_vacuum_insert_threshold to -1.
    
    We make no attempt to skip index cleanup operations on these vacuums as
    they may trigger for an insert-mostly table which continually doesn't have
    enough dead tuples to trigger an autovacuum for the purpose of removing
    those dead tuples. If we were to skip cleaning the indexes in this case,
    then it is possible for the index(es) to become bloated over time.
    
    There are additional benefits to triggering autovacuums based on inserts,
    as tables which never contain enough dead tuples to trigger an autovacuum
    are now more likely to receive a vacuum, which can mark more of the table
    as "allvisible" and encourage the query planner to make use of Index Only
    Scans.
    
    Currently, we still obey vacuum_freeze_min_age when triggering these new
    autovacuums based on INSERTs. For large insert-only tables, it may be
    beneficial to lower the table's autovacuum_freeze_min_age so that tuples
    are eligible to be frozen sooner. Here we've opted not to zero that for
    these types of vacuums, since the table may just be insert-mostly and we
    may otherwise freeze tuples that are still destined to be updated or
    removed in the near future.
    
    There was some debate to what exactly the new scale factor and threshold
    should default to. For now, these are set to 0.2 and 1000, respectively.
    There may be some motivation to adjust these before the release.
    
    Author: Laurenz Albe, Darafei Praliaskouski
    Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
    Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
    b07642db
postgresql.conf.sample 26.8 KB