• Heikki Linnakangas's avatar
    Reduce memory usage of tsvector type analyze function. · da11977d
    Heikki Linnakangas authored
    compute_tsvector_stats() detoasted and kept in memory every tsvector value
    in the sample, but that can be a lot of memory. The original bug report
    described a case using over 10 gigabytes, with statistics target of 10000
    (the maximum).
    
    To fix, allocate a separate copy of just the lexemes that we keep around,
    and free the detoasted tsvector values as we go. This adds some palloc/pfree
    overhead, when you have a lot of distinct lexemes in the sample, but it's
    better than running out of memory.
    
    Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to
    all supported versions.
    
    Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org
    da11977d
ts_typanalyze.c 17.4 KB