• Tom Lane's avatar
    Avoid unnecessary out-of-memory errors during encoding conversion. · 8e10405c
    Tom Lane authored
    Encoding conversion uses the very simplistic rule that the output
    can't be more than 4X longer than the input, and palloc's a buffer
    of that size.  This results in failure to convert any string longer
    than 1/4 GB, which is becoming an annoying limitation.
    
    As a band-aid to improve matters, allow the allocated output buffer
    size to exceed 1GB.  We still insist that the final result fit into
    MaxAllocSize (1GB), though.  Perhaps it'd be safe to relax that
    restriction, but it'd require close analysis of all callers, which
    is daunting (not least because external modules might call these
    functions).  For the moment, this should allow a 2X to 4X improvement
    in the longest string we can convert, which is a useful gain in
    return for quite a simple patch.
    
    Also, once we have successfully converted a long string, repalloc
    the output down to the actual string length, returning the excess
    to the malloc pool.  This seems worth doing since we can usually
    expect to give back several MB if we take this path at all.
    
    This still leaves much to be desired, most notably that the assumption
    that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no
    guard code verifying that the output buffer isn't overrun.  Fixing
    that would require significant changes in the encoding conversion
    APIs, so it'll have to wait for some other day.
    
    The present patch seems safely back-patchable, so patch all supported
    branches.
    
    Alvaro Herrera and Tom Lane
    
    Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql
    Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
    8e10405c
mbutils.c 31 KB