• Tom Lane's avatar
    Teach UtfToLocal/LocalToUtf to support algorithmic encoding conversions. · 7730f48e
    Tom Lane authored
    Until now, these functions have only supported encoding conversions using
    lookup tables, which is fine as long as there's not too many code points
    to convert.  However, GB18030 expects all 1.1 million Unicode code points
    to be convertible, which would require a ridiculously-sized lookup table.
    Fortunately, a large fraction of those conversions can be expressed through
    arithmetic, ie the conversions are one-to-one in certain defined ranges.
    To support that, provide a callback function that is used after consulting
    the lookup tables.  (This patch doesn't actually change anything about the
    GB18030 conversion behavior, just provide infrastructure for fixing it.)
    
    Since this requires changing the APIs of UtfToLocal/LocalToUtf anyway,
    take the opportunity to rearrange their argument lists into what seems
    to me a saner order.  And beautify the call sites by using lengthof()
    instead of error-prone sizeof() arithmetic.
    
    In passing, also mark all the lookup tables used by these calls "const".
    This moves an impressive amount of stuff into the text segment, at least
    on my machine, and is safer anyhow.
    7730f48e
utf8_to_win1255.map 2.14 KB