• Tom Lane's avatar
    Add caching of ctype.h/wctype.h results in regc_locale.c. · e00f68e4
    Tom Lane authored
    While this doesn't save a huge amount of runtime, it still seems worth
    doing, especially since I realized that the data copying I did in my first
    draft was quite unnecessary.  In this version, once we have the results
    cached, getting them back for re-use is really very cheap.
    
    Also, remove the hard-wired limitation to not consider wctype.h results for
    character codes above 255.  It turns out that we can't push the limit as
    far up as I'd originally hoped, because the regex colormap code is not
    efficient enough to cope very well with character classes containing many
    thousand letters, which a Unicode locale is entirely capable of producing.
    Still, we can push it up to U+7FF (which I chose as the limit of 2-byte
    UTF8 characters), which will at least make Eastern Europeans happy pending
    a better solution.  Thus, this commit resolves the specific complaint in
    bug #6457, but not the more general issue that letters of non-western
    alphabets are mostly not recognized as matching [[:alpha:]].
    e00f68e4
regc_pg_locale.c 27 KB