• Tom Lane's avatar
    Tighten checks for whitespace in functions that parse identifiers etc. · 9ae2661f
    Tom Lane authored
    This patch replaces isspace() calls with scanner_isspace() in functions
    that are likely to be presented with non-ASCII input.  isspace() has
    the small advantage that it will correctly recognize no-break space
    in single-byte encodings (such as LATIN1); but it cannot work successfully
    for any multibyte character, and depending on platform it might return
    false positive results for some fragments of multibyte characters.  That's
    disastrous for functions that are trying to discard whitespace between
    valid strings, as noted in bug #14662 from Justin Muise.  Even treating
    no-break space as whitespace is pretty questionable for the usages touched
    here, because the core scanner would think it is an identifier character.
    
    Affected functions are parse_ident(), parseNameAndArgTypes (underlying
    regprocedurein() and siblings), SplitIdentifierString (used for parsing
    GUCs and options that are qualified names or lists of names), and
    SplitDirectoriesString (used for parsing GUCs that are lists of
    directories).
    
    All the functions adjusted here are parsing SQL identifiers and similar
    constructs, so it's reasonable to insist that their definition of
    whitespace match the core scanner.  So we can hope that this won't cause
    many backwards-compatibility problems.  I've left alone isspace() calls
    in places that aren't really expecting any non-ASCII input characters,
    such as float8in().
    
    Back-patch to all supported branches.
    
    Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us
    9ae2661f
misc.c 24.4 KB