• Tom Lane's avatar
    Rethink behavior of pg_import_system_collations(). · 0b13b2a7
    Tom Lane authored
    Marco Atzeri reported that initdb would fail if "locale -a" reported
    the same locale name more than once.  All previous versions of Postgres
    implicitly de-duplicated the results of "locale -a", but the rewrite
    to move the collation import logic into C had lost that property.
    It had also lost the property that locale names matching built-in
    collation names were silently ignored.
    
    The simplest way to fix this is to make initdb run the function in
    if-not-exists mode, which means that there's no real use-case for
    non if-not-exists mode; we might as well just drop the boolean argument
    and simplify the function's definition to be "add any collations not
    already known".  This change also gets rid of some odd corner cases
    caused by the fact that aliases were added in if-not-exists mode even
    if the function argument said otherwise.
    
    While at it, adjust the behavior so that pg_import_system_collations()
    doesn't spew "collation foo already exists, skipping" messages during a
    re-run; that's completely unhelpful, especially since there are often
    hundreds of them.  And make it return a count of the number of collations
    it did add, which seems like it might be helpful.
    
    Also, re-integrate the previous coding's property that it would make a
    deterministic selection of which alias to use if there were conflicting
    possibilities.  This would only come into play if "locale -a" reports
    multiple equivalent locale names, say "de_DE.utf8" and "de_DE.UTF-8",
    but that hardly seems out of the question.
    
    In passing, fix incorrect behavior in pg_import_system_collations()'s
    ICU code path: it neglected CommandCounterIncrement, which would result
    in failures if ICU returns duplicate names, and it would try to create
    comments even if a new collation hadn't been created.
    
    Also, reorder operations in initdb so that the 'ucs_basic' collation
    is created before calling pg_import_system_collations() not after.
    This prevents a failure if "locale -a" were to report a locale named
    that.  There's no reason to think that that ever happens in the wild,
    but the old coding would have survived it, so let's be equally robust.
    
    Discussion: https://postgr.es/m/20c74bc3-d6ca-243d-1bbc-12f17fa4fe9a@gmail.com
    0b13b2a7
initdb.c 84.3 KB