• Tom Lane's avatar
    Fix ancient encoding error in hungarian.stop. · fd90b5d5
    Tom Lane authored
    When we grabbed this file off the Snowball project's website, we mistakenly
    supposed that it was in LATIN1 encoding, but evidently it was actually in
    LATIN2.  This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in
    LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in
    bug #10589 from Zoltán Sörös.  We'd have messed up u-double-acute too,
    but there aren't any of those in the file.  Other characters used in the
    file have the same codes in LATIN1 and LATIN2, which no doubt helped hide
    the problem for so long.
    
    The error is not only ours: the Snowball project also was confused about
    which encoding is required for Hungarian.  But dealing with that will
    require source-code changes that I'm not at all sure we'll wish to
    back-patch.  Fixing the stopword file seems reasonably safe to back-patch
    however.
    fd90b5d5
hungarian.stop 1.2 KB