• Teodor Sigaev's avatar
    improve support of agglutinative languages (query with compound words). · 324300bc
    Teodor Sigaev authored
    regression=# select to_tsquery( '\'fotballklubber\'');
                       to_tsquery
    ------------------------------------------------
     'fotball' & 'klubb' | 'fot' & 'ball' & 'klubb'
    (1 row)
    
    So, changed interface to dictionaries, lexize method of dictionary shoud return
    pointer to aray of TSLexeme structs instead of char**. Last element should
    have TSLexeme->lexeme == NULL.
    
    typedef struct {
            /* number of variant of split word , for example
                    Word 'fotballklubber' (norwegian) has two varian to split:
                    ( fotball, klubb ) and ( fot, ball, klubb ). So, dictionary
                    should return:
                    nvariant        lexeme
                    1               fotball
                    1               klubb
                    2               fot
                    2               ball
                    2               klubb
    
            */
            uint16  nvariant;
    
            /* currently unused */
            uint16  flags;
    
            /* C-string */
            char    *lexeme;
    } TSLexeme;
    324300bc
dict_snowball.c 2.55 KB