Commit 7fc614c6 authored by Tom Lane's avatar Tom Lane

Docs review for unaccent: fix grammar, markup, etc.

parent 1dab218a
<!-- $PostgreSQL: pgsql/doc/src/sgml/unaccent.sgml,v 1.6 2010/08/25 02:12:00 tgl Exp $ -->
<sect1 id="unaccent">
<title>unaccent</title>
......@@ -6,24 +8,24 @@
</indexterm>
<para>
<filename>unaccent</> removes accents (diacritic signs) from a lexeme.
It's a filtering dictionary, that means its output is
always passed to the next dictionary (if any), contrary to the standard
behavior. Currently, it supports most important accents from European
languages.
<filename>unaccent</> is a text search dictionary that removes accents
(diacritic signs) from lexemes.
It's a filtering dictionary, which means its output is
always passed to the next dictionary (if any), unlike the normal
behavior of dictionaries. This allows accent-insensitive processing
for full text search.
</para>
<para>
Limitation: Current implementation of <filename>unaccent</>
dictionary cannot be used as a normalizing dictionary for
<filename>thesaurus</filename> dictionary.
The current implementation of <filename>unaccent</> cannot be used as a
normalizing dictionary for the <filename>thesaurus</filename> dictionary.
</para>
<sect2>
<title>Configuration</title>
<para>
A <literal>unaccent</> dictionary accepts the following options:
An <literal>unaccent</> dictionary accepts the following options:
</para>
<itemizedlist>
<listitem>
......@@ -43,7 +45,9 @@
<itemizedlist>
<listitem>
<para>
Each line represents pair: character_with_accent character_without_accent
Each line represents a pair, consisting of a character with accent
followed by a character without accent. The first is translated into
the second. For example,
<programlisting>
&Agrave; A
&Aacute; A
......@@ -58,8 +62,10 @@
</itemizedlist>
<para>
Look at <filename>unaccent.rules</>, which is installed in
<filename>$SHAREDIR/tsearch_data/</>, for an example.
A more complete example, which is directly useful for most European
languages, can be found in <filename>unaccent.rules</>, which is installed
in <filename>$SHAREDIR/tsearch_data/</> when the <filename>unaccent</>
module is installed.
</para>
</sect2>
......@@ -67,23 +73,22 @@
<title>Usage</title>
<para>
Running the installation script creates a text search template
<literal>unaccent</> and a dictionary <literal>unaccent</>
Running the installation script <filename>unaccent.sql</> creates a text
search template <literal>unaccent</> and a dictionary <literal>unaccent</>
based on it, with default parameters. You can alter the
parameters, for example
<programlisting>
=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
</programlisting>
or create new dictionaries based on the template.
</para>
<para>
To test the dictionary, you can try
To test the dictionary, you can try:
<programlisting>
=# select ts_lexize('unaccent','Hôtel');
mydb=# select ts_lexize('unaccent','H&ocirc;tel');
ts_lexize
-----------
{Hotel}
......@@ -92,41 +97,42 @@
</para>
<para>
Filtering dictionary are useful for correct work of
<function>ts_headline</function> function.
Here is an example showing how to insert the
<filename>unaccent</> dictionary into a text search configuration:
<programlisting>
=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
=# ALTER TEXT SEARCH CONFIGURATION fr
mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
mydb=# ALTER TEXT SEARCH CONFIGURATION fr
ALTER MAPPING FOR hword, hword_part, word
WITH unaccent, french_stem;
=# select to_tsvector('fr','Hôtels de la Mer');
mydb=# select to_tsvector('fr','H&ocirc;tels de la Mer');
to_tsvector
-------------------
'hotel':1 'mer':4
(1 row)
=# select to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
mydb=# select to_tsvector('fr','H&ocirc;tel de la Mer') @@ to_tsquery('fr','Hotels');
?column?
----------
t
(1 row)
=# select ts_headline('fr','Hôtel de la Mer',to_tsquery('fr','Hotels'));
mydb=# select ts_headline('fr','H&ocirc;tel de la Mer',to_tsquery('fr','Hotels'));
ts_headline
------------------------
&lt;b&gt;Hôtel&lt;/b&gt;de la Mer
&lt;b&gt;H&ocirc;tel&lt;/b&gt; de la Mer
(1 row)
</programlisting>
</para>
</sect2>
<sect2>
<title>Function</title>
<title>Functions</title>
<para>
<function>unaccent</> function removes accents (diacritic signs) from
argument string. Basically, it's a wrapper around
<filename>unaccent</> dictionary.
The <function>unaccent()</> function removes accents (diacritic signs) from
a given string. Basically, it's a wrapper around the
<filename>unaccent</> dictionary, but it can be used outside normal
text search contexts.
</para>
<indexterm>
......@@ -134,14 +140,14 @@
</indexterm>
<synopsis>
unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>)
returns <type>text</type>
unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>) returns <type>text</type>
</synopsis>
<para>
For example:
<programlisting>
SELECT unaccent('unaccent', 'Hôtel');
SELECT unaccent('Hôtel');
SELECT unaccent('unaccent', 'H&ocirc;tel');
SELECT unaccent('H&ocirc;tel');
</programlisting>
</para>
</sect2>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment