Commit 6581e930 authored by Tom Lane's avatar Tom Lane

Polish the documentation concerning phrase text search.

Fix grammar, improve examples, etc.

I did not attempt to document the current behavior concerning distance-zero
matches, because I think that's broken and needs to change, so I'm not
going to use up brain cells figuring out how to explain how it works now.
One way or the other, there's still more to write here.
parent f721e94b
...@@ -3923,11 +3923,18 @@ SELECT to_tsvector('english', 'The Fat Rats'); ...@@ -3923,11 +3923,18 @@ SELECT to_tsvector('english', 'The Fat Rats');
<para> <para>
A <type>tsquery</type> value stores lexemes that are to be A <type>tsquery</type> value stores lexemes that are to be
searched for, and combines them honoring the Boolean operators searched for, and can combine them using the Boolean operators
<literal>&amp;</literal> (AND), <literal>|</literal> (OR), <literal>&amp;</literal> (AND), <literal>|</literal> (OR), and
<literal>!</> (NOT) and <literal>&lt;-&gt;</> (FOLLOWED BY) phrase search <literal>!</> (NOT), as well as the phrase search operator
operator. Parentheses can be used to enforce grouping <literal>&lt;-&gt;</> (FOLLOWED BY). There is also a variant
of the operators: <literal>&lt;<replaceable>N</>&gt;</literal> of the FOLLOWED BY
operator, where <replaceable>N</> is an integer constant that
specifies a maximum distance between the two lexemes being searched
for. <literal>&lt;-&gt;</> is equivalent to <literal>&lt;1&gt;</>.
</para>
<para>
Parentheses can be used to enforce grouping of the operators:
<programlisting> <programlisting>
SELECT 'fat &amp; rat'::tsquery; SELECT 'fat &amp; rat'::tsquery;
......
...@@ -9081,10 +9081,11 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple ...@@ -9081,10 +9081,11 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
<table id="textsearch-operators-table"> <table id="textsearch-operators-table">
<title>Text Search Operators</title> <title>Text Search Operators</title>
<tgroup cols="4"> <tgroup cols="5">
<thead> <thead>
<row> <row>
<entry>Operator</entry> <entry>Operator</entry>
<entry>Return Type</entry>
<entry>Description</entry> <entry>Description</entry>
<entry>Example</entry> <entry>Example</entry>
<entry>Result</entry> <entry>Result</entry>
...@@ -9093,54 +9094,63 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple ...@@ -9093,54 +9094,63 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
<tbody> <tbody>
<row> <row>
<entry> <literal>@@</literal> </entry> <entry> <literal>@@</literal> </entry>
<entry><type>boolean</></entry>
<entry><type>tsvector</> matches <type>tsquery</> ?</entry> <entry><type>tsvector</> matches <type>tsquery</> ?</entry>
<entry><literal>to_tsvector('fat cats ate rats') @@ to_tsquery('cat &amp; rat')</literal></entry> <entry><literal>to_tsvector('fat cats ate rats') @@ to_tsquery('cat &amp; rat')</literal></entry>
<entry><literal>t</literal></entry> <entry><literal>t</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>@@@</literal> </entry> <entry> <literal>@@@</literal> </entry>
<entry><type>boolean</></entry>
<entry>deprecated synonym for <literal>@@</></entry> <entry>deprecated synonym for <literal>@@</></entry>
<entry><literal>to_tsvector('fat cats ate rats') @@@ to_tsquery('cat &amp; rat')</literal></entry> <entry><literal>to_tsvector('fat cats ate rats') @@@ to_tsquery('cat &amp; rat')</literal></entry>
<entry><literal>t</literal></entry> <entry><literal>t</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>||</literal> </entry> <entry> <literal>||</literal> </entry>
<entry><type>tsvector</></entry>
<entry>concatenate <type>tsvector</>s</entry> <entry>concatenate <type>tsvector</>s</entry>
<entry><literal>'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector</literal></entry> <entry><literal>'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector</literal></entry>
<entry><literal>'a':1 'b':2,5 'c':3 'd':4</literal></entry> <entry><literal>'a':1 'b':2,5 'c':3 'd':4</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>&amp;&amp;</literal> </entry> <entry> <literal>&amp;&amp;</literal> </entry>
<entry><type>tsquery</></entry>
<entry>AND <type>tsquery</>s together</entry> <entry>AND <type>tsquery</>s together</entry>
<entry><literal>'fat | rat'::tsquery &amp;&amp; 'cat'::tsquery</literal></entry> <entry><literal>'fat | rat'::tsquery &amp;&amp; 'cat'::tsquery</literal></entry>
<entry><literal>( 'fat' | 'rat' ) &amp; 'cat'</literal></entry> <entry><literal>( 'fat' | 'rat' ) &amp; 'cat'</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>||</literal> </entry> <entry> <literal>||</literal> </entry>
<entry><type>tsquery</></entry>
<entry>OR <type>tsquery</>s together</entry> <entry>OR <type>tsquery</>s together</entry>
<entry><literal>'fat | rat'::tsquery || 'cat'::tsquery</literal></entry> <entry><literal>'fat | rat'::tsquery || 'cat'::tsquery</literal></entry>
<entry><literal>( 'fat' | 'rat' ) | 'cat'</literal></entry> <entry><literal>( 'fat' | 'rat' ) | 'cat'</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>!!</literal> </entry> <entry> <literal>!!</literal> </entry>
<entry><type>tsquery</></entry>
<entry>negate a <type>tsquery</></entry> <entry>negate a <type>tsquery</></entry>
<entry><literal>!! 'cat'::tsquery</literal></entry> <entry><literal>!! 'cat'::tsquery</literal></entry>
<entry><literal>!'cat'</literal></entry> <entry><literal>!'cat'</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>&lt;-&gt;</literal> </entry> <entry> <literal>&lt;-&gt;</literal> </entry>
<entry><type>tsquery</></entry>
<entry><type>tsquery</> followed by <type>tsquery</></entry> <entry><type>tsquery</> followed by <type>tsquery</></entry>
<entry><literal>to_tsquery('fat') &lt;-&gt; to_tsquery('rat')</literal></entry> <entry><literal>to_tsquery('fat') &lt;-&gt; to_tsquery('rat')</literal></entry>
<entry><literal>'fat' &lt;-&gt; 'rat'</literal></entry> <entry><literal>'fat' &lt;-&gt; 'rat'</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>@&gt;</literal> </entry> <entry> <literal>@&gt;</literal> </entry>
<entry><type>boolean</></entry>
<entry><type>tsquery</> contains another ?</entry> <entry><type>tsquery</> contains another ?</entry>
<entry><literal>'cat'::tsquery @&gt; 'cat &amp; rat'::tsquery</literal></entry> <entry><literal>'cat'::tsquery @&gt; 'cat &amp; rat'::tsquery</literal></entry>
<entry><literal>f</literal></entry> <entry><literal>f</literal></entry>
</row> </row>
<row> <row>
<entry> <literal>&lt;@</literal> </entry> <entry> <literal>&lt;@</literal> </entry>
<entry><type>boolean</></entry>
<entry><type>tsquery</> is contained in ?</entry> <entry><type>tsquery</> is contained in ?</entry>
<entry><literal>'cat'::tsquery &lt;@ 'cat &amp; rat'::tsquery</literal></entry> <entry><literal>'cat'::tsquery &lt;@ 'cat &amp; rat'::tsquery</literal></entry>
<entry><literal>t</literal></entry> <entry><literal>t</literal></entry>
...@@ -9245,7 +9255,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple ...@@ -9245,7 +9255,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
<literal><function>phraseto_tsquery(<optional> <replaceable class="PARAMETER">config</> <type>regconfig</> , </optional> <replaceable class="PARAMETER">query</> <type>text</type>)</function></literal> <literal><function>phraseto_tsquery(<optional> <replaceable class="PARAMETER">config</> <type>regconfig</> , </optional> <replaceable class="PARAMETER">query</> <type>text</type>)</function></literal>
</entry> </entry>
<entry><type>tsquery</type></entry> <entry><type>tsquery</type></entry>
<entry>produce <type>tsquery</> ignoring punctuation</entry> <entry>produce <type>tsquery</> that searches for a phrase,
ignoring punctuation</entry>
<entry><literal>phraseto_tsquery('english', 'The Fat Rats')</literal></entry> <entry><literal>phraseto_tsquery('english', 'The Fat Rats')</literal></entry>
<entry><literal>'fat' &lt;-&gt; 'rat'</literal></entry> <entry><literal>'fat' &lt;-&gt; 'rat'</literal></entry>
</row> </row>
...@@ -9400,7 +9411,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple ...@@ -9400,7 +9411,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
<literal><function>ts_rewrite(<replaceable class="PARAMETER">query</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">target</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">substitute</replaceable> <type>tsquery</>)</function></literal> <literal><function>ts_rewrite(<replaceable class="PARAMETER">query</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">target</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">substitute</replaceable> <type>tsquery</>)</function></literal>
</entry> </entry>
<entry><type>tsquery</type></entry> <entry><type>tsquery</type></entry>
<entry>replace target with substitute within query</entry> <entry>replace <replaceable>target</> with <replaceable>substitute</>
within query</entry>
<entry><literal>ts_rewrite('a &amp; b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery)</literal></entry> <entry><literal>ts_rewrite('a &amp; b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery)</literal></entry>
<entry><literal>'b' &amp; ( 'foo' | 'bar' )</literal></entry> <entry><literal>'b' &amp; ( 'foo' | 'bar' )</literal></entry>
</row> </row>
...@@ -9419,7 +9431,9 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple ...@@ -9419,7 +9431,9 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
<literal><function>tsquery_phrase(<replaceable class="PARAMETER">query1</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">query2</replaceable> <type>tsquery</>)</function></literal> <literal><function>tsquery_phrase(<replaceable class="PARAMETER">query1</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">query2</replaceable> <type>tsquery</>)</function></literal>
</entry> </entry>
<entry><type>tsquery</type></entry> <entry><type>tsquery</type></entry>
<entry>implementation of <literal>&lt;-&gt;</> (FOLLOWED BY) operator</entry> <entry>make query that searches for <replaceable>query1</> followed
by <replaceable>query2</> (same as <literal>&lt;-&gt;</>
operator)</entry>
<entry><literal>tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))</literal></entry> <entry><literal>tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))</literal></entry>
<entry><literal>'fat' &lt;-&gt; 'cat'</literal></entry> <entry><literal>'fat' &lt;-&gt; 'cat'</literal></entry>
</row> </row>
...@@ -9428,7 +9442,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple ...@@ -9428,7 +9442,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
<literal><function>tsquery_phrase(<replaceable class="PARAMETER">query1</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">query2</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">distance</replaceable> <type>integer</>)</function></literal> <literal><function>tsquery_phrase(<replaceable class="PARAMETER">query1</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">query2</replaceable> <type>tsquery</>, <replaceable class="PARAMETER">distance</replaceable> <type>integer</>)</function></literal>
</entry> </entry>
<entry><type>tsquery</type></entry> <entry><type>tsquery</type></entry>
<entry>phrase-concatenate with distance</entry> <entry>make query that searches for <replaceable>query1</> followed by
<replaceable>query2</> at maximum distance <replaceable>distance</></entry>
<entry><literal>tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)</literal></entry> <entry><literal>tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)</literal></entry>
<entry><literal>'fat' &lt;10&gt; 'cat'</literal></entry> <entry><literal>'fat' &lt;10&gt; 'cat'</literal></entry>
</row> </row>
......
...@@ -263,12 +263,12 @@ SELECT 'fat &amp; cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::t ...@@ -263,12 +263,12 @@ SELECT 'fat &amp; cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::t
As the above example suggests, a <type>tsquery</type> is not just raw As the above example suggests, a <type>tsquery</type> is not just raw
text, any more than a <type>tsvector</type> is. A <type>tsquery</type> text, any more than a <type>tsvector</type> is. A <type>tsquery</type>
contains search terms, which must be already-normalized lexemes, and contains search terms, which must be already-normalized lexemes, and
may combine multiple terms using AND, OR, NOT and FOLLOWED BY operators. may combine multiple terms using AND, OR, NOT, and FOLLOWED BY operators.
(For details see <xref linkend="datatype-textsearch">.) There are (For details see <xref linkend="datatype-tsquery">.) There are
functions <function>to_tsquery</>, <function>plainto_tsquery</> functions <function>to_tsquery</>, <function>plainto_tsquery</>,
and <function>phraseto_tsquery</> and <function>phraseto_tsquery</>
that are helpful in converting user-written text into a proper that are helpful in converting user-written text into a proper
<type>tsquery</type>, for example by normalizing words appearing in <type>tsquery</type>, primarily by normalizing words appearing in
the text. Similarly, <function>to_tsvector</> is used to parse and the text. Similarly, <function>to_tsvector</> is used to parse and
normalize a document string. So in practice a text search match would normalize a document string. So in practice a text search match would
look more like this: look more like this:
...@@ -294,35 +294,6 @@ SELECT 'fat cats ate fat rats'::tsvector @@ to_tsquery('fat &amp; rat'); ...@@ -294,35 +294,6 @@ SELECT 'fat cats ate fat rats'::tsvector @@ to_tsquery('fat &amp; rat');
already normalized, so <literal>rats</> does not match <literal>rat</>. already normalized, so <literal>rats</> does not match <literal>rat</>.
</para> </para>
<para>
Phrase search is made possible with the help of the <literal>&lt;-&gt;</>
(FOLLOWED BY) operator, which enforces lexeme order. This allows you
to discard strings not containing the desired phrase, for example:
<programlisting>
SELECT q @@ to_tsquery('fatal &lt;-&gt; error')
FROM unnest(array[to_tsvector('fatal error'),
to_tsvector('error is not fatal')]) AS q;
?column?
----------
t
f
</programlisting>
A more generic version of the FOLLOWED BY operator takes form of
<literal>&lt;N&gt;</>, where N stands for the greatest allowed distance
between the specified lexemes. The <literal>phraseto_tsquery</>
function makes use of this behavior in order to construct a
<literal>tsquery</> capable of matching the provided phrase:
<programlisting>
SELECT phraseto_tsquery('cat ate some rats');
phraseto_tsquery
-------------------------------
( 'cat' &lt;-&gt; 'ate' ) &lt;2&gt; 'rat'
</programlisting>
</para>
<para> <para>
The <literal>@@</literal> operator also The <literal>@@</literal> operator also
supports <type>text</type> input, allowing explicit conversion of a text supports <type>text</type> input, allowing explicit conversion of a text
...@@ -344,6 +315,57 @@ text @@ text ...@@ -344,6 +315,57 @@ text @@ text
The form <type>text</type> <literal>@@</literal> <type>text</type> The form <type>text</type> <literal>@@</literal> <type>text</type>
is equivalent to <literal>to_tsvector(x) @@ plainto_tsquery(y)</literal>. is equivalent to <literal>to_tsvector(x) @@ plainto_tsquery(y)</literal>.
</para> </para>
<para>
Within a <type>tsquery</>, the <literal>&amp;</literal> (AND) operator
specifies that both its arguments must appear in the document to have a
match. Similarly, the <literal>|</literal> (OR) operator specifies that
at least one of its arguments must appear, while the <literal>!</> (NOT)
operator specifies that its argument must <emphasis>not</> appear in
order to have a match. Parentheses can be used to control nesting of
these operators.
</para>
<para>
Searching for phrases is possible with the help of
the <literal>&lt;-&gt;</> (FOLLOWED BY) <type>tsquery</> operator, which
matches only if its arguments have matches that are adjacent and in the
given order. For example:
<programlisting>
SELECT to_tsvector('fatal error') @@ to_tsquery('fatal &lt;-&gt; error');
?column?
----------
t
SELECT to_tsvector('error is not fatal') @@ to_tsquery('fatal &lt;-&gt; error');
?column?
----------
f
</programlisting>
There is a more general version of the FOLLOWED BY operator having the
form <literal>&lt;<replaceable>N</>&gt;</literal>,
where <replaceable>N</> is an integer standing for the greatest distance
allowed between the matching lexemes. <literal>&lt;1&gt;</literal> is
the same as <literal>&lt;-&gt;</>, while <literal>&lt;2&gt;</literal>
allows one other lexeme to optionally appear between the matches, and so
on. The <literal>phraseto_tsquery</> function makes use of this
operator to construct a <literal>tsquery</> that can match a multi-word
phrase when some of the words are stop words. For example:
<programlisting>
SELECT phraseto_tsquery('cats ate rats');
phraseto_tsquery
-------------------------------
( 'cat' &lt;-&gt; 'ate' ) &lt;-&gt; 'rat'
SELECT phraseto_tsquery('the cats ate the rats');
phraseto_tsquery
-------------------------------
( 'cat' &lt;-&gt; 'ate' ) &lt;2&gt; 'rat'
</programlisting>
</para>
</sect2> </sect2>
<sect2 id="textsearch-intro-configurations"> <sect2 id="textsearch-intro-configurations">
...@@ -740,12 +762,12 @@ UPDATE tt SET ti = ...@@ -740,12 +762,12 @@ UPDATE tt SET ti =
<para> <para>
<productname>PostgreSQL</productname> provides the <productname>PostgreSQL</productname> provides the
functions <function>to_tsquery</function>, functions <function>to_tsquery</function>,
<function>plainto_tsquery</function> and <function>plainto_tsquery</function>, and
<function>phraseto_tsquery</function> <function>phraseto_tsquery</function>
for converting a query to the <type>tsquery</type> data type. for converting a query to the <type>tsquery</type> data type.
<function>to_tsquery</function> offers access to more features <function>to_tsquery</function> offers access to more features
than both <function>plainto_tsquery</function> and than either <function>plainto_tsquery</function> or
<function>phraseto_tsquery</function>, but is less forgiving <function>phraseto_tsquery</function>, but it is less forgiving
about its input. about its input.
</para> </para>
...@@ -760,15 +782,15 @@ to_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable> <type> ...@@ -760,15 +782,15 @@ to_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable> <type>
<para> <para>
<function>to_tsquery</function> creates a <type>tsquery</> value from <function>to_tsquery</function> creates a <type>tsquery</> value from
<replaceable>querytext</replaceable>, which must consist of single tokens <replaceable>querytext</replaceable>, which must consist of single tokens
separated by the Boolean operators <literal>&amp;</literal> (AND), separated by the <type>tsquery</> operators <literal>&amp;</literal> (AND),
<literal>|</literal> (OR), <literal>!</literal> (NOT), and also the <literal>|</literal> (OR), <literal>!</literal> (NOT), and
<literal>&lt;-&gt;</literal> (FOLLOWED BY) phrase search operator. These operators <literal>&lt;-&gt;</literal> (FOLLOWED BY), possibly grouped
can be grouped using parentheses. In other words, the input to using parentheses. In other words, the input to
<function>to_tsquery</function> must already follow the general rules for <function>to_tsquery</function> must already follow the general rules for
<type>tsquery</> input, as described in <xref <type>tsquery</> input, as described in <xref
linkend="datatype-textsearch">. The difference is that while basic linkend="datatype-tsquery">. The difference is that while basic
<type>tsquery</> input takes the tokens at face value, <type>tsquery</> input takes the tokens at face value,
<function>to_tsquery</function> normalizes each token to a lexeme using <function>to_tsquery</function> normalizes each token into a lexeme using
the specified or default configuration, and discards any tokens that are the specified or default configuration, and discards any tokens that are
stop words according to the configuration. For example: stop words according to the configuration. For example:
...@@ -818,7 +840,8 @@ SELECT to_tsquery('''supernovae stars'' &amp; !crab'); ...@@ -818,7 +840,8 @@ SELECT to_tsquery('''supernovae stars'' &amp; !crab');
</screen> </screen>
Without quotes, <function>to_tsquery</function> will generate a syntax Without quotes, <function>to_tsquery</function> will generate a syntax
error for tokens that are not separated by an AND or OR operator. error for tokens that are not separated by an AND, OR, or FOLLOWED BY
operator.
</para> </para>
<indexterm> <indexterm>
...@@ -830,11 +853,11 @@ plainto_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable> < ...@@ -830,11 +853,11 @@ plainto_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable> <
</synopsis> </synopsis>
<para> <para>
<function>plainto_tsquery</> transforms unformatted text <function>plainto_tsquery</> transforms the unformatted text
<replaceable>querytext</replaceable> to <type>tsquery</type>. <replaceable>querytext</replaceable> to a <type>tsquery</type> value.
The text is parsed and normalized much as for <function>to_tsvector</>, The text is parsed and normalized much as for <function>to_tsvector</>,
then the <literal>&amp;</literal> (AND) Boolean operator is inserted then the <literal>&amp;</literal> (AND) <type>tsquery</type> operator is
between surviving words. inserted between surviving words.
</para> </para>
<para> <para>
...@@ -847,8 +870,8 @@ SELECT plainto_tsquery('english', 'The Fat Rats'); ...@@ -847,8 +870,8 @@ SELECT plainto_tsquery('english', 'The Fat Rats');
'fat' &amp; 'rat' 'fat' &amp; 'rat'
</screen> </screen>
Note that <function>plainto_tsquery</> cannot Note that <function>plainto_tsquery</> will not
recognize Boolean and phrase search operators, weight labels, recognize <type>tsquery</type> operators, weight labels,
or prefix-match labels in its input: or prefix-match labels in its input:
<screen> <screen>
...@@ -871,11 +894,14 @@ phraseto_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable> ...@@ -871,11 +894,14 @@ phraseto_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable>
<para> <para>
<function>phraseto_tsquery</> behaves much like <function>phraseto_tsquery</> behaves much like
<function>plainto_tsquery</>, with the exception <function>plainto_tsquery</>, except that it inserts
that it utilizes the <literal>&lt;-&gt;</literal> (FOLLOWED BY) phrase search the <literal>&lt;-&gt;</literal> (FOLLOWED BY) operator between
operator instead of the <literal>&amp;</literal> (AND) Boolean operator. surviving words instead of the <literal>&amp;</literal> (AND) operator.
This is particularly useful when searching for exact lexeme sequences, Also, stop words are not simply discarded, but are accounted for by
since the phrase search operator helps to maintain lexeme order. inserting <literal>&lt;<replaceable>N</>&gt;</literal> operators rather
than <literal>&lt;-&gt;</literal> operators. This function is useful
when searching for exact lexeme sequences, since the FOLLOWED BY
operators check lexeme order not just the presence of all the lexemes.
</para> </para>
<para> <para>
...@@ -888,9 +914,9 @@ SELECT phraseto_tsquery('english', 'The Fat Rats'); ...@@ -888,9 +914,9 @@ SELECT phraseto_tsquery('english', 'The Fat Rats');
'fat' &lt;-&gt; 'rat' 'fat' &lt;-&gt; 'rat'
</screen> </screen>
Just like the <function>plainto_tsquery</>, the Like <function>plainto_tsquery</>, the
<function>phraseto_tsquery</> function cannot <function>phraseto_tsquery</> function will not
recognize Boolean and phrase search operators, weight labels, recognize <type>tsquery</type> operators, weight labels,
or prefix-match labels in its input: or prefix-match labels in its input:
<screen> <screen>
...@@ -899,17 +925,6 @@ SELECT phraseto_tsquery('english', 'The Fat &amp; Rats:C'); ...@@ -899,17 +925,6 @@ SELECT phraseto_tsquery('english', 'The Fat &amp; Rats:C');
----------------------------- -----------------------------
( 'fat' &lt;-&gt; 'rat' ) &lt;-&gt; 'c' ( 'fat' &lt;-&gt; 'rat' ) &lt;-&gt; 'c'
</screen> </screen>
It is possible to specify the configuration to be used to parse the document,
for example, we could create a new one using the hunspell dictionary
(namely 'eng_hunspell') in order to match phrases with different word forms:
<screen>
SELECT phraseto_tsquery('eng_hunspell', 'developer of the building which collapsed');
phraseto_tsquery
--------------------------------------------------------------------------------------------
( 'developer' &lt;3&gt; 'building' ) &lt;2&gt; 'collapse' | ( 'developer' &lt;3&gt; 'build' ) &lt;2&gt; 'collapse'
</screen>
</para> </para>
</sect2> </sect2>
...@@ -1400,10 +1415,13 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank ...@@ -1400,10 +1415,13 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
<listitem> <listitem>
<para> <para>
Returns a vector which lists the same lexemes as the given vector, but Returns a vector that lists the same lexemes as the given vector, but
which lacks any position or weight information. While the returned lacks any position or weight information. The result is usually much
vector is much less useful than an unstripped vector for relevance smaller than an unstripped vector, but it is also less useful.
ranking, it will usually be much smaller. Relevance ranking does not work as well on stripped vectors as
unstripped ones. Also, when given stripped input,
the <literal>&lt;-&gt;</> (FOLLOWED BY) <type>tsquery</> operator
effectively degenerates to a simple <literal>&amp;</> (AND) test.
</para> </para>
</listitem> </listitem>
...@@ -1481,7 +1499,10 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank ...@@ -1481,7 +1499,10 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
<listitem> <listitem>
<para> <para>
Returns the phrase-concatenation of the two given queries. Returns a query that searches for a match to the first given query
immediately followed by a match to the second given query, using
the <literal>&lt;-&gt;</> (FOLLOWED BY)
<type>tsquery</> operator. For example:
<screen> <screen>
SELECT to_tsquery('fat') &lt;-&gt; to_tsquery('cat | rat'); SELECT to_tsquery('fat') &lt;-&gt; to_tsquery('cat | rat');
...@@ -1506,8 +1527,11 @@ SELECT to_tsquery('fat') &lt;-&gt; to_tsquery('cat | rat'); ...@@ -1506,8 +1527,11 @@ SELECT to_tsquery('fat') &lt;-&gt; to_tsquery('cat | rat');
<listitem> <listitem>
<para> <para>
Returns the distanced phrase-concatenation of the two given queries. Returns a query that searches for a match to the first given query
This function lies in the implementation of the <literal>&lt;-&gt;</> operator. followed by a match to the second given query at a distance of at
most <replaceable>distance</replaceable> lexemes, using
the <literal>&lt;<replaceable>N</>&gt;</literal>
<type>tsquery</> operator. For example:
<screen> <screen>
SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10); SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10);
...@@ -3785,6 +3809,11 @@ Parser: "pg_catalog.default" ...@@ -3785,6 +3809,11 @@ Parser: "pg_catalog.default"
<para>Position values in <type>tsvector</> must be greater than 0 and <para>Position values in <type>tsvector</> must be greater than 0 and
no more than 16,383</para> no more than 16,383</para>
</listitem> </listitem>
<listitem>
<para>The match distance in a <literal>&lt;<replaceable>N</>&gt;</literal>
(FOLLOWED BY) <type>tsquery</> operator cannot be more than
16,384</para>
</listitem>
<listitem> <listitem>
<para>No more than 256 positions per lexeme</para> <para>No more than 256 positions per lexeme</para>
</listitem> </listitem>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment