Commit 7cd082f9 authored by Peter Eisentraut's avatar Peter Eisentraut

Clarify that surrogate pairs are not encoded in UTF-8 directly

parent c5d94a34
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.154 2010/09/01 18:22:29 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.155 2010/09/07 18:54:09 petere Exp $ -->
<chapter id="sql-syntax"> <chapter id="sql-syntax">
<title>SQL Syntax</title> <title>SQL Syntax</title>
...@@ -236,12 +236,15 @@ U&amp;"d!0061t!+000061" UESCAPE '!' ...@@ -236,12 +236,15 @@ U&amp;"d!0061t!+000061" UESCAPE '!'
<para> <para>
The Unicode escape syntax works only when the server encoding is The Unicode escape syntax works only when the server encoding is
UTF8. When other server encodings are used, only code points in <literal>UTF8</>. When other server encodings are used, only code
the ASCII range (up to <literal>\007F</literal>) can be specified. points in the ASCII range (up to <literal>\007F</literal>) can be
Both the 4-digit and the 6-digit form can be used to specify specified. Both the 4-digit and the 6-digit form can be used to
UTF-16 surrogate pairs to compose characters with code points specify UTF-16 surrogate pairs to compose characters with code
larger than U+FFFF (although the availability of points larger than U+FFFF, although the availability of the
the 6-digit form technically makes this unnecessary). 6-digit form technically makes this unnecessary. (When surrogate
pairs are used when the server encoding is <literal>UTF8</>, they
are first combined into a single code point that is then encoded
in UTF-8.)
</para> </para>
<para> <para>
...@@ -431,13 +434,15 @@ SELECT 'foo' 'bar'; ...@@ -431,13 +434,15 @@ SELECT 'foo' 'bar';
<para> <para>
The Unicode escape syntax works fully only when the server The Unicode escape syntax works fully only when the server
encoding is UTF-8. When other server encodings are used, only encoding is <literal>UTF8</>. When other server encodings are
code points in the ASCII range (up to <literal>\u007F</>) can be used, only code points in the ASCII range (up
specified. Both the 4-digit and the 8-digit form can be used to to <literal>\u007F</>) can be specified. Both the 4-digit and
specify UTF-16 surrogate pairs to compose characters with code the 8-digit form can be used to specify UTF-16 surrogate pairs to
points larger than U+FFFF (although the compose characters with code points larger than U+FFFF, although
availability of the 8-digit form technically makes this the availability of the 8-digit form technically makes this
unnecessary). unnecessary. (When surrogate pairs are used when the server
encoding is <literal>UTF8</>, they are first combined into a
single code point that is then encoded in UTF-8.)
</para> </para>
<caution> <caution>
...@@ -517,13 +522,15 @@ U&amp;'d!0061t!+000061' UESCAPE '!' ...@@ -517,13 +522,15 @@ U&amp;'d!0061t!+000061' UESCAPE '!'
<para> <para>
The Unicode escape syntax works only when the server encoding is The Unicode escape syntax works only when the server encoding is
UTF8. When other server encodings are used, only code points in <literal>UTF8</>. When other server encodings are used, only
the ASCII range (up to <literal>\007F</literal>) can be code points in the ASCII range (up to <literal>\007F</literal>)
specified. can be specified. Both the 4-digit and the 6-digit form can be
Both the 4-digit and the 6-digit form can be used to specify used to specify UTF-16 surrogate pairs to compose characters with
UTF-16 surrogate pairs to compose characters with code points code points larger than U+FFFF, although the availability of the
larger than U+FFFF (although the availability 6-digit form technically makes this unnecessary. (When surrogate
of the 6-digit form technically makes this unnecessary). pairs are used when the server encoding is <literal>UTF8</>, they
are first combined into a single code point that is then encoded
in UTF-8.)
</para> </para>
<para> <para>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment