Commit 34a0a81b authored by Tom Lane's avatar Tom Lane

Doc: rearrange the documentation of binary-string functions.

Rather than intermixing the discussion of text-string and binary-string
functions, make a clean break, moving all discussion of binary-string
operations into section 9.5.  This involves some duplication of
function descriptions between 9.4 and 9.5, but it seems cleaner on the
whole since the individual descriptions are clearer (and on the other
side of the coin, it gets rid of some duplicated descriptions, too).

Move the convert*/encode/decode functions to a separate table, because
they don't quite seem to fit under the heading of "binary string
functions".

Also provide full documentation of the textual formats supported by
encode() and decode() (which was the original goal of this patch
series, many moons ago).

Also move the table of built-in encoding conversions out of section 9.4,
where it no longer had any relevance whatsoever, and put it into section
23.3 about character sets.  I chose to put both that and table 23.2
(multibyte-translation-table) into a new <sect2> so as not to break up
the flow of discussion in 23.3.3.

Also do a bunch of minor copy-editing on the function descriptions
in 9.4 and 9.5.

Karl Pinc, reviewed by Fabien Coelho, further hacking by me

Discussion: https://postgr.es/m/20190304163347.7bca4897@slate.meme.com
parent 41aadeeb
......@@ -1495,313 +1495,12 @@ $ <userinput>psql -l</userinput>
<title>Automatic Character Set Conversion Between Server and Client</title>
<para>
<productname>PostgreSQL</productname> supports automatic
character set conversion between server and client for certain
character set combinations. The conversion information is stored in the
<literal>pg_conversion</literal> system catalog. <productname>PostgreSQL</productname>
comes with some predefined conversions, as shown in <xref
linkend="multibyte-translation-table"/>. You can create a new
conversion using the SQL command <command>CREATE CONVERSION</command>.
<productname>PostgreSQL</productname> supports automatic character
set conversion between server and client for many combinations of
character sets (<xref linkend="multibyte-conversions-supported"/>
shows which ones).
</para>
<table id="multibyte-translation-table">
<title>Client/Server Character Set Conversions</title>
<tgroup cols="2">
<thead>
<row>
<entry>Server Character Set</entry>
<entry>Available Client Character Sets</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>BIG5</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>EUC_CN</literal></entry>
<entry><emphasis>EUC_CN</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_JP</literal></entry>
<entry><emphasis>EUC_JP</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>SJIS</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_JIS_2004</literal></entry>
<entry><emphasis>EUC_JIS_2004</emphasis>,
<literal>SHIFT_JIS_2004</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_KR</literal></entry>
<entry><emphasis>EUC_KR</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_TW</literal></entry>
<entry><emphasis>EUC_TW</emphasis>,
<literal>BIG5</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>GB18030</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>GBK</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>ISO_8859_5</literal></entry>
<entry><emphasis>ISO_8859_5</emphasis>,
<literal>KOI8R</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN866</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>ISO_8859_6</literal></entry>
<entry><emphasis>ISO_8859_6</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>ISO_8859_7</literal></entry>
<entry><emphasis>ISO_8859_7</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>ISO_8859_8</literal></entry>
<entry><emphasis>ISO_8859_8</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>JOHAB</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>KOI8R</literal></entry>
<entry><emphasis>KOI8R</emphasis>,
<literal>ISO_8859_5</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN866</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>KOI8U</literal></entry>
<entry><emphasis>KOI8U</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN1</literal></entry>
<entry><emphasis>LATIN1</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN2</literal></entry>
<entry><emphasis>LATIN2</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN1250</literal>
</entry>
</row>
<row>
<entry><literal>LATIN3</literal></entry>
<entry><emphasis>LATIN3</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN4</literal></entry>
<entry><emphasis>LATIN4</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN5</literal></entry>
<entry><emphasis>LATIN5</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN6</literal></entry>
<entry><emphasis>LATIN6</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN7</literal></entry>
<entry><emphasis>LATIN7</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN8</literal></entry>
<entry><emphasis>LATIN8</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN9</literal></entry>
<entry><emphasis>LATIN9</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN10</literal></entry>
<entry><emphasis>LATIN10</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><emphasis>MULE_INTERNAL</emphasis>,
<literal>BIG5</literal>,
<literal>EUC_CN</literal>,
<literal>EUC_JP</literal>,
<literal>EUC_KR</literal>,
<literal>EUC_TW</literal>,
<literal>ISO_8859_5</literal>,
<literal>KOI8R</literal>,
<literal>LATIN1</literal> to <literal>LATIN4</literal>,
<literal>SJIS</literal>,
<literal>WIN866</literal>,
<literal>WIN1250</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>SJIS</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>SHIFT_JIS_2004</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>SQL_ASCII</literal></entry>
<entry><emphasis>any (no conversion will be performed)</emphasis>
</entry>
</row>
<row>
<entry><literal>UHC</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>UTF8</literal></entry>
<entry><emphasis>all supported encodings</emphasis>
</entry>
</row>
<row>
<entry><literal>WIN866</literal></entry>
<entry><emphasis>WIN866</emphasis>,
<literal>ISO_8859_5</literal>,
<literal>KOI8R</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>WIN874</literal></entry>
<entry><emphasis>WIN874</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1250</literal></entry>
<entry><emphasis>WIN1250</emphasis>,
<literal>LATIN2</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1251</literal></entry>
<entry><emphasis>WIN1251</emphasis>,
<literal>ISO_8859_5</literal>,
<literal>KOI8R</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN866</literal>
</entry>
</row>
<row>
<entry><literal>WIN1252</literal></entry>
<entry><emphasis>WIN1252</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1253</literal></entry>
<entry><emphasis>WIN1253</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1254</literal></entry>
<entry><emphasis>WIN1254</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1255</literal></entry>
<entry><emphasis>WIN1255</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1256</literal></entry>
<entry><emphasis>WIN1256</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1257</literal></entry>
<entry><emphasis>WIN1257</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1258</literal></entry>
<entry><emphasis>WIN1258</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
To enable automatic character set conversion, you have to
tell <productname>PostgreSQL</productname> the character set
......@@ -1905,6 +1604,970 @@ RESET client_encoding;
</para>
</sect2>
<sect2 id="multibyte-conversions-supported">
<title>Available Character Set Conversions</title>
<para>
<productname>PostgreSQL</productname> allows conversion between any
two character sets for which a conversion function is listed in the
<link linkend="catalog-pg-conversion"><structname>pg_conversion</structname></link>
system catalog. <productname>PostgreSQL</productname> comes with
some predefined conversions, as summarized in
<xref linkend="multibyte-translation-table"/> and shown in more
detail in <xref linkend="builtin-conversions-table"/>. You can
create a new conversion using the SQL command
<xref linkend="sql-createconversion"/>. (To be used for automatic
client/server conversions, a conversion must be marked
as <quote>default</quote> for its character set pair.)
</para>
<table id="multibyte-translation-table">
<title>Built-in Client/Server Character Set Conversions</title>
<tgroup cols="2">
<thead>
<row>
<entry>Server Character Set</entry>
<entry>Available Client Character Sets</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>BIG5</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>EUC_CN</literal></entry>
<entry><emphasis>EUC_CN</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_JP</literal></entry>
<entry><emphasis>EUC_JP</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>SJIS</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_JIS_2004</literal></entry>
<entry><emphasis>EUC_JIS_2004</emphasis>,
<literal>SHIFT_JIS_2004</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_KR</literal></entry>
<entry><emphasis>EUC_KR</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>EUC_TW</literal></entry>
<entry><emphasis>EUC_TW</emphasis>,
<literal>BIG5</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>GB18030</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>GBK</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>ISO_8859_5</literal></entry>
<entry><emphasis>ISO_8859_5</emphasis>,
<literal>KOI8R</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN866</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>ISO_8859_6</literal></entry>
<entry><emphasis>ISO_8859_6</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>ISO_8859_7</literal></entry>
<entry><emphasis>ISO_8859_7</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>ISO_8859_8</literal></entry>
<entry><emphasis>ISO_8859_8</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>JOHAB</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>KOI8R</literal></entry>
<entry><emphasis>KOI8R</emphasis>,
<literal>ISO_8859_5</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN866</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>KOI8U</literal></entry>
<entry><emphasis>KOI8U</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN1</literal></entry>
<entry><emphasis>LATIN1</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN2</literal></entry>
<entry><emphasis>LATIN2</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN1250</literal>
</entry>
</row>
<row>
<entry><literal>LATIN3</literal></entry>
<entry><emphasis>LATIN3</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN4</literal></entry>
<entry><emphasis>LATIN4</emphasis>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN5</literal></entry>
<entry><emphasis>LATIN5</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN6</literal></entry>
<entry><emphasis>LATIN6</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN7</literal></entry>
<entry><emphasis>LATIN7</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN8</literal></entry>
<entry><emphasis>LATIN8</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN9</literal></entry>
<entry><emphasis>LATIN9</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>LATIN10</literal></entry>
<entry><emphasis>LATIN10</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><emphasis>MULE_INTERNAL</emphasis>,
<literal>BIG5</literal>,
<literal>EUC_CN</literal>,
<literal>EUC_JP</literal>,
<literal>EUC_KR</literal>,
<literal>EUC_TW</literal>,
<literal>ISO_8859_5</literal>,
<literal>KOI8R</literal>,
<literal>LATIN1</literal> to <literal>LATIN4</literal>,
<literal>SJIS</literal>,
<literal>WIN866</literal>,
<literal>WIN1250</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>SJIS</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>SHIFT_JIS_2004</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>SQL_ASCII</literal></entry>
<entry><emphasis>any (no conversion will be performed)</emphasis>
</entry>
</row>
<row>
<entry><literal>UHC</literal></entry>
<entry><emphasis>not supported as a server encoding</emphasis>
</entry>
</row>
<row>
<entry><literal>UTF8</literal></entry>
<entry><emphasis>all supported encodings</emphasis>
</entry>
</row>
<row>
<entry><literal>WIN866</literal></entry>
<entry><emphasis>WIN866</emphasis>,
<literal>ISO_8859_5</literal>,
<literal>KOI8R</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN1251</literal>
</entry>
</row>
<row>
<entry><literal>WIN874</literal></entry>
<entry><emphasis>WIN874</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1250</literal></entry>
<entry><emphasis>WIN1250</emphasis>,
<literal>LATIN2</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1251</literal></entry>
<entry><emphasis>WIN1251</emphasis>,
<literal>ISO_8859_5</literal>,
<literal>KOI8R</literal>,
<literal>MULE_INTERNAL</literal>,
<literal>UTF8</literal>,
<literal>WIN866</literal>
</entry>
</row>
<row>
<entry><literal>WIN1252</literal></entry>
<entry><emphasis>WIN1252</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1253</literal></entry>
<entry><emphasis>WIN1253</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1254</literal></entry>
<entry><emphasis>WIN1254</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1255</literal></entry>
<entry><emphasis>WIN1255</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1256</literal></entry>
<entry><emphasis>WIN1256</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1257</literal></entry>
<entry><emphasis>WIN1257</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
<row>
<entry><literal>WIN1258</literal></entry>
<entry><emphasis>WIN1258</emphasis>,
<literal>UTF8</literal>
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="builtin-conversions-table">
<title>All Built-in Character Set Conversions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Conversion Name
<footnote>
<para>
The conversion names follow a standard naming scheme: The
official name of the source encoding with all
non-alphanumeric characters replaced by underscores, followed
by <literal>_to_</literal>, followed by the similarly processed
destination encoding name. Therefore, these names sometimes
deviate from the customary encoding names shown in
<xref linkend="charset-table"/>.
</para>
</footnote>
</entry>
<entry>Source Encoding</entry>
<entry>Destination Encoding</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>big5_to_euc_tw</literal></entry>
<entry><literal>BIG5</literal></entry>
<entry><literal>EUC_TW</literal></entry>
</row>
<row>
<entry><literal>big5_to_mic</literal></entry>
<entry><literal>BIG5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>big5_to_utf8</literal></entry>
<entry><literal>BIG5</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>euc_cn_to_mic</literal></entry>
<entry><literal>EUC_CN</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>euc_cn_to_utf8</literal></entry>
<entry><literal>EUC_CN</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>euc_jp_to_mic</literal></entry>
<entry><literal>EUC_JP</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>euc_jp_to_sjis</literal></entry>
<entry><literal>EUC_JP</literal></entry>
<entry><literal>SJIS</literal></entry>
</row>
<row>
<entry><literal>euc_jp_to_utf8</literal></entry>
<entry><literal>EUC_JP</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>euc_kr_to_mic</literal></entry>
<entry><literal>EUC_KR</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>euc_kr_to_utf8</literal></entry>
<entry><literal>EUC_KR</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>euc_tw_to_big5</literal></entry>
<entry><literal>EUC_TW</literal></entry>
<entry><literal>BIG5</literal></entry>
</row>
<row>
<entry><literal>euc_tw_to_mic</literal></entry>
<entry><literal>EUC_TW</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>euc_tw_to_utf8</literal></entry>
<entry><literal>EUC_TW</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>gb18030_to_utf8</literal></entry>
<entry><literal>GB18030</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>gbk_to_utf8</literal></entry>
<entry><literal>GBK</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_10_to_utf8</literal></entry>
<entry><literal>LATIN6</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_13_to_utf8</literal></entry>
<entry><literal>LATIN7</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_14_to_utf8</literal></entry>
<entry><literal>LATIN8</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_15_to_utf8</literal></entry>
<entry><literal>LATIN9</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_16_to_utf8</literal></entry>
<entry><literal>LATIN10</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_1_to_mic</literal></entry>
<entry><literal>LATIN1</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_1_to_utf8</literal></entry>
<entry><literal>LATIN1</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_2_to_mic</literal></entry>
<entry><literal>LATIN2</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_2_to_utf8</literal></entry>
<entry><literal>LATIN2</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_2_to_windows_1250</literal></entry>
<entry><literal>LATIN2</literal></entry>
<entry><literal>WIN1250</literal></entry>
</row>
<row>
<entry><literal>iso_8859_3_to_mic</literal></entry>
<entry><literal>LATIN3</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_3_to_utf8</literal></entry>
<entry><literal>LATIN3</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_4_to_mic</literal></entry>
<entry><literal>LATIN4</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_4_to_utf8</literal></entry>
<entry><literal>LATIN4</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_koi8_r</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_mic</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_utf8</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_windows_1251</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_windows_866</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>iso_8859_6_to_utf8</literal></entry>
<entry><literal>ISO_8859_6</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_7_to_utf8</literal></entry>
<entry><literal>ISO_8859_7</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_8_to_utf8</literal></entry>
<entry><literal>ISO_8859_8</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_9_to_utf8</literal></entry>
<entry><literal>LATIN5</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>johab_to_utf8</literal></entry>
<entry><literal>JOHAB</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_iso_8859_5</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_mic</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_utf8</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_windows_1251</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_windows_866</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>koi8_u_to_utf8</literal></entry>
<entry><literal>KOI8U</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>mic_to_big5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>BIG5</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_cn</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_CN</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_jp</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_JP</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_kr</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_KR</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_tw</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_TW</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_1</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN1</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_2</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN2</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_3</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN3</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_4</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN4</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>mic_to_koi8_r</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>mic_to_sjis</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>SJIS</literal></entry>
</row>
<row>
<entry><literal>mic_to_windows_1250</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>WIN1250</literal></entry>
</row>
<row>
<entry><literal>mic_to_windows_1251</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>mic_to_windows_866</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>sjis_to_euc_jp</literal></entry>
<entry><literal>SJIS</literal></entry>
<entry><literal>EUC_JP</literal></entry>
</row>
<row>
<entry><literal>sjis_to_mic</literal></entry>
<entry><literal>SJIS</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>sjis_to_utf8</literal></entry>
<entry><literal>SJIS</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1258_to_utf8</literal></entry>
<entry><literal>WIN1258</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>uhc_to_utf8</literal></entry>
<entry><literal>UHC</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_big5</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>BIG5</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_cn</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_CN</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_jp</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_JP</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_kr</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_KR</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_tw</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_TW</literal></entry>
</row>
<row>
<entry><literal>utf8_to_gb18030</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>GB18030</literal></entry>
</row>
<row>
<entry><literal>utf8_to_gbk</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>GBK</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_1</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN1</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_10</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN6</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_13</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN7</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_14</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_15</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN9</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_16</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN10</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_2</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN2</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_3</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN3</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_4</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN4</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_5</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_6</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_6</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_7</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_7</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_8</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_9</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN5</literal></entry>
</row>
<row>
<entry><literal>utf8_to_johab</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>JOHAB</literal></entry>
</row>
<row>
<entry><literal>utf8_to_koi8_r</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>utf8_to_koi8_u</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>KOI8U</literal></entry>
</row>
<row>
<entry><literal>utf8_to_sjis</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>SJIS</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1258</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1258</literal></entry>
</row>
<row>
<entry><literal>utf8_to_uhc</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>UHC</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1250</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1250</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1251</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1252</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1252</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1253</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1253</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1254</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1254</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1255</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1255</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1256</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1256</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1257</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1257</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_866</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_874</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN874</literal></entry>
</row>
<row>
<entry><literal>windows_1250_to_iso_8859_2</literal></entry>
<entry><literal>WIN1250</literal></entry>
<entry><literal>LATIN2</literal></entry>
</row>
<row>
<entry><literal>windows_1250_to_mic</literal></entry>
<entry><literal>WIN1250</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>windows_1250_to_utf8</literal></entry>
<entry><literal>WIN1250</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_iso_8859_5</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_koi8_r</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_mic</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_utf8</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_windows_866</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>windows_1252_to_utf8</literal></entry>
<entry><literal>WIN1252</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1256_to_utf8</literal></entry>
<entry><literal>WIN1256</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_iso_8859_5</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_koi8_r</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_mic</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_utf8</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_windows_1251</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>WIN</literal></entry>
</row>
<row>
<entry><literal>windows_874_to_utf8</literal></entry>
<entry><literal>WIN874</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>euc_jis_2004_to_utf8</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_jis_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
</row>
<row>
<entry><literal>shift_jis_2004_to_utf8</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_shift_jis_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
</row>
<row>
<entry><literal>euc_jis_2004_to_shift_jis_2004</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
</row>
<row>
<entry><literal>shift_jis_2004_to_euc_jis_2004</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Further Reading</title>
......
......@@ -1455,8 +1455,7 @@
and <type>text</type>. Unless otherwise noted, all
of the functions listed below work on all of these types, but be
wary of potential effects of automatic space-padding when using the
<type>character</type> type. Some functions also exist
natively for the bit-string types.
<type>character</type> type.
</para>
<para>
......@@ -1628,7 +1627,8 @@
</entry>
<entry><type>text</type></entry>
<entry>
Extract substring
Extract substring (provide at least one of <literal>from</literal>
and <literal>for</literal>)
</entry>
<entry><literal>substring('Thomas' from 2 for 3)</literal></entry>
<entry><literal>hom</literal></entry>
......@@ -1638,9 +1638,9 @@
<entry><literal><function>substring(<parameter>string</parameter> from <replaceable>pattern</replaceable>)</function></literal></entry>
<entry><type>text</type></entry>
<entry>
Extract substring matching POSIX regular expression. See
Extract substring matching POSIX regular expression (see
<xref linkend="functions-matching"/> for more information on pattern
matching.
matching)
</entry>
<entry><literal>substring('Thomas' from '...$')</literal></entry>
<entry><literal>mas</literal></entry>
......@@ -1650,9 +1650,9 @@
<entry><literal><function>substring(<parameter>string</parameter> from <replaceable>pattern</replaceable> for <replaceable>escape</replaceable>)</function></literal></entry>
<entry><type>text</type></entry>
<entry>
Extract substring matching <acronym>SQL</acronym> regular expression.
See <xref linkend="functions-matching"/> for more information on
pattern matching.
Extract substring matching <acronym>SQL</acronym> regular expression
(see <xref linkend="functions-matching"/> for more information on
pattern matching)
</entry>
<entry><literal>substring('Thomas' from '%#"o_a#"_' for '#')</literal></entry>
<entry><literal>oma</literal></entry>
......@@ -1682,9 +1682,8 @@
<entry>
<literal><function>trim(<optional>leading | trailing
| both</optional> <optional>from</optional>
<parameter>string</parameter>
<optional>, <parameter>characters</parameter></optional>
)</function></literal>
<parameter>string</parameter> <optional>,
<parameter>characters</parameter></optional>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
......@@ -1822,101 +1821,6 @@
<row>
<entry>
<indexterm>
<primary>convert</primary>
</indexterm>
<literal><function>convert(<parameter>string</parameter> <type>bytea</type>,
<parameter>src_encoding</parameter> <type>name</type>,
<parameter>dest_encoding</parameter> <type>name</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Convert string to <parameter>dest_encoding</parameter>. The
original encoding is specified by
<parameter>src_encoding</parameter>. The
<parameter>string</parameter> must be valid in this encoding.
Conversions can be defined by <command>CREATE CONVERSION</command>.
Also there are some predefined conversions. See <xref
linkend="conversion-names"/> for available conversions.
</entry>
<entry><literal>convert('text_in_utf8', 'UTF8', 'LATIN1')</literal></entry>
<entry><literal>text_in_utf8</literal> represented in Latin-1
encoding (ISO 8859-1)</entry>
</row>
<row>
<entry>
<indexterm>
<primary>convert_from</primary>
</indexterm>
<literal><function>convert_from(<parameter>string</parameter> <type>bytea</type>,
<parameter>src_encoding</parameter> <type>name</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Convert string to the database encoding. The original encoding
is specified by <parameter>src_encoding</parameter>. The
<parameter>string</parameter> must be valid in this encoding.
</entry>
<entry><literal>convert_from('text_in_utf8', 'UTF8')</literal></entry>
<entry><literal>text_in_utf8</literal> represented in the current database encoding</entry>
</row>
<row>
<entry>
<indexterm>
<primary>convert_to</primary>
</indexterm>
<literal><function>convert_to(<parameter>string</parameter> <type>text</type>,
<parameter>dest_encoding</parameter> <type>name</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Convert string to <parameter>dest_encoding</parameter>.
</entry>
<entry><literal>convert_to('some text', 'UTF8')</literal></entry>
<entry><literal>some text</literal> represented in the UTF8 encoding</entry>
</row>
<row>
<entry>
<indexterm>
<primary>decode</primary>
</indexterm>
<literal><function>decode(<parameter>string</parameter> <type>text</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Decode binary data from textual representation in <parameter>string</parameter>.
Options for <parameter>format</parameter> are same as in <function>encode</function>.
</entry>
<entry><literal>decode('MTIzAAE=', 'base64')</literal></entry>
<entry><literal>\x3132330001</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>encode</primary>
</indexterm>
<literal><function>encode(<parameter>data</parameter> <type>bytea</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Encode binary data into a textual representation. Supported
formats are: <literal>base64</literal>, <literal>hex</literal>, <literal>escape</literal>.
<literal>escape</literal> converts zero bytes and high-bit-set bytes to
octal sequences (<literal>\</literal><replaceable>nnn</replaceable>) and
doubles backslashes.
</entry>
<entry><literal>encode('123\000\001', 'base64')</literal></entry>
<entry><literal>MTIzAAE=</literal></entry>
</row>
<row>
<entry id="format">
<indexterm>
<primary>format</primary>
</indexterm>
......@@ -1955,13 +1859,14 @@
<indexterm>
<primary>left</primary>
</indexterm>
<literal><function>left(<parameter>str</parameter> <type>text</type>,
<literal><function>left(<parameter>string</parameter> <type>text</type>,
<parameter>n</parameter> <type>int</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Return first <replaceable>n</replaceable> characters in the string. When <replaceable>n</replaceable>
is negative, return all but last |<replaceable>n</replaceable>| characters.
Return first <replaceable>n</replaceable> characters in the
string, or when <replaceable>n</replaceable> is negative, return
all but last |<replaceable>n</replaceable>| characters
</entry>
<entry><literal>left('abcde', 2)</literal></entry>
<entry><literal>ab</literal></entry>
......@@ -1982,19 +1887,6 @@
<entry><literal>4</literal></entry>
</row>
<row>
<entry><literal><function>length(<parameter>string</parameter> <type>bytea</type>,
<parameter>encoding</parameter> <type>name</type> )</function></literal></entry>
<entry><type>int</type></entry>
<entry>
Number of characters in <parameter>string</parameter> in the given
<parameter>encoding</parameter>. The <parameter>string</parameter>
must be valid in this encoding.
</entry>
<entry><literal>length('jose', 'UTF8')</literal></entry>
<entry><literal>4</literal></entry>
</row>
<row>
<entry>
<indexterm>
......@@ -2006,7 +1898,7 @@
</entry>
<entry><type>text</type></entry>
<entry>
Fill up the <parameter>string</parameter> to length
Extend the <parameter>string</parameter> to length
<parameter>length</parameter> by prepending the characters
<parameter>fill</parameter> (a space by default). If the
<parameter>string</parameter> is already longer than
......@@ -2044,8 +1936,8 @@
</entry>
<entry><type>text</type></entry>
<entry>
Calculates the MD5 hash of <parameter>string</parameter>,
returning the result in hexadecimal
MD5 <link linkend="functions-hash-note">hash</link>, with
the result written in hexadecimal
</entry>
<entry><literal>md5('abc')</literal></entry>
<entry><literal>900150983cd24fb0 d6963f7d28e17f72</literal></entry>
......@@ -2182,8 +2074,8 @@
<entry><type>text[]</type></entry>
<entry>
Return captured substring(s) resulting from the first match of a POSIX
regular expression to the <parameter>string</parameter>. See
<xref linkend="functions-posix-regexp"/> for more information.
regular expression to the <parameter>string</parameter> (see
<xref linkend="functions-posix-regexp"/> for more information)
</entry>
<entry><literal>regexp_match('foobarbequebaz', '(bar)(beque)')</literal></entry>
<entry><literal>{bar,beque}</literal></entry>
......@@ -2199,8 +2091,8 @@
<entry><type>setof text[]</type></entry>
<entry>
Return captured substring(s) resulting from matching a POSIX regular
expression to the <parameter>string</parameter>. See
<xref linkend="functions-posix-regexp"/> for more information.
expression to the <parameter>string</parameter> (see
<xref linkend="functions-posix-regexp"/> for more information)
</entry>
<entry><literal>regexp_matches('foobarbequebaz', 'ba.', 'g')</literal></entry>
<entry><literal>{bar}</literal><para><literal>{baz}</literal></para> (2 rows)</entry>
......@@ -2215,8 +2107,8 @@
</entry>
<entry><type>text</type></entry>
<entry>
Replace substring(s) matching a POSIX regular expression. See
<xref linkend="functions-posix-regexp"/> for more information.
Replace substring(s) matching a POSIX regular expression (see
<xref linkend="functions-posix-regexp"/> for more information)
</entry>
<entry><literal>regexp_replace('Thomas', '.[mN]a.', 'M')</literal></entry>
<entry><literal>ThM</literal></entry>
......@@ -2232,8 +2124,8 @@
<entry><type>text[]</type></entry>
<entry>
Split <parameter>string</parameter> using a POSIX regular expression as
the delimiter. See <xref linkend="functions-posix-regexp"/> for more
information.
the delimiter (see <xref linkend="functions-posix-regexp"/> for more
information)
</entry>
<entry><literal>regexp_split_to_array('hello world', '\s+')</literal></entry>
<entry><literal>{hello,world}</literal></entry>
......@@ -2249,8 +2141,8 @@
<entry><type>setof text</type></entry>
<entry>
Split <parameter>string</parameter> using a POSIX regular expression as
the delimiter. See <xref linkend="functions-posix-regexp"/> for more
information.
the delimiter (see <xref linkend="functions-posix-regexp"/> for more
information)
</entry>
<entry><literal>regexp_split_to_table('hello world', '\s+')</literal></entry>
<entry><literal>hello</literal><para><literal>world</literal></para> (2 rows)</entry>
......@@ -2296,7 +2188,7 @@
</entry>
<entry><type>text</type></entry>
<entry>
Return reversed string.
Reverse the order of the characters in <parameter>string</parameter>
</entry>
<entry><literal>reverse('abcde')</literal></entry>
<entry><literal>edcba</literal></entry>
......@@ -2307,13 +2199,14 @@
<indexterm>
<primary>right</primary>
</indexterm>
<literal><function>right(<parameter>str</parameter> <type>text</type>,
<literal><function>right(<parameter>string</parameter> <type>text</type>,
<parameter>n</parameter> <type>int</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Return last <replaceable>n</replaceable> characters in the string. When <replaceable>n</replaceable>
is negative, return all but first |<replaceable>n</replaceable>| characters.
Return last <replaceable>n</replaceable> characters in the string,
or when <replaceable>n</replaceable> is negative, return all but
first |<replaceable>n</replaceable>| characters
</entry>
<entry><literal>right('abcde', 2)</literal></entry>
<entry><literal>de</literal></entry>
......@@ -2330,7 +2223,7 @@
</entry>
<entry><type>text</type></entry>
<entry>
Fill up the <parameter>string</parameter> to length
Extend the <parameter>string</parameter> to length
<parameter>length</parameter> by appending the characters
<parameter>fill</parameter> (a space by default). If the
<parameter>string</parameter> is already longer than
......@@ -2418,7 +2311,8 @@
</entry>
<entry><type>bool</type></entry>
<entry>
Returns true if <parameter>string</parameter> starts with <parameter>prefix</parameter>.
Return true if <parameter>string</parameter> starts
with <parameter>prefix</parameter>
</entry>
<entry><literal>starts_with('alphabet', 'alph')</literal></entry>
<entry><literal>t</literal></entry>
......@@ -2497,942 +2391,173 @@
</para>
<para>
See also the aggregate function <function>string_agg</function> in
<xref linkend="functions-aggregate"/>.
See also the aggregate function <function>string_agg</function> in
<xref linkend="functions-aggregate"/>, and the functions for
converting between strings and the <type>bytea</type> type in
<xref linkend="functions-binarystring-conversions"/>.
</para>
<table id="conversion-names">
<title>Built-in Conversions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Conversion Name
<footnote>
<para>
The conversion names follow a standard naming scheme: The
official name of the source encoding with all
non-alphanumeric characters replaced by underscores, followed
by <literal>_to_</literal>, followed by the similarly processed
destination encoding name. Therefore, the names might deviate
from the customary encoding names.
</para>
</footnote>
</entry>
<entry>Source Encoding</entry>
<entry>Destination Encoding</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>big5_to_euc_tw</literal></entry>
<entry><literal>BIG5</literal></entry>
<entry><literal>EUC_TW</literal></entry>
</row>
<sect2 id="functions-string-format">
<title><function>format</function></title>
<row>
<entry><literal>big5_to_mic</literal></entry>
<entry><literal>BIG5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<indexterm>
<primary>format</primary>
</indexterm>
<row>
<entry><literal>big5_to_utf8</literal></entry>
<entry><literal>BIG5</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<para>
The function <function>format</function> produces output formatted according to
a format string, in a style similar to the C function
<function>sprintf</function>.
</para>
<row>
<entry><literal>euc_cn_to_mic</literal></entry>
<entry><literal>EUC_CN</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<para>
<synopsis>
<function>format</function>(<parameter>formatstr</parameter> <type>text</type> [, <parameter>formatarg</parameter> <type>"any"</type> [, ...] ])
</synopsis>
<replaceable>formatstr</replaceable> is a format string that specifies how the
result should be formatted. Text in the format string is copied
directly to the result, except where <firstterm>format specifiers</firstterm> are
used. Format specifiers act as placeholders in the string, defining how
subsequent function arguments should be formatted and inserted into the
result. Each <replaceable>formatarg</replaceable> argument is converted to text
according to the usual output rules for its data type, and then formatted
and inserted into the result string according to the format specifier(s).
</para>
<row>
<entry><literal>euc_cn_to_utf8</literal></entry>
<entry><literal>EUC_CN</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<para>
Format specifiers are introduced by a <literal>%</literal> character and have
the form
<synopsis>
%[<replaceable>position</replaceable>][<replaceable>flags</replaceable>][<replaceable>width</replaceable>]<replaceable>type</replaceable>
</synopsis>
where the component fields are:
<row>
<entry><literal>euc_jp_to_mic</literal></entry>
<entry><literal>EUC_JP</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<variablelist>
<varlistentry>
<term><replaceable>position</replaceable> (optional)</term>
<listitem>
<para>
A string of the form <literal><replaceable>n</replaceable>$</literal> where
<replaceable>n</replaceable> is the index of the argument to print.
Index 1 means the first argument after
<replaceable>formatstr</replaceable>. If the <replaceable>position</replaceable> is
omitted, the default is to use the next argument in sequence.
</para>
</listitem>
</varlistentry>
<row>
<entry><literal>euc_jp_to_sjis</literal></entry>
<entry><literal>EUC_JP</literal></entry>
<entry><literal>SJIS</literal></entry>
</row>
<varlistentry>
<term><replaceable>flags</replaceable> (optional)</term>
<listitem>
<para>
Additional options controlling how the format specifier's output is
formatted. Currently the only supported flag is a minus sign
(<literal>-</literal>) which will cause the format specifier's output to be
left-justified. This has no effect unless the <replaceable>width</replaceable>
field is also specified.
</para>
</listitem>
</varlistentry>
<row>
<entry><literal>euc_jp_to_utf8</literal></entry>
<entry><literal>EUC_JP</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<varlistentry>
<term><replaceable>width</replaceable> (optional)</term>
<listitem>
<para>
Specifies the <emphasis>minimum</emphasis> number of characters to use to
display the format specifier's output. The output is padded on the
left or right (depending on the <literal>-</literal> flag) with spaces as
needed to fill the width. A too-small width does not cause
truncation of the output, but is simply ignored. The width may be
specified using any of the following: a positive integer; an
asterisk (<literal>*</literal>) to use the next function argument as the
width; or a string of the form <literal>*<replaceable>n</replaceable>$</literal> to
use the <replaceable>n</replaceable>th function argument as the width.
</para>
<row>
<entry><literal>euc_kr_to_mic</literal></entry>
<entry><literal>EUC_KR</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<para>
If the width comes from a function argument, that argument is
consumed before the argument that is used for the format specifier's
value. If the width argument is negative, the result is left
aligned (as if the <literal>-</literal> flag had been specified) within a
field of length <function>abs</function>(<replaceable>width</replaceable>).
</para>
</listitem>
</varlistentry>
<row>
<entry><literal>euc_kr_to_utf8</literal></entry>
<entry><literal>EUC_KR</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<varlistentry>
<term><replaceable>type</replaceable> (required)</term>
<listitem>
<para>
The type of format conversion to use to produce the format
specifier's output. The following types are supported:
<itemizedlist>
<listitem>
<para>
<literal>s</literal> formats the argument value as a simple
string. A null value is treated as an empty string.
</para>
</listitem>
<listitem>
<para>
<literal>I</literal> treats the argument value as an SQL
identifier, double-quoting it if necessary.
It is an error for the value to be null (equivalent to
<function>quote_ident</function>).
</para>
</listitem>
<listitem>
<para>
<literal>L</literal> quotes the argument value as an SQL literal.
A null value is displayed as the string <literal>NULL</literal>, without
quotes (equivalent to <function>quote_nullable</function>).
</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<row>
<entry><literal>euc_tw_to_big5</literal></entry>
<entry><literal>EUC_TW</literal></entry>
<entry><literal>BIG5</literal></entry>
</row>
<para>
In addition to the format specifiers described above, the special sequence
<literal>%%</literal> may be used to output a literal <literal>%</literal> character.
</para>
<row>
<entry><literal>euc_tw_to_mic</literal></entry>
<entry><literal>EUC_TW</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<para>
Here are some examples of the basic format conversions:
<row>
<entry><literal>euc_tw_to_utf8</literal></entry>
<entry><literal>EUC_TW</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<screen>
SELECT format('Hello %s', 'World');
<lineannotation>Result: </lineannotation><computeroutput>Hello World</computeroutput>
<row>
<entry><literal>gb18030_to_utf8</literal></entry>
<entry><literal>GB18030</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
SELECT format('Testing %s, %s, %s, %%', 'one', 'two', 'three');
<lineannotation>Result: </lineannotation><computeroutput>Testing one, two, three, %</computeroutput>
<row>
<entry><literal>gbk_to_utf8</literal></entry>
<entry><literal>GBK</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
SELECT format('INSERT INTO %I VALUES(%L)', 'Foo bar', E'O\'Reilly');
<lineannotation>Result: </lineannotation><computeroutput>INSERT INTO "Foo bar" VALUES('O''Reilly')</computeroutput>
<row>
<entry><literal>iso_8859_10_to_utf8</literal></entry>
<entry><literal>LATIN6</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
SELECT format('INSERT INTO %I VALUES(%L)', 'locations', 'C:\Program Files');
<lineannotation>Result: </lineannotation><computeroutput>INSERT INTO locations VALUES('C:\Program Files')</computeroutput>
</screen>
</para>
<row>
<entry><literal>iso_8859_13_to_utf8</literal></entry>
<entry><literal>LATIN7</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<para>
Here are examples using <replaceable>width</replaceable> fields
and the <literal>-</literal> flag:
<row>
<entry><literal>iso_8859_14_to_utf8</literal></entry>
<entry><literal>LATIN8</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<screen>
SELECT format('|%10s|', 'foo');
<lineannotation>Result: </lineannotation><computeroutput>| foo|</computeroutput>
<row>
<entry><literal>iso_8859_15_to_utf8</literal></entry>
<entry><literal>LATIN9</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
SELECT format('|%-10s|', 'foo');
<lineannotation>Result: </lineannotation><computeroutput>|foo |</computeroutput>
<row>
<entry><literal>iso_8859_16_to_utf8</literal></entry>
<entry><literal>LATIN10</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
SELECT format('|%*s|', 10, 'foo');
<lineannotation>Result: </lineannotation><computeroutput>| foo|</computeroutput>
<row>
<entry><literal>iso_8859_1_to_mic</literal></entry>
<entry><literal>LATIN1</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_1_to_utf8</literal></entry>
<entry><literal>LATIN1</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_2_to_mic</literal></entry>
<entry><literal>LATIN2</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_2_to_utf8</literal></entry>
<entry><literal>LATIN2</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_2_to_windows_1250</literal></entry>
<entry><literal>LATIN2</literal></entry>
<entry><literal>WIN1250</literal></entry>
</row>
<row>
<entry><literal>iso_8859_3_to_mic</literal></entry>
<entry><literal>LATIN3</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_3_to_utf8</literal></entry>
<entry><literal>LATIN3</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_4_to_mic</literal></entry>
<entry><literal>LATIN4</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_4_to_utf8</literal></entry>
<entry><literal>LATIN4</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_koi8_r</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_mic</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_utf8</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_windows_1251</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>iso_8859_5_to_windows_866</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>iso_8859_6_to_utf8</literal></entry>
<entry><literal>ISO_8859_6</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_7_to_utf8</literal></entry>
<entry><literal>ISO_8859_7</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_8_to_utf8</literal></entry>
<entry><literal>ISO_8859_8</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>iso_8859_9_to_utf8</literal></entry>
<entry><literal>LATIN5</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>johab_to_utf8</literal></entry>
<entry><literal>JOHAB</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_iso_8859_5</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_mic</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_utf8</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_windows_1251</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>koi8_r_to_windows_866</literal></entry>
<entry><literal>KOI8R</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>koi8_u_to_utf8</literal></entry>
<entry><literal>KOI8U</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>mic_to_big5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>BIG5</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_cn</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_CN</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_jp</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_JP</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_kr</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_KR</literal></entry>
</row>
<row>
<entry><literal>mic_to_euc_tw</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>EUC_TW</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_1</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN1</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_2</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN2</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_3</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN3</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_4</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>LATIN4</literal></entry>
</row>
<row>
<entry><literal>mic_to_iso_8859_5</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>mic_to_koi8_r</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>mic_to_sjis</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>SJIS</literal></entry>
</row>
<row>
<entry><literal>mic_to_windows_1250</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>WIN1250</literal></entry>
</row>
<row>
<entry><literal>mic_to_windows_1251</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>mic_to_windows_866</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>sjis_to_euc_jp</literal></entry>
<entry><literal>SJIS</literal></entry>
<entry><literal>EUC_JP</literal></entry>
</row>
<row>
<entry><literal>sjis_to_mic</literal></entry>
<entry><literal>SJIS</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>sjis_to_utf8</literal></entry>
<entry><literal>SJIS</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1258_to_utf8</literal></entry>
<entry><literal>WIN1258</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>uhc_to_utf8</literal></entry>
<entry><literal>UHC</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_big5</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>BIG5</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_cn</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_CN</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_jp</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_JP</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_kr</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_KR</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_tw</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_TW</literal></entry>
</row>
<row>
<entry><literal>utf8_to_gb18030</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>GB18030</literal></entry>
</row>
<row>
<entry><literal>utf8_to_gbk</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>GBK</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_1</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN1</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_10</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN6</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_13</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN7</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_14</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_15</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN9</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_16</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN10</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_2</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN2</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_3</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN3</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_4</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN4</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_5</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_6</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_6</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_7</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_7</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_8</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>ISO_8859_8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_iso_8859_9</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>LATIN5</literal></entry>
</row>
<row>
<entry><literal>utf8_to_johab</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>JOHAB</literal></entry>
</row>
<row>
<entry><literal>utf8_to_koi8_r</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>utf8_to_koi8_u</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>KOI8U</literal></entry>
</row>
<row>
<entry><literal>utf8_to_sjis</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>SJIS</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1258</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1258</literal></entry>
</row>
<row>
<entry><literal>utf8_to_uhc</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>UHC</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1250</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1250</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1251</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1251</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1252</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1252</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1253</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1253</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1254</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1254</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1255</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1255</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1256</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1256</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_1257</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN1257</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_866</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>utf8_to_windows_874</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>WIN874</literal></entry>
</row>
<row>
<entry><literal>windows_1250_to_iso_8859_2</literal></entry>
<entry><literal>WIN1250</literal></entry>
<entry><literal>LATIN2</literal></entry>
</row>
<row>
<entry><literal>windows_1250_to_mic</literal></entry>
<entry><literal>WIN1250</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>windows_1250_to_utf8</literal></entry>
<entry><literal>WIN1250</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_iso_8859_5</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_koi8_r</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_mic</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_utf8</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1251_to_windows_866</literal></entry>
<entry><literal>WIN1251</literal></entry>
<entry><literal>WIN866</literal></entry>
</row>
<row>
<entry><literal>windows_1252_to_utf8</literal></entry>
<entry><literal>WIN1252</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_1256_to_utf8</literal></entry>
<entry><literal>WIN1256</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_iso_8859_5</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>ISO_8859_5</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_koi8_r</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>KOI8R</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_mic</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>MULE_INTERNAL</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_utf8</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>windows_866_to_windows_1251</literal></entry>
<entry><literal>WIN866</literal></entry>
<entry><literal>WIN</literal></entry>
</row>
<row>
<entry><literal>windows_874_to_utf8</literal></entry>
<entry><literal>WIN874</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>euc_jis_2004_to_utf8</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_euc_jis_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
</row>
<row>
<entry><literal>shift_jis_2004_to_utf8</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
</row>
<row>
<entry><literal>utf8_to_shift_jis_2004</literal></entry>
<entry><literal>UTF8</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
</row>
<row>
<entry><literal>euc_jis_2004_to_shift_jis_2004</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
</row>
<row>
<entry><literal>shift_jis_2004_to_euc_jis_2004</literal></entry>
<entry><literal>SHIFT_JIS_2004</literal></entry>
<entry><literal>EUC_JIS_2004</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<sect2 id="functions-string-format">
<title><function>format</function></title>
<indexterm>
<primary>format</primary>
</indexterm>
<para>
The function <function>format</function> produces output formatted according to
a format string, in a style similar to the C function
<function>sprintf</function>.
</para>
<para>
<synopsis>
<function>format</function>(<parameter>formatstr</parameter> <type>text</type> [, <parameter>formatarg</parameter> <type>"any"</type> [, ...] ])
</synopsis>
<replaceable>formatstr</replaceable> is a format string that specifies how the
result should be formatted. Text in the format string is copied
directly to the result, except where <firstterm>format specifiers</firstterm> are
used. Format specifiers act as placeholders in the string, defining how
subsequent function arguments should be formatted and inserted into the
result. Each <replaceable>formatarg</replaceable> argument is converted to text
according to the usual output rules for its data type, and then formatted
and inserted into the result string according to the format specifier(s).
</para>
<para>
Format specifiers are introduced by a <literal>%</literal> character and have
the form
<synopsis>
%[<replaceable>position</replaceable>][<replaceable>flags</replaceable>][<replaceable>width</replaceable>]<replaceable>type</replaceable>
</synopsis>
where the component fields are:
<variablelist>
<varlistentry>
<term><replaceable>position</replaceable> (optional)</term>
<listitem>
<para>
A string of the form <literal><replaceable>n</replaceable>$</literal> where
<replaceable>n</replaceable> is the index of the argument to print.
Index 1 means the first argument after
<replaceable>formatstr</replaceable>. If the <replaceable>position</replaceable> is
omitted, the default is to use the next argument in sequence.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable>flags</replaceable> (optional)</term>
<listitem>
<para>
Additional options controlling how the format specifier's output is
formatted. Currently the only supported flag is a minus sign
(<literal>-</literal>) which will cause the format specifier's output to be
left-justified. This has no effect unless the <replaceable>width</replaceable>
field is also specified.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable>width</replaceable> (optional)</term>
<listitem>
<para>
Specifies the <emphasis>minimum</emphasis> number of characters to use to
display the format specifier's output. The output is padded on the
left or right (depending on the <literal>-</literal> flag) with spaces as
needed to fill the width. A too-small width does not cause
truncation of the output, but is simply ignored. The width may be
specified using any of the following: a positive integer; an
asterisk (<literal>*</literal>) to use the next function argument as the
width; or a string of the form <literal>*<replaceable>n</replaceable>$</literal> to
use the <replaceable>n</replaceable>th function argument as the width.
</para>
<para>
If the width comes from a function argument, that argument is
consumed before the argument that is used for the format specifier's
value. If the width argument is negative, the result is left
aligned (as if the <literal>-</literal> flag had been specified) within a
field of length <function>abs</function>(<replaceable>width</replaceable>).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable>type</replaceable> (required)</term>
<listitem>
<para>
The type of format conversion to use to produce the format
specifier's output. The following types are supported:
<itemizedlist>
<listitem>
<para>
<literal>s</literal> formats the argument value as a simple
string. A null value is treated as an empty string.
</para>
</listitem>
<listitem>
<para>
<literal>I</literal> treats the argument value as an SQL
identifier, double-quoting it if necessary.
It is an error for the value to be null (equivalent to
<function>quote_ident</function>).
</para>
</listitem>
<listitem>
<para>
<literal>L</literal> quotes the argument value as an SQL literal.
A null value is displayed as the string <literal>NULL</literal>, without
quotes (equivalent to <function>quote_nullable</function>).
</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
In addition to the format specifiers described above, the special sequence
<literal>%%</literal> may be used to output a literal <literal>%</literal> character.
</para>
<para>
Here are some examples of the basic format conversions:
<screen>
SELECT format('Hello %s', 'World');
<lineannotation>Result: </lineannotation><computeroutput>Hello World</computeroutput>
SELECT format('Testing %s, %s, %s, %%', 'one', 'two', 'three');
<lineannotation>Result: </lineannotation><computeroutput>Testing one, two, three, %</computeroutput>
SELECT format('INSERT INTO %I VALUES(%L)', 'Foo bar', E'O\'Reilly');
<lineannotation>Result: </lineannotation><computeroutput>INSERT INTO "Foo bar" VALUES('O''Reilly')</computeroutput>
SELECT format('INSERT INTO %I VALUES(%L)', 'locations', 'C:\Program Files');
<lineannotation>Result: </lineannotation><computeroutput>INSERT INTO locations VALUES('C:\Program Files')</computeroutput>
</screen>
</para>
<para>
Here are examples using <replaceable>width</replaceable> fields
and the <literal>-</literal> flag:
<screen>
SELECT format('|%10s|', 'foo');
<lineannotation>Result: </lineannotation><computeroutput>| foo|</computeroutput>
SELECT format('|%-10s|', 'foo');
<lineannotation>Result: </lineannotation><computeroutput>|foo |</computeroutput>
SELECT format('|%*s|', 10, 'foo');
<lineannotation>Result: </lineannotation><computeroutput>| foo|</computeroutput>
SELECT format('|%*s|', -10, 'foo');
<lineannotation>Result: </lineannotation><computeroutput>|foo |</computeroutput>
SELECT format('|%*s|', -10, 'foo');
<lineannotation>Result: </lineannotation><computeroutput>|foo |</computeroutput>
SELECT format('|%-*s|', 10, 'foo');
<lineannotation>Result: </lineannotation><computeroutput>|foo |</computeroutput>
......@@ -3494,7 +2619,9 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<para>
This section describes functions and operators for examining and
manipulating values of type <type>bytea</type>.
manipulating binary strings, that is values of type <type>bytea</type>.
Many of these are equivalent, in purpose and syntax, to the
text-string functions described in the previous section.
</para>
<para>
......@@ -3512,6 +2639,8 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
The sample results shown on this page assume that the server parameter
<link linkend="guc-bytea-output"><varname>bytea_output</varname></link> is set
to <literal>escape</literal> (the traditional PostgreSQL format).
The output will look different when using the default setting
(<literal>hex</literal>).
</para>
</note>
......@@ -3530,11 +2659,11 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<tbody>
<row>
<entry><literal><parameter>string</parameter> <literal>||</literal>
<parameter>string</parameter></literal></entry>
<entry><literal><parameter>bytes</parameter> <literal>||</literal>
<parameter>bytes</parameter></literal></entry>
<entry> <type>bytea</type> </entry>
<entry>
String concatenation
Binary string concatenation
<indexterm>
<primary>binary string</primary>
<secondary>concatenation</secondary>
......@@ -3544,12 +2673,25 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<entry><literal>\\Post'gres\000</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>bit_length</primary>
</indexterm>
<literal><function>bit_length(<parameter>bytes</parameter>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>Number of bits in binary string</entry>
<entry><literal>bit_length('jo\000se':bytea)</literal></entry>
<entry><literal>40</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>octet_length</primary>
</indexterm>
<literal><function>octet_length(<parameter>string</parameter>)</function></literal>
<literal><function>octet_length(<parameter>bytes</parameter>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>Number of bytes in binary string</entry>
......@@ -3562,7 +2704,7 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>overlay</primary>
</indexterm>
<literal><function>overlay(<parameter>string</parameter> placing <parameter>string</parameter> from <type>int</type> <optional>for <type>int</type></optional>)</function></literal>
<literal><function>overlay(<parameter>bytes</parameter> placing <parameter>bytes</parameter> from <type>int</type> <optional>for <type>int</type></optional>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
......@@ -3577,7 +2719,7 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>position</primary>
</indexterm>
<literal><function>position(<parameter>substring</parameter> in <parameter>string</parameter>)</function></literal>
<literal><function>position(<parameter>bytesubstring</parameter> in <parameter>bytes</parameter>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>Location of specified substring</entry>
......@@ -3590,11 +2732,12 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>substring</primary>
</indexterm>
<literal><function>substring(<parameter>string</parameter> <optional>from <type>int</type></optional> <optional>for <type>int</type></optional>)</function></literal>
<literal><function>substring(<parameter>bytes</parameter> <optional>from <type>int</type></optional> <optional>for <type>int</type></optional>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Extract substring
Extract substring (provide at least one of <literal>from</literal>
and <literal>for</literal>)
</entry>
<entry><literal>substring('Th\000omas'::bytea from 2 for 3)</literal></entry>
<entry><literal>h\000o</literal></entry>
......@@ -3606,18 +2749,32 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<primary>trim</primary>
</indexterm>
<literal><function>trim(<optional>both</optional>
<parameter>bytes</parameter> from
<parameter>string</parameter>)</function></literal>
<parameter>bytesremoved</parameter> from
<parameter>bytes</parameter>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Remove the longest string containing only bytes appearing in
<parameter>bytes</parameter> from the start
and end of <parameter>string</parameter>
<parameter>bytesremoved</parameter> from the start
and end of <parameter>bytes</parameter>
</entry>
<entry><literal>trim('\000\001'::bytea from '\000Tom\001'::bytea)</literal></entry>
<entry><literal>Tom</literal></entry>
</row>
<row>
<entry>
<literal><function>trim(<optional>both</optional> <optional>from</optional>
<parameter>bytes</parameter>,
<parameter>bytesremoved</parameter>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Non-standard syntax for <function>trim()</function>
</entry>
<entry><literal>trim(both from 'yxTomxx'::bytea, 'xyz'::bytea)</literal></entry>
<entry><literal>Tom</literal></entry>
</row>
</tbody>
</tgroup>
</table>
......@@ -3649,66 +2806,30 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>btrim</primary>
</indexterm>
<literal><function>btrim(<parameter>string</parameter>
<type>bytea</type>, <parameter>bytes</parameter> <type>bytea</type>)</function></literal>
<literal><function>btrim(<parameter>bytes</parameter>
<type>bytea</type>, <parameter>bytesremoved</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Remove the longest string containing only bytes appearing in
<parameter>bytes</parameter> from the start and end of
<parameter>string</parameter>
</entry>
<parameter>bytesremoved</parameter> from the start and end of
<parameter>bytes</parameter>
</entry>
<entry><literal>btrim('\000trim\001'::bytea, '\000\001'::bytea)</literal></entry>
<entry><literal>trim</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>decode</primary>
</indexterm>
<literal><function>decode(<parameter>string</parameter> <type>text</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Decode binary data from textual representation in <parameter>string</parameter>.
Options for <parameter>format</parameter> are same as in <function>encode</function>.
</entry>
<entry><literal>decode('123\000456', 'escape')</literal></entry>
<entry><literal>123\000456</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>encode</primary>
</indexterm>
<literal><function>encode(<parameter>data</parameter> <type>bytea</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Encode binary data into a textual representation. Supported
formats are: <literal>base64</literal>, <literal>hex</literal>, <literal>escape</literal>.
<literal>escape</literal> converts zero bytes and high-bit-set bytes to
octal sequences (<literal>\</literal><replaceable>nnn</replaceable>) and
doubles backslashes.
</entry>
<entry><literal>encode('123\000456'::bytea, 'escape')</literal></entry>
<entry><literal>123\000456</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>get_bit</primary>
</indexterm>
<literal><function>get_bit(<parameter>string</parameter>, <parameter>offset</parameter>)</function></literal>
<literal><function>get_bit(<parameter>bytes</parameter> <type>bytea</type>, <parameter>offset</parameter> <type>int</type>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>
Extract bit from string
Extract <link linkend="functions-zerobased-note">n'th</link> bit
from binary string
</entry>
<entry><literal>get_bit('Th\000omas'::bytea, 45)</literal></entry>
<entry><literal>1</literal></entry>
......@@ -3719,67 +2840,82 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>get_byte</primary>
</indexterm>
<literal><function>get_byte(<parameter>string</parameter>, <parameter>offset</parameter>)</function></literal>
<literal><function>get_byte(<parameter>bytes</parameter> <type>bytea</type>, <parameter>offset</parameter> <type>int</type>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>
Extract byte from string
Extract <link linkend="functions-zerobased-note">n'th</link> byte
from binary string
</entry>
<entry><literal>get_byte('Th\000omas'::bytea, 4)</literal></entry>
<entry><literal>109</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>length</primary>
</indexterm>
<literal><function>length(<parameter>string</parameter>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>
Length of binary string
<indexterm>
<primary>binary string</primary>
<secondary>length</secondary>
</indexterm>
<indexterm>
<primary>length</primary>
<secondary sortas="binary string">of a binary string</secondary>
<see>binary strings, length</see>
</indexterm>
</entry>
<entry><literal>length('jo\000se'::bytea)</literal></entry>
<entry><literal>5</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>length</primary>
</indexterm>
<literal><function>length(<parameter>bytes</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>int</type></entry>
<entry>
Number of bytes in binary string
<indexterm>
<primary>binary string</primary>
<secondary>length</secondary>
</indexterm>
<indexterm>
<primary>length</primary>
<secondary sortas="binary string">of a binary string</secondary>
<see>binary strings, length</see>
</indexterm>
</entry>
<entry><literal>length('jo\000se'::bytea)</literal></entry>
<entry><literal>5</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>md5</primary>
</indexterm>
<literal><function>md5(<parameter>string</parameter>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Calculates the MD5 hash of <parameter>string</parameter>,
returning the result in hexadecimal
</entry>
<entry><literal>md5('Th\000omas'::bytea)</literal></entry>
<entry><literal>8ab2d3c9689aaf18&#x200B;b4958c334c82d8b1</literal></entry>
</row>
<row>
<entry><literal><function>length(<parameter>bytes</parameter> <type>bytea</type>,
<parameter>encoding</parameter> <type>name</type>)</function></literal></entry>
<entry><type>int</type></entry>
<entry>
Number of characters in <parameter>bytes</parameter>, assuming
that it is text in the given <parameter>encoding</parameter>
</entry>
<entry><literal>length('jose'::bytea, 'UTF8')</literal></entry>
<entry><literal>4</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>md5</primary>
</indexterm>
<literal><function>md5(<parameter>bytes</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
MD5 <link linkend="functions-hash-note">hash</link>, with
the result written in hexadecimal
</entry>
<entry><literal>md5('Th\000omas'::bytea)</literal></entry>
<entry><literal>8ab2d3c9689aaf18&#x200B;b4958c334c82d8b1</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>set_bit</primary>
</indexterm>
<literal><function>set_bit(<parameter>string</parameter>,
<parameter>offset</parameter>, <parameter>newvalue</parameter>)</function></literal>
<literal><function>set_bit(<parameter>bytes</parameter> <type>bytea</type>,
<parameter>offset</parameter> <type>int</type>,
<parameter>newvalue</parameter> <type>int</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Set bit in string
Set <link linkend="functions-zerobased-note">n'th</link> bit in
binary string
</entry>
<entry><literal>set_bit('Th\000omas'::bytea, 45, 0)</literal></entry>
<entry><literal>Th\000omAs</literal></entry>
......@@ -3790,12 +2926,14 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>set_byte</primary>
</indexterm>
<literal><function>set_byte(<parameter>string</parameter>,
<parameter>offset</parameter>, <parameter>newvalue</parameter>)</function></literal>
<literal><function>set_byte(<parameter>bytes</parameter> <type>bytea</type>,
<parameter>offset</parameter> <type>int</type>,
<parameter>newvalue</parameter> <type>int</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Set byte in string
Set <link linkend="functions-zerobased-note">n'th</link> byte in
binary string
</entry>
<entry><literal>set_byte('Th\000omas'::bytea, 4, 64)</literal></entry>
<entry><literal>Th\000o@as</literal></entry>
......@@ -3806,13 +2944,13 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>sha224</primary>
</indexterm>
<literal><function>sha224(<type>bytea</type>)</function></literal>
<literal><function>sha224(<parameter>bytes</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
SHA-224 hash
SHA-224 <link linkend="functions-hash-note">hash</link>
</entry>
<entry><literal>sha224('abc')</literal></entry>
<entry><literal>sha224('abc'::bytea)</literal></entry>
<entry><literal>\x23097d223405d8228642a477bda2&#x200B;55b32aadbce4bda0b3f7e36c9da7</literal></entry>
</row>
......@@ -3821,13 +2959,13 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>sha256</primary>
</indexterm>
<literal><function>sha256(<type>bytea</type>)</function></literal>
<literal><function>sha256(<parameter>bytes</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
SHA-256 hash
SHA-256 <link linkend="functions-hash-note">hash</link>
</entry>
<entry><literal>sha256('abc')</literal></entry>
<entry><literal>sha256('abc'::bytea)</literal></entry>
<entry><literal>\xba7816bf8f01cfea414140de5dae2223&#x200B;b00361a396177a9cb410ff61f20015ad</literal></entry>
</row>
......@@ -3836,13 +2974,13 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>sha384</primary>
</indexterm>
<literal><function>sha384(<type>bytea</type>)</function></literal>
<literal><function>sha384(<parameter>bytes</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
SHA-384 hash
SHA-384 <link linkend="functions-hash-note">hash</link>
</entry>
<entry><literal>sha384('abc')</literal></entry>
<entry><literal>sha384('abc'::bytea)</literal></entry>
<entry><literal>\xcb00753f45a35e8bb5a03d699ac65007&#x200B;272c32ab0eded1631a8b605a43ff5bed&#x200B;8086072ba1e7cc2358baeca134c825a7</literal></entry>
</row>
......@@ -3851,34 +2989,258 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
<indexterm>
<primary>sha512</primary>
</indexterm>
<literal><function>sha512(<type>bytea</type>)</function></literal>
<literal><function>sha512(<parameter>bytes</parameter> <type>bytea</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
SHA-512 hash
SHA-512 <link linkend="functions-hash-note">hash</link>
</entry>
<entry><literal>sha512('abc')</literal></entry>
<entry><literal>sha512('abc'::bytea)</literal></entry>
<entry><literal>\xddaf35a193617abacc417349ae204131&#x200B;12e6fa4e89a97ea20a9eeee64b55d39a&#x200B;2192992a274fc1a836ba3c23a3feebbd&#x200B;454d4423643ce80e2a9ac94fa54ca49f</literal></entry>
</row>
<row>
<entry>
<indexterm>
<primary>substr</primary>
</indexterm>
<literal><function>substr(<parameter>bytes</parameter> <type>bytea</type>, <parameter>from</parameter> <type>int</type> <optional>, <parameter>count</parameter> <type>int</type></optional>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Extract substring (same as
<literal>substring(<parameter>bytea</parameter> from <parameter>from</parameter> for <parameter>count</parameter>)</literal>)
</entry>
<entry><literal>substr('alphabet', 3, 2)</literal></entry>
<entry><literal>ph</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<function>get_byte</function> and <function>set_byte</function> number the first byte
of a binary string as byte 0.
<function>get_bit</function> and <function>set_bit</function> number bits from the
right within each byte; for example bit 0 is the least significant bit of
the first byte, and bit 15 is the most significant bit of the second byte.
<para id="functions-zerobased-note">
Functions <function>get_byte</function> and <function>set_byte</function>
number the first byte of a binary string as byte 0.
Functions <function>get_bit</function> and <function>set_bit</function>
number bits from the right within each byte; for example bit 0 is the least
significant bit of the first byte, and bit 15 is the most significant bit
of the second byte.
</para>
<para>
Note that for historic reasons, the function <function>md5</function>
<para id="functions-hash-note">
For historical reasons, the function <function>md5</function>
returns a hex-encoded value of type <type>text</type> whereas the SHA-2
functions return type <type>bytea</type>. Use the functions
<function>encode</function> and <function>decode</function> to convert
between the two, for example <literal>encode(sha256('abc'),
'hex')</literal> to get a hex-encoded text representation.
<link linkend="function-encode"><function>encode</function></link>
and <link linkend="function-decode"><function>decode</function></link> to
convert between the two. For example write <literal>encode(sha256('abc'),
'hex')</literal> to get a hex-encoded text representation,
or <literal>decode(md5('abc'), 'hex')</literal> to get
a <type>bytea</type> value.
</para>
<para>
<indexterm>
<primary>character string</primary>
<secondary>converting to binary string</secondary>
</indexterm>
<indexterm>
<primary>binary string</primary>
<secondary>converting to character string</secondary>
</indexterm>
Functions for converting strings between different character sets
(encodings), and for representing arbitrary binary data in textual
form, are shown in
<xref linkend="functions-binarystring-conversions"/>. For these
functions, an argument or result of type <type>text</type> is expressed
in the database's default encoding, while arguments or results of
type <type>bytea</type> are in an encoding named by another argument.
</para>
<table id="functions-binarystring-conversions">
<title>Text/Binary String Conversion Functions</title>
<tgroup cols="5">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<indexterm>
<primary>convert</primary>
</indexterm>
<literal><function>convert(<parameter>bytes</parameter> <type>bytea</type>,
<parameter>src_encoding</parameter> <type>name</type>,
<parameter>dest_encoding</parameter> <type>name</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Convert binary string representing text in
encoding <parameter>src_encoding</parameter>
to a binary string in encoding <parameter>dest_encoding</parameter>
(see <xref linkend="multibyte-conversions-supported"/> for
available conversions)
</entry>
<entry><literal>convert('text_in_utf8', 'UTF8', 'LATIN1')</literal></entry>
<entry><literal>text_in_utf8</literal> represented in Latin-1 encoding</entry>
</row>
<row>
<entry>
<indexterm>
<primary>convert_from</primary>
</indexterm>
<literal><function>convert_from(<parameter>bytes</parameter> <type>bytea</type>,
<parameter>src_encoding</parameter> <type>name</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Convert binary string representing text in
encoding <parameter>src_encoding</parameter>
to <type>text</type> in the database encoding
(see <xref linkend="multibyte-conversions-supported"/> for
available conversions)
</entry>
<entry><literal>convert_from('text_in_utf8', 'UTF8')</literal></entry>
<entry><literal>text_in_utf8</literal> represented in the
database encoding</entry>
</row>
<row>
<entry>
<indexterm>
<primary>convert_to</primary>
</indexterm>
<literal><function>convert_to(<parameter>string</parameter> <type>text</type>,
<parameter>dest_encoding</parameter> <type>name</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Convert <type>text</type> (in the database encoding) to a binary
string encoded in encoding <parameter>dest_encoding</parameter>
(see <xref linkend="multibyte-conversions-supported"/> for
available conversions)
</entry>
<entry><literal>convert_to('some text', 'UTF8')</literal></entry>
<entry><literal>some text</literal> represented in UTF8 encoding</entry>
</row>
<row id="function-encode">
<entry>
<indexterm>
<primary>encode</primary>
</indexterm>
<literal><function>encode(<parameter>bytes</parameter> <type>bytea</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>text</type></entry>
<entry>
Encode binary data into a textual representation; supported
<parameter>format</parameter> values are:
<link linkend="encode-format-base64"><literal>base64</literal></link>,
<link linkend="encode-format-escape"><literal>escape</literal></link>,
<link linkend="encode-format-hex"><literal>hex</literal></link>
</entry>
<entry><literal>encode('123\000\001', 'base64')</literal></entry>
<entry><literal>MTIzAAE=</literal></entry>
</row>
<row id="function-decode">
<entry>
<indexterm>
<primary>decode</primary>
</indexterm>
<literal><function>decode(<parameter>string</parameter> <type>text</type>,
<parameter>format</parameter> <type>text</type>)</function></literal>
</entry>
<entry><type>bytea</type></entry>
<entry>
Decode binary data from a textual representation given
in <parameter>string</parameter>; allowed
<parameter>format</parameter> values are the same as
for <function>encode</function>
</entry>
<entry><literal>decode('MTIzAAE=', 'base64')</literal></entry>
<entry><literal>\x3132330001</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<para>
The <function>encode</function> and <function>decode</function>
functions support the following textual formats:
<variablelist>
<varlistentry id="encode-format-base64">
<term>base64
<indexterm>
<primary>base64 format</primary>
</indexterm></term>
<listitem>
<para>
The <literal>base64</literal> format is that
of <ulink url="https://tools.ietf.org/html/rfc2045#section-6.8">RFC
2045 Section 6.8</ulink>. As per the RFC, encoded lines are
broken at 76 characters. However instead of the MIME CRLF
end-of-line marker, only a newline is used for end-of-line.
The <function>decode</function> function ignores carriage-return,
newline, space, and tab characters. Otherwise, an error is
raised when <function>decode</function> is supplied invalid
base64 data &mdash; including when trailing padding is incorrect.
</para>
</listitem>
</varlistentry>
<varlistentry id="encode-format-escape">
<term>escape
<indexterm>
<primary>escape format</primary>
</indexterm></term>
<listitem>
<para>
The <literal>escape</literal> format converts zero bytes and
bytes with the high bit set into octal escape sequences
(<literal>\</literal><replaceable>nnn</replaceable>), and it doubles
backslashes. Other byte values are represented literally.
The <function>decode</function> function will raise an error if a
backslash is not followed by either a second backslash or three
octal digits; it accepts other byte values unchanged.
</para>
</listitem>
</varlistentry>
<varlistentry id="encode-format-hex">
<term>hex
<indexterm>
<primary>hex format</primary>
</indexterm></term>
<listitem>
<para>
The <literal>hex</literal> format represents each 4 bits of
data as one hexadecimal digit, <literal>0</literal>
through <literal>f</literal>, writing the higher-order digit of
each byte first. The <function>encode</function> function outputs
the <literal>a</literal>-<literal>f</literal> hex digits in lower
case. Because the smallest unit of data is 8 bits, there are
always an even number of characters returned
by <function>encode</function>.
The <function>decode</function> function
accepts the <literal>a</literal>-<literal>f</literal> characters in
either upper or lower case. An error is raised
when <function>decode</function> is given invalid hex data
&mdash; including when given an odd number of characters.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
......@@ -3976,22 +3338,20 @@ SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three');
</table>
<para>
The following <acronym>SQL</acronym>-standard functions work on bit
strings as well as character strings:
Some of the binary-string functions shown in
<xref linkend="functions-binarystring-sql"/> and
<xref linkend="functions-binarystring-other"/> are also available
for bit strings, specifically:
<literal><function>length</function></literal>,
<literal><function>bit_length</function></literal>,
<literal><function>octet_length</function></literal>,
<literal><function>position</function></literal>,
<literal><function>substring</function></literal>,
<literal><function>overlay</function></literal>.
</para>
<para>
The following functions work on bit strings as well as binary
strings:
<literal><function>overlay</function></literal>,
<literal><function>get_bit</function></literal>,
<literal><function>set_bit</function></literal>.
When working with a bit string, these functions number the first
When working with a bit string, <function>get_bit</function>
and <function>set_bit</function> number the first
(leftmost) bit of the string as bit 0.
</para>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment