Convert more charset/locale documentation to DocBook

0ba77c14 · Peter Eisentraut · 333cbc2d · 333cbc2d · 333cbc2d · 0ba77c14
Commit 0ba77c14 authored Sep 30, 2000 by Peter Eisentraut
7 changed files
--- a/doc/README.charsets
+++ b/doc/README.charsets
-  PostgreSQL Charsets README
-  Josef Balatka, <balatka@email.cz>
-  Draft v0.1, Tue Jul 20 15:49:07 CEST 1999
-  This document is a brief overview of the national charsets support
-  that PostgreSQL ver. 6.5 has implemented. Various compilation options
-  and setup tips are mentioned here to be helpful in the particular use.
-  ---------------------------------------------------------------------------
-  Table of Contents
-  1. Locale awareness
-  2. Single-byte charsets recoding
-  3. Multi-byte support/recoding
-  4. Credits
-  ---------------------------------------------------------------------------
-  1. Locale awareness
-     PostgreSQL server supports both locale aware and locale not aware
-     (default) operational modes. You can determine this mode during the
-     configuration stage of the installation with --enable-locale option.
-     If you don't use --enable-locale, the multi-language code will not be
-     compiled and PostgreSQL will behave as an ASCII compliant application.
-     This mode is useful for its speed but only provided that you don't
-     have to consider national specific chars.
-     With --enable-locale you will get a locale aware server using LC_*
-     environment variables to determine how to process national specifics.
-     In this case strcoll(3) and similar functions are used internally
-     so speed is somewhat lower.
-     Notice here that --enable-locale is sufficient when all your clients
-     use the same single-byte encoding as the database server does.
-     When your clients use encoding different from the server than you have
-     to use, moreover, --enable-recode or --with-mb=<encoding> options on
-     the server side or a particular client that does recoding itself (e.g.
-     there exists a PostgreSQL ODBC driver for Win32 with various Cyrillic
-     encoding capability). Option --with-mb=<encoding> is necessary for the
-     multi-byte charsets support.
-  2. Single-byte charsets recoding
-     You can set up this feature with --enable-recode option. This option
-     is described as 'enable Cyrillic recode support' which doesn't express
-     all its power. It can be used for *any* single-byte charset recoding.
-     This method uses charset.conf file located in the $PGDATA directory.
-     It's a typical configuration text file where spaces and newlines
-     separate items and records and # specifies comments. Three keywords
-     with the following syntax are recognized here:
-       BaseCharset	<server_charset>
-       RecodeTable	<from_charset>     <to_charset>    <file_name>
-       HostCharset	<host_spec>	   <host_charset>
-     BaseCharset defines encoding of the database server. All charset
-     names are only used for mapping inside the charset.conf so you can
-     freely use typing-friendly names.
-     RecodeTable records specify translation table between server and client.
-     The file name is relative to the $PGDATA directory. Table file format
-     is very simple. There are no keywords and characters are represented by
-     a pair of decimal or hexadecimal (0x prefixed) values on single lines:
-       <char_value>  <translated_char_value>
-     HostCharset records define IP address and charset. You can use a single
-     IP address, an IP mask range starting from the given address or an IP
-     interval (e.g. 127.0.0.1, 192.168.1.100/24, 192.168.1.20-192.168.1.40)
-     The charset.conf is always processed up to the end, so you can easily
-     specify exceptions from the previous rules. In the src/data you will
-     find charset.conf example and a few recoding tables.
-     As this solution is based on the client's IP address / charset mapping
-     there are obviously some restrictions as well. You can't use different
-     encoding on the same host at the same time. It's also inconvenient when
-     you boot your client hosts into more operating systems.
-     Nevertheless, when these restrictions are not limiting and you don't
-     need multi-byte chars than it's a simple and effective solution.
-  3. Multi-byte support/recoding
-     It's a new generation of charset encoding in PostgreSQL designed as a
-     more complex solution supporting both single-byte and multi-byte chars.
-     You can set up this feature with --with-mb=<encoding> option.
-     There is no IP mapping file and recoding is controlled through the new
-     SQL statements. Recoding tables are included in the code. Many national
-     charsets are already supported and further will follow.
-     See doc/README.mb, doc/README.mb.jp to get detailed instruction on how
-     to use the multibyte support. In the file doc/README.locale there is
-     a particular instruction on usage of the multibyte support with Cyrillic.
-  4. Credits
-     I'd like to thank the PostgreSQL development team and all contributors
-     for creating PostgreSQL. Thanks to Oleg Bartunov, Oleg Broytmann and
-     Tatsuo Ishii for opening the door into the multi-language world.
--- a/doc/README.locale
+++ b/doc/README.locale
-===========
-1999 Jul 21
-===========
-   Josef Balatka, <balatka@email.cz> asked us not to remove RECODE and sent me
-Czech ISO-8859-2 -> WIN-1250 translation table.
-   RECODE is no longer contains just Cyrillic RECODE and will stay in 
-PostgreSQL.
-   He also created some bits of documentation, mostly concerning RECODE -
-see README.Charsets.
-===========
-1999 Apr 14
-===========
-   Tatsuo Ishii <t-ishii@sra.co.jp> updated Multibyte support extending it
-to Cyrillic language. Now PostgreSQL supports KOI8-R, WIN-1251, ISO8859-5
-and CP866 (ALT) encodings.
-   Short instruction on using this feature follows. Longer discussion of
-Multibyte support is in README.mb.
-   WARNING! Now with Multibyte support Cyrillic RECODE declared obsolete
-and will be removed from Postgres. If you are using RECODE consider
-switching to Multibyte support.
-   Instructions on how to prepare Postgres for Cyrillic Multibyte support.
-   ----------------------------------------------------------------------
-   First, you need to backup all your databases. I recommend to backup the
-entire Postgres directory, including binaries and libraries - thus you can
-easily restore if something goes wrong.
-   Dump you data: pg_dumpall > dump.db
-   Stop postmaster.
-   Configure, compile and install Postgres. (I'll mostly talk about KOI8-R
-encoding, this is just to make examples a little more clear; you can use
-any supported encoding.)
-   cd src
-   ./configure --enable-locale --with-mb=KOI8
-   make
-   make install
-   Make sure you've backed up your databases. Doublecheck your backup. I
-really mean it - make regular backups and test your backups sometimes by
-fake restore.
-   Remove your data directory (better, rename or move it).
-   Run initdb saying your primary encoding: initdb -e KOI8. If you omit
-encoding, primary encoding from configure will be taken.
-   Start postmaster.
-   Create databases: createdb -e KOI8. Again, you can omit encoding -
-default encoding will be used. You are not forced to use the same encoding
-for all your databases - you can create different databases with different
-encodings.
-   Load your data from the dump you've created: psql < dump.db
-   That's all! Now you are ready to enjoy the full power of Multibyte
-support.
-   To use Multibyte support you do not need to do something special - just
-execute your queries. If client program does not set encoding, it will get
-the data in database encoding. But client may ask Postgres to do automatic
-server-to-client and client-to-server conversions. There are 2 (two) ways
-client program declares its encoding:
-   1) client explicitly executes the query SET CLIENT_ENCODING TO 'win';
-   2) client started with environment variable set. Examples -
-using sh syntax:
-   PGCLIENTENCODING='win'; export PGCLIENTENCODING
-using csh syntax:
-   setenv PGCLIENTENCODING 'win'
-   Setting PGCLIENTENCODING even if you use same client encding as the
-database would omit an overhead of asking the database encoding while
-initiating the connection, so it is good idea to set it in any case.
-   Now you may run test suite and see Multibyte support in action. Go to
-.../src/test/locale and run
-   make clean all test-koi2win
-===========
-1998 Nov 20
-===========
-   I extended locale support, originally written by Oleg Bartunov
-<oleg@sai.msu.su>. Now ORDER BY (if PostgreSQL configured with
--enable-locale) uses strcoll() for all text fields: char(n), varchar(n),
-text.
-   I included test suite .../src/test/locale. I didn't include this in
-the regression test because not so much people require locale support. Read
-.../src/test/locale/README for details on the test suite.
-   Many thanks to Oleg Bartunov (oleg@sai.msu.su) and Thomas G. Lockhart
-(lockhart@alumni.caltech.edu) for hints, tips, help and discussion.
-Oleg.
--- a/doc/src/sgml/admin.sgml
+++ b/doc/src/sgml/admin.sgml
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.26 2000/09/12 05:37:07 thomas Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.27 2000/09/30 16:58:20 petere Exp $
 Postgres Administrator's Guide.
 Derived from postgres.sgml.
@@ -98,9 +98,9 @@ Derived from postgres.sgml.
  &intro-ag;
  &installation;
  &installw;
-  &charset;
  &runtime;
  &client-auth;
+  &charset;
  &manage-ag;
  &user-manag;
  &backup;

--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v 1.21 2000/09/29 20:21:34 petere Exp $ -->
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v 1.22 2000/09/30 16:58:20 petere Exp $ -->
 <chapter id="installation">
 <title><![%flattext-install-include[<productname>PostgreSQL</> ]]>Installation Instructions</title>
@@ -447,8 +447,9 @@ su - postgres
       <term>--enable-recode</term>
       <listitem>
        <para>
-         Enables character set recode support. See
+         Enables single-byte character set recode support. See
-         <filename>doc/README.Charsets</> for details on this feature.
+         <![%flattext-install-include[the <citetitle>Administrator's Guide</citetitle>]]>
+         <![%flattext-install-ignore[<xref linkend="recode">]]> about this feature.
        </para>
       </listitem>
      </varlistentry>
@@ -459,7 +460,10 @@ su - postgres
        <para>
         Allows the use of multibyte character encodings. This is
         primarily for languages like Japanese, Korean, and Chinese.
-         Read <filename>doc/README.mb</> for details.
+         Read 
+         <![%flattext-install-include[the <citetitle>Administrator's Guide</citetitle>]]>
+         <![%flattext-install-ignore[<xref linkend="multibyte">]]>
+         for details.
        </para>
       </listitem>
      </varlistentry>

--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.41 2000/09/12 05:37:09 thomas Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.42 2000/09/30 16:58:20 petere Exp $
 -->
 <!doctype set PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
@@ -173,9 +173,9 @@ $Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.41 2000/09/12 05:37:09 th
 -->
  &installation;
  &installw;
-  &charset;
  &runtime;
  &client-auth;
+  &charset;
  &manage-ag;
  &user-manag;
  &backup;

--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.25 2000/09/29 20:21:34 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.26 2000/09/30 16:58:20 petere Exp $
 -->
 <Chapter Id="runtime">
@@ -1553,126 +1553,6 @@ set semsys:seminfo_semmsl=32
 </sect1>
- <sect1 id="locale">
-  <title>Locale Support</title>
-  <note>
-   <title>Acknowledgement</title>
-   <para>
-    Written by Oleg Bartunov. See <ulink
-    url="http://www.sai.msu.su/~megera/postgres/">Oleg's web
-    page</ulink> for additional information on locale and Russian
-    language support.
-   </para>
-  </note>
-  <para>
-   While doing a project for a company in Moscow, Russia, I
-   encountered the problem that <productname>Postgres</> had no
-   support of national alphabets. After looking for possible
-   workarounds I decided to develop support of locale myself. I'm not
-   a C programmer but already had some experience with locale
-   programming when I work with <productname>Perl</> (debugging) and
-   <productname>Glimpse</>. After several days of digging through the
-   <productname>Postgres</> source tree I made very minor corections
-   to <filename>src/backend/utils/adt/varlena.c</> and
-   <filename>src/backend/main/main.c</> and got what I needed! I did
-   support only for <envar>LC_CTYPE</envar> and
-   <envar>LC_COLLATE</envar>, but later <envar>LC_MONETARY</envar> was
-   added by others. I got many messages from people about this patch
-   so I decided to send it to developers and (to my surprise) it was
-   incorporated into the <productname>Postgres</> distribution.
-  </para>
-  <para>
-   People often complain that locale doesn't work for them. There are
-   several common mistakes:
-   <itemizedlist>
-    <listitem>
-     <para>
-      Didn't properly configure <productname>Postgres</> before
-      compilation. You must run <filename>configure</> with the
-      <option>--enable-locale</> option to enable locale support.
-     </para>
-    </listitem>
-    <listitem>
-     <para>
-      Didn't setup environment correctly when starting postmaster. You
-      must define environment variables <envar>LC_CTYPE</envar> and
-      <envar>LC_COLLATE</envar> before running postmaster because
-      backend gets information about locale from environment. I use
-      following shell script:
-<programlisting>
-#!/bin/sh
-export LC_CTYPE=koi8-r
-export LC_COLLATE=koi8-r
-postmaster -B 1024 -S -D/usr/local/pgsql/data/ -o '-Fe'
-</programlisting>
-     </para>
-    </listitem>
-    <listitem>
-     <para>
-      Broken locale support in the operating system (for example,
-      locale support in libc under Linux several times has changed and
-      this caused a lot of problems). Perl has also support of locale
-      and if locale is broken <command>perl -v</> will complain
-      something like:
-<screen>
-<prompt>$</> <userinput>export LC_CTYPE='not_exist'</>
-<prompt>$</> <userinput>perl -v</>
-<computeroutput>
-perl: warning: Setting locale failed.
-perl: warning: Please check that your locale settings:
-LC_ALL = (unset),
-LC_CTYPE = "not_exist",
-LANG = (unset)
-are supported and installed on your system.
-perl: warning: Falling back to the standard locale ("C").
-</computeroutput>
-</screen>
-     </para>
-    </listitem>
-    <listitem>
-     <para>
-      Wrong location of locale files. Possible locations include:
-      <filename>/usr/lib/locale</filename> (Linux, Solaris),
-      <filename>/usr/share/locale</filename> (Linux),
-      <filename>/usr/lib/nls/loc</filename> (DUX 4.0).
-      Check <command>man locale</command> to find the correct
-      location. Under Linux I made a symbolic link between
-      <filename>/usr/lib/locale</filename> and
-      <filename>/usr/share/locale</filename> to be sure that the next
-      libc will not break my locale.
-     </para>
-    </listitem>
-   </itemizedlist>
-  </para>
-  <formalpara>
-   <title>What are the Benefits?</title> 
-   <para>
-    You can use ~* and order by operators for strings contain
-    characters from national alphabets. Non-english users definitely
-    need that.
-   </para>
-  </formalpara>
-  <formalpara>
-   <title>What are the Drawbacks?</title>
-   <para>
-    There is one evident drawback of using locale - its speed! So, use
-    locale only if you really need it.
-   </para>
-  </formalpara>
- </sect1>
 <sect1 id="postmaster-shutdown">
  <title>Shutting down the server</title>