Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
1073123b
Commit
1073123b
authored
Jan 19, 2001
by
Tom Lane
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update docs to explain that 7.1 locks down LC_COLLATE and LC_CTYPE at
initdb time. A few copy-editing cleanups, too.
parent
671f798c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
52 additions
and
45 deletions
+52
-45
doc/src/sgml/charset.sgml
doc/src/sgml/charset.sgml
+52
-45
No files found.
doc/src/sgml/charset.sgml
View file @
1073123b
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.
5 2000/12/22 21:51:57 petere
Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.
6 2001/01/19 04:47:50 tgl
Exp $ -->
<chapter id="charset">
<chapter id="charset">
<title>Localization</>
<title>Localization</>
...
@@ -54,7 +54,7 @@
...
@@ -54,7 +54,7 @@
cultural preferences regarding alphabets, sorting, number
cultural preferences regarding alphabets, sorting, number
formatting, etc. <productname>PostgreSQL</> uses the standard ISO
formatting, etc. <productname>PostgreSQL</> uses the standard ISO
C and POSIX-like locale facilities provided by the server operating
C and POSIX-like locale facilities provided by the server operating
system. For additional information refer the documentation of your
system. For additional information refer t
o t
he documentation of your
system.
system.
</para>
</para>
...
@@ -62,7 +62,7 @@
...
@@ -62,7 +62,7 @@
<title>Overview</>
<title>Overview</>
<para>
<para>
Locale support is not buil
d
into <productname>PostgreSQL</> by
Locale support is not buil
t
into <productname>PostgreSQL</> by
default; to enable it, supply the <option>--enable-locale</> option
default; to enable it, supply the <option>--enable-locale</> option
to the <filename>configure</> script:
to the <filename>configure</> script:
<informalexample>
<informalexample>
...
@@ -95,7 +95,7 @@ export LANG=sv_SE
...
@@ -95,7 +95,7 @@ export LANG=sv_SE
<para>
<para>
Occasionally it is useful to mix rules from several locales, e.g.,
Occasionally it is useful to mix rules from several locales, e.g.,
use U.S. rules but Spanish messages. To do that a set of
use U.S.
collation
rules but Spanish messages. To do that a set of
environment variables exist that override the default of
environment variables exist that override the default of
<envar>LANG</> for a particular category:
<envar>LANG</> for a particular category:
...
@@ -141,14 +141,23 @@ export LANG=sv_SE
...
@@ -141,14 +141,23 @@ export LANG=sv_SE
</para>
</para>
<para>
<para>
Once you have chosen a set of localization rules this way you must
Note that the locale behavior is determined by the environment
keep them fixed for any particular database cluster. That means
variables seen by the server, not by the environment of any client.
that the locales that were active when you ran <filename>initdb</>
Therefore, be careful to set these variables before starting the
must be kept the same when you start the postmaster. Otherwise,
postmaster.
the changed sort order can corrupt indexes or make your data
</para>
disappear mysteriously. It is currently not possible to change the
locales after database initialization or to use more than one set
<para>
of locales for a given database cluster.
The <envar>LC_COLLATE</> and <envar>LC_CTYPE</> variables affect the
sort order of indexes. Therefore, these values must be kept fixed
for any particular database cluster, or indexes on text columns will
become corrupt. <productname>Postgres</productname> enforces this
by recording the values of <envar>LC_COLLATE</> and <envar>LC_CTYPE</>
that are seen by <command>initdb</>. The server automatically adopts
those two values when it is started; only the other <envar>LC_</>
categories can be set from the environment at server startup.
In short, only one collation order can be used in a database cluster,
and it is chosen at <command>initdb</> time.
</para>
</para>
</sect2>
</sect2>
...
@@ -183,7 +192,10 @@ export LANG=sv_SE
...
@@ -183,7 +192,10 @@ export LANG=sv_SE
<para>
<para>
The only severe drawback of using the locale support in
The only severe drawback of using the locale support in
<productname>PostgreSQL</> is its speed. So use locale only if you
<productname>PostgreSQL</> is its speed. So use locale only if you
actually need it.
actually need it. It should be noted in particular that selecting
a non-C locale disables index optimizations for <literal>LIKE</> and
<literal>~</> operators, which can make a huge difference in the
speed of searches that use those operators.
</para>
</para>
</sect2>
</sect2>
...
@@ -261,7 +273,7 @@ perl: warning: Falling back to the standard locale ("C").
...
@@ -261,7 +273,7 @@ perl: warning: Falling back to the standard locale ("C").
<para>
<para>
<acronym>MB</acronym> also fixes some problems concerning 8-bit single byte
<acronym>MB</acronym> also fixes some problems concerning 8-bit single byte
character sets including ISO8859. (I would not say all
of
problems
character sets including ISO8859. (I would not say all problems
have been fixed. I just confirmed that the regression test ran fine
have been fixed. I just confirmed that the regression test ran fine
and a few French characters could be used with the patch. Please let
and a few French characters could be used with the patch. Please let
me know if you find any problem while using 8-bit characters.)
me know if you find any problem while using 8-bit characters.)
...
@@ -271,7 +283,7 @@ perl: warning: Falling back to the standard locale ("C").
...
@@ -271,7 +283,7 @@ perl: warning: Falling back to the standard locale ("C").
<title>Enabling MB</title>
<title>Enabling MB</title>
<para>
<para>
Run configure with
a
multibyte option:
Run configure with
the
multibyte option:
<programlisting>
<programlisting>
% ./configure --enable-multibyte[=<replaceable>encoding_system</replaceable>]
% ./configure --enable-multibyte[=<replaceable>encoding_system</replaceable>]
...
@@ -383,11 +395,11 @@ perl: warning: Falling back to the standard locale ("C").
...
@@ -383,11 +395,11 @@ perl: warning: Falling back to the standard locale ("C").
% initdb -E EUC_JP
% initdb -E EUC_JP
</programlisting>
</programlisting>
sets the default encoding to EUC_JP(Extended Unix Code for Japanese).
sets the default encoding to EUC_JP
(Extended Unix Code for Japanese).
Note that you can use "--encoding" instead of "-E" if you prefer
Note that you can use "--encoding" instead of "-E" if you prefer
to type longer option strings.
to type longer option strings.
If no -E or --encoding option is given, the encoding
If no -E or --encoding option is given, the encoding
specified at
the compil
e time is used.
specified at
configur
e time is used.
</para>
</para>
<para>
<para>
...
@@ -397,8 +409,8 @@ perl: warning: Falling back to the standard locale ("C").
...
@@ -397,8 +409,8 @@ perl: warning: Falling back to the standard locale ("C").
% createdb -E EUC_KR korean
% createdb -E EUC_KR korean
</programlisting>
</programlisting>
will create a database named "korean" with EUC_KR encoding.
The
will create a database named "korean" with EUC_KR encoding.
a
nother way to accomplish this is to use a SQL command:
A
nother way to accomplish this is to use a SQL command:
<programlisting>
<programlisting>
CREATE DATABASE korean WITH ENCODING = 'EUC_KR';
CREATE DATABASE korean WITH ENCODING = 'EUC_KR';
...
@@ -527,20 +539,11 @@ char *pg_encoding_to_char(int <replaceable>encoding_id</replaceable>)
...
@@ -527,20 +539,11 @@ char *pg_encoding_to_char(int <replaceable>encoding_id</replaceable>)
</para>
</para>
</listitem>
</listitem>
<listitem>
<para>
Using <envar>PGCLIENTENCODING</envar>.
If an environment variable <envar>PGCLIENTENCODING</envar> is defined in the
frontend, an automatic encoding translation is done by the backend.
</para>
</listitem>
<listitem>
<listitem>
<para>
<para>
Using <command>SET CLIENT_ENCODING TO</command>.
Using <command>SET CLIENT_ENCODING TO</command>.
Setting the frontend side encoding can be done
a
SQL command:
Setting the frontend side encoding can be done
by this
SQL command:
<programlisting>
<programlisting>
SET CLIENT_ENCODING TO 'encoding';
SET CLIENT_ENCODING TO 'encoding';
...
@@ -552,7 +555,7 @@ SET CLIENT_ENCODING TO 'encoding';
...
@@ -552,7 +555,7 @@ SET CLIENT_ENCODING TO 'encoding';
SET NAMES 'encoding';
SET NAMES 'encoding';
</programlisting>
</programlisting>
To query the current
the
frontend encoding:
To query the current frontend encoding:
<programlisting>
<programlisting>
SHOW CLIENT_ENCODING;
SHOW CLIENT_ENCODING;
...
@@ -565,6 +568,17 @@ RESET CLIENT_ENCODING;
...
@@ -565,6 +568,17 @@ RESET CLIENT_ENCODING;
</programlisting>
</programlisting>
</para>
</para>
</listitem>
</listitem>
<listitem>
<para>
Using <envar>PGCLIENTENCODING</envar>.
If environment variable <envar>PGCLIENTENCODING</envar> is defined
in the client's environment, that client encoding is automatically
selected when a backend connection is made. (This can subsequently
be overridden using any of the other methods mentioned above.)
</para>
</listitem>
</itemizedlist>
</itemizedlist>
</para>
</para>
</sect2>
</sect2>
...
@@ -588,7 +602,7 @@ RESET CLIENT_ENCODING;
...
@@ -588,7 +602,7 @@ RESET CLIENT_ENCODING;
<para>
<para>
Suppose you choose EUC_JP for the backend, LATIN1 for the frontend,
Suppose you choose EUC_JP for the backend, LATIN1 for the frontend,
then some Japanese characters could not be translated into LATIN1. In
then some Japanese characters could not be translated into LATIN1. In
this case, a letter
cannot be represented in the LATIN1 character set,
this case, a letter
that cannot be represented in the LATIN1 character set
would be transformed as:
would be transformed as:
<programlisting>
<programlisting>
...
@@ -601,7 +615,7 @@ RESET CLIENT_ENCODING;
...
@@ -601,7 +615,7 @@ RESET CLIENT_ENCODING;
<title>References</title>
<title>References</title>
<para>
<para>
These are good sources to start learning
various kind
of encoding
These are good sources to start learning
about various kinds
of encoding
systems.
systems.
<itemizedlist>
<itemizedlist>
...
@@ -724,8 +738,7 @@ Mar 1, 1998 PL1 released
...
@@ -724,8 +738,7 @@ Mar 1, 1998 PL1 released
<para>
<para>
<!--
<!--
[Here is a good documentation explaining how to use WIN1250 on
[Here is a good documentation explaining how to use WIN1250 on
Windows/ODBC from Pavel Behal. Please note that Installation step 1)
Windows/ODBC from Pavel Behal]
is not necceary in 6.5.1 - Tatsuo]
Version: 0.91 for PgSQL 6.5
Version: 0.91 for PgSQL 6.5
Author: Pavel Behal
Author: Pavel Behal
...
@@ -815,20 +828,14 @@ Sorry for my Eglish and C code, I'm not native :-)
...
@@ -815,20 +828,14 @@ Sorry for my Eglish and C code, I'm not native :-)
<title>WIN1250 on Windows/ODBC</title>
<title>WIN1250 on Windows/ODBC</title>
<step>
<step>
<para>
<para>
Change the three relevant files in the source directories.
Compile <productname>Postgres</productname> with locale enabled
</para>
</step>
<step>
<para>
Compile <productname>Postgres</productname> with local enabled
and the multibyte encoding set to <literal>LATIN2</literal>.
and the multibyte encoding set to <literal>LATIN2</literal>.
</para>
</para>
</step>
</step>
<step>
<step>
<para>
<para>
Set up your instalation. Do not forget to create locale
Set up your instal
l
ation. Do not forget to create locale
variables in your profile (environment). For example (this may
variables in your profile (environment). For example (this may
not be correct for <emphasis>your</emphasis> environment):
not be correct for <emphasis>your</emphasis> environment):
...
@@ -936,8 +943,8 @@ HostCharset <replaceable>host_spec</> <replaceable>host_charset</>
...
@@ -936,8 +943,8 @@ HostCharset <replaceable>host_spec</> <replaceable>host_charset</>
<para>
<para>
The <filename>charset.conf</> file is always processed up to the
The <filename>charset.conf</> file is always processed up to the
end, so you can easily specify exceptions from the previous
end, so you can easily specify exceptions from the previous
rules. In the
src/data you will find charset.conf example and a few
rules. In the
<filename>src/data/</> directory you will find an
recoding tables.
example <filename>charset.conf</> and a few
recoding tables.
</para>
</para>
<para>
<para>
...
@@ -945,7 +952,7 @@ HostCharset <replaceable>host_spec</> <replaceable>host_charset</>
...
@@ -945,7 +952,7 @@ HostCharset <replaceable>host_spec</> <replaceable>host_charset</>
set mapping there are obviously some restrictions as well. You
set mapping there are obviously some restrictions as well. You
cannot use different encodings on the same host at the same
cannot use different encodings on the same host at the same
time. It is also inconvenient when you boot your client hosts into
time. It is also inconvenient when you boot your client hosts into
m
or
e operating systems. Nevertheless, when these restrictions are
m
ultipl
e operating systems. Nevertheless, when these restrictions are
not limiting and you do not need multi-byte characters than it is a
not limiting and you do not need multi-byte characters than it is a
simple and effective solution.
simple and effective solution.
</para>
</para>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment