Document method of removing invalid UTF8 escape sequences from dump

file. Backpatch to 8.1.X. Paul Lindner

Document method of removing invalid UTF8 escape sequences from dump
file. Backpatch to 8.1.X. Paul Lindner
394fedfd · Bruce Momjian · af2e8a87 · 394fedfd
Commit 394fedfd authored Dec 06, 2005 by Bruce Momjian
Hide whitespace changes
Inline Side-by-side

Showing with 15 additions and 1 deletion

doc/src/sgml/release.sgml doc/src/sgml/release.sgml +15 -1

No files found.
--- a/doc/src/sgml/release.sgml
+++ b/doc/src/sgml/release.sgml
 <!--
-$PostgreSQL: pgsql/doc/src/sgml/release.sgml,v 1.403 2005/12/06 18:45:18 momjian Exp $
+$PostgreSQL: pgsql/doc/src/sgml/release.sgml,v 1.404 2005/12/06 19:26:43 momjian Exp $
 Typical markup:
@@ -525,6 +525,20 @@ psql -t -f fixseq.sql db1 | psql -e db1
       <type>boolean</type> rather than an <type>integer</type> (Neil)
      </para>
     </listitem>
+     <listitem>
+      <para>
+       Some users are having problems loading <literal>UTF8</> data into
+       8.1.X.  This is because previous versions allowed invalid <literal>UTF8</>
+       sequences to be entered into the database, and this release
+       properly accepts only valid <literal>UTF8</> sequences.  One
+       way to correct a dumpfile is to use <command>iconv -c -f UTF8 -t UTF8</>.
+       This will remove invalid character sequences. <command>iconv</>
+       reads the entire input file into memory so it might be necessary to
+       <command>split</> the dump into multiple smaller files for processing.
+      </para>
+     </listitem>
    </itemizedlist>
   </sect2>