diff options
author | Peter Eisentraut | 2010-09-07 18:54:09 +0000 |
---|---|---|
committer | Peter Eisentraut | 2010-09-07 18:54:09 +0000 |
commit | 7cd082f907814f0fe90918399cbb95fd83f161c9 (patch) | |
tree | 398a9217524391bc79971a25d209ead89e86bf1d /doc/src | |
parent | c5d94a34fbd732762106b4056823bde6969fdfd8 (diff) |
Clarify that surrogate pairs are not encoded in UTF-8 directly
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/syntax.sgml | 49 |
1 files changed, 28 insertions, 21 deletions
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index ca092b5ae6e..18582b9216c 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.154 2010/09/01 18:22:29 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.155 2010/09/07 18:54:09 petere Exp $ --> <chapter id="sql-syntax"> <title>SQL Syntax</title> @@ -236,12 +236,15 @@ U&"d!0061t!+000061" UESCAPE '!' <para> The Unicode escape syntax works only when the server encoding is - UTF8. When other server encodings are used, only code points in - the ASCII range (up to <literal>\007F</literal>) can be specified. - Both the 4-digit and the 6-digit form can be used to specify - UTF-16 surrogate pairs to compose characters with code points - larger than U+FFFF (although the availability of - the 6-digit form technically makes this unnecessary). + <literal>UTF8</>. When other server encodings are used, only code + points in the ASCII range (up to <literal>\007F</literal>) can be + specified. Both the 4-digit and the 6-digit form can be used to + specify UTF-16 surrogate pairs to compose characters with code + points larger than U+FFFF, although the availability of the + 6-digit form technically makes this unnecessary. (When surrogate + pairs are used when the server encoding is <literal>UTF8</>, they + are first combined into a single code point that is then encoded + in UTF-8.) </para> <para> @@ -431,13 +434,15 @@ SELECT 'foo' 'bar'; <para> The Unicode escape syntax works fully only when the server - encoding is UTF-8. When other server encodings are used, only - code points in the ASCII range (up to <literal>\u007F</>) can be - specified. Both the 4-digit and the 8-digit form can be used to - specify UTF-16 surrogate pairs to compose characters with code - points larger than U+FFFF (although the - availability of the 8-digit form technically makes this - unnecessary). + encoding is <literal>UTF8</>. When other server encodings are + used, only code points in the ASCII range (up + to <literal>\u007F</>) can be specified. Both the 4-digit and + the 8-digit form can be used to specify UTF-16 surrogate pairs to + compose characters with code points larger than U+FFFF, although + the availability of the 8-digit form technically makes this + unnecessary. (When surrogate pairs are used when the server + encoding is <literal>UTF8</>, they are first combined into a + single code point that is then encoded in UTF-8.) </para> <caution> @@ -517,13 +522,15 @@ U&'d!0061t!+000061' UESCAPE '!' <para> The Unicode escape syntax works only when the server encoding is - UTF8. When other server encodings are used, only code points in - the ASCII range (up to <literal>\007F</literal>) can be - specified. - Both the 4-digit and the 6-digit form can be used to specify - UTF-16 surrogate pairs to compose characters with code points - larger than U+FFFF (although the availability - of the 6-digit form technically makes this unnecessary). + <literal>UTF8</>. When other server encodings are used, only + code points in the ASCII range (up to <literal>\007F</literal>) + can be specified. Both the 4-digit and the 6-digit form can be + used to specify UTF-16 surrogate pairs to compose characters with + code points larger than U+FFFF, although the availability of the + 6-digit form technically makes this unnecessary. (When surrogate + pairs are used when the server encoding is <literal>UTF8</>, they + are first combined into a single code point that is then encoded + in UTF-8.) </para> <para> |