diff options
Diffstat (limited to 'doc/html/pcre2unicode.html')
-rw-r--r-- | doc/html/pcre2unicode.html | 32 |
1 files changed, 18 insertions, 14 deletions
diff --git a/doc/html/pcre2unicode.html b/doc/html/pcre2unicode.html index 7af55c3..6ca367f 100644 --- a/doc/html/pcre2unicode.html +++ b/doc/html/pcre2unicode.html @@ -67,16 +67,20 @@ In UTF modes, the dot metacharacter matches one UTF character instead of a single code unit. </P> <P> -The escape sequence \C can be used to match a single code unit, in a UTF mode, +The escape sequence \C can be used to match a single code unit in a UTF mode, but its use can lead to some strange effects because it breaks up multi-unit characters (see the description of \C in the <a href="pcre2pattern.html"><b>pcre2pattern</b></a> -documentation). The use of \C is not supported by the alternative matching -function <b>pcre2_dfa_match()</b> when in UTF mode. Its use provokes a -match-time error. The JIT optimization also does not support \C in UTF mode. -If JIT optimization is requested for a UTF pattern that contains \C, it will -not succeed, and so the matching will be carried out by the normal interpretive -function. +documentation). +</P> +<P> +The use of \C is not supported by the alternative matching function +<b>pcre2_dfa_match()</b> when in UTF-8 or UTF-16 mode, that is, when a character +may consist of more than one code unit. The use of \C in these modes provokes +a match-time error. Also, the JIT optimization does not support \C in these +modes. If JIT optimization is requested for a UTF-8 or UTF-16 pattern that +contains \C, it will not succeed, and so when <b>pcre2_match()</b> is called, +the matching will be carried out by the normal interpretive function. </P> <P> The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly test @@ -244,9 +248,9 @@ Errors in UTF-16 strings <P> The following negative error codes are given for invalid UTF-16 strings: <pre> - PCRE_UTF16_ERR1 Missing low surrogate at end of string - PCRE_UTF16_ERR2 Invalid low surrogate follows high surrogate - PCRE_UTF16_ERR3 Isolated low surrogate + PCRE2_ERROR_UTF16_ERR1 Missing low surrogate at end of string + PCRE2_ERROR_UTF16_ERR2 Invalid low surrogate follows high surrogate + PCRE2_ERROR_UTF16_ERR3 Isolated low surrogate <a name="utf32strings"></a></PRE> </P> @@ -256,8 +260,8 @@ Errors in UTF-32 strings <P> The following negative error codes are given for invalid UTF-32 strings: <pre> - PCRE_UTF32_ERR1 Surrogate character (range from 0xd800 to 0xdfff) - PCRE_UTF32_ERR2 Code point is greater than 0x10ffff + PCRE2_ERROR_UTF32_ERR1 Surrogate character (0xd800 to 0xdfff) + PCRE2_ERROR_UTF32_ERR2 Code point is greater than 0x10ffff </PRE> </P> @@ -276,9 +280,9 @@ Cambridge, England. REVISION </b><br> <P> -Last updated: 16 October 2015 +Last updated: 03 July 2016 <br> -Copyright © 1997-2015 University of Cambridge. +Copyright © 1997-2016 University of Cambridge. <br> <p> Return to the <a href="index.html">PCRE2 index page</a>. |