diff options
Diffstat (limited to 'doc/pcre2pattern.3')
-rw-r--r-- | doc/pcre2pattern.3 | 26 |
1 files changed, 15 insertions, 11 deletions
diff --git a/doc/pcre2pattern.3 b/doc/pcre2pattern.3 index 8d0e9df..70ac14a 100644 --- a/doc/pcre2pattern.3 +++ b/doc/pcre2pattern.3 @@ -1,4 +1,4 @@ -.TH PCRE2PATTERN 3 "13 November 2015" "PCRE2 10.21" +.TH PCRE2PATTERN 3 "20 June 2016" "PCRE2 10.22" .SH NAME PCRE2 - Perl-compatible regular expressions (revised API) .SH "PCRE2 REGULAR EXPRESSION DETAILS" @@ -1256,16 +1256,20 @@ PCRE2 does not allow \eC to appear in lookbehind assertions .\" </a> (described below) .\" -in a UTF mode, because this would make it impossible to calculate the length of -the lookbehind. Neither the alternative matching function -\fBpcre2_dfa_match()\fP not the JIT optimizer support \eC in a UTF mode. The -former gives a match-time error; the latter fails to optimize and so the match -is always run using the interpreter. +in UTF-8 or UTF-16 modes, because this would make it impossible to calculate +the length of the lookbehind. Neither the alternative matching function +\fBpcre2_dfa_match()\fP nor the JIT optimizer support \eC in these UTF modes. +The former gives a match-time error; the latter fails to optimize and so the +match is always run using the interpreter. +.P +In the 32-bit library, however, \eC is always supported (when not explicitly +locked out) because it always matches a single code unit, whether or not UTF-32 +is specified. .P In general, the \eC escape sequence is best avoided. However, one way of using -it that avoids the problem of malformed UTF characters is to use a lookahead to -check the length of the next character, as in this pattern, which could be used -with a UTF-8 string (ignore white space and line breaks): +it that avoids the problem of malformed UTF-8 or UTF-16 characters is to use a +lookahead to check the length of the next character, as in this pattern, which +could be used with a UTF-8 string (ignore white space and line breaks): .sp (?| (?=[\ex00-\ex7f])(\eC) | (?=[\ex80-\ex{7ff}])(\eC)(\eC) | @@ -3425,6 +3429,6 @@ Cambridge, England. .rs .sp .nf -Last updated: 13 November 2015 -Copyright (c) 1997-2015 University of Cambridge. +Last updated: 20 June 2016 +Copyright (c) 1997-2016 University of Cambridge. .fi |