New upstream version 10.31

author: Matthew Vernon <matthew@debian.org> 2018-02-24 12:07:04 +0000
committer: Matthew Vernon <matthew@debian.org> 2018-02-24 12:07:04 +0000
commit: e98c3314cf9e05aa99f5e192862ec37f29b7dbb5 (patch)
tree: b69bb3feb63a4fd79ad8a6e55865228f6fde04eb /doc
parent: 92b17f0eb8fddd7117c5344a1e1177daec21995a (diff)
103 files changed, 10542 insertions, 6523 deletions
diff --git a/doc/html/NON-AUTOTOOLS-BUILD.txt b/doc/html/NON-AUTOTOOLS-BUILD.txt
index ceb9245..0775794 100644
--- a/doc/html/NON-AUTOTOOLS-BUILD.txt
+++ b/doc/html/NON-AUTOTOOLS-BUILD.txt
@@ -1,10 +1,6 @@
 Building PCRE2 without using autotools
 --------------------------------------
 
-This document has been converted from the PCRE1 document. I have removed a
-number of sections about building in various environments, as they applied only
-to PCRE1 and are probably out of date.
-
 This document contains the following sections:
 
   General
@@ -49,7 +45,7 @@ can skip ahead to the CMake section.
      macro settings that it contains to whatever is appropriate for your
      environment. In particular, you can alter the definition of the NEWLINE
      macro to specify what character(s) you want to be interpreted as line
-     terminators.
+     terminators by default.
 
      When you compile any of the PCRE2 modules, you must specify
      -DHAVE_CONFIG_H to your compiler so that src/config.h is included in the
@@ -95,8 +91,10 @@ can skip ahead to the CMake section.
        pcre2_compile.c
        pcre2_config.c
        pcre2_context.c
+       pcre2_convert.c
        pcre2_dfa_match.c
        pcre2_error.c
+       pcre2_extuni.c
        pcre2_find_bracket.c
        pcre2_jit_compile.c
        pcre2_maketables.c
@@ -123,10 +121,14 @@ can skip ahead to the CMake section.
      Note that you must compile pcre2_jit_compile.c, even if you have not
      defined SUPPORT_JIT in src/config.h, because when JIT support is not
      configured, dummy functions are compiled. When JIT support IS configured,
-     pcre2_compile.c #includes other files from the sljit subdirectory, where
-     there should be 16 files, all of whose names begin with "sljit". It also
-     #includes src/pcre2_jit_match.c and src/pcre2_jit_misc.c, so you should
-     not compile these yourself.
+     pcre2_jit_compile.c #includes other files from the sljit subdirectory,
+     all of whose names begin with "sljit". It also #includes
+     src/pcre2_jit_match.c and src/pcre2_jit_misc.c, so you should not compile
+     these yourself.
+
+     Not also that the pcre2_fuzzsupport.c file contains special code that is
+     useful to those who want to run fuzzing tests on the PCRE2 library. Unless
+     you are doing that, you can ignore it.
 
  (5) Now link all the compiled code into an object library in whichever form
      your system keeps such libraries. This is the basic PCRE2 C 8-bit library.
@@ -174,26 +176,18 @@ can skip ahead to the CMake section.
 
 (11) If you want to use the pcre2grep command, compile and link
      src/pcre2grep.c; it uses only the basic 8-bit PCRE2 library (it does not
-     need the pcre2posix library).
+     need the pcre2posix library). If you have built the PCRE2 library with JIT
+     support by defining SUPPORT_JIT in src/config.h, you can also define
+     SUPPORT_PCRE2GREP_JIT, which causes pcre2grep to make use of JIT (unless
+     it is run with --no-jit). If you define SUPPORT_PCRE2GREP_JIT without
+     defining SUPPORT_JIT, pcre2grep does not try to make use of JIT.
 
 
 STACK SIZE IN WINDOWS ENVIRONMENTS
 
-The default processor stack size of 1Mb in some Windows environments is too
-small for matching patterns that need much recursion. In particular, test 2 may
-fail because of this. Normally, running out of stack causes a crash, but there
-have been cases where the test program has just died silently. See your linker
-documentation for how to increase stack size if you experience problems. If you
-are using CMake (see "BUILDING PCRE2 ON WINDOWS WITH CMAKE" below) and the gcc
-compiler, you can increase the stack size for pcre2test and pcre2grep by
-setting the CMAKE_EXE_LINKER_FLAGS variable to "-Wl,--stack,8388608" (for
-example). The Linux default of 8Mb is a reasonable choice for the stack, though
-even that can be too small for some pattern/subject combinations.
-
-PCRE2 has a compile configuration option to disable the use of stack for
-recursion so that heap is used instead. However, pattern matching is
-significantly slower when this is done. There is more about stack usage in the
-"pcre2stack" documentation.
+Prior to release 10.30 the default system stack size of 1Mb in some Windows
+environments caused issues with some tests. This should no longer be the case
+for 10.30 and later releases.
 
 
 LINKING PROGRAMS IN WINDOWS ENVIRONMENTS
@@ -375,18 +369,19 @@ BUILDING PCRE2 ON NATIVE Z/OS AND Z/VM
 z/OS and z/VM are operating systems for mainframe computers, produced by IBM.
 The character code used is EBCDIC, not ASCII or Unicode. In z/OS, UNIX APIs and
 applications can be supported through UNIX System Services, and in such an
-environment PCRE2 can be built in the same way as in other systems. However, in
-native z/OS (without UNIX System Services) and in z/VM, special ports are
-required. For details, please see this web site:
+environment it should be possible to build PCRE2 in the same way as in other
+systems, with the EBCDIC related configuration settings, but it is not known if
+anybody has tried this.
 
-  http://www.zaconsultants.net
+In native z/OS (without UNIX System Services) and in z/VM, special ports are
+required. For details, please see file 939 on this web site:
 
-The site currently has ports for PCRE1 releases, but PCRE2 should follow in due
-course.
+  http://www.cbttape.org
 
-You may also download PCRE1 from WWW.CBTTAPE.ORG, file 882. Everything, source
-and executable, is in EBCDIC and native z/OS file formats and this is the
-recommended download site.
+Everything in that location, source and executable, is in EBCDIC and native
+z/OS file formats. The port provides an API for LE languages such as COBOL and
+for the z/OS and z/VM versions of the Rexx languages.
 
-=============================
-Last Updated: 16 July 2015
+===============================
+Last Updated: 13 September 2017
+===============================
diff --git a/doc/html/README.txt b/doc/html/README.txt
index 03d67f6..52859a9 100644
--- a/doc/html/README.txt
+++ b/doc/html/README.txt
@@ -15,8 +15,8 @@ subscribe or manage your subscription here:
 
    https://lists.exim.org/mailman/listinfo/pcre-dev
 
-Please read the NEWS file if you are upgrading from a previous release.
-The contents of this README file are:
+Please read the NEWS file if you are upgrading from a previous release. The
+contents of this README file are:
 
   The PCRE2 APIs
   Documentation for PCRE2
@@ -44,8 +44,8 @@ wrappers.
 
 The distribution does contain a set of C wrapper functions for the 8-bit
 library that are based on the POSIX regular expression API (see the pcre2posix
-man page). These can be found in a library called libpcre2posix. Note that this
-just provides a POSIX calling interface to PCRE2; the regular expressions
+man page). These can be found in a library called libpcre2-posix. Note that
+this just provides a POSIX calling interface to PCRE2; the regular expressions
 themselves still follow Perl syntax and semantics. The POSIX API is restricted,
 and does not give full access to all of PCRE2's facilities.
 
@@ -58,8 +58,8 @@ renamed or pointed at by a link.
 If you are using the POSIX interface to PCRE2 and there is already a POSIX
 regex library installed on your system, as well as worrying about the regex.h
 header file (as mentioned above), you must also take care when linking programs
-to ensure that they link with PCRE2's libpcre2posix library. Otherwise they may
-pick up the POSIX functions of the same name from the other library.
+to ensure that they link with PCRE2's libpcre2-posix library. Otherwise they
+may pick up the POSIX functions of the same name from the other library.
 
 One way of avoiding this confusion is to compile PCRE2 with the addition of
 -Dregcomp=PCRE2regcomp (and similarly for the other POSIX functions) to the
@@ -95,10 +95,9 @@ PCRE2 documentation is supplied in two other forms:
 Building PCRE2 on non-Unix-like systems
 ---------------------------------------
 
-For a non-Unix-like system, please read the comments in the file
-NON-AUTOTOOLS-BUILD, though if your system supports the use of "configure" and
-"make" you may be able to build PCRE2 using autotools in the same way as for
-many Unix-like systems.
+For a non-Unix-like system, please read the file NON-AUTOTOOLS-BUILD, though if
+your system supports the use of "configure" and "make" you may be able to build
+PCRE2 using autotools in the same way as for many Unix-like systems.
 
 PCRE2 can also be configured using CMake, which can be run in various ways
 (command line, GUI, etc). This creates Makefiles, solution files, etc. The file
@@ -172,21 +171,24 @@ library. They are also documented in the pcre2build man page.
   give large performance improvements on certain platforms, add --enable-jit to
   the "configure" command. This support is available only for certain hardware
   architectures. If you try to enable it on an unsupported architecture, there
-  will be a compile time error.
-
-. If you do not want to make use of the support for UTF-8 Unicode character
-  strings in the 8-bit library, UTF-16 Unicode character strings in the 16-bit
-  library, or UTF-32 Unicode character strings in the 32-bit library, you can
-  add --disable-unicode to the "configure" command. This reduces the size of
-  the libraries. It is not possible to configure one library with Unicode
-  support, and another without, in the same configuration.
+  will be a compile time error. If you are running under SELinux you may also
+  want to add --enable-jit-sealloc, which enables the use of an execmem
+  allocator in JIT that is compatible with SELinux. This has no effect if JIT
+  is not enabled.
+
+. If you do not want to make use of the default support for UTF-8 Unicode
+  character strings in the 8-bit library, UTF-16 Unicode character strings in
+  the 16-bit library, or UTF-32 Unicode character strings in the 32-bit
+  library, you can add --disable-unicode to the "configure" command. This
+  reduces the size of the libraries. It is not possible to configure one
+  library with Unicode support, and another without, in the same configuration.
+  It is also not possible to use --enable-ebcdic (see below) with Unicode
+  support, so if this option is set, you must also use --disable-unicode.
 
   When Unicode support is available, the use of a UTF encoding still has to be
   enabled by setting the PCRE2_UTF option at run time or starting a pattern
   with (*UTF). When PCRE2 is compiled with Unicode support, its input can only
-  either be ASCII or UTF-8/16/32, even when running on EBCDIC platforms. It is
-  not possible to use both --enable-unicode and --enable-ebcdic at the same
-  time.
+  either be ASCII or UTF-8/16/32, even when running on EBCDIC platforms.
 
   As well as supporting UTF strings, Unicode support includes support for the
   \P, \p, and \X sequences that recognize Unicode character properties.
@@ -196,20 +198,14 @@ library. They are also documented in the pcre2build man page.
   or starting a pattern with (*UCP).
 
 . You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
-  of the preceding, or any of the Unicode newline sequences, as indicating the
-  end of a line. Whatever you specify at build time is the default; the caller
-  of PCRE2 can change the selection at run time. The default newline indicator
-  is a single LF character (the Unix standard). You can specify the default
-  newline indicator by adding --enable-newline-is-cr, --enable-newline-is-lf,
-  --enable-newline-is-crlf, --enable-newline-is-anycrlf, or
-  --enable-newline-is-any to the "configure" command, respectively.
-
-  If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
-  the standard tests will fail, because the lines in the test files end with
-  LF. Even if the files are edited to change the line endings, there are likely
-  to be some failures. With --enable-newline-is-anycrlf or
-  --enable-newline-is-any, many tests should succeed, but there may be some
-  failures.
+  of the preceding, or any of the Unicode newline sequences, or the NUL (zero)
+  character as indicating the end of a line. Whatever you specify at build time
+  is the default; the caller of PCRE2 can change the selection at run time. The
+  default newline indicator is a single LF character (the Unix standard). You
+  can specify the default newline indicator by adding --enable-newline-is-cr,
+  --enable-newline-is-lf, --enable-newline-is-crlf,
+  --enable-newline-is-anycrlf, --enable-newline-is-any, or
+  --enable-newline-is-nul to the "configure" command, respectively.
 
 . By default, the sequence \R in a pattern matches any Unicode line ending
   sequence. This is independent of the option specifying what PCRE2 considers
@@ -231,49 +227,44 @@ library. They are also documented in the pcre2build man page.
 
   --with-parens-nest-limit=500
 
-. PCRE2 has a counter that can be set to limit the amount of resources it uses
-  when matching a pattern. If the limit is exceeded during a match, the match
-  fails. The default is ten million. You can change the default by setting, for
-  example,
+. PCRE2 has a counter that can be set to limit the amount of computing resource
+  it uses when matching a pattern. If the limit is exceeded during a match, the
+  match fails. The default is ten million. You can change the default by
+  setting, for example,
 
   --with-match-limit=500000
 
   on the "configure" command. This is just the default; individual calls to
-  pcre2_match() can supply their own value. There is more discussion on the
-  pcre2api man page.
+  pcre2_match() or pcre2_dfa_match() can supply their own value. There is more
+  discussion in the pcre2api man page (search for pcre2_set_match_limit).
+
+. There is a separate counter that limits the depth of nested backtracking
+  during a matching process, which indirectly limits the amount of heap memory
+  that is used. This also has a default of ten million, which is essentially
+  "unlimited". You can change the default by setting, for example,
+
+  --with-match-limit-depth=5000
 
-. There is a separate counter that limits the depth of recursive function calls
-  during a matching process. This also has a default of ten million, which is
-  essentially "unlimited". You can change the default by setting, for example,
+  There is more discussion in the pcre2api man page (search for
+  pcre2_set_depth_limit).
 
-  --with-match-limit-recursion=500000
+. You can also set an explicit limit on the amount of heap memory used by
+  the pcre2_match() interpreter:
 
-  Recursive function calls use up the runtime stack; running out of stack can
-  cause programs to crash in strange ways. There is a discussion about stack
-  sizes in the pcre2stack man page.
+  --with-heap-limit=500
+
+  The units are kilobytes. This limit does not apply when the JIT optimization
+  (which has its own memory control features) is used. There is more discussion
+  on the pcre2api man page (search for pcre2_set_heap_limit).
 
 . In the 8-bit library, the default maximum compiled pattern size is around
-  64K. You can increase this by adding --with-link-size=3 to the "configure"
-  command. PCRE2 then uses three bytes instead of two for offsets to different
-  parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
-  the same as --with-link-size=4, which (in both libraries) uses four-byte
-  offsets. Increasing the internal link size reduces performance in the 8-bit
-  and 16-bit libraries. In the 32-bit library, the link size setting is
-  ignored, as 4-byte offsets are always used.
-
-. You can build PCRE2 so that its internal match() function that is called from
-  pcre2_match() does not call itself recursively. Instead, it uses memory
-  blocks obtained from the heap to save data that would otherwise be saved on
-  the stack. To build PCRE2 like this, use
-
-  --disable-stack-for-recursion
-
-  on the "configure" command. PCRE2 runs more slowly in this mode, but it may
-  be necessary in environments with limited stack sizes. This applies only to
-  the normal execution of the pcre2_match() function; if JIT support is being
-  successfully used, it is not relevant. Equally, it does not apply to
-  pcre2_dfa_match(), which does not use deeply nested recursion. There is a
-  discussion about stack sizes in the pcre2stack man page.
+  64K bytes. You can increase this by adding --with-link-size=3 to the
+  "configure" command. PCRE2 then uses three bytes instead of two for offsets
+  to different parts of the compiled pattern. In the 16-bit library,
+  --with-link-size=3 is the same as --with-link-size=4, which (in both
+  libraries) uses four-byte offsets. Increasing the internal link size reduces
+  performance in the 8-bit and 16-bit libraries. In the 32-bit library, the
+  link size setting is ignored, as 4-byte offsets are always used.
 
 . For speed, PCRE2 uses four tables for manipulating and identifying characters
   whose code point values are less than 256. By default, it uses a set of
@@ -339,12 +330,23 @@ library. They are also documented in the pcre2build man page.
 
   Of course, the relevant libraries must be installed on your system.
 
-. The default size (in bytes) of the internal buffer used by pcre2grep can be
-  set by, for example:
+. The default starting size (in bytes) of the internal buffer used by pcre2grep
+  can be set by, for example:
 
   --with-pcre2grep-bufsize=51200
 
-  The value must be a plain integer. The default is 20480.
+  The value must be a plain integer. The default is 20480. The amount of memory
+  used by pcre2grep is actually three times this number, to allow for "before"
+  and "after" lines. If very long lines are encountered, the buffer is
+  automatically enlarged, up to a fixed maximum size.
+
+. The default maximum size of pcre2grep's internal buffer can be set by, for
+  example:
+
+  --with-pcre2grep-max-bufsize=2097152
+
+  The default is either 1048576 or the value of --with-pcre2grep-bufsize,
+  whichever is the larger.
 
 . It is possible to compile pcre2test so that it links with the libreadline
   or libedit libraries, by specifying, respectively,
@@ -369,6 +371,29 @@ library. They are also documented in the pcre2build man page.
   tgetflag, or tgoto, this is the problem, and linking with the ncurses library
   should fix it.
 
+. There is a special option called --enable-fuzz-support for use by people who
+  want to run fuzzing tests on PCRE2. At present this applies only to the 8-bit
+  library. If set, it causes an extra library called libpcre2-fuzzsupport.a to
+  be built, but not installed. This contains a single function called
+  LLVMFuzzerTestOneInput() whose arguments are a pointer to a string and the
+  length of the string. When called, this function tries to compile the string
+  as a pattern, and if that succeeds, to match it. This is done both with no
+  options and with some random options bits that are generated from the string.
+  Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
+  be created. This is normally run under valgrind or used when PCRE2 is
+  compiled with address sanitizing enabled. It calls the fuzzing function and
+  outputs information about it is doing. The input strings are specified by
+  arguments: if an argument starts with "=" the rest of it is a literal input
+  string. Otherwise, it is assumed to be a file name, and the contents of the
+  file are the test string.
+
+. Releases before 10.30 could be compiled with --disable-stack-for-recursion,
+  which caused pcre2_match() to use individual blocks on the heap for
+  backtracking instead of recursive function calls (which use the stack). This
+  is now obsolete since pcre2_match() was refactored always to use the heap (in
+  a much more efficient way than before). This option is retained for backwards
+  compatibility, but has no effect other than to output a warning.
+
 The "configure" script builds the following files for the basic C library:
 
 . Makefile             the makefile that builds the library
@@ -543,7 +568,7 @@ script creates the .txt and HTML forms of the documentation from the man pages.
 
 
 Testing PCRE2
-------------
+-------------
 
 To test the basic PCRE2 library on a Unix-like system, run the RunTest script.
 There is another script called RunGrepTest that tests the pcre2grep command.
@@ -635,32 +660,43 @@ with the perltest.sh script, and test 5 checking PCRE2-specific things.
 Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
 non-UTF mode and UTF-mode with Unicode property support, respectively.
 
-Test 8 checks some internal offsets and code size features; it is run only when
-the default "link size" of 2 is set (in other cases the sizes change) and when
-Unicode support is enabled.
+Test 8 checks some internal offsets and code size features, but it is run only
+when Unicode support is enabled. The output is different in 8-bit, 16-bit, and
+32-bit modes and for different link sizes, so there are different output files
+for each mode and link size.
 
 Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
 16-bit and 32-bit modes. These are tests that generate different output in
 8-bit mode. Each pair are for general cases and Unicode support, respectively.
+
 Test 13 checks the handling of non-UTF characters greater than 255 by
 pcre2_dfa_match() in 16-bit and 32-bit modes.
 
-Test 14 contains a number of tests that must not be run with JIT. They check,
+Test 14 contains some special UTF and UCP tests that give different output for
+different code unit widths.
+
+Test 15 contains a number of tests that must not be run with JIT. They check,
 among other non-JIT things, the match-limiting features of the intepretive
 matcher.
 
-Test 15 is run only when JIT support is not available. It checks that an
+Test 16 is run only when JIT support is not available. It checks that an
 attempt to use JIT has the expected behaviour.
 
-Test 16 is run only when JIT support is available. It checks JIT complete and
+Test 17 is run only when JIT support is available. It checks JIT complete and
 partial modes, match-limiting under JIT, and other JIT-specific features.
 
-Tests 17 and 18 are run only in 8-bit mode. They check the POSIX interface to
+Tests 18 and 19 are run only in 8-bit mode. They check the POSIX interface to
 the 8-bit library, without and with Unicode support, respectively.
 
-Test 19 checks the serialization functions by writing a set of compiled
+Test 20 checks the serialization functions by writing a set of compiled
 patterns to a file, and then reloading and checking them.
 
+Tests 21 and 22 test \C support when the use of \C is not locked out, without
+and with UTF support, respectively. Test 23 tests \C when it is locked out.
+
+Tests 24 and 25 test the experimental pattern conversion functions, without and
+with UTF support, respectively.
+
 
 Character tables
 ----------------
@@ -679,7 +715,7 @@ specified for ./configure, a different version of pcre2_chartables.c is built
 by the program dftables (compiled from dftables.c), which uses the ANSI C
 character handling functions such as isalnum(), isalpha(), isupper(),
 islower(), etc. to build the table sources. This means that the default C
-locale which is set for your system will control the contents of these default
+locale that is set for your system will control the contents of these default
 tables. You can change the default tables by editing pcre2_chartables.c and
 then re-building PCRE2. If you do this, you should take care to ensure that the
 file does not get automatically re-generated. The best way to do this is to
@@ -734,8 +770,10 @@ The distribution should contain the files listed below.
   src/pcre2_compile.c      )
   src/pcre2_config.c       )
   src/pcre2_context.c      )
+  src/pcre2_convert.c      )
   src/pcre2_dfa_match.c    )
   src/pcre2_error.c        )
+  src/pcre2_extuni.c       )
   src/pcre2_find_bracket.c )
   src/pcre2_jit_compile.c  )
   src/pcre2_jit_match.c    ) sources for the functions in the library,
@@ -757,6 +795,7 @@ The distribution should contain the files listed below.
   src/pcre2_xclass.c       )
 
   src/pcre2_printint.c     debugging function that is used by pcre2test,
+  src/pcre2_fuzzsupport.c  function for (optional) fuzzing support
 
   src/config.h.in          template for config.h, when built by "configure"
   src/pcre2.h.in           template for pcre2.h when built by "configure"
@@ -772,7 +811,6 @@ The distribution should contain the files listed below.
   src/pcre2demo.c          simple demonstration of coding calls to PCRE2
   src/pcre2grep.c          source of a grep utility that uses PCRE2
   src/pcre2test.c          comprehensive test program
-  src/pcre2_printint.c     part of pcre2test
   src/pcre2_jit_test.c     JIT test program
 
 (C) Auxiliary files:
@@ -814,7 +852,7 @@ The distribution should contain the files listed below.
   libpcre2-8.pc.in         template for libpcre2-8.pc for pkg-config
   libpcre2-16.pc.in        template for libpcre2-16.pc for pkg-config
   libpcre2-32.pc.in        template for libpcre2-32.pc for pkg-config
-  libpcre2posix.pc.in      template for libpcre2posix.pc for pkg-config
+  libpcre2-posix.pc.in     template for libpcre2-posix.pc for pkg-config
   ltmain.sh                file used to build a libtool script
   missing                  ) common stub for a few missing GNU programs while
                            )   installing, generated by automake
@@ -837,12 +875,12 @@ The distribution should contain the files listed below.
 
 (E) Auxiliary files for building PCRE2 "by hand"
 
-  pcre2.h.generic         ) a version of the public PCRE2 header file
+  src/pcre2.h.generic     ) a version of the public PCRE2 header file
                           )   for use in non-"configure" environments
-  config.h.generic        ) a version of config.h for use in non-"configure"
+  src/config.h.generic    ) a version of config.h for use in non-"configure"
                           )   environments
 
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 01 April 2016
+Last updated: 12 September 2017
diff --git a/doc/html/index.html b/doc/html/index.html
index 703c298..b9393d9 100644
--- a/doc/html/index.html
+++ b/doc/html/index.html
@@ -35,6 +35,9 @@ first.
 <tr><td><a href="pcre2compat.html">pcre2compat</a></td>
     <td>&nbsp;&nbsp;Compability with Perl</td></tr>
 
+<tr><td><a href="pcre2convert.html">pcre2convert</a></td>
+    <td>&nbsp;&nbsp;Experimental foreign pattern conversion functions</td></tr>
+
 <tr><td><a href="pcre2demo.html">pcre2demo</a></td>
     <td>&nbsp;&nbsp;A demonstration C program that uses the PCRE2 library</td></tr>
 
@@ -68,9 +71,6 @@ first.
 <tr><td><a href="pcre2serialize.html">pcre2serialize</a></td>
     <td>&nbsp;&nbsp;Serializing functions for saving precompiled patterns</td></tr>
 
-<tr><td><a href="pcre2stack.html">pcre2stack</a></td>
-    <td>&nbsp;&nbsp;Discussion of PCRE2's stack usage</td></tr>
-
 <tr><td><a href="pcre2syntax.html">pcre2syntax</a></td>
     <td>&nbsp;&nbsp;Syntax quick-reference summary</td></tr>
 
@@ -94,6 +94,9 @@ in the library.
 <tr><td><a href="pcre2_code_copy.html">pcre2_code_copy</a></td>
     <td>&nbsp;&nbsp;Copy a compiled pattern</td></tr>
 
+<tr><td><a href="pcre2_code_copy_with_tables.html">pcre2_code_copy_with_tables</a></td>
+    <td>&nbsp;&nbsp;Copy a compiled pattern and its character tables</td></tr>
+
 <tr><td><a href="pcre2_code_free.html">pcre2_code_free</a></td>
     <td>&nbsp;&nbsp;Free a compiled pattern</td></tr>
 
@@ -112,6 +115,18 @@ in the library.
 <tr><td><a href="pcre2_config.html">pcre2_config</a></td>
     <td>&nbsp;&nbsp;Show build-time configuration options</td></tr>
 
+<tr><td><a href="pcre2_convert_context_copy.html">pcre2_convert_context_copy</a></td>
+    <td>&nbsp;&nbsp;Copy a convert context</td></tr>
+
+<tr><td><a href="pcre2_convert_context_create.html">pcre2_convert_context_create</a></td>
+    <td>&nbsp;&nbsp;Create a convert context</td></tr>
+
+<tr><td><a href="pcre2_convert_context_free.html">pcre2_convert_context_free</a></td>
+    <td>&nbsp;&nbsp;Free a convert context</td></tr>
+
+<tr><td><a href="pcre2_converted_pattern_free.html">pcre2_converted_pattern_free</a></td>
+    <td>&nbsp;&nbsp;Free converted foreign pattern</td></tr>
+
 <tr><td><a href="pcre2_dfa_match.html">pcre2_dfa_match</a></td>
     <td>&nbsp;&nbsp;Match a compiled pattern to a subject string
     (DFA algorithm; <i>not</i> Perl compatible)</td></tr>
@@ -183,6 +198,9 @@ in the library.
 <tr><td><a href="pcre2_match_data_free.html">pcre2_match_data_free</a></td>
     <td>&nbsp;&nbsp;Free a match data block</td></tr>
 
+<tr><td><a href="pcre2_pattern_convert.html">pcre2_pattern_convert</a></td>
+    <td>&nbsp;&nbsp;Experimental foreign pattern converter</td></tr>
+
 <tr><td><a href="pcre2_pattern_info.html">pcre2_pattern_info</a></td>
     <td>&nbsp;&nbsp;Extract information about a pattern</td></tr>
 
@@ -207,9 +225,24 @@ in the library.
 <tr><td><a href="pcre2_set_character_tables.html">pcre2_set_character_tables</a></td>
     <td>&nbsp;&nbsp;Set character tables</td></tr>
 
+<tr><td><a href="pcre2_set_compile_extra_options.html">pcre2_set_compile_extra_options</a></td>
+    <td>&nbsp;&nbsp;Set compile time extra options</td></tr>
+
 <tr><td><a href="pcre2_set_compile_recursion_guard.html">pcre2_set_compile_recursion_guard</a></td>
     <td>&nbsp;&nbsp;Set up a compile recursion guard function</td></tr>
 
+<tr><td><a href="pcre2_set_depth_limit.html">pcre2_set_depth_limit</a></td>
+    <td>&nbsp;&nbsp;Set the match backtracking depth limit</td></tr>
+
+<tr><td><a href="pcre2_set_glob_escape.html">pcre2_set_glob_escape</a></td>
+    <td>&nbsp;&nbsp;Set glob escape character</td></tr>
+
+<tr><td><a href="pcre2_set_glob_separator.html">pcre2_set_glob_separator</a></td>
+    <td>&nbsp;&nbsp;Set glob separator character</td></tr>
+
+<tr><td><a href="pcre2_set_heap_limit.html">pcre2_set_heap_limit</a></td>
+    <td>&nbsp;&nbsp;Set the match backtracking heap limit</td></tr>
+
 <tr><td><a href="pcre2_set_match_limit.html">pcre2_set_match_limit</a></td>
     <td>&nbsp;&nbsp;Set the match limit</td></tr>
 
@@ -226,10 +259,10 @@ in the library.
     <td>&nbsp;&nbsp;Set the parentheses nesting limit</td></tr>
 
 <tr><td><a href="pcre2_set_recursion_limit.html">pcre2_set_recursion_limit</a></td>
-    <td>&nbsp;&nbsp;Set the match recursion limit</td></tr>
+    <td>&nbsp;&nbsp;Obsolete: use pcre2_set_depth_limit</td></tr>
 
 <tr><td><a href="pcre2_set_recursion_memory_management.html">pcre2_set_recursion_memory_management</a></td>
-    <td>&nbsp;&nbsp;Set match recursion memory management</td></tr>
+    <td>&nbsp;&nbsp;Obsolete function that (from 10.30 onwards) does nothing</td></tr>
 
 <tr><td><a href="pcre2_substitute.html">pcre2_substitute</a></td>
     <td>&nbsp;&nbsp;Match a compiled pattern to a subject string and do
diff --git a/doc/html/pcre2.html b/doc/html/pcre2.html
index 07ab8e9..b61c579 100644
--- a/doc/html/pcre2.html
+++ b/doc/html/pcre2.html
@@ -109,7 +109,7 @@ lose performance.
 One way of guarding against this possibility is to use the
 <b>pcre2_pattern_info()</b> function to check the compiled pattern's options for
 PCRE2_UTF. Alternatively, you can set the PCRE2_NEVER_UTF option when calling
-<b>pcre2_compile()</b>. This causes an compile time error if a pattern contains
+<b>pcre2_compile()</b>. This causes a compile time error if the pattern contains
 a UTF-setting sequence.
 </P>
 <P>
@@ -137,7 +137,8 @@ large search tree against a string that will never match. Nested unlimited
 repeats in a pattern are a common example. PCRE2 provides some protection
 against this: see the <b>pcre2_set_match_limit()</b> function in the
 <a href="pcre2api.html"><b>pcre2api</b></a>
-page.
+page. There is a similar function called <b>pcre2_set_depth_limit()</b> that can
+be used to restrict the amount of memory that is used.
 </P>
 <br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
 <P>
@@ -166,7 +167,6 @@ listing), and the short pages for individual functions, are concatenated in
   pcre2perform       discussion of performance issues
   pcre2posix         the POSIX-compatible C API for the 8-bit library
   pcre2sample        discussion of the pcre2demo program
-  pcre2stack         discussion of stack usage
   pcre2syntax        quick syntax reference
   pcre2test          description of the <b>pcre2test</b> command
   pcre2unicode       discussion of Unicode and UTF support
@@ -189,9 +189,9 @@ use my two initials, followed by the two digits 10, at the domain cam.ac.uk.
 </P>
 <br><a name="SEC5" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 16 October 2015
+Last updated: 01 April 2017
 <br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2_callout_enumerate.html b/doc/html/pcre2_callout_enumerate.html
index 6c2cdb8..505ea7b 100644
--- a/doc/html/pcre2_callout_enumerate.html
+++ b/doc/html/pcre2_callout_enumerate.html
@@ -36,20 +36,21 @@ for success and non-zero otherwise. The arguments are:
   <i>callout_data</i>   User data that is passed to the callback
 </pre>
 The <i>callback()</i> function is passed a pointer to a data block containing
-the following fields:
+the following fields (not necessarily in this order):
 <pre>
-  <i>version</i>                Block version number
-  <i>pattern_position</i>       Offset to next item in pattern
-  <i>next_item_length</i>       Length of next item in pattern
-  <i>callout_number</i>         Number for numbered callouts
-  <i>callout_string_offset</i>  Offset to string within pattern
-  <i>callout_string_length</i>  Length of callout string
-  <i>callout_string</i>         Points to callout string or is NULL
+  uint32_t   <i>version</i>                Block version number
+  uint32_t   <i>callout_number</i>         Number for numbered callouts
+  PCRE2_SIZE <i>pattern_position</i>       Offset to next item in pattern
+  PCRE2_SIZE <i>next_item_length</i>       Length of next item in pattern
+  PCRE2_SIZE <i>callout_string_offset</i>  Offset to string within pattern
+  PCRE2_SIZE <i>callout_string_length</i>  Length of callout string
+  PCRE2_SPTR <i>callout_string</i>         Points to callout string or is NULL
 </pre>
-The second argument is the callout data that was passed to
-<b>pcre2_callout_enumerate()</b>. The <b>callback()</b> function must return zero
-for success. Any other value causes the pattern scan to stop, with the value
-being passed back as the result of <b>pcre2_callout_enumerate()</b>.
+The second argument passed to the <b>callback()</b> function is the callout data
+that was passed to <b>pcre2_callout_enumerate()</b>. The <b>callback()</b>
+function must return zero for success. Any other value causes the pattern scan
+to stop, with the value being passed back as the result of
+<b>pcre2_callout_enumerate()</b>.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_code_copy.html b/doc/html/pcre2_code_copy.html
index 5b68282..667d7b7 100644
--- a/doc/html/pcre2_code_copy.html
+++ b/doc/html/pcre2_code_copy.html
@@ -28,8 +28,9 @@ DESCRIPTION
 This function makes a copy of the memory used for a compiled pattern, excluding
 any memory used by the JIT compiler. Without a subsequent call to
 <b>pcre2_jit_compile()</b>, the copy can be used only for non-JIT matching. The
-yield of the function is NULL if <i>code</i> is NULL or if sufficient memory
-cannot be obtained.
+pointer to the character tables is copied, not the tables themselves (see
+<b>pcre2_code_copy_with_tables()</b>). The yield of the function is NULL if
+<i>code</i> is NULL or if sufficient memory cannot be obtained.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_code_copy_with_tables.html b/doc/html/pcre2_code_copy_with_tables.html
new file mode 100644
index 0000000..67b2e1f
--- /dev/null
+++ b/doc/html/pcre2_code_copy_with_tables.html
@@ -0,0 +1,44 @@
+<html>
+<head>
+<title>pcre2_code_copy_with_tables specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_code_copy_with_tables man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *<i>code</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function makes a copy of the memory used for a compiled pattern, excluding
+any memory used by the JIT compiler. Without a subsequent call to
+<b>pcre2_jit_compile()</b>, the copy can be used only for non-JIT matching.
+Unlike <b>pcre2_code_copy()</b>, a separate copy of the character tables is also
+made, with the new code pointing to it. This memory will be automatically freed
+when <b>pcre2_code_free()</b> is called. The yield of the function is NULL if
+<i>code</i> is NULL or if sufficient memory cannot be obtained.
+</P>
+<P>
+There is a complete description of the PCRE2 native API in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+page and a description of the POSIX API in the
+<a href="pcre2posix.html"><b>pcre2posix</b></a>
+page.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_code_free.html b/doc/html/pcre2_code_free.html
index 0477abe..5fce3c5 100644
--- a/doc/html/pcre2_code_free.html
+++ b/doc/html/pcre2_code_free.html
@@ -26,7 +26,9 @@ DESCRIPTION
 </b><br>
 <P>
 This function frees the memory used for a compiled pattern, including any
-memory used by the JIT compiler.
+memory used by the JIT compiler. If the compiled pattern was created by a call
+to <b>pcre2_code_copy_with_tables()</b>, the memory for the character tables is
+also freed.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_compile.html b/doc/html/pcre2_compile.html
index 544f4fe..0a9eafa 100644
--- a/doc/html/pcre2_compile.html
+++ b/doc/html/pcre2_compile.html
@@ -37,26 +37,34 @@ arguments are:
   <i>erroffset</i>     Where to put an error offset
   <i>ccontext</i>      Pointer to a compile context or NULL
 </pre>
-The length of the string and any error offset that is returned are in code
-units, not characters. A compile context is needed only if you want to change
+The length of the pattern and any error offset that is returned are in code
+units, not characters. A compile context is needed only if you want to provide
+custom memory allocation functions, or to provide an external function for
+system stack size checking, or to change one or more of these parameters:
 <pre>
-  What \R matches (Unicode newlines or CR, LF, CRLF only)
-  PCRE2's character tables
-  The newline character sequence
-  The compile time nested parentheses limit
+  What \R matches (Unicode newlines, or CR, LF, CRLF only);
+  PCRE2's character tables;
+  The newline character sequence;
+  The compile time nested parentheses limit;
+  The maximum pattern length (in code units) that is allowed.
+  The additional options bits (see pcre2_set_compile_extra_options())
 </pre>
-or provide an external function for stack size checking. The option bits are:
+The option bits are:
 <pre>
   PCRE2_ANCHORED           Force pattern anchoring
+  PCRE2_ALLOW_EMPTY_CLASS  Allow empty classes
   PCRE2_ALT_BSUX           Alternative handling of \u, \U, and \x
   PCRE2_ALT_CIRCUMFLEX     Alternative handling of ^ in multiline mode
+  PCRE2_ALT_VERBNAMES      Process backslashes in verb names
   PCRE2_AUTO_CALLOUT       Compile automatic callouts
   PCRE2_CASELESS           Do caseless matching
   PCRE2_DOLLAR_ENDONLY     $ not to match newline at end
   PCRE2_DOTALL             . matches anything including NL
   PCRE2_DUPNAMES           Allow duplicate names for subpatterns
+  PCRE2_ENDANCHORED        Pattern can match only at end of subject
   PCRE2_EXTENDED           Ignore white space and # comments
   PCRE2_FIRSTLINE          Force matching to be before newline
+  PCRE2_LITERAL            Pattern characters are all literal
   PCRE2_MATCH_UNSET_BACKREF  Match unset back references
   PCRE2_MULTILINE          ^ and $ match newlines within data
   PCRE2_NEVER_BACKSLASH_C  Lock out the use of \C in patterns
@@ -71,19 +79,21 @@ or provide an external function for stack size checking. The option bits are:
                              (only relevant if PCRE2_UTF is set)
   PCRE2_UCP                Use Unicode properties for \d, \w, etc.
   PCRE2_UNGREEDY           Invert greediness of quantifiers
+  PCRE2_USE_OFFSET_LIMIT   Enable offset limit for unanchored matching
   PCRE2_UTF                Treat pattern and subjects as UTF strings
 </pre>
-PCRE2 must be built with Unicode support in order to use PCRE2_UTF, PCRE2_UCP
-and related options.
+PCRE2 must be built with Unicode support (the default) in order to use
+PCRE2_UTF, PCRE2_UCP and related options.
 </P>
 <P>
 The yield of the function is a pointer to a private data structure that
 contains the compiled pattern, or NULL if an error was detected.
 </P>
 <P>
-There is a complete description of the PCRE2 native API in the
+There is a complete description of the PCRE2 native API, with more detail on
+each option, in the
 <a href="pcre2api.html"><b>pcre2api</b></a>
-page and a description of the POSIX API in the
+page, and a description of the POSIX API in the
 <a href="pcre2posix.html"><b>pcre2posix</b></a>
 page.
 <p>
diff --git a/doc/html/pcre2_config.html b/doc/html/pcre2_config.html
index a51b0c7..f05bd06 100644
--- a/doc/html/pcre2_config.html
+++ b/doc/html/pcre2_config.html
@@ -45,24 +45,25 @@ point to a uint32_t integer variable. The available codes are:
   PCRE2_CONFIG_BSR             Indicates what \R matches by default:
                                  PCRE2_BSR_UNICODE
                                  PCRE2_BSR_ANYCRLF
-  PCRE2_CONFIG_JIT             Availability of just-in-time compiler
-                                support (1=yes 0=no)
-  PCRE2_CONFIG_JITTARGET       Information about the target archi-
-                                 tecture for the JIT compiler
+  PCRE2_CONFIG_COMPILED_WIDTHS Which of 8/16/32 support was compiled
+  PCRE2_CONFIG_DEPTHLIMIT      Default backtracking depth limit
+  PCRE2_CONFIG_HEAPLIMIT       Default heap memory limit
+  PCRE2_CONFIG_JIT             Availability of just-in-time compiler support (1=yes 0=no)
+  PCRE2_CONFIG_JITTARGET       Information (a string) about the target architecture for the JIT compiler
   PCRE2_CONFIG_LINKSIZE        Configured internal link size (2, 3, 4)
   PCRE2_CONFIG_MATCHLIMIT      Default internal resource limit
+  PCRE2_CONFIG_NEVER_BACKSLASH_C  Whether or not \C is disabled
   PCRE2_CONFIG_NEWLINE         Code for the default newline sequence:
                                  PCRE2_NEWLINE_CR
                                  PCRE2_NEWLINE_LF
                                  PCRE2_NEWLINE_CRLF
                                  PCRE2_NEWLINE_ANY
                                  PCRE2_NEWLINE_ANYCRLF
+                                 PCRE2_NEWLINE_NUL
   PCRE2_CONFIG_PARENSLIMIT     Default parentheses nesting limit
-  PCRE2_CONFIG_RECURSIONLIMIT  Internal recursion depth limit
-  PCRE2_CONFIG_STACKRECURSE    Recursion implementation (1=stack
-                                 0=heap)
-  PCRE2_CONFIG_UNICODE         Availability of Unicode support (1=yes
-                                 0=no)
+  PCRE2_CONFIG_RECURSIONLIMIT  Obsolete: use PCRE2_CONFIG_DEPTHLIMIT
+  PCRE2_CONFIG_STACKRECURSE    Obsolete: always returns 0
+  PCRE2_CONFIG_UNICODE         Availability of Unicode support (1=yes 0=no)
   PCRE2_CONFIG_UNICODE_VERSION The Unicode version (a string)
   PCRE2_CONFIG_VERSION         The PCRE2 version (a string)
 </pre>
diff --git a/doc/html/pcre2_convert_context_copy.html b/doc/html/pcre2_convert_context_copy.html
new file mode 100644
index 0000000..3c44ac6
--- /dev/null
+++ b/doc/html/pcre2_convert_context_copy.html
@@ -0,0 +1,40 @@
+<html>
+<head>
+<title>pcre2_convert_context_copy specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_convert_context_copy man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>pcre2_convert_context *pcre2_convert_context_copy(</b>
+<b>  pcre2_convert_context *<i>cvcontext</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It makes a new copy of a convert context, using the memory allocation function
+that was used for the original context. The result is NULL if the memory cannot
+be obtained.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_convert_context_create.html b/doc/html/pcre2_convert_context_create.html
new file mode 100644
index 0000000..2564780
--- /dev/null
+++ b/doc/html/pcre2_convert_context_create.html
@@ -0,0 +1,41 @@
+<html>
+<head>
+<title>pcre2_convert_context_create specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_convert_context_create man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>pcre2_convert_context *pcre2_convert_context_create(</b>
+<b>  pcre2_general_context *<i>gcontext</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It creates and initializes a new convert context. If its argument is
+NULL, <b>malloc()</b> is used to get the necessary memory; otherwise the memory
+allocation function within the general context is used. The result is NULL if
+the memory could not be obtained.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_convert_context_free.html b/doc/html/pcre2_convert_context_free.html
new file mode 100644
index 0000000..ab6db6c
--- /dev/null
+++ b/doc/html/pcre2_convert_context_free.html
@@ -0,0 +1,39 @@
+<html>
+<head>
+<title>pcre2_convert_context_free specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_convert_context_free man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>void pcre2_convert_context_free(pcre2_convert_context *<i>cvcontext</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It frees the memory occupied by a convert context, using the memory
+freeing function from the general context with which it was created, or
+<b>free()</b> if that was not set.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_converted_pattern_free.html b/doc/html/pcre2_converted_pattern_free.html
new file mode 100644
index 0000000..11adefd
--- /dev/null
+++ b/doc/html/pcre2_converted_pattern_free.html
@@ -0,0 +1,39 @@
+<html>
+<head>
+<title>pcre2_converted_pattern_free specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_converted_pattern_free man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>void pcre2_converted_pattern_free(PCRE2_UCHAR *<i>converted_pattern</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It frees the memory occupied by a converted pattern that was obtained by
+calling <b>pcre2_pattern_convert()</b> with arguments that caused it to place
+the converted pattern into newly obtained heap memory.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_dfa_match.html b/doc/html/pcre2_dfa_match.html
index e137a14..36d7976 100644
--- a/doc/html/pcre2_dfa_match.html
+++ b/doc/html/pcre2_dfa_match.html
@@ -31,8 +31,9 @@ DESCRIPTION
 <P>
 This function matches a compiled regular expression against a given subject
 string, using an alternative matching algorithm that scans the subject string
-just once (<i>not</i> Perl-compatible). (The Perl-compatible matching function
-is <b>pcre2_match()</b>.) The arguments for this function are:
+just once (except when processing lookaround assertions). This function is
+<i>not</i> Perl-compatible (the Perl-compatible matching function is
+<b>pcre2_match()</b>). The arguments for this function are:
 <pre>
   <i>code</i>         Points to the compiled pattern
   <i>subject</i>      Points to the subject string
@@ -45,22 +46,20 @@ is <b>pcre2_match()</b>.) The arguments for this function are:
   <i>wscount</i>      Number of elements in the vector
 </pre>
 For <b>pcre2_dfa_match()</b>, a match context is needed only if you want to set
-up a callout function. The <i>length</i> and <i>startoffset</i> values are code
-units, not characters. The options are:
+up a callout function or specify the match and/or the recursion depth limits.
+The <i>length</i> and <i>startoffset</i> values are code units, not characters.
+The options are:
 <pre>
   PCRE2_ANCHORED          Match only at the first position
+  PCRE2_ENDANCHORED       Pattern can match only at end of subject
   PCRE2_NOTBOL            Subject is not the beginning of a line
   PCRE2_NOTEOL            Subject is not the end of a line
   PCRE2_NOTEMPTY          An empty string is not a valid match
-  PCRE2_NOTEMPTY_ATSTART  An empty string at the start of the subject
-                           is not a valid match
-  PCRE2_NO_UTF_CHECK      Do not check the subject for UTF
-                           validity (only relevant if PCRE2_UTF
+  PCRE2_NOTEMPTY_ATSTART  An empty string at the start of the subject is not a valid match
+  PCRE2_NO_UTF_CHECK      Do not check the subject for UTF validity (only relevant if PCRE2_UTF
                            was set at compile time)
-  PCRE2_PARTIAL_SOFT      Return PCRE2_ERROR_PARTIAL for a partial
-                            match if no full matches are found
-  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial match
-                           even if there is a full match as well
+  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial match even if there is a full match
+  PCRE2_PARTIAL_SOFT      Return PCRE2_ERROR_PARTIAL for a partial match if no full matches are found
   PCRE2_DFA_RESTART       Restart after a partial match
   PCRE2_DFA_SHORTEST      Return only the shortest match
 </pre>
diff --git a/doc/html/pcre2_get_error_message.html b/doc/html/pcre2_get_error_message.html
index 26c80fe..7005760 100644
--- a/doc/html/pcre2_get_error_message.html
+++ b/doc/html/pcre2_get_error_message.html
@@ -34,11 +34,11 @@ errors are negative numbers. The arguments are:
   <i>buffer</i>      where to put the message
   <i>bufflen</i>     the length of the buffer (code units)
 </pre>
-The function returns the length of the message, excluding the trailing zero, or
-the negative error code PCRE2_ERROR_NOMEMORY if the buffer is too small. In
-this case, the returned message is truncated (but still with a trailing zero).
-If <i>errorcode</i> does not contain a recognized error code number, the
-negative value PCRE2_ERROR_BADDATA is returned.
+The function returns the length of the message in code units, excluding the
+trailing zero, or the negative error code PCRE2_ERROR_NOMEMORY if the buffer is
+too small. In this case, the returned message is truncated (but still with a
+trailing zero). If <i>errorcode</i> does not contain a recognized error code
+number, the negative value PCRE2_ERROR_BADDATA is returned.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_get_mark.html b/doc/html/pcre2_get_mark.html
index f8e50e3..88e6326 100644
--- a/doc/html/pcre2_get_mark.html
+++ b/doc/html/pcre2_get_mark.html
@@ -26,11 +26,15 @@ DESCRIPTION
 </b><br>
 <P>
 After a call of <b>pcre2_match()</b> that was passed the match block that is
-this function's argument, this function returns a pointer to the last (*MARK)
-name that was encountered. The name is zero-terminated, and is within the
-compiled pattern. If no (*MARK) name is available, NULL is returned. A (*MARK)
-name may be available after a failed match or a partial match, as well as after
-a successful one.
+this function's argument, this function returns a pointer to the last (*MARK),
+(*PRUNE), or (*THEN) name that was encountered during the matching process. The
+name is zero-terminated, and is within the compiled pattern. The length of the
+name is in the preceding code unit. If no name is available, NULL is returned.
+</P>
+<P>
+After a successful match, the name that is returned is the last one on the
+matching path. After a failed match or a partial match, the last encountered
+name is returned.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_jit_stack_create.html b/doc/html/pcre2_jit_stack_create.html
index a668e34..7c89c31 100644
--- a/doc/html/pcre2_jit_stack_create.html
+++ b/doc/html/pcre2_jit_stack_create.html
@@ -32,10 +32,9 @@ maximum size to which it is allowed to grow. The final argument is a general
 context, for memory allocation functions, or NULL for standard memory
 allocation. The result can be passed to the JIT run-time code by calling
 <b>pcre2_jit_stack_assign()</b> to associate the stack with a compiled pattern,
-which can then be processed by <b>pcre2_match()</b>. If the "fast path" JIT
-matcher, <b>pcre2_jit_match()</b> is used, the stack can be passed directly as
-an argument. A maximum stack size of 512K to 1M should be more than enough for
-any pattern. For more details, see the
+which can then be processed by <b>pcre2_match()</b> or <b>pcre2_jit_match()</b>.
+A maximum stack size of 512K to 1M should be more than enough for any pattern.
+For more details, see the
 <a href="pcre2jit.html"><b>pcre2jit</b></a>
 page.
 </P>
diff --git a/doc/html/pcre2_maketables.html b/doc/html/pcre2_maketables.html
index 068e6d4..6d240e3 100644
--- a/doc/html/pcre2_maketables.html
+++ b/doc/html/pcre2_maketables.html
@@ -19,16 +19,16 @@ SYNOPSIS
 <b>#include &#60;pcre2.h&#62;</b>
 </P>
 <P>
-<b>const unsigned char *pcre2_maketables(pcre22_general_context *<i>gcontext</i>);</b>
+<b>const unsigned char *pcre2_maketables(pcre2_general_context *<i>gcontext</i>);</b>
 </P>
 <br><b>
 DESCRIPTION
 </b><br>
 <P>
-This function builds a set of character tables for character values less than
-256. These can be passed to <b>pcre2_compile()</b> in a compile context in order
-to override the internal, built-in tables (which were either defaulted or made
-by <b>pcre2_maketables()</b> when PCRE2 was compiled). See the
+This function builds a set of character tables for character code points that
+are less than 256. These can be passed to <b>pcre2_compile()</b> in a compile
+context in order to override the internal, built-in tables (which were either
+defaulted or made by <b>pcre2_maketables()</b> when PCRE2 was compiled). See the
 <a href="pcre2_set_character_tables.html"><b>pcre2_set_character_tables()</b></a>
 page. You might want to do this if you are using a non-standard locale.
 </P>
diff --git a/doc/html/pcre2_match.html b/doc/html/pcre2_match.html
index 0e389eb..ced70bb 100644
--- a/doc/html/pcre2_match.html
+++ b/doc/html/pcre2_match.html
@@ -30,7 +30,13 @@ DESCRIPTION
 <P>
 This function matches a compiled regular expression against a given subject
 string, using a matching algorithm that is similar to Perl's. It returns
-offsets to captured substrings. Its arguments are:
+offsets to what it has matched and to captured substrings via the
+<b>match_data</b> block, which can be processed by functions with names that
+start with <b>pcre2_get_ovector_...()</b> or <b>pcre2_substring_...()</b>. The
+return from <b>pcre2_match()</b> is one more than the highest numbered capturing
+pair that has been set (for example, 1 if there are no captures), zero if the
+vector of offsets is too small, or a negative error code for no match and other
+errors. The function arguments are:
 <pre>
   <i>code</i>         Points to the compiled pattern
   <i>subject</i>      Points to the subject string
@@ -43,26 +49,27 @@ offsets to captured substrings. Its arguments are:
 A match context is needed only if you want to:
 <pre>
   Set up a callout function
-  Change the limit for calling the internal function <i>match()</i>
-  Change the limit for calling <i>match()</i> recursively
-  Set custom memory management when the heap is used for recursion
+  Set a matching offset limit
+  Change the heap memory limit
+  Change the backtracking match limit
+  Change the backtracking depth limit
+  Set custom memory management specifically for the match
 </pre>
 The <i>length</i> and <i>startoffset</i> values are code
-units, not characters. The options are:
+units, not characters. The length may be given as PCRE2_ZERO_TERMINATE for a
+subject that is terminated by a binary zero code unit. The options are:
 <pre>
   PCRE2_ANCHORED          Match only at the first position
+  PCRE2_ENDANCHORED       Pattern can match only at end of subject
   PCRE2_NOTBOL            Subject string is not the beginning of a line
   PCRE2_NOTEOL            Subject string is not the end of a line
   PCRE2_NOTEMPTY          An empty string is not a valid match
-  PCRE2_NOTEMPTY_ATSTART  An empty string at the start of the subject
-                           is not a valid match
-  PCRE2_NO_UTF_CHECK      Do not check the subject for UTF
-                           validity (only relevant if PCRE2_UTF
+  PCRE2_NOTEMPTY_ATSTART  An empty string at the start of the subject is not a valid match
+  PCRE2_NO_JIT            Do not use JIT matching
+  PCRE2_NO_UTF_CHECK      Do not check the subject for UTF validity (only relevant if PCRE2_UTF
                            was set at compile time)
-  PCRE2_PARTIAL_SOFT      Return PCRE2_ERROR_PARTIAL for a partial
-                            match if no full matches are found
-  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial match
-                           if that is found before a full match
+  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial match even if there is a full match
+  PCRE2_PARTIAL_SOFT      Return PCRE2_ERROR_PARTIAL for a partial match if no full matches are found
 </pre>
 For details of partial matching, see the
 <a href="pcre2partial.html"><b>pcre2partial</b></a>
diff --git a/doc/html/pcre2_match_data_free.html b/doc/html/pcre2_match_data_free.html
index 70e107e..840067f 100644
--- a/doc/html/pcre2_match_data_free.html
+++ b/doc/html/pcre2_match_data_free.html
@@ -26,8 +26,8 @@ DESCRIPTION
 </b><br>
 <P>
 This function frees the memory occupied by a match data block, using the memory
-freeing function from the general context with which it was created, or
-<b>free()</b> if that was not set.
+freeing function from the general context or compiled pattern with which it was
+created, or <b>free()</b> if that was not set.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_pattern_convert.html b/doc/html/pcre2_pattern_convert.html
new file mode 100644
index 0000000..2fcd7cc
--- /dev/null
+++ b/doc/html/pcre2_pattern_convert.html
@@ -0,0 +1,70 @@
+<html>
+<head>
+<title>pcre2_pattern_convert specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_pattern_convert man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_pattern_convert(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
+<b>  uint32_t <i>options</i>, PCRE2_UCHAR **<i>buffer</i>,</b>
+<b>  PCRE2_SIZE *<i>blength</i>, pcre2_convert_context *<i>cvcontext</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It converts a foreign pattern (for example, a glob) into a PCRE2 regular
+expression pattern. Its arguments are:
+<pre>
+  <i>pattern</i>     The foreign pattern
+  <i>length</i>      The length of the input pattern or PCRE2_ZERO_TERMINATED
+  <i>options</i>     Option bits
+  <i>buffer</i>      Pointer to pointer to output buffer, or NULL
+  <i>blength</i>     Pointer to output length field
+  <i>cvcontext</i>   Pointer to a convert context or NULL
+</pre>
+The length of the converted pattern (excluding the terminating zero) is
+returned via <i>blength</i>. If <i>buffer</i> is NULL, the function just returns
+the output length. If <i>buffer</i> points to a NULL pointer, heap memory is
+obtained for the converted pattern, using the allocator in the context if
+present (or else <b>malloc()</b>), and the field pointed to by <i>buffer</i> is
+updated. If <i>buffer</i> points to a non-NULL field, that must point to a
+buffer whose size is in the variable pointed to by <i>blength</i>. This value is
+updated.
+</P>
+<P>
+The option bits are:
+<pre>
+  PCRE2_CONVERT_UTF                     Input is UTF
+  PCRE2_CONVERT_NO_UTF_CHECK            Do not check UTF validity
+  PCRE2_CONVERT_POSIX_BASIC             Convert POSIX basic pattern
+  PCRE2_CONVERT_POSIX_EXTENDED          Convert POSIX extended pattern
+  PCRE2_CONVERT_GLOB                    ) Convert
+  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR  )   various types
+  PCRE2_CONVERT_GLOB_NO_STARSTAR        )     of glob
+</pre>
+The return value from <b>pcre2_pattern_convert()</b> is zero on success or a
+non-zero PCRE2 error code.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_pattern_info.html b/doc/html/pcre2_pattern_info.html
index b4cd6f5..1ebf90b 100644
--- a/doc/html/pcre2_pattern_info.html
+++ b/doc/html/pcre2_pattern_info.html
@@ -27,7 +27,7 @@ DESCRIPTION
 <P>
 This function returns information about a compiled pattern. Its arguments are:
 <pre>
-  <i>code</i>     Pointer to a compiled regular expression
+  <i>code</i>     Pointer to a compiled regular expression pattern
   <i>what</i>     What information is required
   <i>where</i>    Where to put the information
 </pre>
@@ -41,27 +41,28 @@ request are as follows:
                                PCRE2_BSR_UNICODE: Unicode line endings
                                PCRE2_BSR_ANYCRLF: CR, LF, or CRLF only
   PCRE2_INFO_CAPTURECOUNT    Number of capturing subpatterns
+  PCRE2_INFO_DEPTHLIMIT      Backtracking depth limit if set, otherwise PCRE2_ERROR_UNSET
+  PCRE2_INFO_EXTRAOPTIONS    Extra options that were passed in the
+                               compile context
   PCRE2_INFO_FIRSTBITMAP     Bitmap of first code units, or NULL
   PCRE2_INFO_FIRSTCODETYPE   Type of start-of-match information
                                0 nothing set
                                1 first code unit is set
                                2 start of string or after newline
   PCRE2_INFO_FIRSTCODEUNIT   First code unit when type is 1
+  PCRE2_INFO_FRAMESIZE       Size of backtracking frame
   PCRE2_INFO_HASBACKSLASHC   Return 1 if pattern contains \C
-  PCRE2_INFO_HASCRORLF       Return 1 if explicit CR or LF matches
-                               exist in the pattern
+  PCRE2_INFO_HASCRORLF       Return 1 if explicit CR or LF matches exist in the pattern
+  PCRE2_INFO_HEAPLIMIT       Heap memory limit if set, otherwise PCRE2_ERROR_UNSET
   PCRE2_INFO_JCHANGED        Return 1 if (?J) or (?-J) was used
   PCRE2_INFO_JITSIZE         Size of JIT compiled code, or 0
   PCRE2_INFO_LASTCODETYPE    Type of must-be-present information
                                0 nothing set
                                1 code unit is set
   PCRE2_INFO_LASTCODEUNIT    Last code unit when type is 1
-  PCRE2_INFO_MATCHEMPTY      1 if the pattern can match an
-                               empty string, 0 otherwise
-  PCRE2_INFO_MATCHLIMIT      Match limit if set,
-                               otherwise PCRE2_ERROR_UNSET
-  PCRE2_INFO_MAXLOOKBEHIND   Length (in characters) of the longest
-                               lookbehind assertion
+  PCRE2_INFO_MATCHEMPTY      1 if the pattern can match an empty string, 0 otherwise
+  PCRE2_INFO_MATCHLIMIT      Match limit if set, otherwise PCRE2_ERROR_UNSET
+  PCRE2_INFO_MAXLOOKBEHIND   Length (in characters) of the longest lookbehind assertion
   PCRE2_INFO_MINLENGTH       Lower bound length of matching strings
   PCRE2_INFO_NAMECOUNT       Number of named subpatterns
   PCRE2_INFO_NAMEENTRYSIZE   Size of name table entries
@@ -72,8 +73,8 @@ request are as follows:
                                PCRE2_NEWLINE_CRLF
                                PCRE2_NEWLINE_ANY
                                PCRE2_NEWLINE_ANYCRLF
-  PCRE2_INFO_RECURSIONLIMIT  Recursion limit if set,
-                               otherwise PCRE2_ERROR_UNSET
+                               PCRE2_NEWLINE_NUL
+  PCRE2_INFO_RECURSIONLIMIT  Obsolete synonym for PCRE2_INFO_DEPTHLIMIT
   PCRE2_INFO_SIZE            Size of compiled pattern
 </pre>
 If <i>where</i> is NULL, the function returns the amount of memory needed for
diff --git a/doc/html/pcre2_set_callout.html b/doc/html/pcre2_set_callout.html
index 635e0c2..4e7aca6 100644
--- a/doc/html/pcre2_set_callout.html
+++ b/doc/html/pcre2_set_callout.html
@@ -29,7 +29,7 @@ DESCRIPTION
 <P>
 This function sets the callout fields in a match context (the first argument).
 The second argument specifies a callout function, and the third argument is an
-opaque data time that is passed to it. The result of this function is always
+opaque data item that is passed to it. The result of this function is always
 zero.
 </P>
 <P>
diff --git a/doc/html/pcre2_set_compile_extra_options.html b/doc/html/pcre2_set_compile_extra_options.html
new file mode 100644
index 0000000..7374931
--- /dev/null
+++ b/doc/html/pcre2_set_compile_extra_options.html
@@ -0,0 +1,45 @@
+<html>
+<head>
+<title>pcre2_set_compile_extra_options specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_set_compile_extra_options man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
+<b>  PCRE2_SIZE <i>extra_options</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function sets additional option bits for <b>pcre2_compile()</b> that are
+housed in a compile context. It completely replaces all the bits. The extra
+options are:
+<pre>
+  PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES  Allow \x{df800} to \x{dfff} in UTF-8 and UTF-32 modes
+  PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL    Treat all invalid escapes as a literal following character
+  PCRE2_EXTRA_MATCH_LINE               Pattern matches whole lines
+  PCRE2_EXTRA_MATCH_WORD               Pattern matches "words"
+</pre>
+There is a complete description of the PCRE2 native API in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+page and a description of the POSIX API in the
+<a href="pcre2posix.html"><b>pcre2posix</b></a>
+page.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_set_depth_limit.html b/doc/html/pcre2_set_depth_limit.html
new file mode 100644
index 0000000..a1cf706
--- /dev/null
+++ b/doc/html/pcre2_set_depth_limit.html
@@ -0,0 +1,40 @@
+<html>
+<head>
+<title>pcre2_set_depth_limit specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_set_depth_limit man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_set_depth_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>  uint32_t <i>value</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function sets the backtracking depth limit field in a match context. The
+result is always zero.
+</P>
+<P>
+There is a complete description of the PCRE2 native API in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+page and a description of the POSIX API in the
+<a href="pcre2posix.html"><b>pcre2posix</b></a>
+page.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_set_glob_escape.html b/doc/html/pcre2_set_glob_escape.html
new file mode 100644
index 0000000..2b55627
--- /dev/null
+++ b/doc/html/pcre2_set_glob_escape.html
@@ -0,0 +1,43 @@
+<html>
+<head>
+<title>pcre2_set_glob_escape specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_set_glob_escape man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_set_glob_escape(pcre2_convert_context *<i>cvcontext</i>,</b>
+<b>  uint32_t <i>escape_char</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It sets the escape character that is used when converting globs. The second
+argument must either be zero (meaning there is no escape character) or a
+punctuation character whose code point is less than 256. The default is grave
+accent if running under Windows, otherwise backslash. The result of the
+function is zero for success or PCRE2_ERROR_BADDATA if the second argument is
+invalid.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_set_glob_separator.html b/doc/html/pcre2_set_glob_separator.html
new file mode 100644
index 0000000..538748d
--- /dev/null
+++ b/doc/html/pcre2_set_glob_separator.html
@@ -0,0 +1,42 @@
+<html>
+<head>
+<title>pcre2_set_glob_separator specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_set_glob_separator man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_set_glob_separator(pcre2_convert_context *<i>cvcontext</i>,</b>
+<b>  uint32_t <i>separator_char</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function is part of an experimental set of pattern conversion functions.
+It sets the component separator character that is used when converting globs.
+The second argument must one of the characters forward slash, backslash, or
+dot. The default is backslash when running under Windows, otherwise forward
+slash. The result of the function is zero for success or PCRE2_ERROR_BADDATA if
+the second argument is invalid.
+</P>
+<P>
+The pattern conversion functions are described in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_set_heap_limit.html b/doc/html/pcre2_set_heap_limit.html
new file mode 100644
index 0000000..3631ef6
--- /dev/null
+++ b/doc/html/pcre2_set_heap_limit.html
@@ -0,0 +1,40 @@
+<html>
+<head>
+<title>pcre2_set_heap_limit specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_set_heap_limit man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_set_heap_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>  uint32_t <i>value</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function sets the backtracking heap limit field in a match context. The
+result is always zero.
+</P>
+<P>
+There is a complete description of the PCRE2 native API in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+page and a description of the POSIX API in the
+<a href="pcre2posix.html"><b>pcre2posix</b></a>
+page.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_set_max_pattern_length.html b/doc/html/pcre2_set_max_pattern_length.html
new file mode 100644
index 0000000..f6e422a
--- /dev/null
+++ b/doc/html/pcre2_set_max_pattern_length.html
@@ -0,0 +1,43 @@
+<html>
+<head>
+<title>pcre2_set_max_pattern_length specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2_set_max_pattern_length man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<br><b>
+SYNOPSIS
+</b><br>
+<P>
+<b>#include &#60;pcre2.h&#62;</b>
+</P>
+<P>
+<b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
+<b>  PCRE2_SIZE <i>value</i>);</b>
+</P>
+<br><b>
+DESCRIPTION
+</b><br>
+<P>
+This function sets, in a compile context, the maximum text length (in code
+units) of the pattern that can be compiled. The result is always zero. If a
+longer pattern is passed to <b>pcre2_compile()</b> there is an immediate error
+return. The default is effectively unlimited, being the largest value a
+PCRE2_SIZE variable can hold.
+</P>
+<P>
+There is a complete description of the PCRE2 native API in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+page and a description of the POSIX API in the
+<a href="pcre2posix.html"><b>pcre2posix</b></a>
+page.
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2_set_newline.html b/doc/html/pcre2_set_newline.html
index ae6332a..ba81300 100644
--- a/doc/html/pcre2_set_newline.html
+++ b/doc/html/pcre2_set_newline.html
@@ -35,6 +35,7 @@ matching patterns. The second argument must be one of:
   PCRE2_NEWLINE_CRLF      CR followed by LF only
   PCRE2_NEWLINE_ANYCRLF   Any of the above
   PCRE2_NEWLINE_ANY       Any Unicode newline sequence
+  PCRE2_NEWLINE_NUL       The NUL character (binary zero)
 </pre>
 The result is zero for success or PCRE2_ERROR_BADDATA if the second argument is
 invalid.
diff --git a/doc/html/pcre2_set_recursion_limit.html b/doc/html/pcre2_set_recursion_limit.html
index 5adcc99..9ff68c2 100644
--- a/doc/html/pcre2_set_recursion_limit.html
+++ b/doc/html/pcre2_set_recursion_limit.html
@@ -26,8 +26,8 @@ SYNOPSIS
 DESCRIPTION
 </b><br>
 <P>
-This function sets the recursion limit field in a match context. The result is
-always zero.
+This function is obsolete and should not be used in new code. Use
+<b>pcre2_set_depth_limit()</b> instead.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_set_recursion_memory_management.html b/doc/html/pcre2_set_recursion_memory_management.html
index ec18947..1e057b9 100644
--- a/doc/html/pcre2_set_recursion_memory_management.html
+++ b/doc/html/pcre2_set_recursion_memory_management.html
@@ -28,13 +28,8 @@ SYNOPSIS
 DESCRIPTION
 </b><br>
 <P>
-This function sets the match context fields for custom memory management when
-PCRE2 is compiled to use the heap instead of the system stack for recursive
-function calls while matching. When PCRE2 is compiled to use the stack (the
-default) this function does nothing. The first argument is a match context, the
-second and third specify the memory allocation and freeing functions, and the
-final argument is an opaque value that is passed to them whenever they are
-called. The result of this function is always zero.
+From release 10.30 onwards, this function is obsolete and does nothing. The
+result is always zero.
 </P>
 <P>
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/html/pcre2_substitute.html b/doc/html/pcre2_substitute.html
index 2dfd094..2215ce9 100644
--- a/doc/html/pcre2_substitute.html
+++ b/doc/html/pcre2_substitute.html
@@ -47,26 +47,30 @@ Its arguments are:
   <i>outputbuffer</i>  Points to the output buffer
   <i>outlengthptr</i>  Points to the length of the output buffer
 </pre>
-A match context is needed only if you want to:
+A match data block is needed only if you want to inspect the data from the
+match that is returned in that block. A match context is needed only if you
+want to:
 <pre>
   Set up a callout function
-  Change the limit for calling the internal function <i>match()</i>
-  Change the limit for calling <i>match()</i> recursively
-  Set custom memory management when the heap is used for recursion
+  Set a matching offset limit
+  Change the backtracking match limit
+  Change the backtracking depth limit
+  Set custom memory management in the match context
 </pre>
 The <i>length</i>, <i>startoffset</i> and <i>rlength</i> values are code
 units, not characters, as is the contents of the variable pointed at by
 <i>outlengthptr</i>, which is updated to the actual length of the new string.
-The options are:
+The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
+zero-terminated strings. The options are:
 <pre>
   PCRE2_ANCHORED             Match only at the first position
+  PCRE2_ENDANCHORED          Pattern can match only at end of subject
   PCRE2_NOTBOL               Subject is not the beginning of a line
   PCRE2_NOTEOL               Subject is not the end of a line
   PCRE2_NOTEMPTY             An empty string is not a valid match
-  PCRE2_NOTEMPTY_ATSTART     An empty string at the start of the
-                              subject is not a valid match
-  PCRE2_NO_UTF_CHECK         Do not check the subject or replacement
-                              for UTF validity (only relevant if
+  PCRE2_NOTEMPTY_ATSTART     An empty string at the start of the subject is not a valid match
+  PCRE2_NO_JIT               Do not use JIT matching
+  PCRE2_NO_UTF_CHECK         Do not check the subject or replacement for UTF validity (only relevant if
                               PCRE2_UTF was set at compile time)
   PCRE2_SUBSTITUTE_EXTENDED  Do extended replacement processing
   PCRE2_SUBSTITUTE_GLOBAL    Replace all occurrences in the subject
diff --git a/doc/html/pcre2api.html b/doc/html/pcre2api.html
index fa9f342..ba3b2ca 100644
--- a/doc/html/pcre2api.html
+++ b/doc/html/pcre2api.html
@@ -23,44 +23,45 @@ please consult the man page, in case the conversion went wrong.
 <li><a name="TOC8" href="#SEC8">PCRE2 NATIVE API JIT FUNCTIONS</a>
 <li><a name="TOC9" href="#SEC9">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a>
 <li><a name="TOC10" href="#SEC10">PCRE2 NATIVE API AUXILIARY FUNCTIONS</a>
-<li><a name="TOC11" href="#SEC11">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a>
-<li><a name="TOC12" href="#SEC12">PCRE2 API OVERVIEW</a>
-<li><a name="TOC13" href="#SEC13">STRING LENGTHS AND OFFSETS</a>
-<li><a name="TOC14" href="#SEC14">NEWLINES</a>
-<li><a name="TOC15" href="#SEC15">MULTITHREADING</a>
-<li><a name="TOC16" href="#SEC16">PCRE2 CONTEXTS</a>
-<li><a name="TOC17" href="#SEC17">CHECKING BUILD-TIME OPTIONS</a>
-<li><a name="TOC18" href="#SEC18">COMPILING A PATTERN</a>
-<li><a name="TOC19" href="#SEC19">COMPILATION ERROR CODES</a>
-<li><a name="TOC20" href="#SEC20">JUST-IN-TIME (JIT) COMPILATION</a>
-<li><a name="TOC21" href="#SEC21">LOCALE SUPPORT</a>
-<li><a name="TOC22" href="#SEC22">INFORMATION ABOUT A COMPILED PATTERN</a>
-<li><a name="TOC23" href="#SEC23">INFORMATION ABOUT A PATTERN'S CALLOUTS</a>
-<li><a name="TOC24" href="#SEC24">SERIALIZATION AND PRECOMPILING</a>
-<li><a name="TOC25" href="#SEC25">THE MATCH DATA BLOCK</a>
-<li><a name="TOC26" href="#SEC26">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a>
-<li><a name="TOC27" href="#SEC27">NEWLINE HANDLING WHEN MATCHING</a>
-<li><a name="TOC28" href="#SEC28">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a>
-<li><a name="TOC29" href="#SEC29">OTHER INFORMATION ABOUT A MATCH</a>
-<li><a name="TOC30" href="#SEC30">ERROR RETURNS FROM <b>pcre2_match()</b></a>
-<li><a name="TOC31" href="#SEC31">OBTAINING A TEXTUAL ERROR MESSAGE</a>
-<li><a name="TOC32" href="#SEC32">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a>
-<li><a name="TOC33" href="#SEC33">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
-<li><a name="TOC34" href="#SEC34">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
-<li><a name="TOC35" href="#SEC35">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
-<li><a name="TOC36" href="#SEC36">DUPLICATE SUBPATTERN NAMES</a>
-<li><a name="TOC37" href="#SEC37">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
-<li><a name="TOC38" href="#SEC38">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
-<li><a name="TOC39" href="#SEC39">SEE ALSO</a>
-<li><a name="TOC40" href="#SEC40">AUTHOR</a>
-<li><a name="TOC41" href="#SEC41">REVISION</a>
+<li><a name="TOC11" href="#SEC11">PCRE2 NATIVE API OBSOLETE FUNCTIONS</a>
+<li><a name="TOC12" href="#SEC12">PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a>
+<li><a name="TOC13" href="#SEC13">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a>
+<li><a name="TOC14" href="#SEC14">PCRE2 API OVERVIEW</a>
+<li><a name="TOC15" href="#SEC15">STRING LENGTHS AND OFFSETS</a>
+<li><a name="TOC16" href="#SEC16">NEWLINES</a>
+<li><a name="TOC17" href="#SEC17">MULTITHREADING</a>
+<li><a name="TOC18" href="#SEC18">PCRE2 CONTEXTS</a>
+<li><a name="TOC19" href="#SEC19">CHECKING BUILD-TIME OPTIONS</a>
+<li><a name="TOC20" href="#SEC20">COMPILING A PATTERN</a>
+<li><a name="TOC21" href="#SEC21">JUST-IN-TIME (JIT) COMPILATION</a>
+<li><a name="TOC22" href="#SEC22">LOCALE SUPPORT</a>
+<li><a name="TOC23" href="#SEC23">INFORMATION ABOUT A COMPILED PATTERN</a>
+<li><a name="TOC24" href="#SEC24">INFORMATION ABOUT A PATTERN'S CALLOUTS</a>
+<li><a name="TOC25" href="#SEC25">SERIALIZATION AND PRECOMPILING</a>
+<li><a name="TOC26" href="#SEC26">THE MATCH DATA BLOCK</a>
+<li><a name="TOC27" href="#SEC27">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a>
+<li><a name="TOC28" href="#SEC28">NEWLINE HANDLING WHEN MATCHING</a>
+<li><a name="TOC29" href="#SEC29">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a>
+<li><a name="TOC30" href="#SEC30">OTHER INFORMATION ABOUT A MATCH</a>
+<li><a name="TOC31" href="#SEC31">ERROR RETURNS FROM <b>pcre2_match()</b></a>
+<li><a name="TOC32" href="#SEC32">OBTAINING A TEXTUAL ERROR MESSAGE</a>
+<li><a name="TOC33" href="#SEC33">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a>
+<li><a name="TOC34" href="#SEC34">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
+<li><a name="TOC35" href="#SEC35">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
+<li><a name="TOC36" href="#SEC36">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
+<li><a name="TOC37" href="#SEC37">DUPLICATE SUBPATTERN NAMES</a>
+<li><a name="TOC38" href="#SEC38">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
+<li><a name="TOC39" href="#SEC39">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
+<li><a name="TOC40" href="#SEC40">SEE ALSO</a>
+<li><a name="TOC41" href="#SEC41">AUTHOR</a>
+<li><a name="TOC42" href="#SEC42">REVISION</a>
 </ul>
 <P>
 <b>#include &#60;pcre2.h&#62;</b>
 <br>
 <br>
-PCRE2 is a new API for PCRE. This document contains a description of all its
-functions. See the
+PCRE2 is a new API for PCRE, starting at release 10.0. This document contains a
+description of all its native functions. See the
 <a href="pcre2.html"><b>pcre2</b></a>
 document for an overview of all the PCRE2 documentation.
 </P>
@@ -144,6 +145,10 @@ document for an overview of all the PCRE2 documentation.
 <b>  const unsigned char *<i>tables</i>);</b>
 <br>
 <br>
+<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
+<b>  uint32_t <i>extra_options</i>);</b>
+<br>
+<br>
 <b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  PCRE2_SIZE <i>value</i>);</b>
 <br>
@@ -177,22 +182,20 @@ document for an overview of all the PCRE2 documentation.
 <b>  void *<i>callout_data</i>);</b>
 <br>
 <br>
-<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
-<b>  uint32_t <i>value</i>);</b>
-<br>
-<br>
 <b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
 <b>  PCRE2_SIZE <i>value</i>);</b>
 <br>
 <br>
-<b>int pcre2_set_recursion_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>int pcre2_set_heap_limit(pcre2_match_context *<i>mcontext</i>,</b>
 <b>  uint32_t <i>value</i>);</b>
 <br>
 <br>
-<b>int pcre2_set_recursion_memory_management(</b>
-<b>  pcre2_match_context *<i>mcontext</i>,</b>
-<b>  void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
-<b>  void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
+<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>  uint32_t <i>value</i>);</b>
+<br>
+<br>
+<b>int pcre2_set_depth_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>  uint32_t <i>value</i>);</b>
 </P>
 <br><a name="SEC6" href="#TOC1">PCRE2 NATIVE API STRING EXTRACTION FUNCTIONS</a><br>
 <P>
@@ -294,6 +297,9 @@ document for an overview of all the PCRE2 documentation.
 <b>pcre2_code *pcre2_code_copy(const pcre2_code *<i>code</i>);</b>
 <br>
 <br>
+<b>pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *<i>code</i>);</b>
+<br>
+<br>
 <b>int pcre2_get_error_message(int <i>errorcode</i>, PCRE2_UCHAR *<i>buffer</i>,</b>
 <b>  PCRE2_SIZE <i>bufflen</i>);</b>
 <br>
@@ -311,7 +317,60 @@ document for an overview of all the PCRE2 documentation.
 <br>
 <b>int pcre2_config(uint32_t <i>what</i>, void *<i>where</i>);</b>
 </P>
-<br><a name="SEC11" href="#TOC1">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a><br>
+<br><a name="SEC11" href="#TOC1">PCRE2 NATIVE API OBSOLETE FUNCTIONS</a><br>
+<P>
+<b>int pcre2_set_recursion_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>  uint32_t <i>value</i>);</b>
+<br>
+<br>
+<b>int pcre2_set_recursion_memory_management(</b>
+<b>  pcre2_match_context *<i>mcontext</i>,</b>
+<b>  void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
+<b>  void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
+<br>
+<br>
+These functions became obsolete at release 10.30 and are retained only for
+backward compatibility. They should not be used in new code. The first is
+replaced by <b>pcre2_set_depth_limit()</b>; the second is no longer needed and
+has no effect (it always returns zero).
+</P>
+<br><a name="SEC12" href="#TOC1">PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a><br>
+<P>
+<b>pcre2_convert_context *pcre2_convert_context_create(</b>
+<b>  pcre2_general_context *<i>gcontext</i>);</b>
+<br>
+<br>
+<b>pcre2_convert_context *pcre2_convert_context_copy(</b>
+<b>  pcre2_convert_context *<i>cvcontext</i>);</b>
+<br>
+<br>
+<b>void pcre2_convert_context_free(pcre2_convert_context *<i>cvcontext</i>);</b>
+<br>
+<br>
+<b>int pcre2_set_glob_escape(pcre2_convert_context *<i>cvcontext</i>,</b>
+<b>  uint32_t <i>escape_char</i>);</b>
+<br>
+<br>
+<b>int pcre2_set_glob_separator(pcre2_convert_context *<i>cvcontext</i>,</b>
+<b>  uint32_t <i>separator_char</i>);</b>
+<br>
+<br>
+<b>int pcre2_pattern_convert(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
+<b>  uint32_t <i>options</i>, PCRE2_UCHAR **<i>buffer</i>,</b>
+<b>  PCRE2_SIZE *<i>blength</i>, pcre2_convert_context *<i>cvcontext</i>);</b>
+<br>
+<br>
+<b>void pcre2_converted_pattern_free(PCRE2_UCHAR *<i>converted_pattern</i>);</b>
+<br>
+<br>
+These functions provide a way of converting non-PCRE2 patterns into
+patterns that can be processed by <b>pcre2_compile()</b>. This facility is
+experimental and may be changed in future releases. At present, "globs" and
+POSIX basic and extended patterns can be converted. Details are given in the
+<a href="pcre2convert.html"><b>pcre2convert</b></a>
+documentation.
+</P>
+<br><a name="SEC13" href="#TOC1">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a><br>
 <P>
 There are three PCRE2 libraries, supporting 8-bit, 16-bit, and 32-bit code
 units, respectively. However, there is just one header file, <b>pcre2.h</b>.
@@ -365,28 +424,28 @@ When using multiple libraries in an application, you must take care when
 processing any particular pattern to use only functions from a single library.
 For example, if you want to run a match using a pattern that was compiled with
 <b>pcre2_compile_16()</b>, you must do so with <b>pcre2_match_16()</b>, not
-<b>pcre2_match_8()</b>.
+<b>pcre2_match_8()</b> or <b>pcre2_match_32()</b>.
 </P>
 <P>
 In the function summaries above, and in the rest of this document and other
 PCRE2 documents, functions and data types are described using their generic
-names, without the 8, 16, or 32 suffix.
+names, without the _8, _16, or _32 suffix.
 </P>
-<br><a name="SEC12" href="#TOC1">PCRE2 API OVERVIEW</a><br>
+<br><a name="SEC14" href="#TOC1">PCRE2 API OVERVIEW</a><br>
 <P>
 PCRE2 has its own native API, which is described in this document. There are
 also some wrapper functions for the 8-bit library that correspond to the
 POSIX regular expression API, but they do not give access to all the
-functionality. They are described in the
+functionality of PCRE2. They are described in the
 <a href="pcre2posix.html"><b>pcre2posix</b></a>
 documentation. Both these APIs define a set of C function calls.
 </P>
 <P>
 The native API C data types, function prototypes, option values, and error
-codes are defined in the header file <b>pcre2.h</b>, which contains definitions
-of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release numbers for the
-library. Applications can use these to include support for different releases
-of PCRE2.
+codes are defined in the header file <b>pcre2.h</b>, which also contains
+definitions of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release numbers
+for the library. Applications can use these to include support for different
+releases of PCRE2.
 </P>
 <P>
 In a Windows environment, if you want to statically link an application program
@@ -394,7 +453,7 @@ against a non-dll PCRE2 library, you must define PCRE2_STATIC before including
 <b>pcre2.h</b>.
 </P>
 <P>
-The functions <b>pcre2_compile()</b>, and <b>pcre2_match()</b> are used for
+The functions <b>pcre2_compile()</b> and <b>pcre2_match()</b> are used for
 compiling and matching regular expressions in a Perl-compatible manner. A
 sample program that demonstrates the simplest way of using them is provided in
 the file called <i>pcre2demo.c</i> in the PCRE2 source distribution. A listing
@@ -405,10 +464,17 @@ documentation, and the
 documentation describes how to compile and run it.
 </P>
 <P>
-Just-in-time compiler support is an optional feature of PCRE2 that can be built
-in appropriate hardware environments. It greatly speeds up the matching
+The compiling and matching functions recognize various options that are passed
+as bits in an options argument. There are also some more complicated parameters
+such as custom memory management functions and resource limits that are passed
+in "contexts" (which are just memory blocks, described below). Simple
+applications do not need to make use of contexts.
+</P>
+<P>
+Just-in-time (JIT) compiler support is an optional feature of PCRE2 that can be
+built in appropriate hardware environments. It greatly speeds up the matching
 performance of many patterns. Programs can request that it be used if
-available, by calling <b>pcre2_jit_compile()</b> after a pattern has been
+available by calling <b>pcre2_jit_compile()</b> after a pattern has been
 successfully compiled by <b>pcre2_compile()</b>. This does nothing if JIT
 support is not available.
 </P>
@@ -420,8 +486,8 @@ More complicated programs might need to make use of the specialist functions
 <P>
 JIT matching is automatically used by <b>pcre2_match()</b> if it is available,
 unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
-matching, which gives improved performance. The JIT-specific functions are
-discussed in the
+matching, which gives improved performance at the expense of less sanity
+checking. The JIT-specific functions are discussed in the
 <a href="pcre2jit.html"><b>pcre2jit</b></a>
 documentation.
 </P>
@@ -430,7 +496,7 @@ A second matching function, <b>pcre2_dfa_match()</b>, which is not
 Perl-compatible, is also provided. This uses a different algorithm for the
 matching. The alternative algorithm finds all possible matches (at a given
 point in the subject), and scans the subject just once (unless there are
-lookbehind assertions). However, this algorithm does not return captured
+lookaround assertions). However, this algorithm does not return captured
 substrings. A description of the two matching algorithms and their advantages
 and disadvantages is given in the
 <a href="pcre2matching.html"><b>pcre2matching</b></a>
@@ -452,7 +518,7 @@ been matched by <b>pcre2_match()</b>. They are:
   <b>pcre2_substring_number_from_name()</b>
 </pre>
 <b>pcre2_substring_free()</b> and <b>pcre2_substring_list_free()</b> are also
-provided, to free the memory used for extracted strings.
+provided, to free memory used for extracted strings.
 </P>
 <P>
 The function <b>pcre2_substitute()</b> can be called to match a pattern and
@@ -473,7 +539,7 @@ Functions with names ending with <b>_free()</b> are used for freeing memory
 blocks of various sorts. In all cases, if one of these functions is called with
 a NULL argument, it does nothing.
 </P>
-<br><a name="SEC13" href="#TOC1">STRING LENGTHS AND OFFSETS</a><br>
+<br><a name="SEC15" href="#TOC1">STRING LENGTHS AND OFFSETS</a><br>
 <P>
 The PCRE2 API uses string lengths and offsets into strings of code units in
 several places. These values are always of type PCRE2_SIZE, which is an
@@ -483,7 +549,7 @@ as a special indicator for zero-terminated strings and unset offsets.
 Therefore, the longest string that can be handled is one less than this
 maximum.
 <a name="newlines"></a></P>
-<br><a name="SEC14" href="#TOC1">NEWLINES</a><br>
+<br><a name="SEC16" href="#TOC1">NEWLINES</a><br>
 <P>
 PCRE2 supports five different conventions for indicating line breaks in
 strings: a single CR (carriage return) character, a single LF (linefeed)
@@ -518,7 +584,7 @@ The choice of newline convention does not affect the interpretation of
 the \n or \r escape sequences, nor does it affect what \R matches; this has
 its own separate convention.
 </P>
-<br><a name="SEC15" href="#TOC1">MULTITHREADING</a><br>
+<br><a name="SEC17" href="#TOC1">MULTITHREADING</a><br>
 <P>
 In a multithreaded application it is important to keep thread-specific data
 separate from data that can be shared between threads. The PCRE2 library code
@@ -540,8 +606,8 @@ and does not change when the pattern is matched. Therefore, it is thread-safe,
 that is, the same compiled pattern can be used by more than one thread
 simultaneously. For example, an application can compile all its patterns at the
 start, before forking off multiple threads that use them. However, if the
-just-in-time optimization feature is being used, it needs separate memory stack
-areas for each thread. See the
+just-in-time (JIT) optimization feature is being used, it needs separate memory
+stack areas for each thread. See the
 <a href="pcre2jit.html"><b>pcre2jit</b></a>
 documentation for more details.
 </P>
@@ -567,8 +633,9 @@ If JIT is being used, but the JIT compilation is not being done immediately,
 (perhaps waiting to see if the pattern is used often enough) similar logic is
 required. JIT compilation updates a pointer within the compiled code block, so
 a thread must gain unique write access to the pointer before calling
-<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> can be used
-to obtain a private copy of the compiled code.
+<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> or
+<b>pcre2_code_copy_with_tables()</b> can be used to obtain a private copy of the
+compiled code before calling the JIT compiler.
 </P>
 <br><b>
 Context blocks
@@ -592,12 +659,12 @@ thread-specific copy.
 Match blocks
 </b><br>
 <P>
-The matching functions need a block of memory for working space and for storing
-the results of a match. This includes details of what was matched, as well as
-additional information such as the name of a (*MARK) setting. Each thread must
-provide its own copy of this memory.
+The matching functions need a block of memory for storing the results of a
+match. This includes details of what was matched, as well as additional
+information such as the name of a (*MARK) setting. Each thread must provide its
+own copy of this memory.
 </P>
-<br><a name="SEC16" href="#TOC1">PCRE2 CONTEXTS</a><br>
+<br><a name="SEC18" href="#TOC1">PCRE2 CONTEXTS</a><br>
 <P>
 Some PCRE2 functions have a lot of parameters, many of which are used only by
 specialist applications, for example, those that use custom memory management
@@ -622,6 +689,8 @@ library. The context is named `general' rather than specifically `memory'
 because in future other fields may be added. If you do not want to supply your
 own custom memory management functions, you do not need to bother with a
 general context. A general context is created by:
+<br>
+<br>
 <b>pcre2_general_context *pcre2_general_context_create(</b>
 <b>  void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
 <b>  void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
@@ -648,26 +717,31 @@ used. When the time comes to free the block, this function is called.
 </P>
 <P>
 A general context can be copied by calling:
+<br>
+<br>
 <b>pcre2_general_context *pcre2_general_context_copy(</b>
 <b>  pcre2_general_context *<i>gcontext</i>);</b>
 <br>
 <br>
 The memory used for a general context should be freed by calling:
+<br>
+<br>
 <b>void pcre2_general_context_free(pcre2_general_context *<i>gcontext</i>);</b>
 <a name="compilecontext"></a></P>
 <br><b>
 The compile context
 </b><br>
 <P>
-A compile context is required if you want to change the default values of any
-of the following compile-time parameters:
+A compile context is required if you want to provide an external function for
+stack checking during compilation or to change the default values of any of the
+following compile-time parameters:
 <pre>
   What \R matches (Unicode newlines or CR, LF, CRLF only)
   PCRE2's character tables
   The newline character sequence
   The compile time nested parentheses limit
   The maximum length of the pattern string
-  An external function for stack checking
+  The extra options bits (none set by default)
 </pre>
 A compile context is also required if you are using custom memory management.
 If none of these apply, just pass NULL as the context argument of
@@ -675,6 +749,8 @@ If none of these apply, just pass NULL as the context argument of
 </P>
 <P>
 A compile context is created, copied, and freed by the following functions:
+<br>
+<br>
 <b>pcre2_compile_context *pcre2_compile_context_create(</b>
 <b>  pcre2_general_context *<i>gcontext</i>);</b>
 <br>
@@ -689,6 +765,8 @@ A compile context is created, copied, and freed by the following functions:
 A compile context is created with default values for its parameters. These can
 be changed by calling the following functions, which return 0 on success, or
 PCRE2_ERROR_BADDATA if invalid data is detected.
+<br>
+<br>
 <b>int pcre2_set_bsr(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  uint32_t <i>value</i>);</b>
 <br>
@@ -698,6 +776,8 @@ or CRLF, or PCRE2_BSR_UNICODE, to specify that \R matches any Unicode line
 ending sequence. The value is used by the JIT compiler and by the two
 interpreted matching functions, <i>pcre2_match()</i> and
 <i>pcre2_dfa_match()</i>.
+<br>
+<br>
 <b>int pcre2_set_character_tables(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  const unsigned char *<i>tables</i>);</b>
 <br>
@@ -705,15 +785,33 @@ interpreted matching functions, <i>pcre2_match()</i> and
 The value must be the result of a call to <i>pcre2_maketables()</i>, whose only
 argument is a general context. This function builds a set of character tables
 in the current locale.
+<br>
+<br>
+<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
+<b>  uint32_t <i>extra_options</i>);</b>
+<br>
+<br>
+As PCRE2 has developed, almost all the 32 option bits that are available in
+the <i>options</i> argument of <b>pcre2_compile()</b> have been used up. To avoid
+running out, the compile context contains a set of extra option bits which are
+used for some newer, assumed rarer, options. This function sets those bits. It
+always sets all the bits (either on or off). It does not modify any existing
+setting. The available options are defined in the section entitled "Extra
+compile options"
+<a href="#extracompileoptions">below.</a>
+<br>
+<br>
 <b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  PCRE2_SIZE <i>value</i>);</b>
 <br>
 <br>
-This sets a maximum length, in code units, for the pattern string that is to be
-compiled. If the pattern is longer, an error is generated. This facility is
-provided so that applications that accept patterns from external sources can
-limit their size. The default is the largest number that a PCRE2_SIZE variable
-can hold, which is effectively unlimited.
+This sets a maximum length, in code units, for any pattern string that is
+compiled with this context. If the pattern is longer, an error is generated.
+This facility is provided so that applications that accept patterns from
+external sources can limit their size. The default is the largest number that a
+PCRE2_SIZE variable can hold, which is effectively unlimited.
+<br>
+<br>
 <b>int pcre2_set_newline(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  uint32_t <i>value</i>);</b>
 <br>
@@ -721,22 +819,34 @@ can hold, which is effectively unlimited.
 This specifies which characters or character sequences are to be recognized as
 newlines. The value must be one of PCRE2_NEWLINE_CR (carriage return only),
 PCRE2_NEWLINE_LF (linefeed only), PCRE2_NEWLINE_CRLF (the two-character
-sequence CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any of the above), or
-PCRE2_NEWLINE_ANY (any Unicode newline sequence).
+sequence CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any of the above),
+PCRE2_NEWLINE_ANY (any Unicode newline sequence), or PCRE2_NEWLINE_NUL (the
+NUL character, that is a binary zero).
+</P>
+<P>
+A pattern can override the value set in the compile context by starting with a
+sequence such as (*CRLF). See the
+<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
+page for details.
 </P>
 <P>
-When a pattern is compiled with the PCRE2_EXTENDED option, the value of this
-parameter affects the recognition of white space and the end of internal
-comments starting with #. The value is saved with the compiled pattern for
-subsequent use by the JIT compiler and by the two interpreted matching
-functions, <i>pcre2_match()</i> and <i>pcre2_dfa_match()</i>.
+When a pattern is compiled with the PCRE2_EXTENDED or PCRE2_EXTENDED_MORE
+option, the newline convention affects the recognition of white space and the
+end of internal comments starting with #. The value is saved with the compiled
+pattern for subsequent use by the JIT compiler and by the two interpreted
+matching functions, <i>pcre2_match()</i> and <i>pcre2_dfa_match()</i>.
+<br>
+<br>
 <b>int pcre2_set_parens_nest_limit(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  uint32_t <i>value</i>);</b>
 <br>
 <br>
 This parameter ajusts the limit, set when PCRE2 is built (default 250), on the
 depth of parenthesis nesting in a pattern. This limit stops rogue patterns
-using up too much system stack when being compiled.
+using up too much system stack when being compiled. The limit applies to
+parentheses of all kinds, not just capturing parentheses.
+<br>
+<br>
 <b>int pcre2_set_compile_recursion_guard(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  int (*<i>guard_function</i>)(uint32_t, void *), void *<i>user_data</i>);</b>
 <br>
@@ -744,10 +854,10 @@ using up too much system stack when being compiled.
 There is at least one application that runs PCRE2 in threads with very limited
 system stack, where running out of stack is to be avoided at all costs. The
 parenthesis limit above cannot take account of how much stack is actually
-available. For a finer control, you can supply a function that is called
-whenever <b>pcre2_compile()</b> starts to compile a parenthesized part of a
-pattern. This function can check the actual stack size (or anything else that
-it wants to, of course).
+available during compilation. For a finer control, you can supply a function
+that is called whenever <b>pcre2_compile()</b> starts to compile a parenthesized
+part of a pattern. This function can check the actual stack size (or anything
+else that it wants to, of course).
 </P>
 <P>
 The first argument to the callout function gives the current depth of
@@ -759,20 +869,22 @@ zero if all is well, or non-zero to force an error.
 The match context
 </b><br>
 <P>
-A match context is required if you want to change the default values of any
-of the following match-time parameters:
+A match context is required if you want to:
 <pre>
-  A callout function
-  The offset limit for matching an unanchored pattern
-  The limit for calling <b>match()</b> (see below)
-  The limit for calling <b>match()</b> recursively
+  Set up a callout function
+  Set an offset limit for matching an unanchored pattern
+  Change the limit on the amount of heap used when matching
+  Change the backtracking match limit
+  Change the backtracking depth limit
+  Set custom memory management specifically for the match
 </pre>
-A match context is also required if you are using custom memory management.
 If none of these apply, just pass NULL as the context argument of
 <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or <b>pcre2_jit_match()</b>.
 </P>
 <P>
 A match context is created, copied, and freed by the following functions:
+<br>
+<br>
 <b>pcre2_match_context *pcre2_match_context_create(</b>
 <b>  pcre2_general_context *<i>gcontext</i>);</b>
 <br>
@@ -787,15 +899,19 @@ A match context is created, copied, and freed by the following functions:
 A match context is created with default values for its parameters. These can
 be changed by calling the following functions, which return 0 on success, or
 PCRE2_ERROR_BADDATA if invalid data is detected.
+<br>
+<br>
 <b>int pcre2_set_callout(pcre2_match_context *<i>mcontext</i>,</b>
 <b>  int (*<i>callout_function</i>)(pcre2_callout_block *, void *),</b>
 <b>  void *<i>callout_data</i>);</b>
 <br>
 <br>
-This sets up a "callout" function, which PCRE2 will call at specified points
+This sets up a "callout" function for PCRE2 to call at specified points
 during a matching operation. Details are given in the
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
 documentation.
+<br>
+<br>
 <b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
 <b>  PCRE2_SIZE <i>value</i>);</b>
 <br>
@@ -804,41 +920,83 @@ The <i>offset_limit</i> parameter limits how far an unanchored search can
 advance in the subject string. The default value is PCRE2_UNSET. The
 <b>pcre2_match()</b> and <b>pcre2_dfa_match()</b> functions return
 PCRE2_ERROR_NOMATCH if a match with a starting point before or at the given
-offset is not found. For example, if the pattern /abc/ is matched against
-"123abc" with an offset limit less than 3, the result is PCRE2_ERROR_NO_MATCH.
-A match can never be found if the <i>startoffset</i> argument of
-<b>pcre2_match()</b> or <b>pcre2_dfa_match()</b> is greater than the offset
-limit.
+offset is not found. The <b>pcre2_substitute()</b> function makes no more
+substitutions.
+</P>
+<P>
+For example, if the pattern /abc/ is matched against "123abc" with an offset
+limit less than 3, the result is PCRE2_ERROR_NO_MATCH. A match can never be
+found if the <i>startoffset</i> argument of <b>pcre2_match()</b>,
+<b>pcre2_dfa_match()</b>, or <b>pcre2_substitute()</b> is greater than the offset
+limit set in the match context.
 </P>
 <P>
-When using this facility, you must set PCRE2_USE_OFFSET_LIMIT when calling
-<b>pcre2_compile()</b> so that when JIT is in use, different code can be
+When using this facility, you must set the PCRE2_USE_OFFSET_LIMIT option when
+calling <b>pcre2_compile()</b> so that when JIT is in use, different code can be
 compiled. If a match is started with a non-default match limit when
 PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.
 </P>
 <P>
 The offset limit facility can be used to track progress when searching large
-subject strings. See also the PCRE2_FIRSTLINE option, which requires a match to
-start within the first line of the subject. If this is set with an offset
-limit, a match must occur in the first line and also within the offset limit.
-In other words, whichever limit comes first is used.
+subject strings or to limit the extent of global substitutions. See also the
+PCRE2_FIRSTLINE option, which requires a match to start before or at the first
+newline that follows the start of matching in the subject. If this is set with
+an offset limit, a match must occur in the first line and also within the
+offset limit. In other words, whichever limit comes first is used.
+<br>
+<br>
+<b>int pcre2_set_heap_limit(pcre2_match_context *<i>mcontext</i>,</b>
+<b>  uint32_t <i>value</i>);</b>
+<br>
+<br>
+The <i>heap_limit</i> parameter specifies, in units of kilobytes, the maximum
+amount of heap memory that <b>pcre2_match()</b> may use to hold backtracking
+information when running an interpretive match. This limit does not apply to
+matching with the JIT optimization, which has its own memory control
+arrangements (see the
+<a href="pcre2jit.html"><b>pcre2jit</b></a>
+documentation for more details), nor does it apply to <b>pcre2_dfa_match()</b>.
+If the limit is reached, the negative error code PCRE2_ERROR_HEAPLIMIT is
+returned. The default limit is set when PCRE2 is built; the default default is
+very large and is essentially "unlimited".
+</P>
+<P>
+A value for the heap limit may also be supplied by an item at the start of a
+pattern of the form
+<pre>
+  (*LIMIT_HEAP=ddd)
+</pre>
+where ddd is a decimal number. However, such a setting is ignored unless ddd is
+less than the limit set by the caller of <b>pcre2_match()</b> or, if no such
+limit is set, less than the default.
+</P>
+<P>
+The <b>pcre2_match()</b> function starts out using a 20K vector on the system
+stack for recording backtracking points. The more nested backtracking points
+there are (that is, the deeper the search tree), the more memory is needed.
+Heap memory is used only if the initial vector is too small. If the heap limit
+is set to a value less than 21 (in particular, zero) no heap memory will be
+used. In this case, only patterns that do not have a lot of nested backtracking
+can be successfully processed.
+<br>
+<br>
 <b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
 <b>  uint32_t <i>value</i>);</b>
 <br>
 <br>
 The <i>match_limit</i> parameter provides a means of preventing PCRE2 from using
-up too many resources when processing patterns that are not going to match, but
-which have a very large number of possibilities in their search trees. The
-classic example is a pattern that uses nested unlimited repeats.
+up too many computing resources when processing patterns that are not going to
+match, but which have a very large number of possibilities in their search
+trees. The classic example is a pattern that uses nested unlimited repeats.
 </P>
 <P>
-Internally, <b>pcre2_match()</b> uses a function called <b>match()</b>, which it
-calls repeatedly (sometimes recursively). The limit set by <i>match_limit</i> is
-imposed on the number of times this function is called during a match, which
-has the effect of limiting the amount of backtracking that can take place. For
+There is an internal counter in <b>pcre2_match()</b> that is incremented each
+time round its main matching loop. If this value reaches the match limit,
+<b>pcre2_match()</b> returns the negative value PCRE2_ERROR_MATCHLIMIT. This has
+the effect of limiting the amount of backtracking that can take place. For
 patterns that are not anchored, the count restarts from zero for each position
-in the subject string. This limit is not relevant to <b>pcre2_dfa_match()</b>,
-which ignores it.
+in the subject string. This limit also applies to <b>pcre2_dfa_match()</b>,
+though the counting is done in a different way.
 </P>
 <P>
 When <b>pcre2_match()</b> is called with a pattern that was successfully
@@ -850,72 +1008,53 @@ matching can continue.
 </P>
 <P>
 The default value for the limit can be set when PCRE2 is built; the default
-default is 10 million, which handles all but the most extreme cases. If the
-limit is exceeded, <b>pcre2_match()</b> returns PCRE2_ERROR_MATCHLIMIT. A value
+default is 10 million, which handles all but the most extreme cases. A value
 for the match limit may also be supplied by an item at the start of a pattern
 of the form
 <pre>
   (*LIMIT_MATCH=ddd)
 </pre>
 where ddd is a decimal number. However, such a setting is ignored unless ddd is
-less than the limit set by the caller of <b>pcre2_match()</b> or, if no such
-limit is set, less than the default.
-<b>int pcre2_set_recursion_limit(pcre2_match_context *<i>mcontext</i>,</b>
+less than the limit set by the caller of <b>pcre2_match()</b> or
+<b>pcre2_dfa_match()</b> or, if no such limit is set, less than the default.
+<br>
+<br>
+<b>int pcre2_set_depth_limit(pcre2_match_context *<i>mcontext</i>,</b>
 <b>  uint32_t <i>value</i>);</b>
 <br>
 <br>
-The <i>recursion_limit</i> parameter is similar to <i>match_limit</i>, but
-instead of limiting the total number of times that <b>match()</b> is called, it
-limits the depth of recursion. The recursion depth is a smaller number than the
-total number of calls, because not all calls to <b>match()</b> are recursive.
-This limit is of use only if it is set smaller than <i>match_limit</i>.
+This parameter limits the depth of nested backtracking in <b>pcre2_match()</b>.
+Each time a nested backtracking point is passed, a new memory "frame" is used
+to remember the state of matching at that point. Thus, this parameter
+indirectly limits the amount of memory that is used in a match. However,
+because the size of each memory "frame" depends on the number of capturing
+parentheses, the actual memory limit varies from pattern to pattern. This limit
+was more useful in versions before 10.30, where function recursion was used for
+backtracking.
 </P>
 <P>
-Limiting the recursion depth limits the amount of system stack that can be
-used, or, when PCRE2 has been compiled to use memory on the heap instead of the
-stack, the amount of heap memory that can be used. This limit is not relevant,
-and is ignored, when matching is done using JIT compiled code or by the
-<b>pcre2_dfa_match()</b> function.
+The depth limit is not relevant, and is ignored, when matching is done using
+JIT compiled code. However, it is supported by <b>pcre2_dfa_match()</b>, which
+uses it to limit the depth of internal recursive function calls that implement
+atomic groups, lookaround assertions, and pattern recursions. This is,
+therefore, an indirect limit on the amount of system stack that is used. A
+recursive pattern such as /(.)(?1)/, when matched to a very long string using
+<b>pcre2_dfa_match()</b>, can use a great deal of stack.
 </P>
 <P>
-The default value for <i>recursion_limit</i> can be set when PCRE2 is built; the
-default default is the same value as the default for <i>match_limit</i>. If the
-limit is exceeded, <b>pcre2_match()</b> returns PCRE2_ERROR_RECURSIONLIMIT. A
-value for the recursion limit may also be supplied by an item at the start of a
-pattern of the form
+The default value for the depth limit can be set when PCRE2 is built; the
+default default is the same value as the default for the match limit. If the
+limit is exceeded, <b>pcre2_match()</b> or <b>pcre2_dfa_match()</b> returns
+PCRE2_ERROR_DEPTHLIMIT. A value for the depth limit may also be supplied by an
+item at the start of a pattern of the form
 <pre>
-  (*LIMIT_RECURSION=ddd)
+  (*LIMIT_DEPTH=ddd)
 </pre>
 where ddd is a decimal number. However, such a setting is ignored unless ddd is
-less than the limit set by the caller of <b>pcre2_match()</b> or, if no such
-limit is set, less than the default.
-<b>int pcre2_set_recursion_memory_management(</b>
-<b>  pcre2_match_context *<i>mcontext</i>,</b>
-<b>  void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
-<b>  void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
-<br>
-<br>
-This function sets up two additional custom memory management functions for use
-by <b>pcre2_match()</b> when PCRE2 is compiled to use the heap for remembering
-backtracking data, instead of recursive function calls that use the system
-stack. There is a discussion about PCRE2's stack usage in the
-<a href="pcre2stack.html"><b>pcre2stack</b></a>
-documentation. See the
-<a href="pcre2build.html"><b>pcre2build</b></a>
-documentation for details of how to build PCRE2.
-</P>
-<P>
-Using the heap for recursion is a non-standard way of building PCRE2, for use
-in environments that have limited stacks. Because of the greater use of memory
-management, <b>pcre2_match()</b> runs more slowly. Functions that are different
-to the general custom memory functions are provided so that special-purpose
-external code can be used for this case, because the memory blocks are all the
-same size. The blocks are retained by <b>pcre2_match()</b> until it is about to
-exit so that they can be re-used when possible during the match. In the absence
-of these functions, the normal custom memory management functions are used, if
-supplied, otherwise the system functions.
+less than the limit set by the caller of <b>pcre2_match()</b> or
+<b>pcre2_dfa_match()</b> or, if no such limit is set, less than the default.
 </P>
-<br><a name="SEC17" href="#TOC1">CHECKING BUILD-TIME OPTIONS</a><br>
+<br><a name="SEC19" href="#TOC1">CHECKING BUILD-TIME OPTIONS</a><br>
 <P>
 <b>int pcre2_config(uint32_t <i>what</i>, void *<i>where</i>);</b>
 </P>
@@ -948,6 +1087,25 @@ PCRE2_BSR_UNICODE means that \R matches any Unicode line ending sequence; a
 value of PCRE2_BSR_ANYCRLF means that \R matches only CR, LF, or CRLF. The
 default can be overridden when a pattern is compiled.
 <pre>
+  PCRE2_CONFIG_COMPILED_WIDTHS
+</pre>
+The output is a uint32_t integer whose lower bits indicate which code unit
+widths were selected when PCRE2 was built. The 1-bit indicates 8-bit support,
+and the 2-bit and 4-bit indicate 16-bit and 32-bit support, respectively.
+<pre>
+  PCRE2_CONFIG_DEPTHLIMIT
+</pre>
+The output is a uint32_t integer that gives the default limit for the depth of
+nested backtracking in <b>pcre2_match()</b> or the depth of nested recursions
+and lookarounds in <b>pcre2_dfa_match()</b>. Further details are given with
+<b>pcre2_set_depth_limit()</b> above.
+<pre>
+  PCRE2_CONFIG_HEAPLIMIT
+</pre>
+The output is a uint32_t integer that gives, in kilobytes, the default limit
+for the amount of heap memory used by <b>pcre2_match()</b>. Further details are
+given with <b>pcre2_set_heap_limit()</b> above.
+<pre>
   PCRE2_CONFIG_JIT
 </pre>
 The output is a uint32_t integer that is set to one if support for just-in-time
@@ -982,9 +1140,9 @@ be compiled by those two libraries, but at the expense of slower matching.
 <pre>
   PCRE2_CONFIG_MATCHLIMIT
 </pre>
-The output is a uint32_t integer that gives the default limit for the number of
-internal matching function calls in a <b>pcre2_match()</b> execution. Further
-details are given with <b>pcre2_match()</b> below.
+The output is a uint32_t integer that gives the default match limit for
+<b>pcre2_match()</b>. Further details are given with
+<b>pcre2_set_match_limit()</b> above.
 <pre>
   PCRE2_CONFIG_NEWLINE
 </pre>
@@ -996,10 +1154,16 @@ sequence that is recognized as meaning "newline". The values are:
   PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
   PCRE2_NEWLINE_ANY      Any Unicode line ending
   PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
+  PCRE2_NEWLINE_NUL      The NUL character (binary zero)
 </pre>
 The default should normally correspond to the standard sequence for your
 operating system.
 <pre>
+  PCRE2_CONFIG_NEVER_BACKSLASH_C
+</pre>
+The output is a uint32_t integer that is set to one if the use of \C was
+permanently disabled when PCRE2 was built; otherwise it is set to zero.
+<pre>
   PCRE2_CONFIG_PARENSLIMIT
 </pre>
 The output is a uint32_t integer that gives the maximum depth of nesting
@@ -1009,19 +1173,10 @@ PCRE2 is built; the default is 250. This limit does not take into account the
 stack that may already be used by the calling application. For finer control
 over compilation stack usage, see <b>pcre2_set_compile_recursion_guard()</b>.
 <pre>
-  PCRE2_CONFIG_RECURSIONLIMIT
-</pre>
-The output is a uint32_t integer that gives the default limit for the depth of
-recursion when calling the internal matching function in a <b>pcre2_match()</b>
-execution. Further details are given with <b>pcre2_match()</b> below.
-<pre>
   PCRE2_CONFIG_STACKRECURSE
 </pre>
-The output is a uint32_t integer that is set to one if internal recursion when
-running <b>pcre2_match()</b> is implemented by recursive function calls that use
-the system stack to remember their state. This is the usual way that PCRE2 is
-compiled. The output is zero if PCRE2 was compiled to use blocks of data on the
-heap instead of recursive function calls.
+This parameter is obsolete and should not be used in new code. The output is a
+uint32_t integer that is always set to zero.
 <pre>
   PCRE2_CONFIG_UNICODE_VERSION
 </pre>
@@ -1040,14 +1195,14 @@ available; otherwise it is set to zero. Unicode support implies UTF support.
 <pre>
   PCRE2_CONFIG_VERSION
 </pre>
-The <i>where</i> argument should point to a buffer that is at least 12 code
+The <i>where</i> argument should point to a buffer that is at least 24 code
 units long. (The exact length required can be found by calling
 <b>pcre2_config()</b> with <b>where</b> set to NULL.) The buffer is filled with
 the PCRE2 version string, zero-terminated. The number of code units used is
 returned. This is the length of the string plus one unit for the terminating
 zero.
 <a name="compiling"></a></P>
-<br><a name="SEC18" href="#TOC1">COMPILING A PATTERN</a><br>
+<br><a name="SEC20" href="#TOC1">COMPILING A PATTERN</a><br>
 <P>
 <b>pcre2_code *pcre2_compile(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
 <b>  uint32_t <i>options</i>, int *<i>errorcode</i>, PCRE2_SIZE *<i>erroroffset,</i></b>
@@ -1058,11 +1213,14 @@ zero.
 <br>
 <br>
 <b>pcre2_code *pcre2_code_copy(const pcre2_code *<i>code</i>);</b>
+<br>
+<br>
+<b>pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *<i>code</i>);</b>
 </P>
 <P>
 The <b>pcre2_compile()</b> function compiles a pattern into an internal form.
-The pattern is defined by a pointer to a string of code units and a length. If
-the pattern is zero-terminated, the length can be specified as
+The pattern is defined by a pointer to a string of code units and a length (in
+code units). If the pattern is zero-terminated, the length can be specified as
 PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of memory that
 contains the compiled pattern and related data, or NULL if an error occurred.
 </P>
@@ -1079,9 +1237,22 @@ if the code has been processed by the JIT compiler (see
 <a href="#jitcompiling">below),</a>
 the JIT information cannot be copied (because it is position-dependent).
 The new copy can initially be used only for non-JIT matching, though it can be
-passed to <b>pcre2_jit_compile()</b> if required. The <b>pcre2_code_copy()</b>
-function provides a way for individual threads in a multithreaded application
-to acquire a private copy of shared compiled code.
+passed to <b>pcre2_jit_compile()</b> if required.
+</P>
+<P>
+The <b>pcre2_code_copy()</b> function provides a way for individual threads in a
+multithreaded application to acquire a private copy of shared compiled code.
+However, it does not make a copy of the character tables used by the compiled
+pattern; the new pattern code points to the same tables as the original code.
+(See
+<a href="#jitcompiling">"Locale Support"</a>
+below for details of these character tables.) In many applications the same
+tables are used throughout, so this behaviour is appropriate. Nevertheless,
+there are occasions when a copy of a compiled pattern and the relevant tables
+are needed. The <b>pcre2_code_copy_with_tables()</b> provides this facility.
+Copies of both the code and the tables are made, with the new code pointing to
+the new tables. The memory for the new tables is automatically freed when
+<b>pcre2_code_free()</b> is called for the new copy of the compiled code.
 </P>
 <P>
 NOTE: When one of the matching functions is called, pointers to the compiled
@@ -1105,8 +1276,8 @@ documentation).
 <P>
 For those options that can be different in different parts of the pattern, the
 contents of the <i>options</i> argument specifies their settings at the start of
-compilation. The PCRE2_ANCHORED and PCRE2_NO_UTF_CHECK options can be set at
-the time of matching as well as at compile time.
+compilation. The PCRE2_ANCHORED, PCRE2_ENDANCHORED, and PCRE2_NO_UTF_CHECK
+options can be set at the time of matching as well as at compile time.
 </P>
 <P>
 Other, less frequently required compile-time parameters (for example, the
@@ -1122,13 +1293,26 @@ error has occurred. The values are not defined when compilation is successful
 and <b>pcre2_compile()</b> returns a non-NULL value.
 </P>
 <P>
-The <b>pcre2_get_error_message()</b> function (see "Obtaining a textual error
+There are nearly 100 positive error codes that <b>pcre2_compile()</b> may return
+if it finds an error in the pattern. There are also some negative error codes
+that are used for invalid UTF strings. These are the same as given by
+<b>pcre2_match()</b> and <b>pcre2_dfa_match()</b>, and are described in the
+<a href="pcre2unicode.html"><b>pcre2unicode</b></a>
+page. There is no separate documentation for the positive error codes, because
+the textual error messages that are obtained by calling the
+<b>pcre2_get_error_message()</b> function (see "Obtaining a textual error
 message"
 <a href="#geterrormessage">below)</a>
-provides a textual message for each error code. Compilation errors have
-positive error codes; UTF formatting error codes are negative. For an invalid
-UTF-8 or UTF-16 string, the offset is that of the first code unit of the
-failing character.
+should be self-explanatory. Macro names starting with PCRE2_ERROR_ are defined
+for both positive and negative error codes in <b>pcre2.h</b>.
+</P>
+<P>
+The value returned in <i>erroroffset</i> is an indication of where in the
+pattern the error occurred. It is not necessarily the furthest point in the
+pattern that was read. For example, after the error "lookbehind assertion is
+not fixed length", the error offset points to the start of the failing
+assertion. For an invalid UTF-8 or UTF-16 string, the offset is that of the
+first code unit of the failing character.
 </P>
 <P>
 Some errors are not detected until the whole pattern has been scanned; in these
@@ -1209,13 +1393,15 @@ include a closing parenthesis in the name. However, if the PCRE2_ALT_VERBNAMES
 option is set, normal backslash processing is applied to verb names and only an
 unescaped closing parenthesis terminates the name. A closing parenthesis can be
 included in a name either as \) or between \Q and \E. If the PCRE2_EXTENDED
-option is set, unescaped whitespace in verb names is skipped and #-comments are
-recognized, exactly as in the rest of the pattern.
+or PCRE2_EXTENDED_MORE option is set, unescaped whitespace in verb names is
+skipped and #-comments are recognized in this mode, exactly as in the rest of
+the pattern.
 <pre>
   PCRE2_AUTO_CALLOUT
 </pre>
 If this bit is set, <b>pcre2_compile()</b> automatically inserts callout items,
-all with number 255, before each pattern item. For discussion of the callout
+all with number 255, before each pattern item, except immediately before or
+after an explicit callout in the pattern. For discussion of the callout
 facility, see the
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
 documentation.
@@ -1224,7 +1410,13 @@ documentation.
 </pre>
 If this bit is set, letters in the pattern match both upper and lower case
 letters in the subject. It is equivalent to Perl's /i option, and it can be
-changed within a pattern by a (?i) option setting.
+changed within a pattern by a (?i) option setting. If PCRE2_UTF is set, Unicode
+properties are used for all characters with more than one other case, and for
+all characters whose code points are greater than U+007f. For lower valued
+characters with only one other case, a lookup table is used for speed. When
+PCRE2_UTF is not set, a lookup table is used for all code points less than 256,
+and higher code points (available only in 16-bit or 32-bit mode) are treated as
+not having another case.
 <pre>
   PCRE2_DOLLAR_ENDONLY
 </pre>
@@ -1254,6 +1446,30 @@ details of named subpatterns below; see also the
 <a href="pcre2pattern.html"><b>pcre2pattern</b></a>
 documentation.
 <pre>
+  PCRE2_ENDANCHORED
+</pre>
+If this bit is set, the end of any pattern match must be right at the end of
+the string being searched (the "subject string"). If the pattern match
+succeeds by reaching (*ACCEPT), but does not reach the end of the subject, the
+match fails at the current starting point. For unanchored patterns, a new match
+is then tried at the next starting point. However, if the match succeeds by
+reaching the end of the pattern, but not the end of the subject, backtracking
+occurs and an alternative match may be found. Consider these two patterns:
+<pre>
+  .(*ACCEPT)|..
+  .|..
+</pre>
+If matched against "abc" with PCRE2_ENDANCHORED set, the first matches "c"
+whereas the second matches "bc". The effect of PCRE2_ENDANCHORED can also be
+achieved by appropriate constructs in the pattern itself, which is the only way
+to do it in Perl.
+</P>
+<P>
+For DFA matching with <b>pcre2_dfa_match()</b>, PCRE2_ENDANCHORED applies only
+to the first (that is, the longest) matched string. Other parallel matches,
+which are necessarily substrings of the first one, must obviously end before
+the end of the subject.
+<pre>
   PCRE2_EXTENDED
 </pre>
 If this bit is set, most white space characters in the pattern are totally
@@ -1280,14 +1496,39 @@ sequence at the start of the pattern, as described in the section entitled
 in the <b>pcre2pattern</b> documentation. A default is defined when PCRE2 is
 built.
 <pre>
+  PCRE2_EXTENDED_MORE
+</pre>
+This option has the effect of PCRE2_EXTENDED, but, in addition, unescaped space
+and horizontal tab characters are ignored inside a character class.
+PCRE2_EXTENDED_MORE is equivalent to Perl's 5.26 /xx option, and it can be
+changed within a pattern by a (?xx) option setting.
+<pre>
   PCRE2_FIRSTLINE
 </pre>
-If this option is set, an unanchored pattern is required to match before or at
-the first newline in the subject string, though the matched text may continue
-over the newline. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
-general limiting facility. If PCRE2_FIRSTLINE is set with an offset limit, a
-match must occur in the first line and also within the offset limit. In other
-words, whichever limit comes first is used.
+If this option is set, the start of an unanchored pattern match must be before
+or at the first newline in the subject string following the start of matching,
+though the matched text may continue over the newline. If <i>startoffset</i> is
+non-zero, the limiting newline is not necessarily the first newline in the
+subject. For example, if the subject string is "abc\nxyz" (where \n
+represents a single-character newline) a pattern match for "yz" succeeds with
+PCRE2_FIRSTLINE if <i>startoffset</i> is greater than 3. See also
+PCRE2_USE_OFFSET_LIMIT, which provides a more general limiting facility. If
+PCRE2_FIRSTLINE is set with an offset limit, a match must occur in the first
+line and also within the offset limit. In other words, whichever limit comes
+first is used.
+<pre>
+  PCRE2_LITERAL
+</pre>
+If this option is set, all meta-characters in the pattern are disabled, and it
+is treated as a literal string. Matching literal strings with a regular
+expression engine is not the most efficient way of doing it. If you are doing a
+lot of literal matching and are worried about efficiency, you should consider
+using other approaches. The only other main options that are allowed with
+PCRE2_LITERAL are: PCRE2_ANCHORED, PCRE2_ENDANCHORED, PCRE2_AUTO_CALLOUT,
+PCRE2_CASELESS, PCRE2_FIRSTLINE, PCRE2_NO_START_OPTIMIZE, PCRE2_NO_UTF_CHECK,
+PCRE2_UTF, and PCRE2_USE_OFFSET_LIMIT. The extra options PCRE2_EXTRA_MATCH_LINE
+and PCRE2_EXTRA_MATCH_WORD are also supported. Any other options cause an
+error.
 <pre>
   PCRE2_MATCH_UNSET_BACKREF
 </pre>
@@ -1352,8 +1593,8 @@ PCRE2_NEVER_UTF causes an error.
 If this option is set, it disables the use of numbered capturing parentheses in
 the pattern. Any opening parenthesis that is not followed by ? behaves as if it
 were followed by ?: but named parentheses can still be used for capturing (and
-they acquire numbers in the usual way). There is no equivalent of this option
-in Perl. Note that, if this option is set, references to capturing groups (back
+they acquire numbers in the usual way). This is the same as Perl's /n option.
+Note that, when this option is set, references to capturing groups (back
 references or recursion/subroutine calls) may only refer to named groups,
 though the reference can be by name or by number.
 <pre>
@@ -1389,8 +1630,8 @@ compiler.
 <P>
 There are a number of optimizations that may occur at the start of a match, in
 order to speed up the process. For example, if it is known that an unanchored
-match must start with a specific character, the matching code searches the
-subject for that character, and fails immediately if it cannot find it, without
+match must start with a specific code unit value, the matching code searches
+the subject for that value, and fails immediately if it cannot find it, without
 actually running the main matching function. This means that a special item
 such as (*COMMIT) at the start of a pattern is not considered until after a
 suitable starting point for the match has been found. Also, when callouts or
@@ -1419,9 +1660,11 @@ current starting position, which in this case, it does. However, if the same
 match is run with PCRE2_NO_START_OPTIMIZE set, the initial scan along the
 subject string does not happen. The first match attempt is run starting from
 "D" and when this fails, (*COMMIT) prevents any further matches being tried, so
-the overall result is "no match". There are also other start-up optimizations.
-For example, a minimum length for the subject may be recorded. Consider the
-pattern
+the overall result is "no match".
+</P>
+<P>
+There are also other start-up optimizations. For example, a minimum length for
+the subject may be recorded. Consider the pattern
 <pre>
   (*MARK:A)(X|Y)
 </pre>
@@ -1442,17 +1685,30 @@ and
 <a href="pcre2unicode.html#utf32strings">UTF-32 strings</a>
 in the
 <a href="pcre2unicode.html"><b>pcre2unicode</b></a>
-document.
-If an invalid UTF sequence is found, <b>pcre2_compile()</b> returns a negative
-error code.
+document. If an invalid UTF sequence is found, <b>pcre2_compile()</b> returns a
+negative error code.
 </P>
 <P>
-If you know that your pattern is valid, and you want to skip this check for
-performance reasons, you can set the PCRE2_NO_UTF_CHECK option. When it is set,
-the effect of passing an invalid UTF string as a pattern is undefined. It may
-cause your program to crash or loop. Note that this option can also be passed
-to <b>pcre2_match()</b> and <b>pcre_dfa_match()</b>, to suppress validity
-checking of the subject string.
+If you know that your pattern is a valid UTF string, and you want to skip this
+check for performance reasons, you can set the PCRE2_NO_UTF_CHECK option. When
+it is set, the effect of passing an invalid UTF string as a pattern is
+undefined. It may cause your program to crash or loop.
+</P>
+<P>
+Note that this option can also be passed to <b>pcre2_match()</b> and
+<b>pcre_dfa_match()</b>, to suppress UTF validity checking of the subject
+string.
+</P>
+<P>
+Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the
+error that is given if an escape sequence for an invalid Unicode code point is
+encountered in the pattern. In particular, the so-called "surrogate" code
+points (0xd800 to 0xdfff) are invalid. If you want to allow escape sequences
+such as \x{d800} you can set the PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra
+option, as described in the section entitled "Extra compile options"
+<a href="#extracompileoptions">below.</a>
+However, this is possible only in UTF-8 and UTF-32 modes, because these values
+are not representable in UTF-16.
 <pre>
   PCRE2_UCP
 </pre>
@@ -1465,7 +1721,7 @@ in the
 <a href="pcre2pattern.html"><b>pcre2pattern</b></a>
 page. If you set PCRE2_UCP, matching one of the items it affects takes much
 longer. The option is available only if PCRE2 has been compiled with Unicode
-support.
+support (which is the default).
 <pre>
   PCRE2_UNGREEDY
 </pre>
@@ -1490,25 +1746,80 @@ This option causes PCRE2 to regard both the pattern and the subject strings
 that are subsequently processed as strings of UTF characters instead of
 single-code-unit strings. It is available when PCRE2 is built to include
 Unicode support (which is the default). If Unicode support is not available,
-the use of this option provokes an error. Details of how this option changes
-the behaviour of PCRE2 are given in the
+the use of this option provokes an error. Details of how PCRE2_UTF changes the
+behaviour of PCRE2 are given in the
 <a href="pcre2unicode.html"><b>pcre2unicode</b></a>
 page.
-</P>
-<br><a name="SEC19" href="#TOC1">COMPILATION ERROR CODES</a><br>
+<a name="extracompileoptions"></a></P>
+<br><b>
+Extra compile options
+</b><br>
 <P>
-There are over 80 positive error codes that <b>pcre2_compile()</b> may return
-(via <i>errorcode</i>) if it finds an error in the pattern. There are also some
-negative error codes that are used for invalid UTF strings. These are the same
-as given by <b>pcre2_match()</b> and <b>pcre2_dfa_match()</b>, and are described
-in the
-<a href="pcre2unicode.html"><b>pcre2unicode</b></a>
-page. The <b>pcre2_get_error_message()</b> function (see "Obtaining a textual
-error message"
-<a href="#geterrormessage">below)</a>
-can be called to obtain a textual error message from any error code.
+Unlike the main compile-time options, the extra options are not saved with the
+compiled pattern. The option bits that can be set in a compile context by
+calling the <b>pcre2_set_compile_extra_options()</b> function are as follows:
+<pre>
+  PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
+</pre>
+This option applies when compiling a pattern in UTF-8 or UTF-32 mode. It is
+forbidden in UTF-16 mode, and ignored in non-UTF modes. Unicode "surrogate"
+code points in the range 0xd800 to 0xdfff are used in pairs in UTF-16 to encode
+code points with values in the range 0x10000 to 0x10ffff. The surrogates cannot
+therefore be represented in UTF-16. They can be represented in UTF-8 and
+UTF-32, but are defined as invalid code points, and cause errors if encountered
+in a UTF-8 or UTF-32 string that is being checked for validity by PCRE2.
+</P>
+<P>
+These values also cause errors if encountered in escape sequences such as
+\x{d912} within a pattern. However, it seems that some applications, when
+using PCRE2 to check for unwanted characters in UTF-8 strings, explicitly test
+for the surrogates using escape sequences. The PCRE2_NO_UTF_CHECK option does
+not disable the error that occurs, because it applies only to the testing of
+input strings for UTF validity.
+</P>
+<P>
+If the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set, surrogate code
+point values in UTF-8 and UTF-32 patterns no longer provoke errors and are
+incorporated in the compiled pattern. However, they can only match subject
+characters if the matching function is called with PCRE2_NO_UTF_CHECK set.
+<pre>
+  PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
+</pre>
+This is a dangerous option. Use with care. By default, an unrecognized escape
+such as \j or a malformed one such as \x{2z} causes a compile-time error when
+detected by <b>pcre2_compile()</b>. Perl is somewhat inconsistent in handling
+such items: for example, \j is treated as a literal "j", and non-hexadecimal
+digits in \x{} are just ignored, though warnings are given in both cases if
+Perl's warning switch is enabled. However, a malformed octal number after \o{
+always causes an error in Perl.
+</P>
+<P>
+If the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL extra option is passed to
+<b>pcre2_compile()</b>, all unrecognized or erroneous escape sequences are
+treated as single-character escapes. For example, \j is a literal "j" and
+\x{2z} is treated as the literal string "x{2z}". Setting this option means
+that typos in patterns may go undetected and have unexpected results. This is a
+dangerous option. Use with care.
+<pre>
+  PCRE2_EXTRA_MATCH_LINE
+</pre>
+This option is provided for use by the <b>-x</b> option of <b>pcre2grep</b>. It
+causes the pattern only to match complete lines. This is achieved by
+automatically inserting the code for "^(?:" at the start of the compiled
+pattern and ")$" at the end. Thus, when PCRE2_MULTILINE is set, the matched
+line may be in the middle of the subject string. This option can be used with
+PCRE2_LITERAL.
+<pre>
+  PCRE2_EXTRA_MATCH_WORD
+</pre>
+This option is provided for use by the <b>-w</b> option of <b>pcre2grep</b>. It
+causes the pattern only to match strings that have a word boundary at the start
+and the end. This is achieved by automatically inserting the code for "\b(?:"
+at the start of the compiled pattern and ")\b" at the end. The option may be
+used with PCRE2_LITERAL. However, it is ignored if PCRE2_EXTRA_MATCH_LINE is
+also set.
 <a name="jitcompiling"></a></P>
-<br><a name="SEC20" href="#TOC1">JUST-IN-TIME (JIT) COMPILATION</a><br>
+<br><a name="SEC21" href="#TOC1">JUST-IN-TIME (JIT) COMPILATION</a><br>
 <P>
 <b>int pcre2_jit_compile(pcre2_code *<i>code</i>, uint32_t <i>options</i>);</b>
 <br>
@@ -1544,18 +1855,18 @@ documentation.
 JIT compilation is a heavyweight optimization. It can take some time for
 patterns to be analyzed, and for one-off matches and simple patterns the
 benefit of faster execution might be offset by a much slower compilation time.
-Most, but not all patterns can be optimized by the JIT compiler.
+Most (but not all) patterns can be optimized by the JIT compiler.
 <a name="localesupport"></a></P>
-<br><a name="SEC21" href="#TOC1">LOCALE SUPPORT</a><br>
+<br><a name="SEC22" href="#TOC1">LOCALE SUPPORT</a><br>
 <P>
 PCRE2 handles caseless matching, and determines whether characters are letters,
 digits, or whatever, by reference to a set of tables, indexed by character code
 point. This applies only to characters whose code points are less than 256. By
 default, higher-valued code points never match escapes such as \w or \d.
-However, if PCRE2 is built with UTF support, all characters can be tested with
-\p and \P, or, alternatively, the PCRE2_UCP option can be set when a pattern
-is compiled; this causes \w and friends to use Unicode property support
-instead of the built-in tables.
+However, if PCRE2 is built with Unicode support, all characters can be tested
+with \p and \P, or, alternatively, the PCRE2_UCP option can be set when a
+pattern is compiled; this causes \w and friends to use Unicode property
+support instead of the built-in tables.
 </P>
 <P>
 The use of locales with Unicode is discouraged. If you are handling characters
@@ -1599,10 +1910,10 @@ available for as long as it is needed.
 The pointer that is passed (via the compile context) to <b>pcre2_compile()</b>
 is saved with the compiled pattern, and the same tables are used by
 <b>pcre2_match()</b> and <b>pcre_dfa_match()</b>. Thus, for any single pattern,
-compilation, and matching all happen in the same locale, but different patterns
+compilation and matching both happen in the same locale, but different patterns
 can be processed in different locales.
 <a name="infoaboutpattern"></a></P>
-<br><a name="SEC22" href="#TOC1">INFORMATION ABOUT A COMPILED PATTERN</a><br>
+<br><a name="SEC23" href="#TOC1">INFORMATION ABOUT A COMPILED PATTERN</a><br>
 <P>
 <b>int pcre2_pattern_info(const pcre2 *<i>code</i>, uint32_t <i>what</i>, void *<i>where</i>);</b>
 </P>
@@ -1615,7 +1926,7 @@ pattern. The second argument specifies which piece of information is required,
 and the third argument is a pointer to a variable to receive the data. If the
 third argument is NULL, the first argument is ignored, and the function returns
 the size in bytes of the variable that is required for the information
-requested. Otherwise, The yield of the function is zero for success, or one of
+requested. Otherwise, the yield of the function is zero for success, or one of
 the following negative numbers:
 <pre>
   PCRE2_ERROR_NULL           the argument <i>code</i> was NULL
@@ -1639,12 +1950,15 @@ are as follows:
 <pre>
   PCRE2_INFO_ALLOPTIONS
   PCRE2_INFO_ARGOPTIONS
+  PCRE2_INFO_EXTRAOPTIONS
 </pre>
-Return a copy of the pattern's options. The third argument should point to a
+Return copies of the pattern's options. The third argument should point to a
 <b>uint32_t</b> variable. PCRE2_INFO_ARGOPTIONS returns exactly the options that
 were passed to <b>pcre2_compile()</b>, whereas PCRE2_INFO_ALLOPTIONS returns
 the compile options as modified by any top-level (*XXX) option settings such as
-(*UTF) at the start of the pattern itself.
+(*UTF) at the start of the pattern itself. PCRE2_INFO_EXTRAOPTIONS returns the
+extra options that were set in the compile context by calling the
+pcre2_set_compile_extra_options() function.
 </P>
 <P>
 For example, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED
@@ -1668,8 +1982,8 @@ following are true:
   .* is not in an atomic group
   .* is not in a capturing group that is the subject of a back reference
   PCRE2_DOTALL is in force for .*
-  Neither (*PRUNE) nor (*SKIP) appears in the pattern.
-  PCRE2_NO_DOTSTAR_ANCHOR is not set.
+  Neither (*PRUNE) nor (*SKIP) appears in the pattern
+  PCRE2_NO_DOTSTAR_ANCHOR is not set
 </pre>
 For patterns that are auto-anchored, the PCRE2_ANCHORED bit is set in the
 options returned for PCRE2_INFO_ALLOPTIONS.
@@ -1697,6 +2011,15 @@ Return the highest capturing subpattern number in the pattern. In patterns
 where (?| is not used, this is also the total number of capturing subpatterns.
 The third argument should point to an <b>uint32_t</b> variable.
 <pre>
+  PCRE2_INFO_DEPTHLIMIT
+</pre>
+If the pattern set a backtracking depth limit by including an item of the form
+(*LIMIT_DEPTH=nnnn) at the start, the value is returned. The third argument
+should point to an unsigned 32-bit integer. If no such value has been set, the
+call to <b>pcre2_pattern_info()</b> returns the error PCRE2_ERROR_UNSET. Note
+that this limit will only be used during matching if it is less than the limit
+set or defaulted by the caller of the match function.
+<pre>
   PCRE2_INFO_FIRSTBITMAP
 </pre>
 In the absence of a single first code unit for a non-anchored pattern,
@@ -1713,21 +2036,29 @@ returned. Otherwise NULL is returned. The third argument should point to an
 Return information about the first code unit of any matched string, for a
 non-anchored pattern. The third argument should point to an <b>uint32_t</b>
 variable. If there is a fixed first value, for example, the letter "c" from a
-pattern such as (cat|cow|coyote), 1 is returned, and the character value can be
-retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed first value, but
-it is known that a match can occur only at the start of the subject or
-following a newline in the subject, 2 is returned. Otherwise, and for anchored
-patterns, 0 is returned.
+pattern such as (cat|cow|coyote), 1 is returned, and the value can be retrieved
+using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed first value, but it is
+known that a match can occur only at the start of the subject or following a
+newline in the subject, 2 is returned. Otherwise, and for anchored patterns, 0
+is returned.
 <pre>
   PCRE2_INFO_FIRSTCODEUNIT
 </pre>
-Return the value of the first code unit of any matched string in the situation
+Return the value of the first code unit of any matched string for a pattern
 where PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0. The third
 argument should point to an <b>uint32_t</b> variable. In the 8-bit library, the
 value is always less than 256. In the 16-bit library the value can be up to
 0xffff. In the 32-bit library in UTF-32 mode the value can be up to 0x10ffff,
 and up to 0xffffffff when not using UTF-32 mode.
 <pre>
+  PCRE2_INFO_FRAMESIZE
+</pre>
+Return the size (in bytes) of the data frames that are used to remember
+backtracking positions when the pattern is processed by <b>pcre2_match()</b>
+without the use of JIT. The third argument should point to an <b>size_t</b>
+variable. The frame size depends on the number of capturing parentheses in the
+pattern. Each additional capturing group adds two PCRE2_SIZE variables.
+<pre>
   PCRE2_INFO_HASBACKSLASHC
 </pre>
 Return 1 if the pattern contains any instances of \C, otherwise 0. The third
@@ -1737,7 +2068,17 @@ argument should point to an <b>uint32_t</b> variable.
 </pre>
 Return 1 if the pattern contains any explicit matches for CR or LF characters,
 otherwise 0. The third argument should point to an <b>uint32_t</b> variable. An
-explicit match is either a literal CR or LF character, or \r or \n.
+explicit match is either a literal CR or LF character, or \r or \n or one of
+the equivalent hexadecimal or octal escape sequences.
+<pre>
+  PCRE2_INFO_HEAPLIMIT
+</pre>
+If the pattern set a heap memory limit by including an item of the form
+(*LIMIT_HEAP=nnnn) at the start, the value is returned. The third argument
+should point to an unsigned 32-bit integer. If no such value has been set, the
+call to <b>pcre2_pattern_info()</b> returns the error PCRE2_ERROR_UNSET. Note
+that this limit will only be used during matching if it is less than the limit
+set or defaulted by the caller of the match function.
 <pre>
   PCRE2_INFO_JCHANGED
 </pre>
@@ -1764,10 +2105,10 @@ PCRE2_INFO_LASTCODEUNIT), but for /^a\dz\d/ the returned value is 0.
 <pre>
   PCRE2_INFO_LASTCODEUNIT
 </pre>
-Return the value of the rightmost literal data unit that must exist in any
-matched string, other than at its start, if such a value has been recorded. The
-third argument should point to an <b>uint32_t</b> variable. If there is no such
-value, 0 is returned.
+Return the value of the rightmost literal code unit that must exist in any
+matched string, other than at its start, for a pattern where
+PCRE2_INFO_LASTCODETYPE returns 1. Otherwise, return 0. The third argument
+should point to an <b>uint32_t</b> variable.
 <pre>
   PCRE2_INFO_MATCHEMPTY
 </pre>
@@ -1782,7 +2123,9 @@ in such cases.
 If the pattern set a match limit by including an item of the form
 (*LIMIT_MATCH=nnnn) at the start, the value is returned. The third argument
 should point to an unsigned 32-bit integer. If no such value has been set, the
-call to <b>pcre2_pattern_info()</b> returns the error PCRE2_ERROR_UNSET.
+call to <b>pcre2_pattern_info()</b> returns the error PCRE2_ERROR_UNSET. Note
+that this limit will only be used during matching if it is less than the limit
+set or defaulted by the caller of the match function.
 <pre>
   PCRE2_INFO_MAXLOOKBEHIND
 </pre>
@@ -1794,7 +2137,8 @@ require a one-character lookbehind. \A also registers a one-character
 lookbehind, though it does not actually inspect the previous character. This is
 to ensure that at least one character from the old segment is retained when a
 new segment is processed. Otherwise, if there are no lookbehinds in the
-pattern, \A might match incorrectly at the start of a new segment.
+pattern, \A might match incorrectly at the start of a second or subsequent
+segment.
 <pre>
   PCRE2_INFO_MINLENGTH
 </pre>
@@ -1874,23 +2218,17 @@ different for each compiled pattern.
 <pre>
   PCRE2_INFO_NEWLINE
 </pre>
-The output is a <b>uint32_t</b> with one of the following values:
+The output is one of the following <b>uint32_t</b> values:
 <pre>
   PCRE2_NEWLINE_CR       Carriage return (CR)
   PCRE2_NEWLINE_LF       Linefeed (LF)
   PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
   PCRE2_NEWLINE_ANY      Any Unicode line ending
   PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
+  PCRE2_NEWLINE_NUL      The NUL character (binary zero)
 </pre>
-This specifies the default character sequence that will be recognized as
-meaning "newline" while matching.
-<pre>
-  PCRE2_INFO_RECURSIONLIMIT
-</pre>
-If the pattern set a recursion limit by including an item of the form
-(*LIMIT_RECURSION=nnnn) at the start, the value is returned. The third
-argument should point to an unsigned 32-bit integer. If no such value has been
-set, the call to <b>pcre2_pattern_info()</b> returns the error PCRE2_ERROR_UNSET.
+This identifies the character sequence that will be recognized as meaning
+"newline" while matching.
 <pre>
   PCRE2_INFO_SIZE
 </pre>
@@ -1903,7 +2241,7 @@ value returned by this option, because there are cases where the code that
 calculates the size has to over-estimate. Processing a pattern with the JIT
 compiler does not alter the value returned by this option.
 <a name="infoaboutcallouts"></a></P>
-<br><a name="SEC23" href="#TOC1">INFORMATION ABOUT A PATTERN'S CALLOUTS</a><br>
+<br><a name="SEC24" href="#TOC1">INFORMATION ABOUT A PATTERN'S CALLOUTS</a><br>
 <P>
 <b>int pcre2_callout_enumerate(const pcre2_code *<i>code</i>,</b>
 <b>  int (*<i>callback</i>)(pcre2_callout_enumerate_block *, void *),</b>
@@ -1922,7 +2260,7 @@ contents of the callout enumeration block are described in the
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
 documentation, which also gives further details about callouts.
 </P>
-<br><a name="SEC24" href="#TOC1">SERIALIZATION AND PRECOMPILING</a><br>
+<br><a name="SEC25" href="#TOC1">SERIALIZATION AND PRECOMPILING</a><br>
 <P>
 It is possible to save compiled patterns on disc or elsewhere, and reload them
 later, subject to a number of restrictions. The functions whose names begin
@@ -1931,7 +2269,7 @@ the
 <a href="pcre2serialize.html"><b>pcre2serialize</b></a>
 documentation.
 <a name="matchdatablock"></a></P>
-<br><a name="SEC25" href="#TOC1">THE MATCH DATA BLOCK</a><br>
+<br><a name="SEC26" href="#TOC1">THE MATCH DATA BLOCK</a><br>
 <P>
 <b>pcre2_match_data *pcre2_match_data_create(uint32_t <i>ovecsize</i>,</b>
 <b>  pcre2_general_context *<i>gcontext</i>);</b>
@@ -1948,7 +2286,7 @@ Information about a successful or unsuccessful match is placed in a match
 data block, which is an opaque structure that is accessed by function calls. In
 particular, the match data block contains a vector of offsets into the subject
 string that define the matched part of the subject and any substrings that were
-captured. This is know as the <i>ovector</i>.
+captured. This is known as the <i>ovector</i>.
 </P>
 <P>
 Before calling <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or
@@ -1956,9 +2294,9 @@ Before calling <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or
 the creation functions above. For <b>pcre2_match_data_create()</b>, the first
 argument is the number of pairs of offsets in the <i>ovector</i>. One pair of
 offsets is required to identify the string that matched the whole pattern, with
-another pair for each captured substring. For example, a value of 4 creates
-enough space to record the matched portion of the subject plus three captured
-substrings. A minimum of at least 1 pair is imposed by
+an additional pair for each captured substring. For example, a value of 4
+creates enough space to record the matched portion of the subject plus three
+captured substrings. A minimum of at least 1 pair is imposed by
 <b>pcre2_match_data_create()</b>, so it is always possible to return the overall
 matched string.
 </P>
@@ -2002,7 +2340,7 @@ match data block (for that match) have taken place.
 When a match data block itself is no longer needed, it should be freed by
 calling <b>pcre2_match_data_free()</b>.
 </P>
-<br><a name="SEC26" href="#TOC1">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a><br>
+<br><a name="SEC27" href="#TOC1">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a><br>
 <P>
 <b>int pcre2_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
 <b>  PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
@@ -2033,7 +2371,7 @@ Here is an example of a simple call to <b>pcre2_match()</b>:
     11,             /* the length of the subject string */
     0,              /* start at offset 0 in the subject */
     0,              /* default options */
-    match_data,     /* the match data block */
+    md,             /* the match data block */
     NULL);          /* a match context; NULL means use defaults */
 </pre>
 If the subject string is zero-terminated, the length can be given as
@@ -2096,25 +2434,27 @@ character is CR followed by LF, advance the starting offset by two characters
 instead of one.
 </P>
 <P>
-If a non-zero starting offset is passed when the pattern is anchored, one
+If a non-zero starting offset is passed when the pattern is anchored, a single
 attempt to match at the given offset is made. This can only succeed if the
-pattern does not require the match to be at the start of the subject.
+pattern does not require the match to be at the start of the subject. In other
+words, the anchoring must be the result of setting the PCRE2_ANCHORED option or
+the use of .* with PCRE2_DOTALL, not by starting the pattern with ^ or \A.
 <a name="matchoptions"></a></P>
 <br><b>
 Option bits for <b>pcre2_match()</b>
 </b><br>
 <P>
 The unused bits of the <i>options</i> argument for <b>pcre2_match()</b> must be
-zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
-PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
-PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is
-described below.
+zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_ENDANCHORED,
+PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
+PCRE2_NO_JIT, PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT.
+Their action is described below.
 </P>
 <P>
-Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
-compiler. If it is set, JIT matching is disabled and the normal interpretive
-code in <b>pcre2_match()</b> is run. Apart from PCRE2_NO_JIT (obviously), the
-remaining options are supported for JIT matching.
+Setting PCRE2_ANCHORED or PCRE2_ENDANCHORED at match time is not supported by
+the just-in-time (JIT) compiler. If it is set, JIT matching is disabled and the
+interpretive code in <b>pcre2_match()</b> is run. Apart from PCRE2_NO_JIT
+(obviously), the remaining options are supported for JIT matching.
 <pre>
   PCRE2_ANCHORED
 </pre>
@@ -2124,6 +2464,12 @@ to be anchored by virtue of its contents, it cannot be made unachored at
 matching time. Note that setting the option at match time disables JIT
 matching.
 <pre>
+  PCRE2_ENDANCHORED
+</pre>
+If the PCRE2_ENDANCHORED option is set, any string that <b>pcre2_match()</b>
+matches must be right at the end of the subject string. Note that setting the
+option at match time disables JIT matching.
+<pre>
   PCRE2_NOTBOL
 </pre>
 This option specifies that first character of the subject string is not the
@@ -2199,13 +2545,13 @@ page.
 If you know that your subject is valid, and you want to skip these checks for
 performance reasons, you can set the PCRE2_NO_UTF_CHECK option when calling
 <b>pcre2_match()</b>. You might want to do this for the second and subsequent
-calls to <b>pcre2_match()</b> if you are making repeated calls to find all the
-matches in a single subject string.
+calls to <b>pcre2_match()</b> if you are making repeated calls to find other
+matches in the same subject string.
 </P>
 <P>
-NOTE: When PCRE2_NO_UTF_CHECK is set, the effect of passing an invalid string
-as a subject, or an invalid value of <i>startoffset</i>, is undefined. Your
-program may crash or loop indefinitely.
+WARNING: When PCRE2_NO_UTF_CHECK is set, the effect of passing an invalid
+string as a subject, or an invalid value of <i>startoffset</i>, is undefined.
+Your program may crash or loop indefinitely.
 <pre>
   PCRE2_PARTIAL_HARD
   PCRE2_PARTIAL_SOFT
@@ -2232,7 +2578,7 @@ examples, in the
 <a href="pcre2partial.html"><b>pcre2partial</b></a>
 documentation.
 </P>
-<br><a name="SEC27" href="#TOC1">NEWLINE HANDLING WHEN MATCHING</a><br>
+<br><a name="SEC28" href="#TOC1">NEWLINE HANDLING WHEN MATCHING</a><br>
 <P>
 When PCRE2 is built, a default newline convention is set; this is usually the
 standard convention for the operating system. The default can be overridden in
@@ -2264,15 +2610,15 @@ reference, and so advances only by one character after the first failure.
 </P>
 <P>
 An explicit match for CR of LF is either a literal appearance of one of those
-characters in the pattern, or one of the \r or \n escape sequences. Implicit
-matches such as [^X] do not count, nor does \s, even though it includes CR and
-LF in the characters that it matches.
+characters in the pattern, or one of the \r or \n or equivalent octal or
+hexadecimal escape sequences. Implicit matches such as [^X] do not count, nor
+does \s, even though it includes CR and LF in the characters that it matches.
 </P>
 <P>
 Notwithstanding the above, anomalous effects may still occur when CRLF is a
 valid newline sequence and explicit \r or \n escapes appear in the pattern.
 <a name="matchedstrings"></a></P>
-<br><a name="SEC28" href="#TOC1">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a><br>
+<br><a name="SEC29" href="#TOC1">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a><br>
 <P>
 <b>uint32_t pcre2_get_ovector_count(pcre2_match_data *<i>match_data</i>);</b>
 <br>
@@ -2322,12 +2668,12 @@ identify the part of the subject that was partially matched. See the
 documentation for details of partial matching.
 </P>
 <P>
-After a successful match, the first pair of offsets identifies the portion of
-the subject string that was matched by the entire pattern. The next pair is
-used for the first capturing subpattern, and so on. The value returned by
+After a fully successful match, the first pair of offsets identifies the
+portion of the subject string that was matched by the entire pattern. The next
+pair is used for the first captured substring, and so on. The value returned by
 <b>pcre2_match()</b> is one more than the highest numbered pair that has been
 set. For example, if two substrings have been captured, the returned value is
-3. If there are no capturing subpatterns, the return value from a successful
+3. If there are no captured substrings, the return value from a successful
 match is 1, indicating that just the first pair of offsets has been set.
 </P>
 <P>
@@ -2345,11 +2691,7 @@ returned.
 If the ovector is too small to hold all the captured substring offsets, as much
 as possible is filled in, and the function returns a value of zero. If captured
 substrings are not of interest, <b>pcre2_match()</b> may be called with a match
-data block whose ovector is of minimum length (that is, one pair). However, if
-the pattern contains back references and the <i>ovector</i> is not big enough to
-remember the related substrings, PCRE2 has to get additional memory for use
-during matching. Thus it is usually advisable to set up a match data block
-containing an ovector of reasonable size.
+data block whose ovector is of minimum length (that is, one pair).
 </P>
 <P>
 It is possible for capturing subpattern number <i>n+1</i> to match some part of
@@ -2375,7 +2717,7 @@ parentheses, no more than <i>ovector[0]</i> to <i>ovector[2n+1]</i> are set by
 <b>pcre2_match()</b>. The other elements retain whatever values they previously
 had.
 <a name="matchotherdata"></a></P>
-<br><a name="SEC29" href="#TOC1">OTHER INFORMATION ABOUT A MATCH</a><br>
+<br><a name="SEC30" href="#TOC1">OTHER INFORMATION ABOUT A MATCH</a><br>
 <P>
 <b>PCRE2_SPTR pcre2_get_mark(pcre2_match_data *<i>match_data</i>);</b>
 <br>
@@ -2390,25 +2732,28 @@ undefined.
 </P>
 <P>
 After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
-to match (PCRE2_ERROR_NOMATCH), a (*MARK) name may be available, and
-<b>pcre2_get_mark()</b> can be called. It returns a pointer to the
-zero-terminated name, which is within the compiled pattern. Otherwise NULL is
-returned. The length of the (*MARK) name (excluding the terminating zero) is
-stored in the code unit that preceeds the name. You should use this instead of
-relying on the terminating zero if the (*MARK) name might contain a binary
-zero.
-</P>
-<P>
-After a successful match, the (*MARK) name that is returned is the
-last one encountered on the matching path through the pattern. After a "no
-match" or a partial match, the last encountered (*MARK) name is returned. For
-example, consider this pattern:
+to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be
+available. The function <b>pcre2_get_mark()</b> can be called to access this
+name. The same function applies to all three verbs. It returns a pointer to the
+zero-terminated name, which is within the compiled pattern. If no name is
+available, NULL is returned. The length of the name (excluding the terminating
+zero) is stored in the code unit that precedes the name. You should use this
+length instead of relying on the terminating zero if the name might contain a
+binary zero.
+</P>
+<P>
+After a successful match, the name that is returned is the last (*MARK),
+(*PRUNE), or (*THEN) name encountered on the matching path through the pattern.
+Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example,
+if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
+After a "no match" or a partial match, the last encountered name is returned.
+For example, consider this pattern:
 <pre>
   ^(*MARK:A)((*MARK:B)a|b)c
 </pre>
-When it matches "bc", the returned mark is A. The B mark is "seen" in the first
+When it matches "bc", the returned name is A. The B mark is "seen" in the first
 branch of the group, but it is not on the matching path. On the other hand,
-when this pattern fails to match "bx", the returned mark is B.
+when this pattern fails to match "bx", the returned name is B.
 </P>
 <P>
 After a successful match, a partial match, or one of the invalid UTF errors
@@ -2425,7 +2770,7 @@ the code unit offset of the invalid UTF character. Details are given in the
 <a href="pcre2unicode.html"><b>pcre2unicode</b></a>
 page.
 <a name="errorlist"></a></P>
-<br><a name="SEC30" href="#TOC1">ERROR RETURNS FROM <b>pcre2_match()</b></a><br>
+<br><a name="SEC31" href="#TOC1">ERROR RETURNS FROM <b>pcre2_match()</b></a><br>
 <P>
 If <b>pcre2_match()</b> fails, it returns a negative number. This can be
 converted to a text string by calling the <b>pcre2_get_error_message()</b>
@@ -2457,8 +2802,9 @@ returned when the magic number is not present.
 <pre>
   PCRE2_ERROR_BADMODE
 </pre>
-This error is given when a pattern that was compiled by the 8-bit library is
-passed to a 16-bit or 32-bit library function, or vice versa.
+This error is given when a compiled pattern is passed to a function in a
+library of a different code unit width, for example, a pattern compiled by
+the 8-bit library is passed to a 16-bit or 32-bit library function.
 <pre>
   PCRE2_ERROR_BADOFFSET
 </pre>
@@ -2483,20 +2829,19 @@ use by callout functions that want to cause <b>pcre2_match()</b> or
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
 documentation for details.
 <pre>
+  PCRE2_ERROR_DEPTHLIMIT
+</pre>
+The nested backtracking depth limit was reached.
+<pre>
+  PCRE2_ERROR_HEAPLIMIT
+</pre>
+The heap limit was reached.
+<pre>
   PCRE2_ERROR_INTERNAL
 </pre>
 An unexpected internal error has occurred. This error could be caused by a bug
 in PCRE2 or by overwriting of the compiled pattern.
 <pre>
-  PCRE2_ERROR_JIT_BADOPTION
-</pre>
-This error is returned when a pattern that was successfully studied using JIT
-is being matched, but the matching mode (partial or complete match) does not
-correspond to any JIT compilation mode. When the JIT fast path function is
-used, this error may be also given for invalid options. See the
-<a href="pcre2jit.html"><b>pcre2jit</b></a>
-documentation for more details.
-<pre>
   PCRE2_ERROR_JIT_STACKLIMIT
 </pre>
 This error is returned when a pattern that was successfully studied using JIT
@@ -2507,15 +2852,14 @@ documentation for more details.
 <pre>
   PCRE2_ERROR_MATCHLIMIT
 </pre>
-The backtracking limit was reached.
+The backtracking match limit was reached.
 <pre>
   PCRE2_ERROR_NOMEMORY
 </pre>
-If a pattern contains back references, but the ovector is not big enough to
-remember the referenced substrings, PCRE2 gets a block of memory at the start
-of matching to use for this purpose. There are some other special cases where
-extra memory is needed during matching. This error is given when memory cannot
-be obtained.
+If a pattern contains many nested backtracking points, heap memory is used to
+remember them. This error is given when the memory allocation function (default
+or custom) fails. Note that a different error, PCRE2_ERROR_HEAPLIMIT, is given
+if the amount of memory needed exceeds the heap limit.
 <pre>
   PCRE2_ERROR_NULL
 </pre>
@@ -2531,12 +2875,8 @@ in the subject string. Some simple patterns that might do this are detected and
 faulted at compile time, but more complicated cases, in particular mutual
 recursions between two different subpatterns, cannot be detected until matching
 is attempted.
-<pre>
-  PCRE2_ERROR_RECURSIONLIMIT
-</pre>
-The internal recursion limit was reached.
 <a name="geterrormessage"></a></P>
-<br><a name="SEC31" href="#TOC1">OBTAINING A TEXTUAL ERROR MESSAGE</a><br>
+<br><a name="SEC32" href="#TOC1">OBTAINING A TEXTUAL ERROR MESSAGE</a><br>
 <P>
 <b>int pcre2_get_error_message(int <i>errorcode</i>, PCRE2_UCHAR *<i>buffer</i>,</b>
 <b>  PCRE2_SIZE <i>bufflen</i>);</b>
@@ -2545,8 +2885,8 @@ The internal recursion limit was reached.
 A text message for an error code from any PCRE2 function (compile, match, or
 auxiliary) can be obtained by calling <b>pcre2_get_error_message()</b>. The code
 is passed as the first argument, with the remaining two arguments specifying a
-code unit buffer and its length, into which the text message is placed. Note
-that the message is returned in code units of the appropriate width for the
+code unit buffer and its length in code units, into which the text message is
+placed. The message is returned in code units of the appropriate width for the
 library that is being used.
 </P>
 <P>
@@ -2557,7 +2897,7 @@ returned. If the buffer is too small, the message is truncated (but still with
 a trailing zero), and the negative error code PCRE2_ERROR_NOMEMORY is returned.
 None of the messages are very long; a buffer size of 120 code units is ample.
 <a name="extractbynumber"></a></P>
-<br><a name="SEC32" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>
+<br><a name="SEC33" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>
 <P>
 <b>int pcre2_substring_length_bynumber(pcre2_match_data *<i>match_data</i>,</b>
 <b>  uint32_t <i>number</i>, PCRE2_SIZE *<i>length</i>);</b>
@@ -2654,7 +2994,7 @@ The substring did not participate in the match. For example, if the pattern is
 (abc)|(def) and the subject is "def", and the ovector contains at least two
 capturing slots, substring number 1 is unset.
 </P>
-<br><a name="SEC33" href="#TOC1">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a><br>
+<br><a name="SEC34" href="#TOC1">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a><br>
 <P>
 <b>int pcre2_substring_list_get(pcre2_match_data *<i>match_data</i>,</b>
 <b>"  PCRE2_UCHAR ***<i>listptr</i>, PCRE2_SIZE **<i>lengthsptr</i>);</b>
@@ -2693,7 +3033,7 @@ can be distinguished from a genuine zero-length substring by inspecting the
 appropriate offset in the ovector, which contain PCRE2_UNSET for unset
 substrings, or by calling <b>pcre2_substring_length_bynumber()</b>.
 <a name="extractbyname"></a></P>
-<br><a name="SEC34" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a><br>
+<br><a name="SEC35" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a><br>
 <P>
 <b>int pcre2_substring_number_from_name(const pcre2_code *<i>code</i>,</b>
 <b>  PCRE2_SPTR <i>name</i>);</b>
@@ -2725,8 +3065,8 @@ calling <b>pcre2_substring_number_from_name()</b>. The first argument is the
 compiled pattern, and the second is the name. The yield of the function is the
 subpattern number, PCRE2_ERROR_NOSUBSTRING if there is no subpattern of that
 name, or PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one subpattern of
-that name. Given the number, you can extract the substring directly, or use one
-of the functions described above.
+that name. Given the number, you can extract the substring directly from the
+ovector, or use one of the "bynumber" functions described above.
 </P>
 <P>
 For convenience, there are also "byname" functions that correspond to the
@@ -2753,7 +3093,7 @@ names are not included in the compiled code. The matching process uses only
 numbers. For this reason, the use of different names for subpatterns of the
 same number causes an error at compile time.
 </P>
-<br><a name="SEC35" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a><br>
+<br><a name="SEC36" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a><br>
 <P>
 <b>int pcre2_substitute(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
 <b>  PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
@@ -2800,12 +3140,12 @@ length is in code units, not bytes.
 In the replacement string, which is interpreted as a UTF string in UTF mode,
 and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
 dollar character is an escape character that can specify the insertion of
-characters from capturing groups or (*MARK) items in the pattern. The following
-forms are always recognized:
+characters from capturing groups or (*MARK), (*PRUNE), or (*THEN) items in the
+pattern. The following forms are always recognized:
 <pre>
   $$                  insert a dollar character
   $&#60;n&#62; or ${&#60;n&#62;}      insert the contents of group &#60;n&#62;
-  $*MARK or ${*MARK}  insert the name of the last (*MARK) encountered
+  $*MARK or ${*MARK}  insert a (*MARK), (*PRUNE), or (*THEN) name
 </pre>
 Either a group number or a group name can be given for &#60;n&#62;. Curly brackets are
 required only if the following character would be interpreted as part of the
@@ -2814,25 +3154,43 @@ For example, if the pattern a(b)c is matched with "=abc=" and the replacement
 string "+$1$0$1+", the result is "=+babcb+=".
 </P>
 <P>
-The facility for inserting a (*MARK) name can be used to perform simple
-simultaneous substitutions, as this <b>pcre2test</b> example shows:
+$*MARK inserts the name from the last encountered (*MARK), (*PRUNE), or (*THEN)
+on the matching path that has a name. (*MARK) must always include a name, but
+(*PRUNE) and (*THEN) need not. For example, in the case of (*MARK:A)(*PRUNE)
+the name inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B".
+This facility can be used to perform simple simultaneous substitutions, as this
+<b>pcre2test</b> example shows:
 <pre>
-  /(*:pear)apple|(*:orange)lemon/g,replace=${*MARK}
+  /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
       apple lemon
    2: pear orange
 </pre>
 As well as the usual options for <b>pcre2_match()</b>, a number of additional
-options can be set in the <i>options</i> argument.
+options can be set in the <i>options</i> argument of <b>pcre2_substitute()</b>.
 </P>
 <P>
 PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject string,
-replacing every matching substring. If this is not set, only the first matching
-substring is replaced. If any matched substring has zero length, after the
-substitution has happened, an attempt to find a non-empty match at the same
-position is performed. If this is not successful, the current position is
-advanced by one character except when CRLF is a valid newline sequence and the
-next two characters are CR, LF. In this case, the current position is advanced
-by two characters.
+replacing every matching substring. If this option is not set, only the first
+matching substring is replaced. The search for matches takes place in the
+original subject string (that is, previous replacements do not affect it).
+Iteration is implemented by advancing the <i>startoffset</i> value for each
+search, which is always passed the entire subject string. If an offset limit is
+set in the match context, searching stops when that limit is reached.
+</P>
+<P>
+You can restrict the effect of a global substitution to a portion of the
+subject string by setting either or both of <i>startoffset</i> and an offset
+limit. Here is a \fPpcre2test\fP example:
+<pre>
+  /B/g,replace=!,use_offset_limit
+  ABC ABC ABC ABC\=offset=3,offset_limit=12
+   2: ABC A!C A!C ABC
+</pre>
+When continuing with global substitutions after matching a substring with zero
+length, an attempt to find a non-empty match at the same offset is performed.
+If this is not successful, the offset is advanced by one character except when
+CRLF is a valid newline sequence and the next two characters are CR, LF. In
+this case, the offset is advanced by two characters.
 </P>
 <P>
 PCRE2_SUBSTITUTE_OVERFLOW_LENGTH changes what happens when the output buffer is
@@ -2949,10 +3307,10 @@ default.
 <P>
 PCRE2_ERROR_BADREPLACEMENT is used for miscellaneous syntax errors in the
 replacement string, with more particular errors being PCRE2_ERROR_BADREPESCAPE
-(invalid escape sequence), PCRE2_ERROR_REPMISSING_BRACE (closing curly bracket
-not found), PCRE2_BADSUBSTITUTION (syntax error in extended group
-substitution), and PCRE2_BADSUBPATTERN (the pattern match ended before it
-started, which can happen if \K is used in an assertion).
+(invalid escape sequence), PCRE2_ERROR_REPMISSINGBRACE (closing curly bracket
+not found), PCRE2_ERROR_BADSUBSTITUTION (syntax error in extended group
+substitution), and PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before
+it started, which can happen if \K is used in an assertion).
 </P>
 <P>
 As for all PCRE2 errors, a text message that describes the error can be
@@ -2960,7 +3318,7 @@ obtained by calling the <b>pcre2_get_error_message()</b> function (see
 "Obtaining a textual error message"
 <a href="#geterrormessage">above).</a>
 </P>
-<br><a name="SEC36" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
+<br><a name="SEC37" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
 <P>
 <b>int pcre2_substring_nametable_scan(const pcre2_code *<i>code</i>,</b>
 <b>  PCRE2_SPTR <i>name</i>, PCRE2_SPTR *<i>first</i>, PCRE2_SPTR *<i>last</i>);</b>
@@ -3005,7 +3363,7 @@ in the section entitled <i>Information about a pattern</i>. Given all the
 relevant entries for the name, you can extract each of their numbers, and hence
 the captured data.
 </P>
-<br><a name="SEC37" href="#TOC1">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a><br>
+<br><a name="SEC38" href="#TOC1">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a><br>
 <P>
 The traditional matching function uses a similar algorithm to Perl, which stops
 when it finds the first match at a given point in the subject. If you want to
@@ -3023,7 +3381,7 @@ substring. Then return 1, which forces <b>pcre2_match()</b> to backtrack and try
 other alternatives. Ultimately, when it runs out of matches,
 <b>pcre2_match()</b> will yield PCRE2_ERROR_NOMATCH.
 <a name="dfamatch"></a></P>
-<br><a name="SEC38" href="#TOC1">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a><br>
+<br><a name="SEC39" href="#TOC1">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a><br>
 <P>
 <b>int pcre2_dfa_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
 <b>  PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
@@ -3034,11 +3392,12 @@ other alternatives. Ultimately, when it runs out of matches,
 <P>
 The function <b>pcre2_dfa_match()</b> is called to match a subject string
 against a compiled pattern, using a matching algorithm that scans the subject
-string just once, and does not backtrack. This has different characteristics to
-the normal algorithm, and is not compatible with Perl. Some of the features of
-PCRE2 patterns are not supported. Nevertheless, there are times when this kind
-of matching can be useful. For a discussion of the two matching algorithms, and
-a list of features that <b>pcre2_dfa_match()</b> does not support, see the
+string just once (not counting lookaround assertions), and does not backtrack.
+This has different characteristics to the normal algorithm, and is not
+compatible with Perl. Some of the features of PCRE2 patterns are not supported.
+Nevertheless, there are times when this kind of matching can be useful. For a
+discussion of the two matching algorithms, and a list of features that
+<b>pcre2_dfa_match()</b> does not support, see the
 <a href="pcre2matching.html"><b>pcre2matching</b></a>
 documentation.
 </P>
@@ -3066,7 +3425,7 @@ Here is an example of a simple call to <b>pcre2_dfa_match()</b>:
     11,             /* the length of the subject string */
     0,              /* start at offset 0 in the subject */
     0,              /* default options */
-    match_data,     /* the match data block */
+    md,             /* the match data block */
     NULL,           /* a match context; NULL means use defaults */
     wspace,         /* working space vector */
     20);            /* number of elements (NOT size in bytes) */
@@ -3077,11 +3436,11 @@ Option bits for <b>pcre_dfa_match()</b>
 </b><br>
 <P>
 The unused bits of the <i>options</i> argument for <b>pcre2_dfa_match()</b> must
-be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
-PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
-PCRE2_PARTIAL_HARD, PCRE2_PARTIAL_SOFT, PCRE2_DFA_SHORTEST, and
-PCRE2_DFA_RESTART. All but the last four of these are exactly the same as for
-<b>pcre2_match()</b>, so their description is not repeated here.
+be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_ENDANCHORED,
+PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
+PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, PCRE2_PARTIAL_SOFT, PCRE2_DFA_SHORTEST,
+and PCRE2_DFA_RESTART. All but the last four of these are exactly the same as
+for <b>pcre2_match()</b>, so their description is not repeated here.
 <pre>
   PCRE2_PARTIAL_HARD
   PCRE2_PARTIAL_SOFT
@@ -3174,7 +3533,7 @@ NOTE: PCRE2's "auto-possessification" optimization usually applies to character
 repeats at the end of a pattern (as well as internally). For example, the
 pattern "a\d+" is compiled as if it were "a\d++". For DFA matching, this
 means that only one possible match is found. If you really do want multiple
-matches in such cases, either use an ungreedy repeat auch as "a\d+?" or set
+matches in such cases, either use an ungreedy repeat such as "a\d+?" or set
 the PCRE2_NO_AUTO_POSSESS option when compiling.
 </P>
 <br><b>
@@ -3218,13 +3577,13 @@ some plausibility checks are made on the contents of the workspace, which
 should contain data about the previous partial match. If any of these checks
 fail, this error is given.
 </P>
-<br><a name="SEC39" href="#TOC1">SEE ALSO</a><br>
+<br><a name="SEC40" href="#TOC1">SEE ALSO</a><br>
 <P>
 <b>pcre2build</b>(3), <b>pcre2callout</b>(3), <b>pcre2demo(3)</b>,
 <b>pcre2matching</b>(3), <b>pcre2partial</b>(3), <b>pcre2posix</b>(3),
-<b>pcre2sample</b>(3), <b>pcre2stack</b>(3), <b>pcre2unicode</b>(3).
+<b>pcre2sample</b>(3), <b>pcre2unicode</b>(3).
 </P>
-<br><a name="SEC40" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC41" href="#TOC1">AUTHOR</a><br>
 <P>
 Philip Hazel
 <br>
@@ -3233,11 +3592,11 @@ University Computing Service
 Cambridge, England.
 <br>
 </P>
-<br><a name="SEC41" href="#TOC1">REVISION</a><br>
+<br><a name="SEC42" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 17 June 2016
+Last updated: 31 December 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2build.html b/doc/html/pcre2build.html
index 6c8e1de..823e605 100644
--- a/doc/html/pcre2build.html
+++ b/doc/html/pcre2build.html
@@ -23,20 +23,21 @@ please consult the man page, in case the conversion went wrong.
 <li><a name="TOC8" href="#SEC8">NEWLINE RECOGNITION</a>
 <li><a name="TOC9" href="#SEC9">WHAT \R MATCHES</a>
 <li><a name="TOC10" href="#SEC10">HANDLING VERY LARGE PATTERNS</a>
-<li><a name="TOC11" href="#SEC11">AVOIDING EXCESSIVE STACK USAGE</a>
-<li><a name="TOC12" href="#SEC12">LIMITING PCRE2 RESOURCE USAGE</a>
-<li><a name="TOC13" href="#SEC13">CREATING CHARACTER TABLES AT BUILD TIME</a>
-<li><a name="TOC14" href="#SEC14">USING EBCDIC CODE</a>
-<li><a name="TOC15" href="#SEC15">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a>
-<li><a name="TOC16" href="#SEC16">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
-<li><a name="TOC17" href="#SEC17">PCRE2GREP BUFFER SIZE</a>
-<li><a name="TOC18" href="#SEC18">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
-<li><a name="TOC19" href="#SEC19">INCLUDING DEBUGGING CODE</a>
-<li><a name="TOC20" href="#SEC20">DEBUGGING WITH VALGRIND SUPPORT</a>
-<li><a name="TOC21" href="#SEC21">CODE COVERAGE REPORTING</a>
-<li><a name="TOC22" href="#SEC22">SEE ALSO</a>
-<li><a name="TOC23" href="#SEC23">AUTHOR</a>
-<li><a name="TOC24" href="#SEC24">REVISION</a>
+<li><a name="TOC11" href="#SEC11">LIMITING PCRE2 RESOURCE USAGE</a>
+<li><a name="TOC12" href="#SEC12">CREATING CHARACTER TABLES AT BUILD TIME</a>
+<li><a name="TOC13" href="#SEC13">USING EBCDIC CODE</a>
+<li><a name="TOC14" href="#SEC14">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a>
+<li><a name="TOC15" href="#SEC15">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
+<li><a name="TOC16" href="#SEC16">PCRE2GREP BUFFER SIZE</a>
+<li><a name="TOC17" href="#SEC17">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
+<li><a name="TOC18" href="#SEC18">INCLUDING DEBUGGING CODE</a>
+<li><a name="TOC19" href="#SEC19">DEBUGGING WITH VALGRIND SUPPORT</a>
+<li><a name="TOC20" href="#SEC20">CODE COVERAGE REPORTING</a>
+<li><a name="TOC21" href="#SEC21">SUPPORT FOR FUZZERS</a>
+<li><a name="TOC22" href="#SEC22">OBSOLETE OPTION</a>
+<li><a name="TOC23" href="#SEC23">SEE ALSO</a>
+<li><a name="TOC24" href="#SEC24">AUTHOR</a>
+<li><a name="TOC25" href="#SEC25">REVISION</a>
 </ul>
 <br><a name="SEC1" href="#TOC1">BUILDING PCRE2</a><br>
 <P>
@@ -77,19 +78,19 @@ running
 <pre>
   ./configure --help
 </pre>
-The following sections include descriptions of options whose names begin with
---enable or --disable. These settings specify changes to the defaults for the
-<b>configure</b> command. Because of the way that <b>configure</b> works,
---enable and --disable always come in pairs, so the complementary option always
-exists as well, but as it specifies the default, it is not described.
+The following sections include descriptions of "on/off" options whose names
+begin with --enable or --disable. Because of the way that <b>configure</b>
+works, --enable and --disable always come in pairs, so the complementary option
+always exists as well, but as it specifies the default, it is not described.
+Options that specify values have names that start with --with.
 </P>
 <br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
 <P>
 By default, a library called <b>libpcre2-8</b> is built, containing functions
-that take string arguments contained in vectors of bytes, interpreted either as
+that take string arguments contained in arrays of bytes, interpreted either as
 single-byte characters, or UTF-8 strings. You can also build two other
 libraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
-strings that are contained in vectors of 16-bit and 32-bit code units,
+strings that are contained in arrays of 16-bit and 32-bit code units,
 respectively. These can be interpreted either as single-unit characters or
 UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
 the following to the <b>configure</b> command:
@@ -137,10 +138,10 @@ locked this out by setting PCRE2_NEVER_UTF.
 </P>
 <P>
 UTF support allows the libraries to process character code points up to
-0x10ffff in the strings that they handle. It also provides support for
-accessing the Unicode properties of such characters, using pattern escapes such
-as \P, \p, and \X. Only the general category properties such as <i>Lu</i> and
-<i>Nd</i> are supported. Details are given in the
+0x10ffff in the strings that they handle. Unicode support also gives access to
+the Unicode properties of characters, using pattern escapes such as \P, \p,
+and \X. Only the general category properties such as <i>Lu</i> and <i>Nd</i> are
+supported. Details are given in the
 <a href="pcre2pattern.html"><b>pcre2pattern</b></a>
 documentation.
 </P>
@@ -164,13 +165,18 @@ out by setting the PCRE2_NEVER_BACKSLASH_C option when calling
 </P>
 <br><a name="SEC7" href="#TOC1">JUST-IN-TIME COMPILER SUPPORT</a><br>
 <P>
-Just-in-time compiler support is included in the build by specifying
+Just-in-time (JIT) compiler support is included in the build by specifying
 <pre>
   --enable-jit
 </pre>
 This support is available only for certain hardware architectures. If this
-option is set for an unsupported architecture, a building error occurs.
-See the
+option is set for an unsupported architecture, a building error occurs. If you
+are running under SELinux you may also want to add
+<pre>
+  --enable-jit-sealloc
+</pre>
+which enables the use of an execmem allocator in JIT that is compatible with
+SELinux. This has no effect if JIT is not enabled. See the
 <a href="pcre2jit.html"><b>pcre2jit</b></a>
 documentation for a discussion of JIT usage. When JIT support is enabled,
 pcre2grep automatically makes use of it, unless you add
@@ -202,19 +208,23 @@ to the <b>configure</b> command. There is a fourth option, specified by
   --enable-newline-is-anycrlf
 </pre>
 which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
-indicating a line ending. Finally, a fifth option, specified by
+indicating a line ending. A fifth option, specified by
 <pre>
   --enable-newline-is-any
 </pre>
 causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
 sequences are the three just mentioned, plus the single characters VT (vertical
 tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
-separator, U+2028), and PS (paragraph separator, U+2029).
+separator, U+2028), and PS (paragraph separator, U+2029). The final option is
+<pre>
+  --enable-newline-is-nul
+</pre>
+which causes NUL (binary zero) is set as the default line-ending character.
 </P>
 <P>
 Whatever default line ending convention is selected when PCRE2 is built can be
 overridden by applications that use the library. At build time it is
-conventional to use the standard for your operating system.
+recommended to use the standard for your operating system.
 </P>
 <br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
 <P>
@@ -226,7 +236,7 @@ specify
 </pre>
 the default is changed so that \R matches only CR, LF, or CRLF. Whatever is
 selected when PCRE2 is built can be overridden by applications that use the
-called.
+library.
 </P>
 <br><a name="SEC10" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br>
 <P>
@@ -247,58 +257,62 @@ longer offsets slows down the operation of PCRE2 because it has to load
 additional data when handling them. For the 32-bit library the value is always
 4 and cannot be overridden; the value of --with-link-size is ignored.
 </P>
-<br><a name="SEC11" href="#TOC1">AVOIDING EXCESSIVE STACK USAGE</a><br>
+<br><a name="SEC11" href="#TOC1">LIMITING PCRE2 RESOURCE USAGE</a><br>
 <P>
-When matching with the <b>pcre2_match()</b> function, PCRE2 implements
-backtracking by making recursive calls to an internal function called
-<b>match()</b>. In environments where the size of the stack is limited, this can
-severely limit PCRE2's operation. (The Unix environment does not usually suffer
-from this problem, but it may sometimes be necessary to increase the maximum
-stack size. There is a discussion in the
-<a href="pcre2stack.html"><b>pcre2stack</b></a>
-documentation.) An alternative approach to recursion that uses memory from the
-heap to remember data, instead of using recursive function calls, has been
-implemented to work round the problem of limited stack size. If you want to
-build a version of PCRE2 that works this way, add
+The <b>pcre2_match()</b> function increments a counter each time it goes round
+its main loop. Putting a limit on this counter controls the amount of computing
+resource used by a single call to <b>pcre2_match()</b>. The limit can be changed
+at run time, as described in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+documentation. The default is 10 million, but this can be changed by adding a
+setting such as
 <pre>
-  --disable-stack-for-recursion
+  --with-match-limit=500000
 </pre>
-to the <b>configure</b> command. By default, the system functions <b>malloc()</b>
-and <b>free()</b> are called to manage the heap memory that is required, but
-custom memory management functions can be called instead. PCRE2 runs noticeably
-more slowly when built in this way. This option affects only the
-<b>pcre2_match()</b> function; it is not relevant for <b>pcre2_dfa_match()</b>.
+to the <b>configure</b> command. This setting also applies to the
+<b>pcre2_dfa_match()</b> matching function, and to JIT matching (though the
+counting is done differently).
 </P>
-<br><a name="SEC12" href="#TOC1">LIMITING PCRE2 RESOURCE USAGE</a><br>
 <P>
-Internally, PCRE2 has a function called <b>match()</b>, which it calls
-repeatedly (sometimes recursively) when matching a pattern with the
-<b>pcre2_match()</b> function. By controlling the maximum number of times this
-function may be called during a single matching operation, a limit can be
-placed on the resources used by a single call to <b>pcre2_match()</b>. The limit
-can be changed at run time, as described in the
+The <b>pcre2_match()</b> function starts out using a 20K vector on the system
+stack to record backtracking points. The more nested backtracking points there
+are (that is, the deeper the search tree), the more memory is needed. If the
+initial vector is not large enough, heap memory is used, up to a certain limit,
+which is specified in kilobytes. The limit can be changed at run time, as
+described in the
 <a href="pcre2api.html"><b>pcre2api</b></a>
-documentation. The default is 10 million, but this can be changed by adding a
-setting such as
+documentation. The default limit (in effect unlimited) is 20 million. You can
+change this by a setting such as
 <pre>
-  --with-match-limit=500000
+  --with-heap-limit=500
 </pre>
-to the <b>configure</b> command. This setting has no effect on the
-<b>pcre2_dfa_match()</b> matching function.
+which limits the amount of heap to 500 kilobytes. This limit applies only to
+interpretive matching in pcre2_match(). It does not apply when JIT (which has
+its own memory arrangements) is used, nor does it apply to
+<b>pcre2_dfa_match()</b>.
 </P>
 <P>
-In some environments it is desirable to limit the depth of recursive calls of
-<b>match()</b> more strictly than the total number of calls, in order to
-restrict the maximum amount of stack (or heap, if --disable-stack-for-recursion
-is specified) that is used. A second limit controls this; it defaults to the
-value that is set for --with-match-limit, which imposes no additional
-constraints. However, you can set a lower limit by adding, for example,
+You can also explicitly limit the depth of nested backtracking in the
+<b>pcre2_match()</b> interpreter. This limit defaults to the value that is set
+for --with-match-limit. You can set a lower default limit by adding, for
+example,
 <pre>
-  --with-match-limit-recursion=10000
+  --with-match-limit_depth=10000
 </pre>
-to the <b>configure</b> command. This value can also be overridden at run time.
+to the <b>configure</b> command. This value can be overridden at run time. This
+depth limit indirectly limits the amount of heap memory that is used, but
+because the size of each backtracking "frame" depends on the number of
+capturing parentheses in a pattern, the amount of heap that is used before the
+limit is reached varies from pattern to pattern. This limit was more useful in
+versions before 10.30, where function recursion was used for backtracking.
 </P>
-<br><a name="SEC13" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>
+<P>
+As well as applying to <b>pcre2_match()</b>, the depth limit also controls
+the depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
+used for lookaround assertions, atomic groups, and recursion within patterns.
+The limit does not apply to JIT matching.
+</P>
+<br><a name="SEC12" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>
 <P>
 PCRE2 uses fixed tables for processing characters whose code points are less
 than 256. By default, PCRE2 is built with a set of tables that are distributed
@@ -310,12 +324,12 @@ only. If you add
 to the <b>configure</b> command, the distributed tables are no longer used.
 Instead, a program called <b>dftables</b> is compiled and run. This outputs the
 source for new set of tables, created in the default locale of your C run-time
-system. (This method of replacing the tables does not work if you are cross
+system. This method of replacing the tables does not work if you are cross
 compiling, because <b>dftables</b> is run on the local host. If you need to
 create alternative tables when cross compiling, you will have to do so "by
-hand".)
+hand".
 </P>
-<br><a name="SEC14" href="#TOC1">USING EBCDIC CODE</a><br>
+<br><a name="SEC13" href="#TOC1">USING EBCDIC CODE</a><br>
 <P>
 PCRE2 assumes by default that it will run in an environment where the character
 code is ASCII or Unicode, which is a superset of ASCII. This is the case for
@@ -350,7 +364,7 @@ The options that select newline behaviour, such as --enable-newline-is-cr,
 and equivalent run-time options, refer to these character values in an EBCDIC
 environment.
 </P>
-<br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
+<br><a name="SEC14" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
 <P>
 By default, on non-Windows systems, <b>pcre2grep</b> supports the use of
 callouts with string arguments within the patterns it is matching, in order to
@@ -359,7 +373,7 @@ run external scripts. For details, see the
 documentation. This support can be disabled by adding
 --disable-pcre2grep-callout to the <b>configure</b> command.
 </P>
-<br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
+<br><a name="SEC15" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
 <P>
 By default, <b>pcre2grep</b> reads all files as plain text. You can build it so
 that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
@@ -372,22 +386,25 @@ to the <b>configure</b> command. These options naturally require that the
 relevant libraries are installed on your system. Configuration will fail if
 they are not.
 </P>
-<br><a name="SEC17" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
+<br><a name="SEC16" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
 <P>
 <b>pcre2grep</b> uses an internal buffer to hold a "window" on the file it is
 scanning, in order to be able to output "before" and "after" lines when it
-finds a match. The size of the buffer is controlled by a parameter whose
-default value is 20K. The buffer itself is three times this size, but because
-of the way it is used for holding "before" lines, the longest line that is
-guaranteed to be processable is the parameter size. You can change the default
-parameter value by adding, for example,
+finds a match. The starting size of the buffer is controlled by a parameter
+whose default value is 20K. The buffer itself is three times this size, but
+because of the way it is used for holding "before" lines, the longest line that
+is guaranteed to be processable is the parameter size. If a longer line is
+encountered, <b>pcre2grep</b> automatically expands the buffer, up to a
+specified maximum size, whose default is 1M or the starting size, whichever is
+the larger. You can change the default parameter values by adding, for example,
 <pre>
-  --with-pcre2grep-bufsize=50K
+  --with-pcre2grep-bufsize=51200
+  --with-pcre2grep-max-bufsize=2097152
 </pre>
-to the <b>configure</b> command. The caller of \fPpcre2grep\fP can override this
-value by using --buffer-size on the command line.
+to the <b>configure</b> command. The caller of \fPpcre2grep\fP can override
+these values by using --buffer-size and --max-buffer-size on the command line.
 </P>
-<br><a name="SEC18" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
+<br><a name="SEC17" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
 <P>
 If you add one of
 <pre>
@@ -421,7 +438,7 @@ automatically included, you may need to add something like
 </pre>
 immediately before the <b>configure</b> command.
 </P>
-<br><a name="SEC19" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
+<br><a name="SEC18" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
 <P>
 If you add
 <pre>
@@ -430,7 +447,7 @@ If you add
 to the <b>configure</b> command, additional debugging code is included in the
 build. This feature is intended for use by the PCRE2 maintainers.
 </P>
-<br><a name="SEC20" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
+<br><a name="SEC19" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
 <P>
 If you add
 <pre>
@@ -440,7 +457,7 @@ to the <b>configure</b> command, PCRE2 will use valgrind annotations to mark
 certain memory regions as unaddressable. This allows it to detect invalid
 memory accesses, and is mostly useful for debugging PCRE2 itself.
 </P>
-<br><a name="SEC21" href="#TOC1">CODE COVERAGE REPORTING</a><br>
+<br><a name="SEC20" href="#TOC1">CODE COVERAGE REPORTING</a><br>
 <P>
 If your C compiler is gcc, you can build a version of PCRE2 that can generate a
 code coverage report for its test suite. To enable this, you must install
@@ -497,11 +514,47 @@ This cleans all coverage data including the generated coverage report. For more
 information about code coverage, see the <b>gcov</b> and <b>lcov</b>
 documentation.
 </P>
-<br><a name="SEC22" href="#TOC1">SEE ALSO</a><br>
+<br><a name="SEC21" href="#TOC1">SUPPORT FOR FUZZERS</a><br>
+<P>
+There is a special option for use by people who want to run fuzzing tests on
+PCRE2:
+<pre>
+  --enable-fuzz-support
+</pre>
+At present this applies only to the 8-bit library. If set, it causes an extra
+library called libpcre2-fuzzsupport.a to be built, but not installed. This
+contains a single function called LLVMFuzzerTestOneInput() whose arguments are
+a pointer to a string and the length of the string. When called, this function
+tries to compile the string as a pattern, and if that succeeds, to match it.
+This is done both with no options and with some random options bits that are
+generated from the string.
+</P>
+<P>
+Setting --enable-fuzz-support also causes a binary called <b>pcre2fuzzcheck</b>
+to be created. This is normally run under valgrind or used when PCRE2 is
+compiled with address sanitizing enabled. It calls the fuzzing function and
+outputs information about it is doing. The input strings are specified by
+arguments: if an argument starts with "=" the rest of it is a literal input
+string. Otherwise, it is assumed to be a file name, and the contents of the
+file are the test string.
+</P>
+<br><a name="SEC22" href="#TOC1">OBSOLETE OPTION</a><br>
+<P>
+In versions of PCRE2 prior to 10.30, there were two ways of handling
+backtracking in the <b>pcre2_match()</b> function. The default was to use the
+system stack, but if
+<pre>
+  --disable-stack-for-recursion
+</pre>
+was set, memory on the heap was used. From release 10.30 onwards this has
+changed (the stack is no longer used) and this option now does nothing except
+give a warning.
+</P>
+<br><a name="SEC23" href="#TOC1">SEE ALSO</a><br>
 <P>
 <b>pcre2api</b>(3), <b>pcre2-config</b>(3).
 </P>
-<br><a name="SEC23" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC24" href="#TOC1">AUTHOR</a><br>
 <P>
 Philip Hazel
 <br>
@@ -510,11 +563,11 @@ University Computing Service
 Cambridge, England.
 <br>
 </P>
-<br><a name="SEC24" href="#TOC1">REVISION</a><br>
+<br><a name="SEC25" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 01 April 2016
+Last updated: 18 July 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2callout.html b/doc/html/pcre2callout.html
index 7e85c9a..2adf21a 100644
--- a/doc/html/pcre2callout.html
+++ b/doc/html/pcre2callout.html
@@ -57,16 +57,23 @@ two callout points:
 </pre>
 If the PCRE2_AUTO_CALLOUT option bit is set when a pattern is compiled, PCRE2
 automatically inserts callouts, all with number 255, before each item in the
-pattern. For example, if PCRE2_AUTO_CALLOUT is used with the pattern
+pattern except for immediately before or after an explicit callout. For
+example, if PCRE2_AUTO_CALLOUT is used with the pattern
 <pre>
-  A(\d{2}|--)
+  A(?C3)B
 </pre>
 it is processed as if it were
-<br>
-<br>
-(?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
-<br>
-<br>
+<pre>
+  (?C255)A(?C3)B(?C255)
+</pre>
+Here is a more complicated example:
+<pre>
+  A(\d{2}|--)
+</pre>
+With PCRE2_AUTO_CALLOUT, this pattern is processed as if it were
+<pre>
+  (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
+</pre>
 Notice that there is a callout before and after each parenthesis and
 alternation bar. If the pattern contains a conditional group whose condition is
 an assertion, an automatic callout is inserted immediately before the
@@ -107,10 +114,10 @@ with PCRE2_ANCHORED and PCRE2_AUTO_CALLOUT and then applied to the string
   No match
 </pre>
 This indicates that when matching [bc] fails, there is no backtracking into a+
-and therefore the callouts that would be taken for the backtracks do not occur.
-You can disable the auto-possessify feature by passing PCRE2_NO_AUTO_POSSESS to
-<b>pcre2_compile()</b>, or starting the pattern with (*NO_AUTO_POSSESS). In this
-case, the output changes to this:
+(because it is being treated as a++) and therefore the callouts that would be
+taken for the backtracks do not occur. You can disable the auto-possessify
+feature by passing PCRE2_NO_AUTO_POSSESS to <b>pcre2_compile()</b>, or starting
+the pattern with (*NO_AUTO_POSSESS). In this case, the output changes to this:
 <pre>
   ---&#62;aaaa
    +0 ^        a+
@@ -131,10 +138,14 @@ By default, an optimization is applied when .* is the first significant item in
 a pattern. If PCRE2_DOTALL is set, so that the dot can match any character, the
 pattern is automatically anchored. If PCRE2_DOTALL is not set, a match can
 start only after an internal newline or at the beginning of the subject, and
-<b>pcre2_compile()</b> remembers this. This optimization is disabled, however,
-if .* is in an atomic group or if there is a back reference to the capturing
-group in which it appears. It is also disabled if the pattern contains (*PRUNE)
-or (*SKIP). However, the presence of callouts does not affect it.
+<b>pcre2_compile()</b> remembers this. If a pattern has more than one top-level
+branch, automatic anchoring occurs if all branches are anchorable.
+</P>
+<P>
+This optimization is disabled, however, if .* is in an atomic group or if there
+is a back reference to the capturing group in which it appears. It is also
+disabled if the pattern contains (*PRUNE) or (*SKIP). However, the presence of
+callouts does not affect it.
 </P>
 <P>
 For example, if the pattern .*\d is compiled with PCRE2_AUTO_CALLOUT and
@@ -166,10 +177,6 @@ This shows more match attempts, starting at the second subject character.
 Another optimization, described in the next section, means that there is no
 subsequent attempt to match with an empty subject.
 </P>
-<P>
-If a pattern has more than one top-level branch, automatic anchoring occurs if
-all branches are anchorable.
-</P>
 <br><b>
 Other optimizations
 </b><br>
@@ -185,9 +192,10 @@ start, and the callout is never reached. However, with "abyd", though the
 result is still no match, the callout is obeyed.
 </P>
 <P>
-PCRE2 also knows the minimum length of a matching string, and will immediately
-give a "no match" return without actually running a match if the subject is not
-long enough, or, for unanchored patterns, if it has been scanned far enough.
+For most patterns PCRE2 also knows the minimum length of a matching string, and
+will immediately give a "no match" return without actually running a match if
+the subject is not long enough, or, for unanchored patterns, if it has been
+scanned far enough.
 </P>
 <P>
 You can disable these optimizations by passing the PCRE2_NO_START_OPTIMIZE
@@ -198,18 +206,20 @@ callouts such as the example above are obeyed.
 <br><a name="SEC4" href="#TOC1">THE CALLOUT INTERFACE</a><br>
 <P>
 During matching, when PCRE2 reaches a callout point, if an external function is
-set in the match context, it is called. This applies to both normal and DFA
-matching. The first argument to the callout function is a pointer to a
-<b>pcre2_callout</b> block. The second argument is the void * callout data that
-was supplied when the callout was set up by calling <b>pcre2_set_callout()</b>
-(see the
+provided in the match context, it is called. This applies to both normal,
+DFA, and JIT matching. The first argument to the callout function is a pointer
+to a <b>pcre2_callout</b> block. The second argument is the void * callout data
+that was supplied when the callout was set up by calling
+<b>pcre2_set_callout()</b> (see the
 <a href="pcre2api.html"><b>pcre2api</b></a>
-documentation). The callout block structure contains the following fields:
+documentation). The callout block structure contains the following fields, not
+necessarily in this order:
 <pre>
   uint32_t      <i>version</i>;
   uint32_t      <i>callout_number</i>;
   uint32_t      <i>capture_top</i>;
   uint32_t      <i>capture_last</i>;
+  uint32_t      <i>callout_flags</i>;
   PCRE2_SIZE   *<i>offset_vector</i>;
   PCRE2_SPTR    <i>mark</i>;
   PCRE2_SPTR    <i>subject</i>;
@@ -223,11 +233,12 @@ documentation). The callout block structure contains the following fields:
   PCRE2_SPTR    <i>callout_string</i>;
 </pre>
 The <i>version</i> field contains the version number of the block format. The
-current version is 1; the three callout string fields were added for this
-version. If you are writing an application that might use an earlier release of
-PCRE2, you should check the version number before accessing any of these
-fields. The version number will increase in future if more fields are added,
-but the intention is never to remove any of the existing fields.
+current version is 2; the three callout string fields were added for version 1,
+and the <i>callout_flags</i> field for version 2. If you are writing an
+application that might use an earlier release of PCRE2, you should check the
+version number before accessing any of these fields. The version number will
+increase in future if more fields are added, but the intention is never to
+remove any of the existing fields.
 </P>
 <br><b>
 Fields for numerical callouts
@@ -235,8 +246,8 @@ Fields for numerical callouts
 <P>
 For a numerical callout, <i>callout_string</i> is NULL, and <i>callout_number</i>
 contains the number of the callout, in the range 0-255. This is the number
-that follows (?C for manual callouts; it is 255 for automatically generated
-callouts.
+that follows (?C for callouts that part of the pattern; it is 255 for
+automatically generated callouts.
 </P>
 <br><b>
 Fields for string callouts
@@ -267,12 +278,42 @@ The remaining fields in the callout block are the same for both kinds of
 callout.
 </P>
 <P>
-The <i>offset_vector</i> field is a pointer to the vector of capturing offsets
-(the "ovector") that was passed to the matching function in the match data
-block. When <b>pcre2_match()</b> is used, the contents can be inspected in
+The <i>offset_vector</i> field is a pointer to a vector of capturing offsets
+(the "ovector"). You may read the elements in this vector, but you must not
+change any of them.
+</P>
+<P>
+For calls to <b>pcre2_match()</b>, the <i>offset_vector</i> field is not (since
+release 10.30) a pointer to the actual ovector that was passed to the matching
+function in the match data block. Instead it points to an internal ovector of a
+size large enough to hold all possible captured substrings in the pattern. Note
+that whenever a recursion or subroutine call within a pattern completes, the
+capturing state is reset to what it was before.
+</P>
+<P>
+The <i>capture_last</i> field contains the number of the most recently captured
+substring, and the <i>capture_top</i> field contains one more than the number of
+the highest numbered captured substring so far. If no substrings have yet been
+captured, the value of <i>capture_last</i> is 0 and the value of
+<i>capture_top</i> is 1. The values of these fields do not always differ by one;
+for example, when the callout in the pattern ((a)(b))(?C2) is taken,
+<i>capture_last</i> is 1 but <i>capture_top</i> is 4.
+</P>
+<P>
+The contents of ovector[2] to ovector[&#60;capture_top&#62;*2-1] can be inspected in
 order to extract substrings that have been matched so far, in the same way as
-for extracting substrings after a match has completed. For the DFA matching
-function, this field is not useful.
+extracting substrings after a match has completed. The values in ovector[0] and
+ovector[1] are always PCRE2_UNSET because the match is by definition not
+complete. Substrings that have not been captured but whose numbers are less
+than <i>capture_top</i> also have both of their ovector slots set to
+PCRE2_UNSET.
+</P>
+<P>
+For DFA matching, the <i>offset_vector</i> field points to the ovector that was
+passed to the matching function in the match data block, but it holds no useful
+information at callout time because <b>pcre2_dfa_match()</b> does not support
+substring capturing. The value of <i>capture_top</i> is always 1 and the value
+of <i>capture_last</i> is always 0 for DFA matching.
 </P>
 <P>
 The <i>subject</i> and <i>subject_length</i> fields contain copies of the values
@@ -291,29 +332,20 @@ The <i>current_position</i> field contains the offset within the subject of the
 current match pointer.
 </P>
 <P>
-When the <b>pcre2_match()</b> is used, the <i>capture_top</i> field contains one
-more than the number of the highest numbered captured substring so far. If no
-substrings have been captured, the value of <i>capture_top</i> is one. This is
-always the case when the DFA functions are used, because they do not support
-captured substrings.
-</P>
-<P>
-The <i>capture_last</i> field contains the number of the most recently captured
-substring. However, when a recursion exits, the value reverts to what it was
-outside the recursion, as do the values of all captured substrings. If no
-substrings have been captured, the value of <i>capture_last</i> is 0. This is
-always the case for the DFA matching functions.
-</P>
-<P>
 The <i>pattern_position</i> field contains the offset in the pattern string to
 the next item to be matched.
 </P>
 <P>
 The <i>next_item_length</i> field contains the length of the next item to be
-matched in the pattern string. When the callout immediately precedes an
-alternation bar, a closing parenthesis, or the end of the pattern, the length
-is zero. When the callout precedes an opening parenthesis, the length is that
-of the entire subpattern.
+processed in the pattern string. When the callout is at the end of the pattern,
+the length is zero. When the callout precedes an opening parenthesis, the
+length includes meta characters that follow the parenthesis. For example, in a
+callout before an assertion such as (?=ab) the length is 3. For an an
+alternation bar or a closing parenthesis, the length is one, unless a closing
+parenthesis is followed by a quantifier, in which case its length is included.
+(This changed in release 10.23. In earlier releases, before an opening
+parenthesis the length was that of the entire subpattern, and before an
+alternation bar or a closing parenthesis the length was zero.)
 </P>
 <P>
 The <i>pattern_position</i> and <i>next_item_length</i> fields are intended to
@@ -329,6 +361,36 @@ the zero-terminated name of the most recently passed (*MARK), (*PRUNE), or
 of (*PRUNE) or (*THEN) without a name do not obliterate a previous (*MARK). In
 callouts from the DFA matching function this field always contains NULL.
 </P>
+<P>
+The <i>callout_flags</i> field is always zero in callouts from
+<b>pcre2_dfa_match()</b> or when JIT is being used. When <b>pcre2_match()</b>
+without JIT is used, the following bits may be set:
+<pre>
+  PCRE2_CALLOUT_STARTMATCH
+</pre>
+This is set for the first callout after the start of matching for each new
+starting position in the subject.
+<pre>
+  PCRE2_CALLOUT_BACKTRACK
+</pre>
+This is set if there has been a matching backtrack since the previous callout,
+or since the start of matching if this is the first callout from a
+<b>pcre2_match()</b> run.
+</P>
+<P>
+Both bits are set when a backtrack has caused a "bumpalong" to a new starting
+position in the subject. Output from <b>pcre2test</b> does not indicate the
+presence of these bits unless the <b>callout_extra</b> modifier is set.
+</P>
+<P>
+The information in the <b>callout_flags</b> field is provided so that
+applications can track and tell their users how matching with backtracking is
+done. This can be useful when trying to optimize patterns, or just to
+understand how PCRE2 works. There is no support in <b>pcre2_dfa_match()</b>
+because there is no backtracking in DFA matching, and there is no support in
+JIT because JIT is all about maximimizing matching performance. In both these
+cases the <b>callout_flags</b> field is always zero.
+</P>
 <br><a name="SEC5" href="#TOC1">RETURN VALUES FROM CALLOUTS</a><br>
 <P>
 The external callout function returns an integer to PCRE2. If the value is
@@ -399,9 +461,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC8" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 23 March 2015
+Last updated: 22 December 2017
 <br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2compat.html b/doc/html/pcre2compat.html
index 3b29e6f..e6d2e7e 100644
--- a/doc/html/pcre2compat.html
+++ b/doc/html/pcre2compat.html
@@ -18,7 +18,8 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
 <P>
 This document describes the differences in the ways that PCRE2 and Perl handle
 regular expressions. The differences described here are with respect to Perl
-versions 5.10 and above.
+versions 5.26, but as both Perl and PCRE2 are continually changing, the
+information may sometimes be out of date.
 </P>
 <P>
 1. PCRE2 has only a subset of Perl's Unicode support. Details of what it does
@@ -27,17 +28,18 @@ have are given in the
 page.
 </P>
 <P>
-2. PCRE2 allows repeat quantifiers only on parenthesized assertions, but they
-do not mean what you might think. For example, (?!a){3} does not assert that
-the next three characters are not "a". It just asserts that the next character
-is not "a" three times (in principle: PCRE2 optimizes this to run the assertion
-just once). Perl allows repeat quantifiers on other assertions such as \b, but
-these do not seem to have any use.
+2. Like Perl, PCRE2 allows repeat quantifiers on parenthesized assertions, but
+they do not mean what you might think. For example, (?!a){3} does not assert
+that the next three characters are not "a". It just asserts that the next
+character is not "a" three times (in principle: PCRE2 optimizes this to run the
+assertion just once). Perl allows some repeat quantifiers on other assertions,
+for example, \b* (but not \b{3}), but these do not seem to have any use.
 </P>
 <P>
-3. Capturing subpatterns that occur inside negative lookahead assertions are
-counted, but their entries in the offsets vector are never set. Perl sometimes
-(but not always) sets its numerical variables from inside negative assertions.
+3. Capturing subpatterns that occur inside negative lookaround assertions are
+counted, but their entries in the offsets vector are set only when a negative
+assertion is a condition that has a matching branch (that is, the condition is
+false).
 </P>
 <P>
 4. The following Perl escape sequences are not supported: \l, \u, \L,
@@ -50,13 +52,13 @@ generated by default. However, if the PCRE2_ALT_BSUX option is set,
 </P>
 <P>
 5. The Perl escape sequences \p, \P, and \X are supported only if PCRE2 is
-built with Unicode support. The properties that can be tested with \p and \P
-are limited to the general category properties such as Lu and Nd, script names
-such as Greek or Han, and the derived properties Any and L&. PCRE2 does support
-the Cs (surrogate) property, which Perl does not; the Perl documentation says
-"Because Perl hides the need for the user to understand the internal
-representation of Unicode characters, there is no need to implement the
-somewhat messy concept of surrogates."
+built with Unicode support (the default). The properties that can be tested
+with \p and \P are limited to the general category properties such as Lu and
+Nd, script names such as Greek or Han, and the derived properties Any and L&.
+PCRE2 does support the Cs (surrogate) property, which Perl does not; the Perl
+documentation says "Because Perl hides the need for the user to understand the
+internal representation of Unicode characters, there is no need to implement
+the somewhat messy concept of surrogates."
 </P>
 <P>
 6. PCRE2 does support the \Q...\E escape for quoting substrings. Characters
@@ -75,23 +77,15 @@ The \Q...\E sequence is recognized both inside and outside character classes.
 </P>
 <P>
 7. Fairly obviously, PCRE2 does not support the (?{code}) and (??{code})
-constructions. However, there is support for recursive patterns. This is not
-available in Perl 5.8, but it is in Perl 5.10. Also, the PCRE2 "callout"
-feature allows an external function to be called during pattern matching. See
-the
+constructions. However, there is support PCRE2's "callout" feature, which
+allows an external function to be called during pattern matching. See the
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
 documentation for details.
 </P>
 <P>
-8. Subroutine calls (whether recursive or not) are treated as atomic groups.
-Atomic recursion is like Python, but unlike Perl. Captured values that are set
-outside a subroutine call can be referenced from inside in PCRE2, but not in
-Perl. There is a discussion that explains these differences in more detail in
-the
-<a href="pcre2pattern.html#recursiondifference">section on recursion differences from Perl</a>
-in the
-<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
-page.
+8. Subroutine calls (whether recursive or not) were treated as atomic groups up
+to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
+into subroutine calls is now supported, as in Perl.
 </P>
 <P>
 9. If any of the backtracking control verbs are used in a subpattern that is
@@ -107,7 +101,7 @@ processed as anchored at the point where they are tested.
 one that is backtracked onto acts. For example, in the pattern
 A(*COMMIT)B(*PRUNE)C a failure in B triggers (*COMMIT), but a failure in C
 triggers (*PRUNE). Perl's behaviour is more complex; in many cases it is the
-same as PCRE2, but there are examples where it differs.
+same as PCRE2, but there are cases where it differs.
 </P>
 <P>
 11. Most backtracking verbs in assertions have their normal actions. They are
@@ -123,7 +117,7 @@ the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
 13. PCRE2's handling of duplicate subpattern numbers and duplicate subpattern
 names is not as general as Perl's. This is a consequence of the fact the PCRE2
 works internally just with numbers, using an external table to translate
-between numbers and names. In particular, a pattern such as (?|(?&#60;a&#62;A)|(?&#60;b)B),
+between numbers and names. In particular, a pattern such as (?|(?&#60;a&#62;A)|(?&#60;b&#62;B),
 where the two capturing parentheses have the same number but different names,
 is not supported, and causes an error at compile time. If it were allowed, it
 would not be possible to distinguish which parentheses matched, because both
@@ -131,10 +125,11 @@ names map to capturing subpattern number 1. To avoid this confusing situation,
 an error is given at compile time.
 </P>
 <P>
-14. Perl recognizes comments in some places that PCRE2 does not, for example,
-between the ( and ? at the start of a subpattern. If the /x modifier is set,
-Perl allows white space between ( and ? (though current Perls warn that this is
-deprecated) but PCRE2 never does, even if the PCRE2_EXTENDED option is set.
+14. Perl used to recognize comments in some places that PCRE2 does not, for
+example, between the ( and ? at the start of a subpattern. If the /x modifier
+is set, Perl allowed white space between ( and ? though the latest Perls give
+an error (for a while it was just deprecated). There may still be some cases
+where Perl behaves differently.
 </P>
 <P>
 15. Perl, when in warning mode, gives warnings for character classes such as
@@ -146,14 +141,14 @@ certainly user mistakes.
 16. In PCRE2, the upper/lower case character properties Lu and Ll are not
 affected when case-independent matching is specified. For example, \p{Lu}
 always matches an upper case letter. I think Perl has changed in this respect;
-in the release at the time of writing (5.16), \p{Lu} and \p{Ll} match all
+in the release at the time of writing (5.24), \p{Lu} and \p{Ll} match all
 letters, regardless of case, when case independence is specified.
 </P>
 <P>
 17. PCRE2 provides some extensions to the Perl regular expression facilities.
 Perl 5.10 includes new features that are not in earlier versions of Perl, some
-of which (such as named parentheses) have been in PCRE2 for some time. This
-list is with respect to Perl 5.10:
+of which (such as named parentheses) were in PCRE2 for some time before. This
+list is with respect to Perl 5.26:
 <br>
 <br>
 (a) Although lookbehind assertions in PCRE2 must match fixed length strings,
@@ -161,43 +156,63 @@ each alternative branch of a lookbehind assertion can match a different length
 of string. Perl requires them all to have the same length.
 <br>
 <br>
-(b) If PCRE2_DOLLAR_ENDONLY is set and PCRE2_MULTILINE is not set, the $
+(b) From PCRE2 10.23, back references to groups of fixed length are supported
+in lookbehinds, provided that there is no possibility of referencing a
+non-unique number or name. Perl does not support backreferences in lookbehinds.
+<br>
+<br>
+(c) If PCRE2_DOLLAR_ENDONLY is set and PCRE2_MULTILINE is not set, the $
 meta-character matches only at the very end of the string.
 <br>
 <br>
-(c) A backslash followed by a letter with no special meaning is faulted. (Perl
+(d) A backslash followed by a letter with no special meaning is faulted. (Perl
 can be made to issue a warning.)
 <br>
 <br>
-(d) If PCRE2_UNGREEDY is set, the greediness of the repetition quantifiers is
+(e) If PCRE2_UNGREEDY is set, the greediness of the repetition quantifiers is
 inverted, that is, by default they are not greedy, but if followed by a
 question mark they are.
 <br>
 <br>
-(e) PCRE2_ANCHORED can be used at matching time to force a pattern to be tried
+(f) PCRE2_ANCHORED can be used at matching time to force a pattern to be tried
 only at the first matching position in the subject string.
 <br>
 <br>
-(f) The PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, and
-PCRE2_NO_AUTO_CAPTURE options have no Perl equivalents.
+(g) The PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY and PCRE2_NOTEMPTY_ATSTART
+options have no Perl equivalents.
 <br>
 <br>
-(g) The \R escape sequence can be restricted to match only CR, LF, or CRLF
+(h) The \R escape sequence can be restricted to match only CR, LF, or CRLF
 by the PCRE2_BSR_ANYCRLF option.
 <br>
 <br>
-(h) The callout facility is PCRE2-specific.
+(i) The callout facility is PCRE2-specific. Perl supports codeblocks and
+variable interpolation, but not general hooks on every match.
 <br>
 <br>
-(i) The partial matching facility is PCRE2-specific.
+(j) The partial matching facility is PCRE2-specific.
 <br>
 <br>
-(j) The alternative matching function (<b>pcre2_dfa_match()</b> matches in a
+(k) The alternative matching function (<b>pcre2_dfa_match()</b> matches in a
 different way and is not Perl-compatible.
 <br>
 <br>
-(k) PCRE2 recognizes some special sequences such as (*CR) at the start of
-a pattern that set overall options that cannot be changed within the pattern.
+(l) PCRE2 recognizes some special sequences such as (*CR) or (*NO_JIT) at
+the start of a pattern that set overall options that cannot be changed within
+the pattern.
+</P>
+<P>
+18. The Perl /a modifier restricts /d numbers to pure ascii, and the /aa
+modifier restricts /i case-insensitive matching to pure ascii, ignoring Unicode
+rules. This separation cannot be represented with PCRE2_UCP.
+</P>
+<P>
+19. Perl has different limits than PCRE2. See the
+<a href="pcre2limit.html"><b>pcre2limit</b></a>
+documentation for details. Perl went with 5.10 from recursion to iteration
+keeping the intermediate matches on the heap, which is ~10% slower but does not
+fall into any stack-overflow limit. PCRE2 made a similar change at release
+10.30, and also has many build-time and run-time customizable limits.
 </P>
 <br><b>
 AUTHOR
@@ -214,9 +229,9 @@ Cambridge, England.
 REVISION
 </b><br>
 <P>
-Last updated: 15 March 2015
+Last updated: 18 April 2017
 <br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2convert.html b/doc/html/pcre2convert.html
new file mode 100644
index 0000000..8b4d87f
--- /dev/null
+++ b/doc/html/pcre2convert.html
@@ -0,0 +1,190 @@
+<html>
+<head>
+<title>pcre2convert specification</title>
+</head>
+<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
+<h1>pcre2convert man page</h1>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
+<p>
+This page is part of the PCRE2 HTML documentation. It was generated
+automatically from the original man page. If there is any nonsense in it,
+please consult the man page, in case the conversion went wrong.
+<br>
+<ul>
+<li><a name="TOC1" href="#SEC1">EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a>
+<li><a name="TOC2" href="#SEC2">THE CONVERT CONTEXT</a>
+<li><a name="TOC3" href="#SEC3">THE CONVERSION FUNCTION</a>
+<li><a name="TOC4" href="#SEC4">CONVERTING GLOBS</a>
+<li><a name="TOC5" href="#SEC5">CONVERTING POSIX PATTERNS</a>
+<li><a name="TOC6" href="#SEC6">AUTHOR</a>
+<li><a name="TOC7" href="#SEC7">REVISION</a>
+</ul>
+<br><a name="SEC1" href="#TOC1">EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a><br>
+<P>
+This document describes a set of functions that can be used to convert
+"foreign" patterns into PCRE2 regular expressions. This facility is currently
+experimental, and may be changed in future releases. Two kinds of pattern,
+globs and POSIX patterns, are supported.
+</P>
+<br><a name="SEC2" href="#TOC1">THE CONVERT CONTEXT</a><br>
+<P>
+<b>pcre2_convert_context *pcre2_convert_context_create(</b>
+<b>  pcre2_general_context *<i>gcontext</i>);</b>
+<br>
+<br>
+<b>pcre2_convert_context *pcre2_convert_context_copy(</b>
+<b>  pcre2_convert_context *<i>cvcontext</i>);</b>
+<br>
+<br>
+<b>void pcre2_convert_context_free(pcre2_convert_context *<i>cvcontext</i>);</b>
+<br>
+<br>
+<b>int pcre2_set_glob_escape(pcre2_convert_context *<i>cvcontext</i>,</b>
+<b>  uint32_t <i>escape_char</i>);</b>
+<br>
+<br>
+<b>int pcre2_set_glob_separator(pcre2_convert_context *<i>cvcontext</i>,</b>
+<b>  uint32_t <i>separator_char</i>);</b>
+<br>
+<br>
+A convert context is used to hold parameters that affect the way that pattern
+conversion works. Like all PCRE2 contexts, you need to use a context only if
+you want to override the defaults. There are the usual create, copy, and free
+functions. If custom memory management functions are set in a general context
+that is passed to <b>pcre2_convert_context_create()</b>, they are used for all
+memory management within the conversion functions.
+</P>
+<P>
+There are only two parameters in the convert context at present. Both apply
+only to glob conversions. The escape character defaults to grave accent under
+Windows, otherwise backslash. It can be set to zero, meaning no escape
+character, or to any punctuation character with a code point less than 256.
+The separator character defaults to backslash under Windows, otherwise forward
+slash. It can be set to forward slash, backslash, or dot.
+</P>
+<P>
+The two setting functions return zero on success, or PCRE2_ERROR_BADDATA if
+their second argument is invalid.
+</P>
+<br><a name="SEC3" href="#TOC1">THE CONVERSION FUNCTION</a><br>
+<P>
+<b>int pcre2_pattern_convert(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
+<b>  uint32_t <i>options</i>, PCRE2_UCHAR **<i>buffer</i>,</b>
+<b>  PCRE2_SIZE *<i>blength</i>, pcre2_convert_context *<i>cvcontext</i>);</b>
+<br>
+<br>
+<b>void pcre2_converted_pattern_free(PCRE2_UCHAR *<i>converted_pattern</i>);</b>
+<br>
+<br>
+The first two arguments of <b>pcre2_pattern_convert()</b> define the foreign
+pattern that is to be converted. The length may be given as
+PCRE2_ZERO_TERMINATED. The <b>options</b> argument defines how the pattern is to
+be processed. If the input is UTF, the PCRE2_CONVERT_UTF option should be set.
+PCRE2_CONVERT_NO_UTF_CHECK may also be set if you are sure the input is valid.
+One or more of the glob options, or one of the following POSIX options must be
+set to define the type of conversion that is required:
+<pre>
+  PCRE2_CONVERT_GLOB
+  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
+  PCRE2_CONVERT_GLOB_NO_STARSTAR
+  PCRE2_CONVERT_POSIX_BASIC
+  PCRE2_CONVERT_POSIX_EXTENDED
+</pre>
+Details of the conversions are given below. The <b>buffer</b> and <b>blength</b>
+arguments define how the output is handled:
+</P>
+<P>
+If <b>buffer</b> is NULL, the function just returns the length of the converted
+pattern via <b>blength</b>. This is one less than the length of buffer needed,
+because a terminating zero is always added to the output.
+</P>
+<P>
+If <b>buffer</b> points to a NULL pointer, an output buffer is obtained using
+the allocator in the context or <b>malloc()</b> if no context is supplied. A
+pointer to this buffer is placed in the variable to which <b>buffer</b> points.
+When no longer needed the output buffer must be freed by calling
+<b>pcre2_converted_pattern_free()</b>.
+</P>
+<P>
+If <b>buffer</b> points to a non-NULL pointer, <b>blength</b> must be set to the
+actual length of the buffer provided (in code units).
+</P>
+<P>
+In all cases, after successful conversion, the variable pointed to by
+<b>blength</b> is updated to the length actually used (in code units), excluding
+the terminating zero that is always added.
+</P>
+<P>
+If an error occurs, the length (via <b>blength</b>) is set to the offset
+within the input pattern where the error was detected. Only gross syntax errors
+are caught; there are plenty of errors that will get passed on for
+<b>pcre2_compile()</b> to discover.
+</P>
+<P>
+The return from <b>pcre2_pattern_convert()</b> is zero on success or a non-zero
+PCRE2 error code. Note that PCRE2 error codes may be positive or negative:
+<b>pcre2_compile()</b> uses mostly positive codes and <b>pcre2_match()</b>
+negative ones; <b>pcre2_convert()</b> uses existing codes of both kinds. A
+textual error message can be obtained by calling
+<b>pcre2_get_error_message()</b>.
+</P>
+<br><a name="SEC4" href="#TOC1">CONVERTING GLOBS</a><br>
+<P>
+Globs are used to match file names, and consequently have the concept of a
+"path separator", which defaults to backslash under Windows and forward slash
+otherwise. If PCRE2_CONVERT_GLOB is set, the wildcards * and ? are not
+permitted to match separator characters, but the double-star (**) feature
+(which does match separators) is supported.
+</P>
+<P>
+PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR matches globs with wildcards allowed to
+match separator characters. PCRE2_GLOB_NO_STARSTAR matches globs with the
+double-star feature disabled. These options may be given together.
+</P>
+<br><a name="SEC5" href="#TOC1">CONVERTING POSIX PATTERNS</a><br>
+<P>
+POSIX defines two kinds of regular expression pattern: basic and extended.
+These can be processed by setting PCRE2_CONVERT_POSIX_BASIC or
+PCRE2_CONVERT_POSIX_EXTENDED, respectively.
+</P>
+<P>
+In POSIX patterns, backslash is not special in a character class. Unmatched
+closing parentheses are treated as literals.
+</P>
+<P>
+In basic patterns, ? + | {} and () must be escaped to be recognized
+as metacharacters outside a character class. If the first character in the
+pattern is * it is treated as a literal. ^ is a metacharacter only at the start
+of a branch.
+</P>
+<P>
+In extended patterns, a backslash not in a character class always
+makes the next character literal, whatever it is. There are no backreferences.
+</P>
+<P>
+Note: POSIX mandates that the longest possible match at the first matching
+position must be found. This is not what <b>pcre2_match()</b> does; it yields
+the first match that is found. An application can use <b>pcre2_dfa_match()</b>
+to find the longest match, but that does not support backreferences (but then
+neither do POSIX extended patterns).
+</P>
+<br><a name="SEC6" href="#TOC1">AUTHOR</a><br>
+<P>
+Philip Hazel
+<br>
+University Computing Service
+<br>
+Cambridge, England.
+<br>
+</P>
+<br><a name="SEC7" href="#TOC1">REVISION</a><br>
+<P>
+Last updated: 12 July 2017
+<br>
+Copyright &copy; 1997-2017 University of Cambridge.
+<br>
+<p>
+Return to the <a href="index.html">PCRE2 index page</a>.
+</p>
diff --git a/doc/html/pcre2demo.html b/doc/html/pcre2demo.html
index d64e16b..72754d3 100644
--- a/doc/html/pcre2demo.html
+++ b/doc/html/pcre2demo.html
@@ -228,6 +228,21 @@ pcre2_match_data_create_from_pattern() above. */
 if (rc == 0)
   printf("ovector was not big enough for all the captured substrings\n");
 
+/* We must guard against patterns such as /(?=.\K)/ that use \K in an assertion
+to set the start of a match later than its end. In this demonstration program,
+we just detect this case and give up. */
+
+if (ovector[0] &gt; ovector[1])
+  {
+  printf("\\K was used in an assertion to set the match start after its end.\n"
+    "From end to start the match was: %.*s\n", (int)(ovector[0] - ovector[1]),
+      (char *)(subject + ovector[1]));
+  printf("Run abandoned\n");
+  pcre2_match_data_free(match_data);
+  pcre2_code_free(re);
+  return 1;
+  }
+
 /* Show substrings stored in the output vector by number. Obviously, in a real
 application you might want to do things other than print them. */
 
@@ -355,6 +370,29 @@ for (;;)
     options = PCRE2_NOTEMPTY_ATSTART | PCRE2_ANCHORED;
     }
 
+  /* If the previous match was not an empty string, there is one tricky case to
+  consider. If a pattern contains \K within a lookbehind assertion at the
+  start, the end of the matched string can be at the offset where the match
+  started. Without special action, this leads to a loop that keeps on matching
+  the same substring. We must detect this case and arrange to move the start on
+  by one character. The pcre2_get_startchar() function returns the starting
+  offset that was passed to pcre2_match(). */
+
+  else
+    {
+    PCRE2_SIZE startchar = pcre2_get_startchar(match_data);
+    if (start_offset &lt;= startchar)
+      {
+      if (startchar &gt;= subject_length) break;   /* Reached end of subject.   */
+      start_offset = startchar + 1;             /* Advance by one character. */
+      if (utf8)                                 /* If UTF-8, it may be more  */
+        {                                       /*   than one code unit.     */
+        for (; start_offset &lt; subject_length; start_offset++)
+          if ((subject[start_offset] &amp; 0xc0) != 0x80) break;
+        }
+      }
+    }
+
   /* Run the next matching operation */
 
   rc = pcre2_match(
@@ -419,6 +457,21 @@ for (;;)
   if (rc == 0)
     printf("ovector was not big enough for all the captured substrings\n");
 
+  /* We must guard against patterns such as /(?=.\K)/ that use \K in an
+  assertion to set the start of a match later than its end. In this
+  demonstration program, we just detect this case and give up. */
+
+  if (ovector[0] &gt; ovector[1])
+    {
+    printf("\\K was used in an assertion to set the match start after its end.\n"
+      "From end to start the match was: %.*s\n", (int)(ovector[0] - ovector[1]),
+        (char *)(subject + ovector[1]));
+    printf("Run abandoned\n");
+    pcre2_match_data_free(match_data);
+    pcre2_code_free(re);
+    return 1;
+    }
+
   /* As before, show substrings stored in the output vector by number, and then
   also any named substrings. */
 
diff --git a/doc/html/pcre2grep.html b/doc/html/pcre2grep.html
index d02d365..625a467 100644
--- a/doc/html/pcre2grep.html
+++ b/doc/html/pcre2grep.html
@@ -22,7 +22,7 @@ please consult the man page, in case the conversion went wrong.
 <li><a name="TOC7" href="#SEC7">NEWLINES</a>
 <li><a name="TOC8" href="#SEC8">OPTIONS COMPATIBILITY</a>
 <li><a name="TOC9" href="#SEC9">OPTIONS WITH DATA</a>
-<li><a name="TOC10" href="#SEC10">CALLING EXTERNAL SCRIPTS</a>
+<li><a name="TOC10" href="#SEC10">USING PCRE2'S CALLOUT FACILITY</a>
 <li><a name="TOC11" href="#SEC11">MATCHING ERRORS</a>
 <li><a name="TOC12" href="#SEC12">DIAGNOSTICS</a>
 <li><a name="TOC13" href="#SEC13">SEE ALSO</a>
@@ -80,11 +80,19 @@ span line boundaries. What defines a line boundary is controlled by the
 </P>
 <P>
 The amount of memory used for buffering files that are being scanned is
-controlled by a parameter that can be set by the <b>--buffer-size</b> option.
-The default value for this parameter is specified when <b>pcre2grep</b> is
-built, with the default default being 20K. A block of memory three times this
-size is used (to allow for buffering "before" and "after" lines). An error
-occurs if a line overflows the buffer.
+controlled by parameters that can be set by the <b>--buffer-size</b> and
+<b>--max-buffer-size</b> options. The first of these sets the size of buffer
+that is obtained at the start of processing. If an input file contains very
+long lines, a larger buffer may be needed; this is handled by automatically
+extending the buffer, up to the limit specified by <b>--max-buffer-size</b>. The
+default values for these parameters are specified when <b>pcre2grep</b> is
+built, with the default defaults being 20K and 1M respectively. An error occurs
+if a line is too long and the buffer can no longer be expanded.
+</P>
+<P>
+The block of memory that is actually used is three times the "buffer size", to
+allow for buffering "before" and "after" lines. If the buffer size is too
+small, fewer than requested "before" and "after" lines may be output.
 </P>
 <P>
 Patterns can be no longer than 8K or BUFSIZ bytes, whichever is the greater.
@@ -125,23 +133,27 @@ The <b>--locale</b> option can be used to override this.
 <br><a name="SEC3" href="#TOC1">SUPPORT FOR COMPRESSED FILES</a><br>
 <P>
 It is possible to compile <b>pcre2grep</b> so that it uses <b>libz</b> or
-<b>libbz2</b> to read files whose names end in <b>.gz</b> or <b>.bz2</b>,
-respectively. You can find out whether your binary has support for one or both
-of these file types by running it with the <b>--help</b> option. If the
-appropriate support is not present, files are treated as plain text. The
-standard input is always so treated.
+<b>libbz2</b> to read compressed files whose names end in <b>.gz</b> or
+<b>.bz2</b>, respectively. You can find out whether your <b>pcre2grep</b> binary
+has support for one or both of these file types by running it with the
+<b>--help</b> option. If the appropriate support is not present, all files are
+treated as plain text. The standard input is always so treated. When input is
+from a compressed .gz or .bz2 file, the <b>--line-buffered</b> option is
+ignored.
 </P>
 <br><a name="SEC4" href="#TOC1">BINARY FILES</a><br>
 <P>
 By default, a file that contains a binary zero byte within the first 1024 bytes
-is identified as a binary file, and is processed specially. (GNU grep also
-identifies binary files in this manner.) See the <b>--binary-files</b> option
-for a means of changing the way binary files are handled.
+is identified as a binary file, and is processed specially. (GNU grep
+identifies binary files in this manner.) However, if the newline type is
+specified as "nul", that is, the line terminator is a binary zero, the test for
+a binary file is not applied. See the <b>--binary-files</b> option for a means
+of changing the way binary files are handled.
 </P>
 <br><a name="SEC5" href="#TOC1">OPTIONS</a><br>
 <P>
 The order in which some of the options appear can affect the output. For
-example, both the <b>-h</b> and <b>-l</b> options affect the printing of file
+example, both the <b>-H</b> and <b>-l</b> options affect the printing of file
 names. Whichever comes later in the command line will be the one that takes
 effect. Similarly, except where noted below, if an option is given twice, the
 later setting is used. Numerical values for options may be followed by K or M,
@@ -155,12 +167,13 @@ processing of patterns and file names that start with hyphens.
 </P>
 <P>
 <b>-A</b> <i>number</i>, <b>--after-context=</b><i>number</i>
-Output <i>number</i> lines of context after each matching line. If file names
-and/or line numbers are being output, a hyphen separator is used instead of a
-colon for the context lines. A line containing "--" is output between each
-group of lines, unless they are in fact contiguous in the input file. The value
-of <i>number</i> is expected to be relatively small. However, <b>pcre2grep</b>
-guarantees to have up to 8K of following text available for context output.
+Output up to <i>number</i> lines of context after each matching line. Fewer
+lines are output if the next match or the end of the file is reached, or if the
+processing buffer size has been set too small. If file names and/or line
+numbers are being output, a hyphen separator is used instead of a colon for the
+context lines. A line containing "--" is output between each group of lines,
+unless they are in fact contiguous in the input file. The value of <i>number</i>
+is expected to be relatively small. When <b>-c</b> is used, <b>-A</b> is ignored.
 </P>
 <P>
 <b>-a</b>, <b>--text</b>
@@ -169,12 +182,14 @@ Treat binary files as text. This is equivalent to
 </P>
 <P>
 <b>-B</b> <i>number</i>, <b>--before-context=</b><i>number</i>
-Output <i>number</i> lines of context before each matching line. If file names
-and/or line numbers are being output, a hyphen separator is used instead of a
-colon for the context lines. A line containing "--" is output between each
-group of lines, unless they are in fact contiguous in the input file. The value
-of <i>number</i> is expected to be relatively small. However, <b>pcre2grep</b>
-guarantees to have up to 8K of preceding text available for context output.
+Output up to <i>number</i> lines of context before each matching line. Fewer
+lines are output if the previous match or the start of the file is within
+<i>number</i> lines, or if the processing buffer size has been set too small. If
+file names and/or line numbers are being output, a hyphen separator is used
+instead of a colon for the context lines. A line containing "--" is output
+between each group of lines, unless they are in fact contiguous in the input
+file. The value of <i>number</i> is expected to be relatively small. When
+<b>-c</b> is used, <b>-B</b> is ignored.
 </P>
 <P>
 <b>--binary-files=</b><i>word</i>
@@ -191,8 +206,9 @@ return code.
 </P>
 <P>
 <b>--buffer-size=</b><i>number</i>
-Set the parameter that controls how much memory is used for buffering files
-that are being scanned.
+Set the parameter that controls how much memory is obtained at the start of
+processing for buffering files that are being scanned. See also
+<b>--max-buffer-size</b> below.
 </P>
 <P>
 <b>-C</b> <i>number</i>, <b>--context=</b><i>number</i>
@@ -202,14 +218,16 @@ This is equivalent to setting both <b>-A</b> and <b>-B</b> to the same value.
 <P>
 <b>-c</b>, <b>--count</b>
 Do not output lines from the files that are being scanned; instead output the
-number of matches (or non-matches if <b>-v</b> is used) that would otherwise
-have caused lines to be shown. By default, this count is the same as the number
-of suppressed lines, but if the <b>-M</b> (multiline) option is used (without
-<b>-v</b>), there may be more suppressed lines than the number of matches.
+number of lines that would have been shown, either because they matched, or, if
+<b>-v</b> is set, because they failed to match. By default, this count is
+exactly the same as the number of lines that would have been output, but if the
+<b>-M</b> (multiline) option is used (without <b>-v</b>), there may be more
+suppressed lines than the count (that is, the number of matches).
 <br>
 <br>
 If no lines are selected, the number zero is output. If several files are are
-being scanned, a count is output for each of them. However, if the
+being scanned, a count is output for each of them and the <b>-t</b> option can
+be used to cause a total to be output at the end. However, if the
 <b>--files-with-matches</b> option is also used, only those files whose counts
 are greater than zero are listed. When <b>-c</b> is used, the <b>-A</b>,
 <b>-B</b>, and <b>-C</b> options are ignored.
@@ -231,12 +249,23 @@ because <b>pcre2grep</b> has to search for all possible matches in a line, not
 just one, in order to colour them all.
 <br>
 <br>
-The colour that is used can be specified by setting the environment variable
-PCRE2GREP_COLOUR or PCRE2GREP_COLOR. The value of this variable should be a
-string of two numbers, separated by a semicolon. They are copied directly into
-the control string for setting colour on a terminal, so it is your
-responsibility to ensure that they make sense. If neither of the environment
-variables is set, the default is "1;31", which gives red.
+The colour that is used can be specified by setting one of the environment
+variables PCRE2GREP_COLOUR, PCRE2GREP_COLOR, PCREGREP_COLOUR, or
+PCREGREP_COLOR, which are checked in that order. If none of these are set,
+<b>pcre2grep</b> looks for GREP_COLORS or GREP_COLOR (in that order). The value
+of the variable should be a string of two numbers, separated by a semicolon,
+except in the case of GREP_COLORS, which must start with "ms=" or "mt="
+followed by two semicolon-separated colours, terminated by the end of the
+string or by a colon. If GREP_COLORS does not start with "ms=" or "mt=" it is
+ignored, and GREP_COLOR is checked.
+<br>
+<br>
+If the string obtained from one of the above variables contains any characters
+other than semicolon or digits, the setting is ignored and the default colour
+is used. The string is copied directly into the control string for setting
+colour on a terminal, so it is your responsibility to ensure that the values
+make sense. If no relevant environment variable is set, the default is "1;31",
+which gives red.
 </P>
 <P>
 <b>-D</b> <i>action</i>, <b>--devices=</b><i>action</i>
@@ -255,6 +284,10 @@ operating systems the effect of reading a directory like this is an immediate
 end-of-file; in others it may provoke an error.
 </P>
 <P>
+<b>--depth-limit</b>=<i>number</i>
+See <b>--match-limit</b> below.
+</P>
+<P>
 <b>-e</b> <i>pattern</i>, <b>--regex=</b><i>pattern</i>, <b>--regexp=</b><i>pattern</i>
 Specify a pattern to be matched. This option can be used multiple times in
 order to specify several patterns. It can also be used as a way of specifying a
@@ -321,18 +354,18 @@ files; it does not apply to patterns specified by any of the <b>--include</b> or
 </P>
 <P>
 <b>-f</b> <i>filename</i>, <b>--file=</b><i>filename</i>
-Read patterns from the file, one per line, and match them against
-each line of input. What constitutes a newline when reading the file is the
-operating system's default. The <b>--newline</b> option has no effect on this
-option. Trailing white space is removed from each line, and blank lines are
-ignored. An empty file contains no patterns and therefore matches nothing. See
-also the comments about multiple patterns versus a single pattern with
-alternatives in the description of <b>-e</b> above.
-<br>
-<br>
-If this option is given more than once, all the specified files are
-read. A data line is output if any of the patterns match it. A file name can
-be given as "-" to refer to the standard input. When <b>-f</b> is used, patterns
+Read patterns from the file, one per line, and match them against each line of
+input. What constitutes a newline when reading the file is the operating
+system's default. The <b>--newline</b> option has no effect on this option.
+Trailing white space is removed from each line, and blank lines are ignored. An
+empty file contains no patterns and therefore matches nothing. See also the
+comments about multiple patterns versus a single pattern with alternatives in
+the description of <b>-e</b> above.
+<br>
+<br>
+If this option is given more than once, all the specified files are read. A
+data line is output if any of the patterns match it. A file name can be given
+as "-" to refer to the standard input. When <b>-f</b> is used, patterns
 specified on the command line using <b>-e</b> may also be present; they are
 tested before the file's patterns. However, no other pattern is taken from the
 command line; all arguments are treated as the names of paths to be searched.
@@ -355,8 +388,8 @@ Instead of showing lines or parts of lines that match, show each match as an
 offset from the start of the file and a length, separated by a comma. In this
 mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b>
 options are ignored. If there is more than one match in a line, each of them is
-shown separately. This option is mutually exclusive with <b>--line-offsets</b>
-and <b>--only-matching</b>.
+shown separately. This option is mutually exclusive with <b>--output</b>,
+<b>--line-offsets</b>, and <b>--only-matching</b>.
 </P>
 <P>
 <b>-H</b>, <b>--with-filename</b>
@@ -365,14 +398,20 @@ searching a single file. By default, the file name is not shown in this case.
 For matching lines, the file name is followed by a colon; for context lines, a
 hyphen separator is used. If a line number is also being output, it follows the
 file name. When the <b>-M</b> option causes a pattern to match more than one
-line, only the first is preceded by the file name.
+line, only the first is preceded by the file name. This option overrides any
+previous <b>-h</b>, <b>-l</b>, or <b>-L</b> options.
 </P>
 <P>
 <b>-h</b>, <b>--no-filename</b>
 Suppress the output file names when searching multiple files. By default,
 file names are shown when multiple files are searched. For matching lines, the
 file name is followed by a colon; for context lines, a hyphen separator is used.
-If a line number is also being output, it follows the file name.
+If a line number is also being output, it follows the file name. This option
+overrides any previous <b>-H</b>, <b>-L</b>, or <b>-l</b> options.
+</P>
+<P>
+<b>--heap-limit</b>=<i>number</i>
+See <b>--match-limit</b> below.
 </P>
 <P>
 <b>--help</b>
@@ -425,17 +464,19 @@ given any number of times. If a directory matches both <b>--include-dir</b> and
 <b>-L</b>, <b>--files-without-match</b>
 Instead of outputting lines from the files, just output the names of the files
 that do not contain any lines that would have been output. Each file name is
-output once, on a separate line.
+output once, on a separate line. This option overrides any previous <b>-H</b>,
+<b>-h</b>, or <b>-l</b> options.
 </P>
 <P>
 <b>-l</b>, <b>--files-with-matches</b>
 Instead of outputting lines from the files, just output the names of the files
-containing lines that would have been output. Each file name is output
-once, on a separate line. Searching normally stops as soon as a matching line
-is found in a file. However, if the <b>-c</b> (count) option is also used,
-matching continues in order to obtain the correct count, and those files that
-have at least one match are listed along with their counts. Using this option
-with <b>-c</b> is a way of suppressing the listing of files with no matches.
+containing lines that would have been output. Each file name is output once, on
+a separate line. Searching normally stops as soon as a matching line is found
+in a file. However, if the <b>-c</b> (count) option is also used, matching
+continues in order to obtain the correct count, and those files that have at
+least one match are listed along with their counts. Using this option with
+<b>-c</b> is a way of suppressing the listing of files with no matches. This
+opeion overrides any previous <b>-H</b>, <b>-h</b>, or <b>-L</b> options.
 </P>
 <P>
 <b>--label</b>=<i>name</i>
@@ -445,14 +486,16 @@ short form for this option.
 </P>
 <P>
 <b>--line-buffered</b>
-When this option is given, input is read and processed line by line, and the
-output is flushed after each write. By default, input is read in large chunks,
-unless <b>pcre2grep</b> can determine that it is reading from a terminal (which
-is currently possible only in Unix-like environments). Output to terminal is
-normally automatically flushed by the operating system. This option can be
-useful when the input or output is attached to a pipe and you do not want
-<b>pcre2grep</b> to buffer up large amounts of data. However, its use will
-affect performance, and the <b>-M</b> (multiline) option ceases to work.
+When this option is given, non-compressed input is read and processed line by
+line, and the output is flushed after each write. By default, input is read in
+large chunks, unless <b>pcre2grep</b> can determine that it is reading from a
+terminal (which is currently possible only in Unix-like environments). Output
+to terminal is normally automatically flushed by the operating system. This
+option can be useful when the input or output is attached to a pipe and you do
+not want <b>pcre2grep</b> to buffer up large amounts of data. However, its use
+will affect performance, and the <b>-M</b> (multiline) option ceases to work.
+When input is from a compressed .gz or .bz2 file, <b>--line-buffered</b> is
+ignored.
 </P>
 <P>
 <b>--line-offsets</b>
@@ -462,7 +505,8 @@ number is terminated by a colon (as usual; see the <b>-n</b> option), and the
 offset and length are separated by a comma. In this mode, no context is shown.
 That is, the <b>-A</b>, <b>-B</b>, and <b>-C</b> options are ignored. If there is
 more than one match in a line, each of them is shown separately. This option is
-mutually exclusive with <b>--file-offsets</b> and <b>--only-matching</b>.
+mutually exclusive with <b>--output</b>, <b>--file-offsets</b>, and
+<b>--only-matching</b>.
 </P>
 <P>
 <b>--locale</b>=<i>locale-name</i>
@@ -473,51 +517,57 @@ used. There is no short form for this option.
 </P>
 <P>
 <b>--match-limit</b>=<i>number</i>
-Processing some regular expression patterns can require a very large amount of
-memory, leading in some cases to a program crash if not enough is available.
-Other patterns may take a very long time to search for all possible matching
-strings. The <b>pcre2_match()</b> function that is called by <b>pcre2grep</b> to
-do the matching has two parameters that can limit the resources that it uses.
+Processing some regular expression patterns may take a very long time to search
+for all possible matching strings. Others may require a very large amount of
+memory. There are three options that set resource limits for matching.
+<br>
+<br>
+The <b>--match-limit</b> option provides a means of limiting computing resource
+usage when processing patterns that are not going to match, but which have a
+very large number of possibilities in their search trees. The classic example
+is a pattern that uses nested unlimited repeats. Internally, PCRE2 has a
+counter that is incremented each time around its main processing loop. If the
+value set by <b>--match-limit</b> is reached, an error occurs.
 <br>
 <br>
-The <b>--match-limit</b> option provides a means of limiting resource usage
-when processing patterns that are not going to match, but which have a very
-large number of possibilities in their search trees. The classic example is a
-pattern that uses nested unlimited repeats. Internally, PCRE2 uses a function
-called <b>match()</b> which it calls repeatedly (sometimes recursively). The
-limit set by <b>--match-limit</b> is imposed on the number of times this
-function is called during a match, which has the effect of limiting the amount
-of backtracking that can take place.
+The <b>--heap-limit</b> option specifies, as a number of kilobytes, the amount
+of heap memory that may be used for matching. Heap memory is needed only if
+matching the pattern requires a significant number of nested backtracking
+points to be remembered. This parameter can be set to zero to forbid the use of
+heap memory altogether.
 <br>
 <br>
-The <b>--recursion-limit</b> option is similar to <b>--match-limit</b>, but
-instead of limiting the total number of times that <b>match()</b> is called, it
-limits the depth of recursive calls, which in turn limits the amount of memory
-that can be used. The recursion depth is a smaller number than the total number
-of calls, because not all calls to <b>match()</b> are recursive. This limit is
-of use only if it is set smaller than <b>--match-limit</b>.
+The <b>--depth-limit</b> option limits the depth of nested backtracking points,
+which indirectly limits the amount of memory that is used. The amount of memory
+needed for each backtracking point depends on the number of capturing
+parentheses in the pattern, so the amount of memory that is used before this
+limit acts varies from pattern to pattern. This limit is of use only if it is
+set smaller than <b>--match-limit</b>.
 <br>
 <br>
 There are no short forms for these options. The default settings are specified
-when the PCRE2 library is compiled, with the default default being 10 million.
+when the PCRE2 library is compiled, with the default defaults being very large
+and so effectively unlimited.
+</P>
+<P>
+\fB--max-buffer-size=<i>number</i>
+This limits the expansion of the processing buffer, whose initial size can be
+set by <b>--buffer-size</b>. The maximum buffer size is silently forced to be no
+smaller than the starting buffer size.
 </P>
 <P>
 <b>-M</b>, <b>--multiline</b>
-Allow patterns to match more than one line. When this option is given, patterns
-may usefully contain literal newline characters and internal occurrences of ^
-and $ characters. The output for a successful match may consist of more than
-one line. The first is the line in which the match started, and the last is the
-line in which the match ended. If the matched string ends with a newline
-sequence the output ends at the end of that line.
-<br>
-<br>
-When this option is set, the PCRE2 library is called in "multiline" mode. This
-allows a matched string to extend past the end of a line and continue on one or
-more subsequent lines. However, <b>pcre2grep</b> still processes the input line
-by line. Once a match has been handled, scanning restarts at the beginning of
-the next line, just as it does when <b>-M</b> is not present. This means that it
-is possible for the second or subsequent lines in a multiline match to be
-output again as part of another match.
+Allow patterns to match more than one line. When this option is set, the PCRE2
+library is called in "multiline" mode. This allows a matched string to extend
+past the end of a line and continue on one or more subsequent lines. Patterns
+used with <b>-M</b> may usefully contain literal newline characters and internal
+occurrences of ^ and $ characters. The output for a successful match may
+consist of more than one line. The first line is the line in which the match
+started, and the last line is the line in which the match ended. If the matched
+string ends with a newline sequence, the output ends at the end of that line.
+If <b>-v</b> is set, none of the lines in a multi-line match are output. Once a
+match has been handled, scanning restarts at the beginning of the line after
+the one in which the match ended.
 <br>
 <br>
 The newline sequence that separates multiple lines must be matched as part of
@@ -533,11 +583,8 @@ well as possibly handling a two-character newline sequence.
 <br>
 <br>
 There is a limit to the number of lines that can be matched, imposed by the way
-that <b>pcre2grep</b> buffers the input file as it scans it. However,
-<b>pcre2grep</b> ensures that at least 8K characters or the rest of the file
-(whichever is the shorter) are available for forward matching, and similarly
-the previous 8K characters (or all the previous characters, if fewer than 8K)
-are guaranteed to be available for lookbehind assertions. The <b>-M</b> option
+that <b>pcre2grep</b> buffers the input file as it scans it. With a sufficiently
+large processing buffer, this should not be a problem, but the <b>-M</b> option
 does not work when input is read line by line (see \fP--line-buffered\fP.)
 </P>
 <P>
@@ -581,16 +628,47 @@ use of JIT at run time. It is provided for testing and working round problems.
 It should never be needed in normal use.
 </P>
 <P>
+<b>-O</b> <i>text</i>, <b>--output</b>=<i>text</i>
+When there is a match, instead of outputting the whole line that matched,
+output just the given text. This option is mutually exclusive with
+<b>--only-matching</b>, <b>--file-offsets</b>, and <b>--line-offsets</b>. Escape
+sequences starting with a dollar character may be used to insert the contents
+of the matched part of the line and/or captured substrings into the text.
+<br>
+<br>
+$&#60;digits&#62; or ${&#60;digits&#62;} is replaced by the captured
+substring of the given decimal number; zero substitutes the whole match. If
+the number is greater than the number of capturing substrings, or if the
+capture is unset, the replacement is empty.
+<br>
+<br>
+$a is replaced by bell; $b by backspace; $e by escape; $f by form feed; $n by
+newline; $r by carriage return; $t by tab; $v by vertical tab.
+<br>
+<br>
+$o&#60;digits&#62; is replaced by the character represented by the given octal
+number; up to three digits are processed.
+<br>
+<br>
+$x&#60;digits&#62; is replaced by the character represented by the given hexadecimal
+number; up to two digits are processed.
+<br>
+<br>
+Any other character is substituted by itself. In particular, $$ is replaced by
+a single dollar.
+</P>
+<P>
 <b>-o</b>, <b>--only-matching</b>
 Show only the part of the line that matched a pattern instead of the whole
 line. In this mode, no context is shown. That is, the <b>-A</b>, <b>-B</b>, and
 <b>-C</b> options are ignored. If there is more than one match in a line, each
-of them is shown separately. If <b>-o</b> is combined with <b>-v</b> (invert the
-sense of the match to find non-matching lines), no output is generated, but the
-return code is set appropriately. If the matched portion of the line is empty,
-nothing is output unless the file name or line number are being printed, in
-which case they are shown on an otherwise empty line. This option is mutually
-exclusive with <b>--file-offsets</b> and <b>--line-offsets</b>.
+of them is shown separately, on a separate line of output. If <b>-o</b> is
+combined with <b>-v</b> (invert the sense of the match to find non-matching
+lines), no output is generated, but the return code is set appropriately. If
+the matched portion of the line is empty, nothing is output unless the file
+name or line number are being printed, in which case they are shown on an
+otherwise empty line. This option is mutually exclusive with <b>--output</b>,
+<b>--file-offsets</b> and <b>--line-offsets</b>.
 </P>
 <P>
 <b>-o</b><i>number</i>, <b>--only-matching</b>=<i>number</i>
@@ -599,15 +677,16 @@ given number. Up to 32 capturing parentheses are supported, and -o0 is
 equivalent to <b>-o</b> without a number. Because these options can be given
 without an argument (see above), if an argument is present, it must be given in
 the same shell item, for example, -o3 or --only-matching=2. The comments given
-for the non-argument case above also apply to this case. If the specified
+for the non-argument case above also apply to this option. If the specified
 capturing parentheses do not exist in the pattern, or were not set in the
 match, nothing is output unless the file name or line number are being output.
 <br>
 <br>
-If this option is given multiple times, multiple substrings are output, in the
-order the options are given. For example, -o3 -o1 -o3 causes the substrings
-matched by capturing parentheses 3 and 1 and then 3 again to be output. By
-default, there is no separator (but see the next option).
+If this option is given multiple times, multiple substrings are output for each
+match, in the order the options are given, and all on one line. For example,
+-o3 -o1 -o3 causes the substrings matched by capturing parentheses 3 and 1 and
+then 3 again to be output. By default, there is no separator (but see the next
+option).
 </P>
 <P>
 <b>--om-separator</b>=<i>text</i>
@@ -638,6 +717,18 @@ quietly skipped. However, the return code is still 2, even if matches were
 found in other files.
 </P>
 <P>
+<b>-t</b>, <b>--total-count</b>
+This option is useful when scanning more than one file. If used on its own,
+<b>-t</b> suppresses all output except for a grand total number of matching
+lines (or non-matching lines if <b>-v</b> is used) in all the files. If <b>-t</b>
+is used with <b>-c</b>, a grand total is output except when the previous output
+is just one line. In other words, it is not output when just one file's count
+is listed. If file names are being output, the grand total is preceded by
+"TOTAL:". Otherwise, it appears as just another number. The <b>-t</b> option is
+ignored when used with <b>-L</b> (list files without matches), because the grand
+total would always be zero.
+</P>
+<P>
 <b>-u</b>, <b>--utf-8</b>
 Operate in UTF-8 mode. This option is available only if PCRE2 has been compiled
 with UTF-8 support. All patterns (including those for any <b>--exclude</b> and
@@ -657,17 +748,19 @@ the patterns are the ones that are found.
 </P>
 <P>
 <b>-w</b>, <b>--word-regex</b>, <b>--word-regexp</b>
-Force the patterns to match only whole words. This is equivalent to having \b
-at the start and end of the pattern. This option applies only to the patterns
-that are matched against the contents of files; it does not apply to patterns
-specified by any of the <b>--include</b> or <b>--exclude</b> options.
+Force the patterns only to match "words". That is, there must be a word
+boundary at the start and end of each matched string. This is equivalent to
+having "\b(?:" at the start of each pattern, and ")\b" at the end. This
+option applies only to the patterns that are matched against the contents of
+files; it does not apply to patterns specified by any of the <b>--include</b> or
+<b>--exclude</b> options.
 </P>
 <P>
 <b>-x</b>, <b>--line-regex</b>, <b>--line-regexp</b>
-Force the patterns to be anchored (each must start matching at the beginning of
-a line) and in addition, require them to match entire lines. This is equivalent
-to having ^ and $ characters at the start and end of each alternative top-level
-branch in every pattern. This option applies only to the patterns that are
+Force the patterns to start matching only at the beginnings of lines, and in
+addition, require them to match entire lines. In multiline mode the match may
+be more than one line. This is equivalent to having "^(?:" at the start of each
+pattern and ")$" at the end. This option applies only to the patterns that are
 matched against the contents of files; it does not apply to patterns specified
 by any of the <b>--include</b> or <b>--exclude</b> options.
 </P>
@@ -696,10 +789,11 @@ relying on the C I/O library to convert this to an appropriate sequence.
 Many of the short and long forms of <b>pcre2grep</b>'s options are the same
 as in the GNU <b>grep</b> program. Any long option of the form
 <b>--xxx-regexp</b> (GNU terminology) is also available as <b>--xxx-regex</b>
-(PCRE2 terminology). However, the <b>--file-list</b>, <b>--file-offsets</b>,
-<b>--include-dir</b>, <b>--line-offsets</b>, <b>--locale</b>, <b>--match-limit</b>,
-<b>-M</b>, <b>--multiline</b>, <b>-N</b>, <b>--newline</b>, <b>--om-separator</b>,
-<b>--recursion-limit</b>, <b>-u</b>, and <b>--utf-8</b> options are specific to
+(PCRE2 terminology). However, the <b>--depth-limit</b>, <b>--file-list</b>,
+<b>--file-offsets</b>, <b>--heap-limit</b>, <b>--include-dir</b>,
+<b>--line-offsets</b>, <b>--locale</b>, <b>--match-limit</b>, <b>-M</b>,
+<b>--multiline</b>, <b>-N</b>, <b>--newline</b>, <b>--om-separator</b>,
+<b>--output</b>, <b>-u</b>, and <b>--utf-8</b> options are specific to
 <b>pcre2grep</b>, as is the use of the <b>--only-matching</b> option with a
 capturing parentheses number.
 </P>
@@ -742,23 +836,30 @@ The exceptions to the above are the <b>--colour</b> (or <b>--color</b>) and
 options does have data, it must be given in the first form, using an equals
 character. Otherwise <b>pcre2grep</b> will assume that it has no data.
 </P>
-<br><a name="SEC10" href="#TOC1">CALLING EXTERNAL SCRIPTS</a><br>
+<br><a name="SEC10" href="#TOC1">USING PCRE2'S CALLOUT FACILITY</a><br>
 <P>
-On non-Windows systems, <b>pcre2grep</b> has, by default, support for calling
-external programs or scripts during matching by making use of PCRE2's callout
-facility. However, this support can be disabled when <b>pcre2grep</b> is built.
-You can find out whether your binary has support for callouts by running it
-with the <b>--help</b> option. If the support is not enabled, all callouts in
+<b>pcre2grep</b> has, by default, support for calling external programs or
+scripts or echoing specific strings during matching by making use of PCRE2's
+callout facility. However, this support can be disabled when <b>pcre2grep</b> is
+built. You can find out whether your binary has support for callouts by running
+it with the <b>--help</b> option. If the support is not enabled, all callouts in
 patterns are ignored by <b>pcre2grep</b>.
 </P>
 <P>
 A callout in a PCRE2 pattern is of the form (?C&#60;arg&#62;) where the argument is
 either a number or a quoted string (see the
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
-documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>.
-String arguments are parsed as a list of substrings separated by pipe (vertical
-bar) characters. The first substring must be an executable name, with the
-following substrings specifying arguments:
+documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>;
+only callouts with string arguments are useful.
+</P>
+<br><b>
+Calling external programs or scripts
+</b><br>
+<P>
+If the callout string does not start with a pipe (vertical bar) character, it
+is parsed into a list of substrings separated by pipe characters. The first
+substring must be an executable name, with the following substrings specifying
+arguments:
 <pre>
   executable_name|arg1|arg2|...
 </pre>
@@ -792,6 +893,19 @@ callout to be ignored. If running the program fails for any reason (including
 the non-existence of the executable), a local matching failure occurs and the
 matcher backtracks in the normal way.
 </P>
+<br><b>
+Echoing a specific string
+</b><br>
+<P>
+If the callout string starts with a pipe (vertical bar) character, the rest of
+the string is written to the output, having been passed through the same escape
+processing as text from the --output option. This provides a simple echoing
+facility that avoids calling an external program or script. No terminator is
+added to the string, so if you want a newline, you must include it explicitly.
+Matching continues normally after the string is output. If you want to see only
+the callout output but not any output from an actual match, you should end the
+relevant pattern with (*FAIL).
+</P>
 <br><a name="SEC11" href="#TOC1">MATCHING ERRORS</a><br>
 <P>
 It is possible to supply a regular expression that takes a very long time to
@@ -804,9 +918,9 @@ there are more than 20 such errors, <b>pcre2grep</b> gives up.
 </P>
 <P>
 The <b>--match-limit</b> option of <b>pcre2grep</b> can be used to set the
-overall resource limit; there is a second option called <b>--recursion-limit</b>
-that sets a limit on the amount of memory (usually stack) that is used (see the
-discussion of these options above).
+overall resource limit. There are also other limits that affect the amount of
+memory used during matching; see the discussion of <b>--heap-limit</b> and
+<b>--depth-limit</b> above.
 </P>
 <br><a name="SEC12" href="#TOC1">DIAGNOSTICS</a><br>
 <P>
@@ -816,6 +930,10 @@ matches were found in other files) or too many matching errors. Using the
 <b>-s</b> option to suppress error messages about inaccessible files does not
 affect the return code.
 </P>
+<P>
+When run under VMS, the return code is placed in the symbol PCRE2GREP_RC
+because VMS does not distinguish between exit(0) and exit(1).
+</P>
 <br><a name="SEC13" href="#TOC1">SEE ALSO</a><br>
 <P>
 <b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3), <b>pcre2callout</b>(3).
@@ -831,9 +949,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC15" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 19 June 2016
+Last updated: 13 November 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2jit.html b/doc/html/pcre2jit.html
index 4a6d4ff..c53d3d9 100644
--- a/doc/html/pcre2jit.html
+++ b/doc/html/pcre2jit.html
@@ -173,7 +173,7 @@ below for a discussion of JIT stack usage.
 The error code PCRE2_ERROR_MATCHLIMIT is returned by the JIT code if searching
 a very large pattern tree goes on for too long, as it is in the same
 circumstance when JIT is not used, but the details of exactly what is counted
-are not the same. The PCRE2_ERROR_RECURSIONLIMIT error code is never returned
+are not the same. The PCRE2_ERROR_DEPTHLIMIT error code is never returned
 when JIT matching is used.
 <a name="stackcontrol"></a></P>
 <br><a name="SEC6" href="#TOC1">CONTROLLING THE JIT STACK</a><br>
@@ -194,12 +194,8 @@ allocation functions, or NULL for standard memory allocation). It returns a
 pointer to an opaque structure of type <b>pcre2_jit_stack</b>, or NULL if there
 is an error. The <b>pcre2_jit_stack_free()</b> function is used to free a stack
 that is no longer needed. (For the technically minded: the address space is
-allocated by mmap or VirtualAlloc.)
-</P>
-<P>
-JIT uses far less memory for recursion than the interpretive code,
-and a maximum stack size of 512K to 1M should be more than enough for any
-pattern.
+allocated by mmap or VirtualAlloc.) A maximum stack size of 512K to 1M should
+be more than enough for any pattern.
 </P>
 <P>
 The <b>pcre2_jit_stack_assign()</b> function specifies which stack JIT code
@@ -436,9 +432,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC13" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 05 June 2016
+Last updated: 31 March 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2limits.html b/doc/html/pcre2limits.html
index e227a30..640fe3d 100644
--- a/doc/html/pcre2limits.html
+++ b/doc/html/pcre2limits.html
@@ -44,14 +44,6 @@ integer type, usually defined as size_t. Its maximum value (that is
 and unset offsets.
 </P>
 <P>
-Note that when using the traditional matching function, PCRE2 uses recursion to
-handle subpatterns and indefinite repetition. This means that the available
-stack space may limit the size of a subject string that can be processed by
-certain patterns. For a discussion of stack issues, see the
-<a href="pcre2stack.html"><b>pcre2stack</b></a>
-documentation.
-</P>
-<P>
 All values in repeating quantifiers must be less than 65536.
 </P>
 <P>
@@ -61,14 +53,10 @@ The maximum length of a lookbehind assertion is 65535 characters.
 There is no limit to the number of parenthesized subpatterns, but there can be
 no more than 65535 capturing subpatterns. There is, however, a limit to the
 depth of nesting of parenthesized subpatterns of all kinds. This is imposed in
-order to limit the amount of system stack used at compile time. The limit can
-be specified when PCRE2 is built; the default is 250.
-</P>
-<P>
-There is a limit to the number of forward references to subsequent subpatterns
-of around 200,000. Repeated forward references with fixed upper limits, for
-example, (?2){0,100} when subpattern number 2 is to the right, are included in
-the count. There is no limit to the number of backward references.
+order to limit the amount of system stack used at compile time. The default
+limit can be specified when PCRE2 is built; the default default is 250. An
+application can change this limit by calling pcre2_set_parens_nest_limit() to
+set the limit in a compile context.
 </P>
 <P>
 The maximum length of name for a named subpattern is 32 code units, and the
@@ -76,7 +64,12 @@ maximum number of named subpatterns is 10000.
 </P>
 <P>
 The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
-is 255 for the 8-bit library and 65535 for the 16-bit and 32-bit libraries.
+is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
+32-bit libraries.
+</P>
+<P>
+The maximum length of a string argument to a callout is the largest number a
+32-bit unsigned integer can hold.
 </P>
 <br><b>
 AUTHOR
@@ -93,9 +86,9 @@ Cambridge, England.
 REVISION
 </b><br>
 <P>
-Last updated: 05 November 2015
+Last updated: 30 March 2017
 <br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2pattern.html b/doc/html/pcre2pattern.html
index 797690a..c495cba 100644
--- a/doc/html/pcre2pattern.html
+++ b/doc/html/pcre2pattern.html
@@ -170,35 +170,54 @@ the application to apply the JIT optimization by calling
 <b>pcre2_jit_compile()</b> is ignored.
 </P>
 <br><b>
-Setting match and recursion limits
+Setting match resource limits
 </b><br>
 <P>
-The caller of <b>pcre2_match()</b> can set a limit on the number of times the
-internal <b>match()</b> function is called and on the maximum depth of
-recursive calls. These facilities are provided to catch runaway matches that
-are provoked by patterns with huge matching trees (a typical example is a
-pattern with nested unlimited repeats) and to avoid running out of system stack
-by too much recursion. When one of these limits is reached, <b>pcre2_match()</b>
-gives an error return. The limits can also be set by items at the start of the
-pattern of the form
+The pcre2_match() function contains a counter that is incremented every time it
+goes round its main loop. The caller of <b>pcre2_match()</b> can set a limit on
+this counter, which therefore limits the amount of computing resource used for
+a match. The maximum depth of nested backtracking can also be limited; this
+indirectly restricts the amount of heap memory that is used, but there is also
+an explicit memory limit that can be set.
+</P>
+<P>
+These facilities are provided to catch runaway matches that are provoked by
+patterns with huge matching trees (a typical example is a pattern with nested
+unlimited repeats applied to a long string that does not match). When one of
+these limits is reached, <b>pcre2_match()</b> gives an error return. The limits
+can also be set by items at the start of the pattern of the form
 <pre>
+  (*LIMIT_HEAP=d)
   (*LIMIT_MATCH=d)
-  (*LIMIT_RECURSION=d)
+  (*LIMIT_DEPTH=d)
 </pre>
 where d is any number of decimal digits. However, the value of the setting must
 be less than the value set (or defaulted) by the caller of <b>pcre2_match()</b>
 for it to have any effect. In other words, the pattern writer can lower the
 limits set by the programmer, but not raise them. If there is more than one
 setting of one of these limits, the lower value is used.
+</P>
+<P>
+Prior to release 10.30, LIMIT_DEPTH was called LIMIT_RECURSION. This name is
+still recognized for backwards compatibility.
+</P>
+<P>
+The heap limit applies only when the <b>pcre2_match()</b> interpreter is used
+for matching. It does not apply to JIT or DFA matching. The match limit is used
+(but in a different way) when JIT is being used, or when
+<b>pcre2_dfa_match()</b> is called, to limit computing resource usage by those
+matching functions. The depth limit is ignored by JIT but is relevant for DFA
+matching, which uses function recursion for recursions within the pattern. In
+this case, the depth limit controls the amount of system stack that is used.
 <a name="newlines"></a></P>
 <br><b>
 Newline conventions
 </b><br>
 <P>
-PCRE2 supports five different conventions for indicating line breaks in
+PCRE2 supports six different conventions for indicating line breaks in
 strings: a single CR (carriage return) character, a single LF (linefeed)
-character, the two-character sequence CRLF, any of the three preceding, or any
-Unicode newline sequence. The
+character, the two-character sequence CRLF, any of the three preceding, any
+Unicode newline sequence, or the NUL character (binary zero). The
 <a href="pcre2api.html"><b>pcre2api</b></a>
 page has
 <a href="pcre2api.html#newlines">further discussion</a>
@@ -207,13 +226,14 @@ about newlines, and shows how to set the newline convention when calling
 </P>
 <P>
 It is also possible to specify a newline convention by starting a pattern
-string with one of the following five sequences:
+string with one of the following sequences:
 <pre>
   (*CR)        carriage return
   (*LF)        linefeed
   (*CRLF)      carriage return, followed by linefeed
   (*ANYCRLF)   any of the three above
   (*ANY)       all Unicode newline sequences
+  (*NUL)       the NUL character (binary zero)
 </pre>
 These override the default and the options given to the compiling function. For
 example, on a Unix system where LF is the default newline sequence, the pattern
@@ -229,8 +249,8 @@ The newline convention affects where the circumflex and dollar assertions are
 true. It also affects the interpretation of the dot metacharacter when
 PCRE2_DOTALL is not set, and the behaviour of \N. However, it does not affect
 what the \R escape sequence matches. By default, this is any Unicode newline
-sequence, for Perl compatibility. However, this can be changed; see the
-description of \R in the section entitled
+sequence, for Perl compatibility. However, this can be changed; see the next
+section and the description of \R in the section entitled
 <a href="#newlineseq">"Newline sequences"</a>
 below. A change of \R setting can be combined with a change of newline
 convention.
@@ -248,7 +268,7 @@ corresponding to PCRE2_BSR_UNICODE.
 <br><a name="SEC3" href="#TOC1">EBCDIC CHARACTER CODES</a><br>
 <P>
 PCRE2 can be compiled to run in an environment that uses EBCDIC as its
-character code rather than ASCII or Unicode (typically a mainframe system). In
+character code instead of ASCII or Unicode (typically a mainframe system). In
 the sections below, character code values are ASCII or Unicode; in an EBCDIC
 environment these characters may have different code values, and there are no
 code points greater than 255.
@@ -312,11 +332,11 @@ that character may have. This use of backslash as an escape character applies
 both inside and outside character classes.
 </P>
 <P>
-For example, if you want to match a * character, you write \* in the pattern.
-This escaping action applies whether or not the following character would
-otherwise be interpreted as a metacharacter, so it is always safe to precede a
-non-alphanumeric with backslash to specify that it stands for itself. In
-particular, if you want to match a backslash, you write \\.
+For example, if you want to match a * character, you must write \* in the
+pattern. This escaping action applies whether or not the following character
+would otherwise be interpreted as a metacharacter, so it is always safe to
+precede a non-alphanumeric with backslash to specify that it stands for itself.
+In particular, if you want to match a backslash, you write \\.
 </P>
 <P>
 In a UTF mode, only ASCII numbers and letters have any special meaning after a
@@ -347,7 +367,7 @@ An isolated \E that is not preceded by \Q is ignored. If \Q is not followed
 by \E later in the pattern, the literal interpretation continues to the end of
 the pattern (that is, \E is assumed at the end). If the isolated \Q is inside
 a character class, this causes an error, because the character class is not
-terminated.
+terminated by a closing square bracket.
 <a name="digitsafterbackslash"></a></P>
 <br><b>
 Non-printing characters
@@ -379,32 +399,31 @@ case letter, it is converted to upper case. Then bit 6 of the character (hex
 40) is inverted. Thus \cA to \cZ become hex 01 to hex 1A (A is 41, Z is 5A),
 but \c{ becomes hex 3B ({ is 7B), and \c; becomes hex 7B (; is 3B). If the
 code unit following \c has a value less than 32 or greater than 126, a
-compile-time error occurs. This locks out non-printable ASCII characters in all
-modes.
+compile-time error occurs.
 </P>
 <P>
 When PCRE2 is compiled in EBCDIC mode, \a, \e, \f, \n, \r, and \t
 generate the appropriate EBCDIC code values. The \c escape is processed
 as specified for Perl in the <b>perlebcdic</b> document. The only characters
 that are allowed after \c are A-Z, a-z, or one of @, [, \, ], ^, _, or ?. Any
-other character provokes a compile-time error. The sequence \@ encodes
-character code 0; the letters (in either case) encode characters 1-26 (hex 01
-to hex 1A); [, \, ], ^, and _ encode characters 27-31 (hex 1B to hex 1F), and
-\? becomes either 255 (hex FF) or 95 (hex 5F).
+other character provokes a compile-time error. The sequence \c@ encodes
+character code 0; after \c the letters (in either case) encode characters 1-26
+(hex 01 to hex 1A); [, \, ], ^, and _ encode characters 27-31 (hex 1B to hex
+1F), and \c? becomes either 255 (hex FF) or 95 (hex 5F).
 </P>
 <P>
-Thus, apart from \?, these escapes generate the same character code values as
+Thus, apart from \c?, these escapes generate the same character code values as
 they do in an ASCII environment, though the meanings of the values mostly
-differ. For example, \G always generates code value 7, which is BEL in ASCII
+differ. For example, \cG always generates code value 7, which is BEL in ASCII
 but DEL in EBCDIC.
 </P>
 <P>
-The sequence \? generates DEL (127, hex 7F) in an ASCII environment, but
+The sequence \c? generates DEL (127, hex 7F) in an ASCII environment, but
 because 127 is not a control character in EBCDIC, Perl makes it generate the
 APC character. Unfortunately, there are several variants of EBCDIC. In most of
 them the APC character has the value 255 (hex FF), but in the one Perl calls
 POSIX-BC its value is 95 (hex 5F). If certain other characters have POSIX-BC
-values, PCRE2 makes \? generate 95; otherwise it generates 255.
+values, PCRE2 makes \c? generate 95; otherwise it generates 255.
 </P>
 <P>
 After \0 up to two further octal digits are read. If there are fewer than two
@@ -471,9 +490,9 @@ a hexadecimal digit appears between \x{ and }, or if there is no terminating
 <P>
 If the PCRE2_ALT_BSUX option is set, the interpretation of \x is as just
 described only when it is followed by two hexadecimal digits. Otherwise, it
-matches a literal "x" character. In this mode mode, support for code points
-greater than 256 is provided by \u, which must be followed by four hexadecimal
-digits; otherwise it matches a literal "u" character.
+matches a literal "x" character. In this mode, support for code points greater
+than 256 is provided by \u, which must be followed by four hexadecimal digits;
+otherwise it matches a literal "u" character.
 </P>
 <P>
 Characters whose value is less than 256 can be defined by either of the two
@@ -488,15 +507,15 @@ Constraints on character values
 Characters that are specified using octal or hexadecimal numbers are
 limited to certain values, as follows:
 <pre>
-  8-bit non-UTF mode    less than 0x100
-  8-bit UTF-8 mode      less than 0x10ffff and a valid codepoint
-  16-bit non-UTF mode   less than 0x10000
-  16-bit UTF-16 mode    less than 0x10ffff and a valid codepoint
-  32-bit non-UTF mode   less than 0x100000000
-  32-bit UTF-32 mode    less than 0x10ffff and a valid codepoint
+  8-bit non-UTF mode    no greater than 0xff
+  16-bit non-UTF mode   no greater than 0xffff
+  32-bit non-UTF mode   no greater than 0xffffffff
+  All UTF modes         no greater than 0x10ffff and a valid codepoint
 </pre>
-Invalid Unicode codepoints are the range 0xd800 to 0xdfff (the so-called
-"surrogate" codepoints), and 0xffef.
+Invalid Unicode codepoints are all those in the range 0xd800 to 0xdfff (the
+so-called "surrogate" codepoints). The check for these can be disabled by the
+caller of <b>pcre2_compile()</b> by setting the option
+PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES.
 </P>
 <br><b>
 Escape sequences in character classes
@@ -520,15 +539,15 @@ In Perl, the sequences \l, \L, \u, and \U are recognized by its string
 handler and used to modify the case of following characters. By default, PCRE2
 does not support these escape sequences. However, if the PCRE2_ALT_BSUX option
 is set, \U matches a "U" character, and \u can be used to define a character
-by code point, as described in the previous section.
+by code point, as described above.
 </P>
 <br><b>
 Absolute and relative back references
 </b><br>
 <P>
-The sequence \g followed by an unsigned or a negative number, optionally
-enclosed in braces, is an absolute or relative back reference. A named back
-reference can be coded as \g{name}. Back references are discussed
+The sequence \g followed by a signed or unsigned number, optionally enclosed
+in braces, is an absolute or relative back reference. A named back reference
+can be coded as \g{name}. Back references are discussed
 <a href="#backreferences">later,</a>
 following the discussion of
 <a href="#subpattern">parenthesized subpatterns.</a>
@@ -709,7 +728,9 @@ When PCRE2 is built with Unicode support (the default), three additional escape
 sequences that match characters with specific properties are available. In
 8-bit non-UTF-8 mode, these sequences are of course limited to testing
 characters whose codepoints are less than 256, but they do work in this mode.
-The extra escape sequences are:
+In 32-bit non-UTF mode, codepoints greater than 0x10ffff (the Unicode limit)
+may be encountered. These are all treated as being in the Common script and
+with an unassigned type. The extra escape sequences are:
 <pre>
   \p{<i>xx</i>}   a character with the <i>xx</i> property
   \P{<i>xx</i>}   a character without the <i>xx</i> property
@@ -736,6 +757,7 @@ Those that are not part of an identified script are lumped together as
 "Common". The current list of scripts is:
 </P>
 <P>
+Adlam,
 Ahom,
 Anatolian_Hieroglyphs,
 Arabic,
@@ -746,6 +768,7 @@ Bamum,
 Bassa_Vah,
 Batak,
 Bengali,
+Bhaiksuki,
 Bopomofo,
 Brahmi,
 Braille,
@@ -807,6 +830,8 @@ Mahajani,
 Malayalam,
 Mandaic,
 Manichaean,
+Marchen,
+Masaram_Gondi,
 Meetei_Mayek,
 Mende_Kikakui,
 Meroitic_Cursive,
@@ -819,7 +844,9 @@ Multani,
 Myanmar,
 Nabataean,
 New_Tai_Lue,
+Newa,
 Nko,
+Nushu,
 Ogham,
 Ol_Chiki,
 Old_Hungarian,
@@ -830,6 +857,7 @@ Old_Persian,
 Old_South_Arabian,
 Old_Turkic,
 Oriya,
+Osage,
 Osmanya,
 Pahawh_Hmong,
 Palmyrene,
@@ -847,6 +875,7 @@ Siddham,
 SignWriting,
 Sinhala,
 Sora_Sompeng,
+Soyombo,
 Sundanese,
 Syloti_Nagri,
 Syriac,
@@ -857,6 +886,7 @@ Tai_Tham,
 Tai_Viet,
 Takri,
 Tamil,
+Tangut,
 Telugu,
 Thaana,
 Thai,
@@ -866,7 +896,8 @@ Tirhuta,
 Ugaritic,
 Vai,
 Warang_Citi,
-Yi.
+Yi,
+Zanabazar_Square.
 </P>
 <P>
 Each character has exactly one Unicode general category property, specified by
@@ -972,9 +1003,12 @@ grapheme cluster", and treats the sequence as an atomic group
 <a href="#atomicgroup">(see below).</a>
 Unicode supports various kinds of composite character by giving each character
 a grapheme breaking property, and having rules that use these properties to
-define the boundaries of extended grapheme clusters. \X always matches at
-least one character. Then it decides whether to add additional characters
-according to the following rules for ending a cluster:
+define the boundaries of extended grapheme clusters. The rules are defined in
+Unicode Standard Annex 29, "Unicode Text Segmentation".
+</P>
+<P>
+\X always matches at least one character. Then it decides whether to add
+additional characters according to the following rules for ending a cluster:
 </P>
 <P>
 1. End at the end of the subject string.
@@ -989,13 +1023,27 @@ L, V, LV, or LVT character; an LV or V character may be followed by a V or T
 character; an LVT or T character may be follwed only by a T character.
 </P>
 <P>
-4. Do not end before extending characters or spacing marks. Characters with
-the "mark" property always have the "extend" grapheme breaking property.
+4. Do not end before extending characters or spacing marks or the "zero-width
+joiner" characters. Characters with the "mark" property always have the
+"extend" grapheme breaking property.
 </P>
 <P>
 5. Do not end after prepend characters.
 </P>
 <P>
+6. Do not break within emoji modifier sequences (a base character followed by a
+modifier). Extending characters are allowed before the modifier.
+</P>
+<P>
+7. Do not break within emoji zwj sequences (zero-width jointer followed by
+"glue after ZWJ" or "base glue after ZWJ").
+</P>
+<P>
+8. Do not break within emoji flag sequences. That is, do not break between
+regional indicator (RI) characters if there are an odd number of RI characters
+before the break point.
+</P>
+<P>
 6. Otherwise, end the cluster.
 <a name="extraprops"></a></P>
 <br><b>
@@ -1326,13 +1374,33 @@ whatever setting of the PCRE2_DOTALL and PCRE2_MULTILINE options is used. A
 class such as [^a] always matches one of these characters.
 </P>
 <P>
+The character escape sequences \d, \D, \h, \H, \p, \P, \s, \S, \v,
+\V, \w, and \W may appear in a character class, and add the characters that
+they match to the class. For example, [\dABCDEF] matches any hexadecimal
+digit. In UTF modes, the PCRE2_UCP option affects the meanings of \d, \s, \w
+and their upper case partners, just as it does when they appear outside a
+character class, as described in the section entitled
+<a href="#genericchartypes">"Generic character types"</a>
+above. The escape sequence \b has a different meaning inside a character
+class; it matches the backspace character. The sequences \B, \N, \R, and \X
+are not special inside a character class. Like any other unrecognized escape
+sequences, they cause an error.
+</P>
+<P>
 The minus (hyphen) character can be used to specify a range of characters in a
 character class. For example, [d-m] matches any letter between d and m,
 inclusive. If a minus character is required in a class, it must be escaped with
 a backslash or appear in a position where it cannot be interpreted as
-indicating a range, typically as the first or last character in the class, or
-immediately after a range. For example, [b-d-z] matches letters in the range b
-to d, a hyphen character, or z.
+indicating a range, typically as the first or last character in the class,
+or immediately after a range. For example, [b-d-z] matches letters in the range
+b to d, a hyphen character, or z.
+</P>
+<P>
+Perl treats a hyphen as a literal if it appears before or after a POSIX class
+(see below) or before or after a character type escape such as as \d or \H.
+However, unless the hyphen is the last character in the class, Perl outputs a
+warning in its warning mode, as this is most likely a user error. As PCRE2 has
+no facility for warning, an error is given in these cases.
 </P>
 <P>
 It is not possible to have the literal character "]" as the end character of a
@@ -1344,16 +1412,14 @@ followed by two other characters. The octal or hexadecimal representation of
 "]" can also be used to end a range.
 </P>
 <P>
-An error is generated if a POSIX character class (see below) or an escape
-sequence other than one that defines a single character appears at a point
-where a range ending character is expected. For example, [z-\xff] is valid,
-but [A-\d] and [A-[:digit:]] are not.
-</P>
-<P>
 Ranges normally include all code points between the start and end characters,
 inclusive. They can also be used for code points specified numerically, for
 example [\000-\037]. Ranges can include any characters that are valid for the
-current mode.
+current mode. In any UTF mode, the so-called "surrogate" characters (those
+whose code points lie between 0xd800 and 0xdfff inclusive) may not be specified
+explicitly by default (the PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES option disables
+this check). However, ranges such as [\x{d7ff}-\x{e000}], which include the
+surrogates, are always permitted.
 </P>
 <P>
 There is a special case in EBCDIC environments for ranges whose end points are
@@ -1372,19 +1438,6 @@ tables for a French locale are in use, [\xc8-\xcb] matches accented E
 characters in both cases.
 </P>
 <P>
-The character escape sequences \d, \D, \h, \H, \p, \P, \s, \S, \v,
-\V, \w, and \W may appear in a character class, and add the characters that
-they match to the class. For example, [\dABCDEF] matches any hexadecimal
-digit. In UTF modes, the PCRE2_UCP option affects the meanings of \d, \s, \w
-and their upper case partners, just as it does when they appear outside a
-character class, as described in the section entitled
-<a href="#genericchartypes">"Generic character types"</a>
-above. The escape sequence \b has a different meaning inside a character
-class; it matches the backspace character. The sequences \B, \N, \R, and \X
-are not special inside a character class. Like any other unrecognized escape
-sequences, they cause an error.
-</P>
-<P>
 A circumflex can conveniently be used with the upper case character types to
 specify a more restricted set of characters than the matching lower case type.
 For example, the class [^\W_] matches any letter or digit, but not underscore,
@@ -1526,20 +1579,26 @@ alternative in the subpattern.
 </P>
 <br><a name="SEC13" href="#TOC1">INTERNAL OPTION SETTING</a><br>
 <P>
-The settings of the PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL, and
-PCRE2_EXTENDED options (which are Perl-compatible) can be changed from within
-the pattern by a sequence of Perl option letters enclosed between "(?" and ")".
-The option letters are
+The settings of the PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL,
+PCRE2_EXTENDED, PCRE2_EXTENDED_MORE, and PCRE2_NO_AUTO_CAPTURE options (which
+are Perl-compatible) can be changed from within the pattern by a sequence of
+Perl option letters enclosed between "(?" and ")". The option letters are
 <pre>
   i  for PCRE2_CASELESS
   m  for PCRE2_MULTILINE
+  n  for PCRE2_NO_AUTO_CAPTURE
   s  for PCRE2_DOTALL
   x  for PCRE2_EXTENDED
+  xx for PCRE2_EXTENDED_MORE
 </pre>
 For example, (?im) sets caseless, multiline matching. It is also possible to
-unset these options by preceding the letter with a hyphen, and a combined
-setting and unsetting such as (?im-sx), which sets PCRE2_CASELESS and
-PCRE2_MULTILINE while unsetting PCRE2_DOTALL and PCRE2_EXTENDED, is also
+unset these options by preceding the letter with a hyphen. The two "extended"
+options are not independent; unsetting either one cancels the effects of both
+of them.
+</P>
+<P>
+A combined setting and unsetting such as (?im-sx), which sets PCRE2_CASELESS
+and PCRE2_MULTILINE while unsetting PCRE2_DOTALL and PCRE2_EXTENDED, is also
 permitted. If a letter appears both before and after the hyphen, the option is
 unset. An empty options setting "(?)" is allowed. Needless to say, it has no
 effect.
@@ -1552,13 +1611,8 @@ respectively.
 <P>
 When one of these option changes occurs at top level (that is, not inside
 subpattern parentheses), the change applies to the remainder of the pattern
-that follows. If the change is placed right at the start of a pattern, PCRE2
-extracts it into the global options (and it will therefore show up in data
-extracted by the <b>pcre2_pattern_info()</b> function).
-</P>
-<P>
-An option change within a subpattern (see below for a description of
-subpatterns) affects only that part of the subpattern that follows it, so
+that follows. An option change within a subpattern (see below for a description
+of subpatterns) affects only that part of the subpattern that follows it, so
 <pre>
   (a(?i)b)c
 </pre>
@@ -2093,9 +2147,9 @@ subpattern is possible using named parentheses (see below).
 </P>
 <P>
 Another way of avoiding the ambiguity inherent in the use of digits following a
-backslash is to use the \g escape sequence. This escape must be followed by an
-unsigned number or a negative number, optionally enclosed in braces. These
-examples are all identical:
+backslash is to use the \g escape sequence. This escape must be followed by a
+signed or unsigned number, optionally enclosed in braces. These examples are
+all identical:
 <pre>
   (ring), \1
   (ring), \g1
@@ -2103,8 +2157,7 @@ examples are all identical:
 </pre>
 An unsigned number specifies an absolute reference without the ambiguity that
 is present in the older syntax. It is also useful when literal digits follow
-the reference. A negative number is a relative reference. Consider this
-example:
+the reference. A signed number is a relative reference. Consider this example:
 <pre>
   (abc(def)ghi)\g{-1}
 </pre>
@@ -2115,6 +2168,11 @@ can be helpful in long patterns, and also in patterns that are created by
 joining together fragments that contain references within themselves.
 </P>
 <P>
+The sequence \g{+1} is a reference to the next capturing subpattern. This kind
+of forward reference can be useful it patterns that repeat. Perl does not
+support the use of + in this way.
+</P>
+<P>
 A back reference matches whatever actually matched the capturing subpattern in
 the current subject string, rather than anything matching the subpattern
 itself (see
@@ -2203,15 +2261,27 @@ coded as \b, \B, \A, \G, \Z, \z, ^ and $ are described
 <P>
 More complicated assertions are coded as subpatterns. There are two kinds:
 those that look ahead of the current position in the subject string, and those
-that look behind it. An assertion subpattern is matched in the normal way,
-except that it does not cause the current matching position to be changed.
+that look behind it, and in each case an assertion may be positive (must
+succeed for matching to continue) or negative (must not succeed for matching to
+continue). An assertion subpattern is matched in the normal way, except that,
+when matching continues afterwards, the matching position in the subject string
+is as it was at the start of the assertion.
 </P>
 <P>
-Assertion subpatterns are not capturing subpatterns. If such an assertion
-contains capturing subpatterns within it, these are counted for the purposes of
+Assertion subpatterns are not capturing subpatterns. If an assertion contains
+capturing subpatterns within it, these are counted for the purposes of
 numbering the capturing subpatterns in the whole pattern. However, substring
-capturing is carried out only for positive assertions. (Perl sometimes, but not
-always, does do capturing in negative assertions.)
+capturing is carried out only for positive assertions that succeed, that is,
+one of their branches matches, so matching continues after the assertion. If
+all branches of a positive assertion fail to match, nothing is captured, and
+control is passed to the previous backtracking point.
+</P>
+<P>
+No capturing is done for a negative assertion unless it is being used as a
+condition in a
+<a href="#subpatternsassubroutines">conditional subpattern</a>
+(see the discussion below). Matching continues after a non-conditional negative
+assertion only if all its branches fail to match.
 </P>
 <P>
 For compatibility with Perl, most assertion subpatterns may be repeated; though
@@ -2310,18 +2380,31 @@ match. If there are insufficient characters before the current position, the
 assertion fails.
 </P>
 <P>
-In a UTF mode, PCRE2 does not allow the \C escape (which matches a single code
-unit even in a UTF mode) to appear in lookbehind assertions, because it makes
-it impossible to calculate the length of the lookbehind. The \X and \R
-escapes, which can match different numbers of code units, are also not
-permitted.
+In UTF-8 and UTF-16 modes, PCRE2 does not allow the \C escape (which matches a
+single code unit even in a UTF mode) to appear in lookbehind assertions,
+because it makes it impossible to calculate the length of the lookbehind. The
+\X and \R escapes, which can match different numbers of code units, are never
+permitted in lookbehinds.
 </P>
 <P>
 <a href="#subpatternsassubroutines">"Subroutine"</a>
 calls (see below) such as (?2) or (?&X) are permitted in lookbehinds, as long
-as the subpattern matches a fixed-length string.
-<a href="#recursion">Recursion,</a>
-however, is not supported.
+as the subpattern matches a fixed-length string. However,
+<a href="#recursion">recursion,</a>
+that is, a "subroutine" call into a group that is already active,
+is not supported.
+</P>
+<P>
+Perl does not support back references in lookbehinds. PCRE2 does support them,
+but only if certain conditions are met. The PCRE2_MATCH_UNSET_BACKREF option
+must not be set, there must be no use of (?| in the pattern (it creates
+duplicate subpattern numbers), and if the back reference is by name, the name
+must be unique. Of course, the referenced subpattern must itself be of fixed
+length. The following pattern matches words containing at least two characters
+that begin and end with the same character:
+<pre>
+   \b(\w)\w++(?&#60;=\1)
+</PRE>
 </P>
 <P>
 Possessive quantifiers can be used in conjunction with lookbehind assertions to
@@ -2459,7 +2542,9 @@ Checking for a used subpattern by name
 <P>
 Perl uses the syntax (?(&#60;name&#62;)...) or (?('name')...) to test for a used
 subpattern by name. For compatibility with earlier versions of PCRE1, which had
-this facility before Perl, the syntax (?(name)...) is also recognized.
+this facility before Perl, the syntax (?(name)...) is also recognized. Note,
+however, that undelimited names consisting of the letter R followed by digits
+are ambiguous (see the following section).
 </P>
 <P>
 Rewriting the above example to use a named subpattern gives this:
@@ -2474,30 +2559,52 @@ matched.
 Checking for pattern recursion
 </b><br>
 <P>
-If the condition is the string (R), and there is no subpattern with the name R,
-the condition is true if a recursive call to the whole pattern or any
-subpattern has been made. If digits or a name preceded by ampersand follow the
-letter R, for example:
+"Recursion" in this sense refers to any subroutine-like call from one part of
+the pattern to another, whether or not it is actually recursive. See the
+sections entitled
+<a href="#recursion">"Recursive patterns"</a>
+and
+<a href="#subpatternsassubroutines">"Subpatterns as subroutines"</a>
+below for details of recursion and subpattern calls.
+</P>
+<P>
+If a condition is the string (R), and there is no subpattern with the name R,
+the condition is true if matching is currently in a recursion or subroutine
+call to the whole pattern or any subpattern. If digits follow the letter R, and
+there is no subpattern with that name, the condition is true if the most recent
+call is into a subpattern with the given number, which must exist somewhere in
+the overall pattern. This is a contrived example that is equivalent to a+b:
 <pre>
-  (?(R3)...) or (?(R&name)...)
+  ((?(R1)a+|(?1)b))
 </pre>
-the condition is true if the most recent recursion is into a subpattern whose
-number or name is given. This condition does not check the entire recursion
-stack. If the name used in a condition of this kind is a duplicate, the test is
-applied to all subpatterns of the same name, and is true if any one of them is
-the most recent recursion.
+However, in both cases, if there is a subpattern with a matching name, the
+condition tests for its being set, as described in the section above, instead
+of testing for recursion. For example, creating a group with the name R1 by
+adding (?&#60;R1&#62;) to the above pattern completely changes its meaning.
+</P>
+<P>
+If a name preceded by ampersand follows the letter R, for example:
+<pre>
+  (?(R&name)...)
+</pre>
+the condition is true if the most recent recursion is into a subpattern of that
+name (which must exist within the pattern).
+</P>
+<P>
+This condition does not check the entire recursion stack. It tests only the
+current level. If the name used in a condition of this kind is a duplicate, the
+test is applied to all subpatterns of the same name, and is true if any one of
+them is the most recent recursion.
 </P>
 <P>
 At "top level", all these recursion test conditions are false.
-<a href="#recursion">The syntax for recursive patterns</a>
-is described below.
 <a name="subdefine"></a></P>
 <br><b>
 Defining subpatterns for use by reference only
 </b><br>
 <P>
-If the condition is the string (DEFINE), and there is no subpattern with the
-name DEFINE, the condition is always false. In this case, there may be only one
+If the condition is the string (DEFINE), the condition is always false, even if
+there is a group with the name DEFINE. In this case, there may be only one
 alternative in the subpattern. It is always skipped if control reaches this
 point in the pattern; the idea of DEFINE is that it can be used to define
 subroutines that can be referenced from elsewhere. (The use of
@@ -2552,6 +2659,13 @@ presence of at least one letter in the subject. If a letter is found, the
 subject is matched against the first alternative; otherwise it is matched
 against the second. This pattern matches strings in one of the two forms
 dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
+</P>
+<P>
+When an assertion that is a condition contains capturing subpatterns, any
+capturing that occurs in a matching branch is retained afterwards, for both
+positive and negative assertions, because matching always continues after the
+assertion, whether it succeeds or fails. (Compare non-conditional assertions,
+when captures are retained only for positive assertions that succeed.)
 <a name="comments"></a></P>
 <br><a name="SEC22" href="#TOC1">COMMENTS</a><br>
 <P>
@@ -2724,93 +2838,57 @@ is the actual recursive call.
 Differences in recursion processing between PCRE2 and Perl
 </b><br>
 <P>
-Recursion processing in PCRE2 differs from Perl in two important ways. In PCRE2
-(like Python, but unlike Perl), a recursive subpattern call is always treated
-as an atomic group. That is, once it has matched some of the subject string, it
-is never re-entered, even if it contains untried alternatives and there is a
-subsequent matching failure. This can be illustrated by the following pattern,
-which purports to match a palindromic string that contains an odd number of
-characters (for example, "a", "aba", "abcba", "abcdcba"):
-<pre>
-  ^(.|(.)(?1)\2)$
-</pre>
-The idea is that it either matches a single character, or two identical
-characters surrounding a sub-palindrome. In Perl, this pattern works; in PCRE2
-it does not if the pattern is longer than three characters. Consider the
-subject string "abcba":
+Some former differences between PCRE2 and Perl no longer exist.
 </P>
 <P>
-At the top level, the first character is matched, but as it is not at the end
-of the string, the first alternative fails; the second alternative is taken
-and the recursion kicks in. The recursive call to subpattern 1 successfully
-matches the next character ("b"). (Note that the beginning and end of line
-tests are not part of the recursion).
+Before release 10.30, recursion processing in PCRE2 differed from Perl in that
+a recursive subpattern call was always treated as an atomic group. That is,
+once it had matched some of the subject string, it was never re-entered, even
+if it contained untried alternatives and there was a subsequent matching
+failure. (Historical note: PCRE implemented recursion before Perl did.)
 </P>
 <P>
-Back at the top level, the next character ("c") is compared with what
-subpattern 2 matched, which was "a". This fails. Because the recursion is
-treated as an atomic group, there are now no backtracking points, and so the
-entire match fails. (Perl is able, at this point, to re-enter the recursion and
-try the second alternative.) However, if the pattern is written with the
-alternatives in the other order, things are different:
-<pre>
-  ^((.)(?1)\2|.)$
-</pre>
-This time, the recursing alternative is tried first, and continues to recurse
-until it runs out of characters, at which point the recursion fails. But this
-time we do have another alternative to try at the higher level. That is the big
-difference: in the previous case the remaining alternative is at a deeper
-recursion level, which PCRE2 cannot use.
+Starting with release 10.30, recursive subroutine calls are no longer treated
+as atomic. That is, they can be re-entered to try unused alternatives if there
+is a matching failure later in the pattern. This is now compatible with the way
+Perl works. If you want a subroutine call to be atomic, you must explicitly
+enclose it in an atomic group.
 </P>
 <P>
-To change the pattern so that it matches all palindromic strings, not just
-those with an odd number of characters, it is tempting to change the pattern to
-this:
+Supporting backtracking into recursions simplifies certain types of recursive
+pattern. For example, this pattern matches palindromic strings:
 <pre>
   ^((.)(?1)\2|.?)$
 </pre>
-Again, this works in Perl, but not in PCRE2, and for the same reason. When a
-deeper recursion has matched a single character, it cannot be entered again in
-order to match an empty string. The solution is to separate the two cases, and
-write out the odd and even cases as alternatives at the higher level:
+The second branch in the group matches a single central character in the
+palindrome when there are an odd number of characters, or nothing when there
+are an even number of characters, but in order to work it has to be able to try
+the second case when the rest of the pattern match fails. If you want to match
+typical palindromic phrases, the pattern has to ignore all non-word characters,
+which can be done like this:
 <pre>
-  ^(?:((.)(?1)\2|)|((.)(?3)\4|.))
-</pre>
-If you want to match typical palindromic phrases, the pattern has to ignore all
-non-word characters, which can be done like this:
-<pre>
-  ^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$
+  ^\W*+((.)\W*+(?1)\W*+\2|\W*+.?)\W*+$
 </pre>
 If run with the PCRE2_CASELESS option, this pattern matches phrases such as "A
-man, a plan, a canal: Panama!" and it works in both PCRE2 and Perl. Note the
-use of the possessive quantifier *+ to avoid backtracking into sequences of
-non-word characters. Without this, PCRE2 takes a great deal longer (ten times
-or more) to match typical phrases, and Perl takes so long that you think it has
-gone into a loop.
+man, a plan, a canal: Panama!". Note the use of the possessive quantifier *+ to
+avoid backtracking into sequences of non-word characters. Without this, PCRE2
+takes a great deal longer (ten times or more) to match typical phrases, and
+Perl takes so long that you think it has gone into a loop.
 </P>
 <P>
-<b>WARNING</b>: The palindrome-matching patterns above work only if the subject
-string does not start with a palindrome that is shorter than the entire string.
-For example, although "abcba" is correctly matched, if the subject is "ababa",
-PCRE2 finds the palindrome "aba" at the start, then fails at top level because
-the end of the string does not follow. Once again, it cannot jump back into the
-recursion to try other alternatives, so the entire match fails.
-</P>
-<P>
-The second way in which PCRE2 and Perl differ in their recursion processing is
-in the handling of captured values. In Perl, when a subpattern is called
-recursively or as a subpattern (see the next section), it has no access to any
-values that were captured outside the recursion, whereas in PCRE2 these values
-can be referenced. Consider this pattern:
+Another way in which PCRE2 and Perl used to differ in their recursion
+processing is in the handling of captured values. Formerly in Perl, when a
+subpattern was called recursively or as a subpattern (see the next section), it
+had no access to any values that were captured outside the recursion, whereas
+in PCRE2 these values can be referenced. Consider this pattern:
 <pre>
   ^(.)(\1|a(?2))
 </pre>
-In PCRE2, this pattern matches "bab". The first capturing parentheses match "b",
-then in the second group, when the back reference \1 fails to match "b", the
-second alternative matches "a" and then recurses. In the recursion, \1 does
-now match "b" and so the whole match succeeds. In Perl, the pattern fails to
-match because inside the recursive call \1 cannot access the externally set
-value.
+This pattern matches "bab". The first capturing parentheses match "b", then in
+the second group, when the back reference \1 fails to match "b", the second
+alternative matches "a" and then recurses. In the recursion, \1 does now match
+"b" and so the whole match succeeds. This match used to fail in Perl, but in
+later versions (I tried 5.024) it now works.
 <a name="subpatternsassubroutines"></a></P>
 <br><a name="SEC24" href="#TOC1">SUBPATTERNS AS SUBROUTINES</a><br>
 <P>
@@ -2837,11 +2915,10 @@ is used, it does match "sense and responsibility" as well as the other two
 strings. Another example is given in the discussion of DEFINE above.
 </P>
 <P>
-All subroutine calls, whether recursive or not, are always treated as atomic
-groups. That is, once a subroutine has matched some of the subject string, it
-is never re-entered, even if it contains untried alternatives and there is a
-subsequent matching failure. Any capturing parentheses that are set during the
-subroutine call revert to their previous values afterwards.
+Like recursions, subroutine calls used to be treated as atomic, but this
+changed at PCRE2 release 10.30, so backtracking into subroutine calls can now
+occur. However, any capturing parentheses that are set during the subroutine
+call revert to their previous values afterwards.
 </P>
 <P>
 Processing options such as case-independence are fixed when a subpattern is
@@ -2949,28 +3026,31 @@ The doubling is removed before the string is passed to the callout function.
 <a name="backtrackcontrol"></a></P>
 <br><a name="SEC27" href="#TOC1">BACKTRACKING CONTROL</a><br>
 <P>
-Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which
-are still described in the Perl documentation as "experimental and subject to
-change or removal in a future version of Perl". It goes on to say: "Their usage
-in production code should be noted to avoid problems during upgrades." The same
-remarks apply to the PCRE2 features described in this section.
-</P>
-<P>
-The new verbs make use of what was previously invalid syntax: an opening
-parenthesis followed by an asterisk. They are generally of the form (*VERB) or
-(*VERB:NAME). Some verbs take either form, possibly behaving differently
-depending on whether or not a name is present.
+There are a number of special "Backtracking Control Verbs" (to use Perl's
+terminology) that modify the behaviour of backtracking during matching. They
+are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form,
+possibly behaving differently depending on whether or not a name is present.
 </P>
 <P>
 By default, for compatibility with Perl, a name is any sequence of characters
 that does not include a closing parenthesis. The name is not processed in
 any way, and it is not possible to include a closing parenthesis in the name.
-However, if the PCRE2_ALT_VERBNAMES option is set, normal backslash processing
-is applied to verb names and only an unescaped closing parenthesis terminates
-the name. A closing parenthesis can be included in a name either as \) or
-between \Q and \E. If the PCRE2_EXTENDED option is set, unescaped whitespace
-in verb names is skipped and #-comments are recognized, exactly as in the rest
-of the pattern.
+This can be changed by setting the PCRE2_ALT_VERBNAMES option, but the result
+is no longer Perl-compatible.
+</P>
+<P>
+When PCRE2_ALT_VERBNAMES is set, backslash processing is applied to verb names
+and only an unescaped closing parenthesis terminates the name. However, the
+only backslash items that are permitted are \Q, \E, and sequences such as
+\x{100} that define character code points. Character type escapes such as \d
+are faulted.
+</P>
+<P>
+A closing parenthesis can be included in a name either as \) or between \Q
+and \E. In addition to backslash processing, if the PCRE2_EXTENDED option is
+also set, unescaped whitespace in verb names is skipped, and #-comments are
+recognized, exactly as in the rest of the pattern. PCRE2_EXTENDED does not
+affect verb names unless PCRE2_ALT_VERBNAMES is also set.
 </P>
 <P>
 The maximum length of a name is 255 in the 8-bit library and 65535 in the
@@ -2981,7 +3061,7 @@ not there. Any number of these verbs may occur in a pattern.
 <P>
 Since these verbs are specifically related to backtracking, most of them can be
 used only when the pattern is to be matched using the traditional matching
-function, because these use a backtracking algorithm. With the exception of
+function, because that uses a backtracking algorithm. With the exception of
 (*FAIL), which behaves like a failing negative assertion, the backtracking
 control verbs cause an error if encountered by the DFA matching function.
 </P>
@@ -3119,11 +3199,11 @@ Verbs that act after backtracking
 The following verbs do nothing when they are encountered. Matching continues
 with what follows, but if there is no subsequent match, causing a backtrack to
 the verb, a failure is forced. That is, backtracking cannot pass to the left of
-the verb. However, when one of these verbs appears inside an atomic group
-(which includes any group that is called as a subroutine) or in an assertion
-that is true, its effect is confined to that group, because once the group has
-been matched, there is never any backtracking into it. In this situation,
-backtracking has to jump to the left of the entire atomic group or assertion.
+the verb. However, when one of these verbs appears inside an atomic group or in
+an assertion that is true, its effect is confined to that group, because once
+the group has been matched, there is never any backtracking into it. In this
+situation, backtracking has to jump to the left of the entire atomic group or
+assertion.
 </P>
 <P>
 These verbs differ in exactly what kind of failure occurs when backtracking
@@ -3187,8 +3267,8 @@ expressed in any other way. In an anchored pattern (*PRUNE) has the same effect
 as (*COMMIT).
 </P>
 <P>
-The behaviour of (*PRUNE:NAME) is the not the same as (*MARK:NAME)(*PRUNE).
-It is like (*MARK:NAME) in that the name is remembered for passing back to the
+The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is
+like (*MARK:NAME) in that the name is remembered for passing back to the
 caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
 ignoring those set by (*PRUNE) or (*THEN).
 <pre>
@@ -3329,28 +3409,34 @@ in the second repeat of the group acts.
 Backtracking verbs in assertions
 </b><br>
 <P>
-(*FAIL) in an assertion has its normal effect: it forces an immediate
-backtrack.
+(*FAIL) in any assertion has its normal effect: it forces an immediate
+backtrack. The behaviour of the other backtracking verbs depends on whether or
+not the assertion is standalone or acting as the condition in a conditional
+subpattern.
+</P>
+<P>
+(*ACCEPT) in a standalone positive assertion causes the assertion to succeed
+without any further processing; captured strings are retained. In a standalone
+negative assertion, (*ACCEPT) causes the assertion to fail without any further
+processing; captured substrings are discarded.
+</P>
+<P>
+If the assertion is a condition, (*ACCEPT) causes the condition to be true for
+a positive assertion and false for a negative one; captured substrings are
+retained in both cases.
 </P>
 <P>
-(*ACCEPT) in a positive assertion causes the assertion to succeed without any
-further processing. In a negative assertion, (*ACCEPT) causes the assertion to
-fail without any further processing.
+The effect of (*THEN) is not allowed to escape beyond an assertion. If there
+are no more branches to try, (*THEN) causes a positive assertion to be false,
+and a negative assertion to be true.
 </P>
 <P>
 The other backtracking verbs are not treated specially if they appear in a
-positive assertion. In particular, (*THEN) skips to the next alternative in the
-innermost enclosing group that has alternations, whether or not this is within
-the assertion.
-</P>
-<P>
-Negative assertions are, however, different, in order to ensure that changing a
-positive assertion into a negative assertion changes its result. Backtracking
-into (*COMMIT), (*SKIP), or (*PRUNE) causes a negative assertion to be true,
-without considering any further alternative branches in the assertion.
-Backtracking into (*THEN) causes it to skip to the next enclosing alternative
-within the assertion (the normal behaviour), but if the assertion does not have
-such an alternative, (*THEN) behaves like (*PRUNE).
+standalone positive assertion. In a conditional positive assertion,
+backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the condition to be
+false. However, for both standalone and conditional negative assertions,
+backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the assertion to be
+true, without considering any further alternative branches.
 <a name="btsub"></a></P>
 <br><b>
 Backtracking verbs in subroutines
@@ -3393,9 +3479,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC30" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 20 June 2016
+Last updated: 12 September 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2perform.html b/doc/html/pcre2perform.html
index ac9d23c..28f4f73 100644
--- a/doc/html/pcre2perform.html
+++ b/doc/html/pcre2perform.html
@@ -15,7 +15,7 @@ please consult the man page, in case the conversion went wrong.
 <ul>
 <li><a name="TOC1" href="#SEC1">PCRE2 PERFORMANCE</a>
 <li><a name="TOC2" href="#SEC2">COMPILED PATTERN MEMORY USAGE</a>
-<li><a name="TOC3" href="#SEC3">STACK USAGE AT RUN TIME</a>
+<li><a name="TOC3" href="#SEC3">STACK AND HEAP USAGE AT RUN TIME</a>
 <li><a name="TOC4" href="#SEC4">PROCESSING TIME</a>
 <li><a name="TOC5" href="#SEC5">AUTHOR</a>
 <li><a name="TOC6" href="#SEC6">REVISION</a>
@@ -29,11 +29,11 @@ of them.
 <br><a name="SEC2" href="#TOC1">COMPILED PATTERN MEMORY USAGE</a><br>
 <P>
 Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
-so that most simple patterns do not use much memory. However, there is one case
-where the memory usage of a compiled pattern can be unexpectedly large. If a
-parenthesized subpattern has a quantifier with a minimum greater than 1 and/or
-a limited maximum, the whole subpattern is repeated in the compiled code. For
-example, the pattern
+so that most simple patterns do not use much memory for storing the compiled
+version. However, there is one case where the memory usage of a compiled
+pattern can be unexpectedly large. If a parenthesized subpattern has a
+quantifier with a minimum greater than 1 and/or a limited maximum, the whole
+subpattern is repeated in the compiled code. For example, the pattern
 <pre>
   (abc|def){2,4}
 </pre>
@@ -52,13 +52,13 @@ example, the very simple pattern
 <pre>
   ((ab){1,1000}c){1,3}
 </pre>
-uses 51K bytes when compiled using the 8-bit library. When PCRE2 is compiled
-with its default internal pointer size of two bytes, the size limit on a
-compiled pattern is 64K code units in the 8-bit and 16-bit libraries, and this
-is reached with the above pattern if the outer repetition is increased from 3
-to 4. PCRE2 can be compiled to use larger internal pointers and thus handle
-larger compiled patterns, but it is better to try to rewrite your pattern to
-use less memory if you can.
+uses over 50K bytes when compiled using the 8-bit library. When PCRE2 is
+compiled with its default internal pointer size of two bytes, the size limit on
+a compiled pattern is 64K code units in the 8-bit and 16-bit libraries, and
+this is reached with the above pattern if the outer repetition is increased
+from 3 to 4. PCRE2 can be compiled to use larger internal pointers and thus
+handle larger compiled patterns, but it is better to try to rewrite your
+pattern to use less memory if you can.
 </P>
 <P>
 One way of reducing the memory usage for such patterns is to make use of
@@ -68,25 +68,34 @@ facility. Re-writing the above pattern as
 <pre>
   ((ab)(?2){0,999}c)(?1){0,2}
 </pre>
-reduces the memory requirements to 18K, and indeed it remains under 20K even
-with the outer repetition increased to 100. However, this pattern is not
-exactly equivalent, because the "subroutine" calls are treated as
-<a href="pcre2pattern.html#atomicgroup">atomic groups</a>
-into which there can be no backtracking if there is a subsequent matching
-failure. Therefore, PCRE2 cannot do this kind of rewriting automatically.
-Furthermore, there is a noticeable loss of speed when executing the modified
-pattern. Nevertheless, if the atomic grouping is not a problem and the loss of
-speed is acceptable, this kind of rewriting will allow you to process patterns
-that PCRE2 cannot otherwise handle.
-</P>
-<br><a name="SEC3" href="#TOC1">STACK USAGE AT RUN TIME</a><br>
-<P>
-When <b>pcre2_match()</b> is used for matching, certain kinds of pattern can
-cause it to use large amounts of the process stack. In some environments the
-default process stack is quite small, and if it runs out the result is often
-SIGSEGV. Rewriting your pattern can often help. The
-<a href="pcre2stack.html"><b>pcre2stack</b></a>
-documentation discusses this issue in detail.
+reduces the memory requirements to around 16K, and indeed it remains under 20K
+even with the outer repetition increased to 100. However, this kind of pattern
+is not always exactly equivalent, because any captures within subroutine calls
+are lost when the subroutine completes. If this is not a problem, this kind of
+rewriting will allow you to process patterns that PCRE2 cannot otherwise
+handle. The matching performance of the two different versions of the pattern
+are roughly the same. (This applies from release 10.30 - things were different
+in earlier releases.)
+</P>
+<br><a name="SEC3" href="#TOC1">STACK AND HEAP USAGE AT RUN TIME</a><br>
+<P>
+From release 10.30, the interpretive (non-JIT) version of <b>pcre2_match()</b>
+uses very little system stack at run time. In earlier releases recursive
+function calls could use a great deal of stack, and this could cause problems,
+but this usage has been eliminated. Backtracking positions are now explicitly
+remembered in memory frames controlled by the code. An initial 20K vector of
+frames is allocated on the system stack (enough for about 100 frames for small
+patterns), but if this is insufficient, heap memory is used. The amount of heap
+memory can be limited; if the limit is set to zero, only the initial stack
+vector is used. Rewriting patterns to be time-efficient, as described below,
+may also reduce the memory requirements.
+</P>
+<P>
+In contrast to <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b> does use recursive
+function calls, but only for processing atomic groups, lookaround assertions,
+and recursion within the pattern. Too much nested recursion may cause stack
+issues. The "match depth" parameter can be used to limit the depth of function
+recursion in <b>pcre2_dfa_match()</b>.
 </P>
 <br><a name="SEC4" href="#TOC1">PROCESSING TIME</a><br>
 <P>
@@ -175,7 +184,54 @@ appreciable time with strings longer than about 20 characters.
 </P>
 <P>
 In many cases, the solution to this kind of performance issue is to use an
-atomic group or a possessive quantifier.
+atomic group or a possessive quantifier. This can often reduce memory
+requirements as well. As another example, consider this pattern:
+<pre>
+  ([^&#60;]|&#60;(?!inet))+
+</pre>
+It matches from wherever it starts until it encounters "&#60;inet" or the end of
+the data, and is the kind of pattern that might be used when processing an XML
+file. Each iteration of the outer parentheses matches either one character that
+is not "&#60;" or a "&#60;" that is not followed by "inet". However, each time a
+parenthesis is processed, a backtracking position is passed, so this
+formulation uses a memory frame for each matched character. For a long string,
+a lot of memory is required. Consider now this rewritten pattern, which matches
+exactly the same strings:
+<pre>
+  ([^&#60;]++|&#60;(?!inet))+
+</pre>
+This runs much faster, because sequences of characters that do not contain "&#60;"
+are "swallowed" in one item inside the parentheses, and a possessive quantifier
+is used to stop any backtracking into the runs of non-"&#60;" characters. This
+version also uses a lot less memory because entry to a new set of parentheses
+happens only when a "&#60;" character that is not followed by "inet" is encountered
+(and we assume this is relatively rare).
+</P>
+<P>
+This example shows that one way of optimizing performance when matching long
+subject strings is to write repeated parenthesized subpatterns to match more
+than one character whenever possible.
+</P>
+<br><b>
+SETTING RESOURCE LIMITS
+</b><br>
+<P>
+You can set limits on the amount of processing that takes place when matching,
+and on the amount of heap memory that is used. The default values of the limits
+are very large, and unlikely ever to operate. They can be changed when PCRE2 is
+built, and they can also be set when <b>pcre2_match()</b> or
+<b>pcre2_dfa_match()</b> is called. For details of these interfaces, see the
+<a href="pcre2build.html"><b>pcre2build</b></a>
+documentation and the section entitled
+<a href="pcre2api.html#matchcontext">"The match context"</a>
+in the
+<a href="pcre2api.html"><b>pcre2api</b></a>
+documentation.
+</P>
+<P>
+The <b>pcre2test</b> test program has a modifier called "find_limits" which, if
+applied to a subject line, causes it to find the smallest limits that allow a
+pattern to match. This is done by repeatedly matching with different limits.
 </P>
 <br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
 <P>
@@ -188,9 +244,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC6" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 02 January 2015
+Last updated: 08 April 2017
 <br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2posix.html b/doc/html/pcre2posix.html
index 1d5fe63..8a4431c 100644
--- a/doc/html/pcre2posix.html
+++ b/doc/html/pcre2posix.html
@@ -69,7 +69,7 @@ replacement library. Other POSIX options are not even defined.
 <P>
 There are also some options that are not defined by POSIX. These have been
 added at the request of users who want to make use of certain PCRE2-specific
-features via the POSIX calling interface.
+features via the POSIX calling interface or to add BSD or GNU functionality.
 </P>
 <P>
 When PCRE2 is called via these functions, it is only the API that is POSIX-like
@@ -91,10 +91,11 @@ identifying error codes.
 <br><a name="SEC3" href="#TOC1">COMPILING A PATTERN</a><br>
 <P>
 The function <b>regcomp()</b> is called to compile a pattern into an
-internal form. The pattern is a C string terminated by a binary zero, and
-is passed in the argument <i>pattern</i>. The <i>preg</i> argument is a pointer
-to a <b>regex_t</b> structure that is used as a base for storing information
-about the compiled regular expression.
+internal form. By default, the pattern is a C string terminated by a binary
+zero (but see REG_PEND below). The <i>preg</i> argument is a pointer to a
+<b>regex_t</b> structure that is used as a base for storing information about
+the compiled regular expression. (It is also used for input when REG_PEND is
+set.)
 </P>
 <P>
 The argument <i>cflags</i> is either zero, or contains one or more of the bits
@@ -117,6 +118,14 @@ The PCRE2_MULTILINE option is set when the regular expression is passed for
 compilation to the native function. Note that this does <i>not</i> mimic the
 defined POSIX behaviour for REG_NEWLINE (see the following section).
 <pre>
+  REG_NOSPEC
+</pre>
+The PCRE2_LITERAL option is set when the regular expression is passed for
+compilation to the native function. This disables all meta characters in the
+pattern, causing it to be treated as a literal string. The only other options
+that are allowed with REG_NOSPEC are REG_ICASE, REG_NOSUB, REG_PEND, and
+REG_UTF. Note that REG_NOSPEC is not part of the POSIX standard.
+<pre>
   REG_NOSUB
 </pre>
 When a pattern that is compiled with this flag is passed to <b>regexec()</b> for
@@ -125,6 +134,16 @@ captured strings are returned. Versions of the PCRE library prior to 10.22 used
 to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
 because it disables the use of back references.
 <pre>
+  REG_PEND
+</pre>
+If this option is set, the <b>reg_endp</b> field in the <i>preg</i> structure
+(which has the type const char *) must be set to point to the character beyond
+the end of the pattern before calling <b>regcomp()</b>. The pattern itself may
+now contain binary zeroes, which are treated as data characters. Without
+REG_PEND, a binary zero terminates the pattern and the <b>re_endp</b> field is
+ignored. This is a GNU extension to the POSIX standard and should be used with
+caution in software intended to be portable to other systems.
+<pre>
   REG_UCP
 </pre>
 The PCRE2_UCP option is set when the regular expression is passed for
@@ -156,9 +175,10 @@ class such as [^a] (they are).
 </P>
 <P>
 The yield of <b>regcomp()</b> is zero on success, and non-zero otherwise. The
-<i>preg</i> structure is filled in on success, and one member of the structure
-is public: <i>re_nsub</i> contains the number of capturing subpatterns in
-the regular expression. Various error codes are defined in the header file.
+<i>preg</i> structure is filled in on success, and one other member of the
+structure (as well as <i>re_endp</i>) is public: <i>re_nsub</i> contains the
+number of capturing subpatterns in the regular expression. Various error codes
+are defined in the header file.
 </P>
 <P>
 NOTE: If the yield of <b>regcomp()</b> is non-zero, you must not attempt to
@@ -228,15 +248,26 @@ function.
 <pre>
   REG_STARTEND
 </pre>
-The string is considered to start at <i>string</i> + <i>pmatch[0].rm_so</i> and
-to have a terminating NUL located at <i>string</i> + <i>pmatch[0].rm_eo</i>
-(there need not actually be a NUL at that location), regardless of the value of
-<i>nmatch</i>. This is a BSD extension, compatible with but not specified by
-IEEE Standard 1003.2 (POSIX.2), and should be used with caution in software
-intended to be portable to other systems. Note that a non-zero <i>rm_so</i> does
-not imply REG_NOTBOL; REG_STARTEND affects only the location of the string, not
-how it is matched. Setting REG_STARTEND and passing <i>pmatch</i> as NULL are
-mutually exclusive; the error REG_INVARG is returned.
+When this option is set, the subject string is starts at <i>string</i> +
+<i>pmatch[0].rm_so</i> and ends at <i>string</i> + <i>pmatch[0].rm_eo</i>, which
+should point to the first character beyond the string. There may be binary
+zeroes within the subject string, and indeed, using REG_STARTEND is the only
+way to pass a subject string that contains a binary zero.
+</P>
+<P>
+Whatever the value of <i>pmatch[0].rm_so</i>, the offsets of the matched string
+and any captured substrings are still given relative to the start of
+<i>string</i> itself. (Before PCRE2 release 10.30 these were given relative to
+<i>string</i> + <i>pmatch[0].rm_so</i>, but this differs from other
+implementations.)
+</P>
+<P>
+This is a BSD extension, compatible with but not specified by IEEE Standard
+1003.2 (POSIX.2), and should be used with caution in software intended to be
+portable to other systems. Note that a non-zero <i>rm_so</i> does not imply
+REG_NOTBOL; REG_STARTEND affects only the location and length of the string,
+not how it is matched. Setting REG_STARTEND and passing <i>pmatch</i> as NULL
+are mutually exclusive; the error REG_INVARG is returned.
 </P>
 <P>
 If the pattern was compiled with the REG_NOSUB flag, no data about any matched
@@ -291,9 +322,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC9" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 31 January 2016
+Last updated: 15 June 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2serialize.html b/doc/html/pcre2serialize.html
index edf415a..813b25a 100644
--- a/doc/html/pcre2serialize.html
+++ b/doc/html/pcre2serialize.html
@@ -55,7 +55,10 @@ The facility for saving and restoring compiled patterns is intended for use
 within individual applications. As such, the data supplied to
 <b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from
 arbitrary external sources. There is only some simple consistency checking, not
-complete validation of what is being re-loaded.
+complete validation of what is being re-loaded. Corrupted data may cause
+undefined results. For example, if the length field of a pattern in the
+serialized data is corrupted, the deserializing code may read beyond the end of
+the byte stream that is passed to it.
 </P>
 <br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
 <P>
@@ -190,9 +193,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC6" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 24 May 2016
+Last updated: 21 March 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2stack.html b/doc/html/pcre2stack.html
deleted file mode 100644
index 2942c7a..0000000
--- a/doc/html/pcre2stack.html
+++ /dev/null
@@ -1,207 +0,0 @@
-<html>
-<head>
-<title>pcre2stack specification</title>
-</head>
-<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
-<h1>pcre2stack man page</h1>
-<p>
-Return to the <a href="index.html">PCRE2 index page</a>.
-</p>
-<p>
-This page is part of the PCRE2 HTML documentation. It was generated
-automatically from the original man page. If there is any nonsense in it,
-please consult the man page, in case the conversion went wrong.
-<br>
-<br><b>
-PCRE2 DISCUSSION OF STACK USAGE
-</b><br>
-<P>
-When you call <b>pcre2_match()</b>, it makes use of an internal function called
-<b>match()</b>. This calls itself recursively at branch points in the pattern,
-in order to remember the state of the match so that it can back up and try a
-different alternative after a failure. As matching proceeds deeper and deeper
-into the tree of possibilities, the recursion depth increases. The
-<b>match()</b> function is also called in other circumstances, for example,
-whenever a parenthesized sub-pattern is entered, and in certain cases of
-repetition.
-</P>
-<P>
-Not all calls of <b>match()</b> increase the recursion depth; for an item such
-as a* it may be called several times at the same level, after matching
-different numbers of a's. Furthermore, in a number of cases where the result of
-the recursive call would immediately be passed back as the result of the
-current call (a "tail recursion"), the function is just restarted instead.
-</P>
-<P>
-Each time the internal <b>match()</b> function is called recursively, it uses
-memory from the process stack. For certain kinds of pattern and data, very
-large amounts of stack may be needed, despite the recognition of "tail
-recursion". Note that if PCRE2 is compiled with the -fsanitize=address option
-of the GCC compiler, the stack requirements are greatly increased.
-</P>
-<P>
-The above comments apply when <b>pcre2_match()</b> is run in its normal
-interpretive manner. If the compiled pattern was processed by
-<b>pcre2_jit_compile()</b>, and just-in-time compiling was successful, and the
-options passed to <b>pcre2_match()</b> were not incompatible, the matching
-process uses the JIT-compiled code instead of the <b>match()</b> function. In
-this case, the memory requirements are handled entirely differently. See the
-<a href="pcre2jit.html"><b>pcre2jit</b></a>
-documentation for details.
-</P>
-<P>
-The <b>pcre2_dfa_match()</b> function operates in a different way to
-<b>pcre2_match()</b>, and uses recursion only when there is a regular expression
-recursion or subroutine call in the pattern. This includes the processing of
-assertion and "once-only" subpatterns, which are handled like subroutine calls.
-Normally, these are never very deep, and the limit on the complexity of
-<b>pcre2_dfa_match()</b> is controlled by the amount of workspace it is given.
-However, it is possible to write patterns with runaway infinite recursions;
-such patterns will cause <b>pcre2_dfa_match()</b> to run out of stack. At
-present, there is no protection against this.
-</P>
-<P>
-The comments that follow do NOT apply to <b>pcre2_dfa_match()</b>; they are
-relevant only for <b>pcre2_match()</b> without the JIT optimization.
-</P>
-<br><b>
-Reducing <b>pcre2_match()</b>'s stack usage
-</b><br>
-<P>
-You can often reduce the amount of recursion, and therefore the
-amount of stack used, by modifying the pattern that is being matched. Consider,
-for example, this pattern:
-<pre>
-  ([^&#60;]|&#60;(?!inet))+
-</pre>
-It matches from wherever it starts until it encounters "&#60;inet" or the end of
-the data, and is the kind of pattern that might be used when processing an XML
-file. Each iteration of the outer parentheses matches either one character that
-is not "&#60;" or a "&#60;" that is not followed by "inet". However, each time a
-parenthesis is processed, a recursion occurs, so this formulation uses a stack
-frame for each matched character. For a long string, a lot of stack is
-required. Consider now this rewritten pattern, which matches exactly the same
-strings:
-<pre>
-  ([^&#60;]++|&#60;(?!inet))+
-</pre>
-This uses very much less stack, because runs of characters that do not contain
-"&#60;" are "swallowed" in one item inside the parentheses. Recursion happens only
-when a "&#60;" character that is not followed by "inet" is encountered (and we
-assume this is relatively rare). A possessive quantifier is used to stop any
-backtracking into the runs of non-"&#60;" characters, but that is not related to
-stack usage.
-</P>
-<P>
-This example shows that one way of avoiding stack problems when matching long
-subject strings is to write repeated parenthesized subpatterns to match more
-than one character whenever possible.
-</P>
-<br><b>
-Compiling PCRE2 to use heap instead of stack for <b>pcre2_match()</b>
-</b><br>
-<P>
-In environments where stack memory is constrained, you might want to compile
-PCRE2 to use heap memory instead of stack for remembering back-up points when
-<b>pcre2_match()</b> is running. This makes it run more slowly, however. Details
-of how to do this are given in the
-<a href="pcre2build.html"><b>pcre2build</b></a>
-documentation. When built in this way, instead of using the stack, PCRE2
-gets memory for remembering backup points from the heap. By default, the memory
-is obtained by calling the system <b>malloc()</b> function, but you can arrange
-to supply your own memory management function. For details, see the section
-entitled
-<a href="pcre2api.html#matchcontext">"The match context"</a>
-in the
-<a href="pcre2api.html"><b>pcre2api</b></a>
-documentation. Since the block sizes are always the same, it may be possible to
-implement customized a memory handler that is more efficient than the standard
-function. The memory blocks obtained for this purpose are retained and re-used
-if possible while <b>pcre2_match()</b> is running. They are all freed just
-before it exits.
-</P>
-<br><b>
-Limiting <b>pcre2_match()</b>'s stack usage
-</b><br>
-<P>
-You can set limits on the number of times the internal <b>match()</b> function
-is called, both in total and recursively. If a limit is exceeded,
-<b>pcre2_match()</b> returns an error code. Setting suitable limits should
-prevent it from running out of stack. The default values of the limits are very
-large, and unlikely ever to operate. They can be changed when PCRE2 is built,
-and they can also be set when <b>pcre2_match()</b> is called. For details of
-these interfaces, see the
-<a href="pcre2build.html"><b>pcre2build</b></a>
-documentation and the section entitled
-<a href="pcre2api.html#matchcontext">"The match context"</a>
-in the
-<a href="pcre2api.html"><b>pcre2api</b></a>
-documentation.
-</P>
-<P>
-As a very rough rule of thumb, you should reckon on about 500 bytes per
-recursion. Thus, if you want to limit your stack usage to 8Mb, you should set
-the limit at 16000 recursions. A 64Mb stack, on the other hand, can support
-around 128000 recursions.
-</P>
-<P>
-The <b>pcre2test</b> test program has a modifier called "find_limits" which, if
-applied to a subject line, causes it to find the smallest limits that allow a a
-pattern to match. This is done by calling <b>pcre2_match()</b> repeatedly with
-different limits.
-</P>
-<br><b>
-Changing stack size in Unix-like systems
-</b><br>
-<P>
-In Unix-like environments, there is not often a problem with the stack unless
-very long strings are involved, though the default limit on stack size varies
-from system to system. Values from 8Mb to 64Mb are common. You can find your
-default limit by running the command:
-<pre>
-  ulimit -s
-</pre>
-Unfortunately, the effect of running out of stack is often SIGSEGV, though
-sometimes a more explicit error message is given. You can normally increase the
-limit on stack size by code such as this:
-<pre>
-  struct rlimit rlim;
-  getrlimit(RLIMIT_STACK, &rlim);
-  rlim.rlim_cur = 100*1024*1024;
-  setrlimit(RLIMIT_STACK, &rlim);
-</pre>
-This reads the current limits (soft and hard) using <b>getrlimit()</b>, then
-attempts to increase the soft limit to 100Mb using <b>setrlimit()</b>. You must
-do this before calling <b>pcre2_match()</b>.
-</P>
-<br><b>
-Changing stack size in Mac OS X
-</b><br>
-<P>
-Using <b>setrlimit()</b>, as described above, should also work on Mac OS X. It
-is also possible to set a stack size when linking a program. There is a
-discussion about stack sizes in Mac OS X at this web site:
-<a href="http://developer.apple.com/qa/qa2005/qa1419.html">http://developer.apple.com/qa/qa2005/qa1419.html.</a>
-</P>
-<br><b>
-AUTHOR
-</b><br>
-<P>
-Philip Hazel
-<br>
-University Computing Service
-<br>
-Cambridge, England.
-<br>
-</P>
-<br><b>
-REVISION
-</b><br>
-<P>
-Last updated: 21 November 2014
-<br>
-Copyright &copy; 1997-2014 University of Cambridge.
-<br>
-<p>
-Return to the <a href="index.html">PCRE2 index page</a>.
-</p>
diff --git a/doc/html/pcre2syntax.html b/doc/html/pcre2syntax.html
index 7fdc0dc..9098f47 100644
--- a/doc/html/pcre2syntax.html
+++ b/doc/html/pcre2syntax.html
@@ -430,18 +430,21 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
   (?i)            caseless
   (?J)            allow duplicate names
   (?m)            multiline
+  (?n)            no auto capture
   (?s)            single line (dotall)
   (?U)            default ungreedy (lazy)
-  (?x)            extended (ignore white space)
+  (?x)            extended: ignore white space except in classes
+  (?xx)           as (?x) but also ignore space and tab in classes
   (?-...)         unset option(s)
 </pre>
 The following are recognized only at the very start of a pattern or after one
 of the newline or \R options with similar syntax. More than one of them may
-appear.
+appear. For the first three, d is a decimal number.
 <pre>
-  (*LIMIT_MATCH=d) set the match limit to d (decimal number)
-  (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number)
-  (*NOTEMPTY)     set PCRE2_NOTEMPTY when matching
+  (*LIMIT_DEPTH=d) set the backtracking limit to d
+  (*LIMIT_HEAP=d)  set the heap size limit to d kilobytes
+  (*LIMIT_MATCH=d) set the match limit to d
+  (*NOTEMPTY)      set PCRE2_NOTEMPTY when matching
   (*NOTEMPTY_ATSTART) set PCRE2_NOTEMPTY_ATSTART when matching
   (*NO_AUTO_POSSESS) no auto-possessification (PCRE2_NO_AUTO_POSSESS)
   (*NO_DOTSTAR_ANCHOR) no .* anchoring (PCRE2_NO_DOTSTAR_ANCHOR)
@@ -450,10 +453,11 @@ appear.
   (*UTF)          set appropriate UTF mode for the library in use
   (*UCP)          set PCRE2_UCP (use Unicode properties for \d etc)
 </pre>
-Note that LIMIT_MATCH and LIMIT_RECURSION can only reduce the value of the
-limits set by the caller of pcre2_match(), not increase them. The application
-can lock out the use of (*UTF) and (*UCP) by setting the PCRE2_NEVER_UTF or
-PCRE2_NEVER_UCP options, respectively, at compile time.
+Note that LIMIT_DEPTH, LIMIT_HEAP, and LIMIT_MATCH can only reduce the value of
+the limits set by the caller of <b>pcre2_match()</b> or <b>pcre2_dfa_match()</b>,
+not increase them. LIMIT_RECURSION is an obsolete synonym for LIMIT_DEPTH. The
+application can lock out the use of (*UTF) and (*UCP) by setting the
+PCRE2_NEVER_UTF or PCRE2_NEVER_UCP options, respectively, at compile time.
 </P>
 <br><a name="SEC17" href="#TOC1">NEWLINE CONVENTION</a><br>
 <P>
@@ -465,6 +469,7 @@ settings with a similar syntax.
   (*CRLF)         carriage return followed by linefeed
   (*ANYCRLF)      all three of the above
   (*ANY)          any Unicode newline sequence
+  (*NUL)          the NUL character (binary zero)
 </PRE>
 </P>
 <br><a name="SEC18" href="#TOC1">WHAT \R MATCHES</a><br>
@@ -492,6 +497,9 @@ Each top-level branch of a look behind must be of a fixed length.
   \n              reference by number (can be ambiguous)
   \gn             reference by number
   \g{n}           reference by number
+  \g+n            relative reference by number (PCRE2 extension)
+  \g-n            relative reference by number
+  \g{+n}          relative reference by number (PCRE2 extension)
   \g{-n}          relative reference by number
   \k&#60;name&#62;        reference by name (Perl)
   \k'name'        reference by name (Perl)
@@ -530,14 +538,17 @@ Each top-level branch of a look behind must be of a fixed length.
   (?(-n)              relative reference condition
   (?(&#60;name&#62;)          named reference condition (Perl)
   (?('name')          named reference condition (Perl)
-  (?(name)            named reference condition (PCRE2)
+  (?(name)            named reference condition (PCRE2, deprecated)
   (?(R)               overall recursion condition
-  (?(Rn)              specific group recursion condition
-  (?(R&name)          specific recursion condition
+  (?(Rn)              specific numbered group recursion condition
+  (?(R&name)          specific named group recursion condition
   (?(DEFINE)          define subpattern for reference
   (?(VERSION[&#62;]=n.m)  test PCRE2 version
   (?(assert)          assertion condition
-</PRE>
+</pre>
+Note the ambiguity of (?(R) and (?(Rn) which might be named reference
+conditions or recursion tests. Such a condition is interpreted as a reference
+condition if the relevant named group exists.
 </P>
 <br><a name="SEC23" href="#TOC1">BACKTRACKING CONTROL</a><br>
 <P>
@@ -589,9 +600,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC27" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 16 October 2015
+Last updated: 17 June 2017
 <br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2test.html b/doc/html/pcre2test.html
index 17b308e..7d98d90 100644
--- a/doc/html/pcre2test.html
+++ b/doc/html/pcre2test.html
@@ -61,7 +61,7 @@ subject is processed, and what output is produced.
 <P>
 As the original fairly simple PCRE library evolved, it acquired many different
 features, and as a result, the original <b>pcretest</b> program ended up with a
-lot of options in a messy, arcane syntax, for testing all the features. The
+lot of options in a messy, arcane syntax for testing all the features. The
 move to the new PCRE2 API provided an opportunity to re-implement the test
 program as <b>pcre2test</b>, with a cleaner modifier syntax. Nevertheless, there
 are still many obscure modifiers, some of which are specifically designed for
@@ -77,32 +77,62 @@ strings that are encoded in 8-bit, 16-bit, or 32-bit code units. One, two, or
 all three of these libraries may be simultaneously installed. The
 <b>pcre2test</b> program can be used to test all the libraries. However, its own
 input and output are always in 8-bit format. When testing the 16-bit or 32-bit
-libraries, patterns and subject strings are converted to 16- or 32-bit format
-before being passed to the library functions. Results are converted back to
-8-bit code units for output.
+libraries, patterns and subject strings are converted to 16-bit or 32-bit
+format before being passed to the library functions. Results are converted back
+to 8-bit code units for output.
 </P>
 <P>
 In the rest of this document, the names of library functions and structures
 are given in generic form, for example, <b>pcre_compile()</b>. The actual
 names used in the libraries have a suffix _8, _16, or _32, as appropriate.
-</P>
+<a name="inputencoding"></a></P>
 <br><a name="SEC3" href="#TOC1">INPUT ENCODING</a><br>
 <P>
 Input to <b>pcre2test</b> is processed line by line, either by calling the C
-library's <b>fgets()</b> function, or via the <b>libreadline</b> library (see
-below). The input is processed using using C's string functions, so must not
-contain binary zeroes, even though in Unix-like environments, <b>fgets()</b>
-treats any bytes other than newline as data characters. In some Windows
-environments character 26 (hex 1A) causes an immediate end of file, and no
-further data is read.
+library's <b>fgets()</b> function, or via the <b>libreadline</b> library. In some
+Windows environments character 26 (hex 1A) causes an immediate end of file, and
+no further data is read, so this character should be avoided unless you really
+want that action.
+</P>
+<P>
+The input is processed using using C's string functions, so must not
+contain binary zeros, even though in Unix-like environments, <b>fgets()</b>
+treats any bytes other than newline as data characters. An error is generated
+if a binary zero is encountered. By default subject lines are processed for
+backslash escapes, which makes it possible to include any data value in strings
+that are passed to the library for matching. For patterns, there is a facility
+for specifying some or all of the 8-bit input characters as hexadecimal pairs,
+which makes it possible to include binary zeros.
+</P>
+<br><b>
+Input for the 16-bit and 32-bit libraries
+</b><br>
+<P>
+When testing the 16-bit or 32-bit libraries, there is a need to be able to
+generate character code points greater than 255 in the strings that are passed
+to the library. For subject lines, backslash escapes can be used. In addition,
+when the <b>utf</b> modifier (see
+<a href="#optionmodifiers">"Setting compilation options"</a>
+below) is set, the pattern and any following subject lines are interpreted as
+UTF-8 strings and translated to UTF-16 or UTF-32 as appropriate.
 </P>
 <P>
-For maximum portability, therefore, it is safest to avoid non-printing
-characters in <b>pcre2test</b> input files. There is a facility for specifying
-some or all of a pattern's characters as hexadecimal pairs, thus making it
-possible to include binary zeroes in a pattern for testing purposes. Subject
-lines are processed for backslash escapes, which makes it possible to include
-any data value.
+For non-UTF testing of wide characters, the <b>utf8_input</b> modifier can be
+used. This is mutually exclusive with <b>utf</b>, and is allowed only in 16-bit
+or 32-bit mode. It causes the pattern and following subject lines to be treated
+as UTF-8 according to the original definition (RFC 2279), which allows for
+character values up to 0x7fffffff. Each character is placed in one 16-bit or
+32-bit code unit (in the 16-bit case, values greater than 0xffff cause an error
+to occur).
+</P>
+<P>
+UTF-8 (in its original definition) is not capable of encoding values greater
+than 0x7fffffff, but such values can be handled by the 32-bit library. When
+testing this library in non-UTF mode with <b>utf8_input</b> set, if any
+character is preceded by the byte 0xff (which is an illegal byte in UTF-8)
+0x80000000 is added to the character's value. This is the only way of passing
+such code points in a pattern string. For subject strings, using an escape
+sequence is preferable.
 </P>
 <br><a name="SEC4" href="#TOC1">COMMAND LINE OPTIONS</a><br>
 <P>
@@ -124,15 +154,27 @@ the 32-bit library has been built, this is the default. If the 32-bit library
 has not been built, this option causes an error.
 </P>
 <P>
+<b>-ac</b>
+Behave as if each pattern has the <b>auto_callout</b> modifier, that is, insert
+automatic callouts into every pattern that is compiled.
+</P>
+<P>
+<b>-AC</b>
+As for <b>-ac</b>, but in addition behave as if each subject line has the
+<b>callout_extra</b> modifier, that is, show additional information from
+callouts.
+</P>
+<P>
 <b>-b</b>
-Behave as if each pattern has the <b>/fullbincode</b> modifier; the full
+Behave as if each pattern has the <b>fullbincode</b> modifier; the full
 internal binary form of the pattern is output after compilation.
 </P>
 <P>
 <b>-C</b>
 Output the version number of the PCRE2 library, and all available information
 about the optional features that are included, and then exit with zero exit
-code. All other options are ignored.
+code. All other options are ignored. If both -C and -LM are present, whichever
+is first is recognized.
 </P>
 <P>
 <b>-C</b> <i>option</i>
@@ -147,7 +189,7 @@ following options output the value and set the exit code as indicated:
   linksize   the configured internal link size (2, 3, or 4)
                exit code is set to the link size
   newline    the default newline setting:
-               CR, LF, CRLF, ANYCRLF, or ANY
+               CR, LF, CRLF, ANYCRLF, ANY, or NUL
                exit code is always 0
   bsr        the default setting for what \R matches:
                ANYCRLF or ANY
@@ -191,7 +233,7 @@ Output a brief summary these options and then exit.
 </P>
 <P>
 <b>-i</b>
-Behave as if each pattern has the <b>/info</b> modifier; information about the
+Behave as if each pattern has the <b>info</b> modifier; information about the
 compiled pattern is given after compilation.
 </P>
 <P>
@@ -200,6 +242,18 @@ Behave as if each pattern line has the <b>jit</b> modifier; after successful
 compilation, each pattern is passed to the just-in-time compiler, if available.
 </P>
 <P>
+<b>-jitverify</b>
+Behave as if each pattern line has the <b>jitverify</b> modifier; after
+successful compilation, each pattern is passed to the just-in-time compiler, if
+available, and the use of JIT is verified.
+</P>
+<P>
+<b>-LM</b>
+List modifiers: write a list of available pattern and subject modifiers to the
+standard output, then exit with zero exit code. All other options are ignored.
+If both -C and -LM are present, whichever is first is recognized.
+</P>
+<P>
 \fB-pattern\fB <i>modifier-list</i>
 Behave as if each pattern line contains the given modifiers.
 </P>
@@ -326,8 +380,8 @@ when PCRE2 is compiled with either CR or CRLF as the default newline.
 </P>
 <P>
 The #newline_default command specifies a list of newline types that are
-acceptable as the default. The types must be one of CR, LF, CRLF, ANYCRLF, or
-ANY (in upper or lower case), for example:
+acceptable as the default. The types must be one of CR, LF, CRLF, ANYCRLF,
+ANY, or NUL (in upper or lower case), for example:
 <pre>
   #newline_default LF Any anyCRLF
 </pre>
@@ -341,8 +395,9 @@ of the standard test input files.
 <P>
 When the POSIX API is being tested there is no way to override the default
 newline convention, though it is possible to set the newline convention from
-within the pattern. A warning is given if the <b>posix</b> modifier is used when
-<b>#newline_default</b> would set a default for the non-POSIX API.
+within the pattern. A warning is given if the <b>posix</b> or <b>posix_nosub</b>
+modifier is used when <b>#newline_default</b> would set a default for the
+non-POSIX API.
 <pre>
   #pattern &#60;modifier-list&#62;
 </pre>
@@ -438,8 +493,9 @@ A pattern can be followed by a modifier list (details below).
 <P>
 Before each subject line is passed to <b>pcre2_match()</b> or
 <b>pcre2_dfa_match()</b>, leading and trailing white space is removed, and the
-line is scanned for backslash escapes. The following provide a means of
-encoding non-printing characters in a visible way:
+line is scanned for backslash escapes, unless the <b>subject_literal</b>
+modifier was set for the pattern. The following provide a means of encoding
+non-printing characters in a visible way:
 <pre>
   \a         alarm (BEL, \x07)
   \b         backspace (\x08)
@@ -507,6 +563,12 @@ the very last character in the line is a backslash (and there is no modifier
 list), it is ignored. This gives a way of passing an empty line as data, since
 a real empty line terminates the data input.
 </P>
+<P>
+If the <b>subject_literal</b> modifier is set for a pattern, all subject lines
+that follow are treated as literals, with no special treatment of backslashes.
+No replication is possible, and any subject modifiers must be set as defaults
+by a <b>#subject</b> command.
+</P>
 <br><a name="SEC10" href="#TOC1">PATTERN MODIFIERS</a><br>
 <P>
 There are several types of modifier that can appear in pattern lines. Except
@@ -518,29 +580,42 @@ by a previous <b>#pattern</b> command.
 Setting compilation options
 </b><br>
 <P>
-The following modifiers set options for <b>pcre2_compile()</b>. The most common
-ones have single-letter abbreviations. See
+The following modifiers set options for <b>pcre2_compile()</b>. Most of them set
+bits in the options argument of that function, but those whose names start with
+PCRE2_EXTRA are additional options that are set in the compile context. For the
+main options, there are some single-letter abbreviations that are the same as
+Perl options. There is special handling for /x: if a second x is present,
+PCRE2_EXTENDED is converted into PCRE2_EXTENDED_MORE as in Perl. A third
+appearance adds PCRE2_EXTENDED as well, though this makes no difference to the
+way <b>pcre2_compile()</b> behaves. See
 <a href="pcre2api.html"><b>pcre2api</b></a>
-for a description of their effects.
+for a description of the effects of these options.
 <pre>
       allow_empty_class         set PCRE2_ALLOW_EMPTY_CLASS
+      allow_surrogate_escapes   set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
       alt_bsux                  set PCRE2_ALT_BSUX
       alt_circumflex            set PCRE2_ALT_CIRCUMFLEX
       alt_verbnames             set PCRE2_ALT_VERBNAMES
       anchored                  set PCRE2_ANCHORED
       auto_callout              set PCRE2_AUTO_CALLOUT
+      bad_escape_is_literal     set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
   /i  caseless                  set PCRE2_CASELESS
       dollar_endonly            set PCRE2_DOLLAR_ENDONLY
   /s  dotall                    set PCRE2_DOTALL
       dupnames                  set PCRE2_DUPNAMES
+      endanchored               set PCRE2_ENDANCHORED
   /x  extended                  set PCRE2_EXTENDED
+  /xx extended_more             set PCRE2_EXTENDED_MORE
       firstline                 set PCRE2_FIRSTLINE
+      literal                   set PCRE2_LITERAL
+      match_line                set PCRE2_EXTRA_MATCH_LINE
       match_unset_backref       set PCRE2_MATCH_UNSET_BACKREF
+      match_word                set PCRE2_EXTRA_MATCH_WORD
   /m  multiline                 set PCRE2_MULTILINE
       never_backslash_c         set PCRE2_NEVER_BACKSLASH_C
       never_ucp                 set PCRE2_NEVER_UCP
       never_utf                 set PCRE2_NEVER_UTF
-      no_auto_capture           set PCRE2_NO_AUTO_CAPTURE
+  /n  no_auto_capture           set PCRE2_NO_AUTO_CAPTURE
       no_auto_possess           set PCRE2_NO_AUTO_POSSESS
       no_dotstar_anchor         set PCRE2_NO_DOTSTAR_ANCHOR
       no_start_optimize         set PCRE2_NO_START_OPTIMIZE
@@ -553,19 +628,27 @@ for a description of their effects.
 As well as turning on the PCRE2_UTF option, the <b>utf</b> modifier causes all
 non-printing characters in output strings to be printed using the \x{hh...}
 notation. Otherwise, those less than 0x100 are output in hex without the curly
-brackets.
+brackets. Setting <b>utf</b> in 16-bit or 32-bit mode also causes pattern and
+subject strings to be translated to UTF-16 or UTF-32, respectively, before
+being passed to library functions.
 <a name="controlmodifiers"></a></P>
 <br><b>
 Setting compilation controls
 </b><br>
 <P>
 The following modifiers affect the compilation process or request information
-about the pattern:
+about the pattern. There are single-letter abbreviations for some that are
+heavily used in the test files.
 <pre>
       bsr=[anycrlf|unicode]     specify \R handling
   /B  bincode                   show binary code without lengths
       callout_info              show callout information
+      convert=&#60;options&#62;         request foreign pattern conversion
+      convert_glob_escape=c     set glob escape character
+      convert_glob_separator=c  set glob separator character
+      convert_length            set convert buffer length
       debug                     same as info,fullbincode
+      framesize                 show matching frame size
       fullbincode               show binary code with lengths
   /I  info                      show info about compiled pattern
       hex                       unquoted characters are hexadecimal
@@ -583,7 +666,10 @@ about the pattern:
       push                      push compiled pattern onto the stack
       pushcopy                  push a copy onto the stack
       stackguard=&#60;number&#62;       test the stackguard feature
+      subject_literal           treat all subject lines as literal
       tables=[0|1|2]            select internal tables
+      use_length                do not zero-terminate the pattern
+      utf8_input                treat input as UTF-8
 </pre>
 The effects of these modifiers are described in the following sections.
 </P>
@@ -599,7 +685,7 @@ is built, with the default default being Unicode.
 <P>
 The <b>newline</b> modifier specifies which characters are to be interpreted as
 newlines, both in the pattern and in subject lines. The type must be one of CR,
-LF, CRLF, ANYCRLF, or ANY (in upper or lower case).
+LF, CRLF, ANYCRLF, ANY, or NUL (in upper or lower case).
 </P>
 <br><b>
 Information about a pattern
@@ -651,6 +737,11 @@ not necessarily the last character. These lines are omitted if no starting or
 ending code units are recorded.
 </P>
 <P>
+The <b>framesize</b> modifier shows the size, in bytes, of the storage frames
+used by <b>pcre2_match()</b> for handling backtracking. The size depends on the
+number of capturing parentheses in the pattern.
+</P>
+<P>
 The <b>callout_info</b> modifier requests information about all the callouts in
 the pattern. A list of them is output at the end of any other information that
 is requested. For each callout, either its number or string is given, followed
@@ -684,13 +775,36 @@ nine characters, only two of which are specified in hexadecimal:
   /ab "literal" 32/hex
 </pre>
 Either single or double quotes may be used. There is no way of including
-the delimiter within a substring.
+the delimiter within a substring. The <b>hex</b> and <b>expand</b> modifiers are
+mutually exclusive.
+</P>
+<br><b>
+Specifying the pattern's length
+</b><br>
+<P>
+By default, patterns are passed to the compiling functions as zero-terminated
+strings but can be passed by length instead of being zero-terminated. The
+<b>use_length</b> modifier causes this to happen. Using a length happens
+automatically (whether or not <b>use_length</b> is set) when <b>hex</b> is set,
+because patterns specified in hexadecimal may contain binary zeros.
 </P>
 <P>
-By default, <b>pcre2test</b> passes patterns as zero-terminated strings to
-<b>pcre2_compile()</b>, giving the length as PCRE2_ZERO_TERMINATED. However, for
-patterns specified with the <b>hex</b> modifier, the actual length of the
-pattern is passed.
+If <b>hex</b> or <b>use_length</b> is used with the POSIX wrapper API (see
+<a href="#posixwrapper">"Using the POSIX wrapper API"</a>
+below), the REG_PEND extension is used to pass the pattern's length.
+</P>
+<br><b>
+Specifying wide characters in 16-bit and 32-bit modes
+</b><br>
+<P>
+In 16-bit and 32-bit modes, all input is automatically treated as UTF-8 and
+translated to UTF-16 or UTF-32 when the <b>utf</b> modifier is set. For testing
+the 16-bit and 32-bit libraries in non-UTF mode, the <b>utf8_input</b> modifier
+can be used. It is mutually exclusive with <b>utf</b>. Input lines are
+interpreted as UTF-8 as a means of specifying wide characters. More details are
+given in
+<a href="#inputencoding">"Input encoding"</a>
+above.
 </P>
 <br><b>
 Generating long repetitive patterns
@@ -708,7 +822,8 @@ are expanded before the pattern is passed to <b>pcre2_compile()</b>. For
 example, \[AB]{6000} is expanded to "ABAB..." 6000 times. This construction
 cannot be nested. An initial "\[" sequence is recognized only if "]{" followed
 by decimal digits and "}" is found later in the pattern. If not, the characters
-remain in the pattern unaltered.
+remain in the pattern unaltered. The <b>expand</b> and <b>hex</b> modifiers are
+mutually exclusive.
 </P>
 <P>
 If part of an expanded pattern looks like an expansion, but is really part of
@@ -737,7 +852,7 @@ modifier in "Subject Modifiers"
 for details of how these options are specified for each match attempt.
 </P>
 <P>
-JIT compilation is requested by the <b>/jit</b> pattern modifier, which may
+JIT compilation is requested by the <b>jit</b> pattern modifier, which may
 optionally be followed by an equals sign and a number in the range 0 to 7.
 The three bits that make up the number specify which of the three JIT operating
 modes are to be compiled:
@@ -746,7 +861,7 @@ modes are to be compiled:
   2  compile JIT code for soft partial matching
   4  compile JIT code for hard partial matching
 </pre>
-The possible values for the <b>/jit</b> modifier are therefore:
+The possible values for the <b>jit</b> modifier are therefore:
 <pre>
   0  disable JIT
   1  normal matching only
@@ -761,7 +876,7 @@ to <b>pcre2_match()</b> with either the PCRE2_PARTIAL_SOFT or the
 PCRE2_PARTIAL_HARD option set. Note that such a call may return a complete
 match; the options enable the possibility of a partial match, but do not
 require it. Note also that if you request JIT compilation only for partial
-matching (for example, /jit=2) but do not set the <b>partial</b> modifier on a
+matching (for example, jit=2) but do not set the <b>partial</b> modifier on a
 subject line, that match will not use JIT code because none was compiled for
 non-partial matching.
 </P>
@@ -792,14 +907,14 @@ code was actually used in the match.
 Setting a locale
 </b><br>
 <P>
-The <b>/locale</b> modifier must specify the name of a locale, for example:
+The <b>locale</b> modifier must specify the name of a locale, for example:
 <pre>
   /pattern/locale=fr_FR
 </pre>
 The given locale is set, <b>pcre2_maketables()</b> is called to build a set of
 character tables for the locale, and this is then passed to
 <b>pcre2_compile()</b> when compiling the regular expression. The same tables
-are used when matching the following subject lines. The <b>/locale</b> modifier
+are used when matching the following subject lines. The <b>locale</b> modifier
 applies only to the pattern on which it appears, but can be given in a
 <b>#pattern</b> command if a default is needed. Setting a locale and alternate
 character tables are mutually exclusive.
@@ -808,7 +923,7 @@ character tables are mutually exclusive.
 Showing pattern memory
 </b><br>
 <P>
-The <b>/memory</b> modifier causes the size in bytes of the memory used to hold
+The <b>memory</b> modifier causes the size in bytes of the memory used to hold
 the compiled pattern to be output. This does not include the size of the
 <b>pcre2_code</b> block; it is just the actual compiled data. If the pattern is
 subsequently passed to the JIT compiler, the size of the JIT compiled code is
@@ -838,12 +953,12 @@ The <b>max_pattern_length</b> modifier sets a limit, in code units, to the
 length of pattern that <b>pcre2_compile()</b> will accept. Breaching the limit
 causes a compilation error. The default is the largest number a PCRE2_SIZE
 variable can hold (essentially unlimited).
-</P>
+<a name="posixwrapper"></a></P>
 <br><b>
 Using the POSIX wrapper API
 </b><br>
 <P>
-The <b>/posix</b> and <b>posix_nosub</b> modifiers cause <b>pcre2test</b> to call
+The <b>posix</b> and <b>posix_nosub</b> modifiers cause <b>pcre2test</b> to call
 PCRE2 via the POSIX wrapper API rather than its native API. When
 <b>posix_nosub</b> is used, the POSIX option REG_NOSUB is passed to
 <b>regcomp()</b>. The POSIX wrapper supports only the 8-bit library. Note that
@@ -873,11 +988,16 @@ The <b>aftertext</b> and <b>allaftertext</b> subject modifiers work as described
 below. All other modifiers are either ignored, with a warning message, or cause
 an error.
 </P>
+<P>
+The pattern is passed to <b>regcomp()</b> as a zero-terminated string by
+default, but if the <b>use_length</b> or <b>hex</b> modifiers are set, the
+REG_PEND extension is used to pass it by length.
+</P>
 <br><b>
 Testing the stack guard feature
 </b><br>
 <P>
-The <b>/stackguard</b> modifier is used to test the use of
+The <b>stackguard</b> modifier is used to test the use of
 <b>pcre2_set_compile_recursion_guard()</b>, a function that is provided to
 enable stack availability to be checked during compilation (see the
 <a href="pcre2api.html"><b>pcre2api</b></a>
@@ -892,7 +1012,7 @@ be aborted.
 Using alternative character tables
 </b><br>
 <P>
-The value specified for the <b>/tables</b> modifier must be one of the digits 0,
+The value specified for the <b>tables</b> modifier must be one of the digits 0,
 1, or 2. It causes a specific set of built-in character tables to be passed to
 <b>pcre2_compile()</b>. This is used in the PCRE2 tests to check behaviour with
 different character tables. The digit specifies the tables as follows:
@@ -910,17 +1030,19 @@ are mutually exclusive.
 Setting certain match controls
 </b><br>
 <P>
-The following modifiers are really subject modifiers, and are described below.
-However, they may be included in a pattern's modifier list, in which case they
-are applied to every subject line that is processed with that pattern. They may
-not appear in <b>#pattern</b> commands. These modifiers do not affect the
-compilation process.
+The following modifiers are really subject modifiers, and are described under
+"Subject Modifiers" below. However, they may be included in a pattern's
+modifier list, in which case they are applied to every subject line that is
+processed with that pattern. These modifiers do not affect the compilation
+process.
 <pre>
       aftertext                  show text after match
       allaftertext               show text after captures
       allcaptures                show all captures
       allusedtext                show all consulted text
+      altglobal                  alternative global matching
   /g  global                     global matching
+      jitstack=&#60;n&#62;               set size of JIT stack
       mark                       show mark values
       replace=&#60;string&#62;           specify a replacement string
       startchar                  show starting character when relevant
@@ -933,6 +1055,15 @@ These modifiers may not appear in a <b>#pattern</b> command. If you want them as
 defaults, set them in a <b>#subject</b> command.
 </P>
 <br><b>
+Specifying literal subject lines
+</b><br>
+<P>
+If the <b>subject_literal</b> modifier is present on a pattern, all the subject
+lines that it matches are taken as literal strings, with no interpretation of
+backslashes. It is not possible to set subject modifiers on such lines, but any
+that are set as defaults by a <b>#subject</b> command are recognized.
+</P>
+<br><b>
 Saving a compiled pattern
 </b><br>
 <P>
@@ -941,7 +1072,8 @@ pushed onto a stack of compiled patterns, and <b>pcre2test</b> expects the next
 line to contain a new pattern (or a command) instead of a subject line. This
 facility is used when saving compiled patterns to a file, as described in the
 section entitled "Saving and restoring compiled patterns"
-<a href="#saverestore">below. If <b>pushcopy</b> is used instead of <b>push</b>, a copy of the compiled</a>
+<a href="#saverestore">below.</a>
+If <b>pushcopy</b> is used instead of <b>push</b>, a copy of the compiled
 pattern is stacked, leaving the original as current, ready to match the
 following input lines. This provides a way of testing the
 <b>pcre2_code_copy()</b> function.
@@ -951,6 +1083,41 @@ are ignored (for the stacked copy), with a warning message, except for
 <b>replace</b>, which causes an error. Note that <b>jitverify</b>, which is
 allowed, does not carry through to any subsequent matching that uses a stacked
 pattern.
+</P>
+<br><b>
+Testing foreign pattern conversion
+</b><br>
+<P>
+The experimental foreign pattern conversion functions in PCRE2 can be tested by
+setting the <b>convert</b> modifier. Its argument is a colon-separated list of
+options, which set the equivalent option for the <b>pcre2_pattern_convert()</b>
+function:
+<pre>
+  glob                    PCRE2_CONVERT_GLOB
+  glob_no_starstar        PCRE2_CONVERT_GLOB_NO_STARSTAR
+  glob_no_wild_separator  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
+  posix_basic             PCRE2_CONVERT_POSIX_BASIC
+  posix_extended          PCRE2_CONVERT_POSIX_EXTENDED
+  unset                   Unset all options
+</pre>
+The "unset" value is useful for turning off a default that has been set by a
+<b>#pattern</b> command. When one of these options is set, the input pattern is
+passed to <b>pcre2_pattern_convert()</b>. If the conversion is successful, the
+result is reflected in the output and then passed to <b>pcre2_compile()</b>. The
+normal <b>utf</b> and <b>no_utf_check</b> options, if set, cause the
+PCRE2_CONVERT_UTF and PCRE2_CONVERT_NO_UTF_CHECK options to be passed to
+<b>pcre2_pattern_convert()</b>.
+</P>
+<P>
+By default, the conversion function is allowed to allocate a buffer for its
+output. However, if the <b>convert_length</b> modifier is set to a value greater
+than zero, <b>pcre2test</b> passes a buffer of the given length. This makes it
+possible to test the length check.
+</P>
+<P>
+The <b>convert_glob_escape</b> and <b>convert_glob_separator</b> modifiers can be
+used to specify the escape and separator characters for glob processing,
+overriding the defaults, which are operating-system dependent.
 <a name="subjectmodifiers"></a></P>
 <br><a name="SEC11" href="#TOC1">SUBJECT MODIFIERS</a><br>
 <P>
@@ -967,6 +1134,7 @@ The following modifiers set options for <b>pcre2_match()</b> or
 for a description of their effects.
 <pre>
       anchored                  set PCRE2_ANCHORED
+      endanchored               set PCRE2_ENDANCHORED
       dfa_restart               set PCRE2_DFA_RESTART
       dfa_shortest              set PCRE2_DFA_SHORTEST
       no_jit                    set PCRE2_NO_JIT
@@ -982,11 +1150,26 @@ The partial matching modifiers are provided with abbreviations because they
 appear frequently in tests.
 </P>
 <P>
-If the <b>/posix</b> modifier was present on the pattern, causing the POSIX
-wrapper API to be used, the only option-setting modifiers that have any effect
-are <b>notbol</b>, <b>notempty</b>, and <b>noteol</b>, causing REG_NOTBOL,
-REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to <b>regexec()</b>.
-The other modifiers are ignored, with a warning message.
+If the <b>posix</b> or <b>posix_nosub</b> modifier was present on the pattern,
+causing the POSIX wrapper API to be used, the only option-setting modifiers
+that have any effect are <b>notbol</b>, <b>notempty</b>, and <b>noteol</b>,
+causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to
+<b>regexec()</b>. The other modifiers are ignored, with a warning message.
+</P>
+<P>
+There is one additional modifier that can be used with the POSIX wrapper. It is
+ignored (with a warning) if used for non-POSIX matching.
+<pre>
+      posix_startend=&#60;n&#62;[:&#60;m&#62;]
+</pre>
+This causes the subject string to be passed to <b>regexec()</b> using the
+REG_STARTEND option, which uses offsets to specify which part of the string is
+searched. If only one number is given, the end offset is passed as the end of
+the subject string. For more detail of REG_STARTEND, see the
+<a href="pcre2posix.html"><b>pcre2posix</b></a>
+documentation. If the subject string contains binary zeros (coded as escapes
+such as \x{00} because <b>pcre2test</b> does not support actual binary zeros in
+its input), you must use <b>posix_startend</b> to specify its length.
 </P>
 <br><b>
 Setting match controls
@@ -1004,23 +1187,28 @@ pattern.
       altglobal                  alternative global matching
       callout_capture            show captures at callout time
       callout_data=&#60;n&#62;           set a value to pass via callouts
+      callout_error=&#60;n&#62;[:&#60;m&#62;]    control callout error
+      callout_extra              show extra callout information
       callout_fail=&#60;n&#62;[:&#60;m&#62;]     control callout failure
+      callout_no_where           do not show position of a callout
       callout_none               do not supply a callout function
       copy=&#60;number or name&#62;      copy captured substring
+      depth_limit=&#60;n&#62;            set a depth limit
       dfa                        use <b>pcre2_dfa_match()</b>
-      find_limits                find match and recursion limits
+      find_limits                find match and depth limits
       get=&#60;number or name&#62;       extract captured substring
       getall                     extract all captured substrings
   /g  global                     global matching
+      heap_limit=&#60;n&#62;             set a limit on heap memory
       jitstack=&#60;n&#62;               set size of JIT stack
       mark                       show mark values
       match_limit=&#60;n&#62;            set a match limit
-      memory                     show memory usage
+      memory                     show heap memory usage
       null_context               match with a NULL context
       offset=&#60;n&#62;                 set starting offset
       offset_limit=&#60;n&#62;           set offset limit
       ovector=&#60;n&#62;                set size of output vector
-      recursion_limit=&#60;n&#62;        set a recursion limit
+      recursion_limit=&#60;n&#62;        obsolete synonym for depth_limit
       replace=&#60;string&#62;           specify a replacement string
       startchar                  show startchar when relevant
       startoffset=&#60;n&#62;            same as offset=&#60;n&#62;
@@ -1098,29 +1286,17 @@ Testing callouts
 </b><br>
 <P>
 A callout function is supplied when <b>pcre2test</b> calls the library matching
-functions, unless <b>callout_none</b> is specified. If <b>callout_capture</b> is
-set, the current captured groups are output when a callout occurs.
-</P>
-<P>
-The <b>callout_fail</b> modifier can be given one or two numbers. If there is
-only one number, 1 is returned instead of 0 when a callout of that number is
-reached. If two numbers are given, 1 is returned when callout &#60;n&#62; is reached
-for the &#60;m&#62;th time. Note that callouts with string arguments are always given
-the number zero. See "Callouts" below for a description of the output when a
-callout it taken.
-</P>
-<P>
-The <b>callout_data</b> modifier can be given an unsigned or a negative number.
-This is set as the "user data" that is passed to the matching function, and
-passed back when the callout function is invoked. Any value other than zero is
-used as a return from <b>pcre2test</b>'s callout function.
+functions, unless <b>callout_none</b> is specified. Its behaviour can be
+controlled by various modifiers listed above whose names begin with
+<b>callout_</b>. Details are given in the section entitled "Callouts"
+<a href="#callouts">below.</a>
 </P>
 <br><b>
 Finding all matches in a string
 </b><br>
 <P>
 Searching for all possible matches within a subject can be requested by the
-<b>global</b> or <b>/altglobal</b> modifier. After finding a match, the matching
+<b>global</b> or <b>altglobal</b> modifier. After finding a match, the matching
 function is called again to search the remainder of the subject. The difference
 between <b>global</b> and <b>altglobal</b> is that the former uses the
 <i>start_offset</i> argument to <b>pcre2_match()</b> or <b>pcre2_dfa_match()</b>
@@ -1242,41 +1418,47 @@ Setting the JIT stack size
 <P>
 The <b>jitstack</b> modifier provides a way of setting the maximum stack size
 that is used by the just-in-time optimization code. It is ignored if JIT
-optimization is not being used. The value is a number of kilobytes. Providing a
-stack that is larger than the default 32K is necessary only for very
-complicated patterns.
+optimization is not being used. The value is a number of kilobytes. Setting
+zero reverts to the default of 32K. Providing a stack that is larger than the
+default is necessary only for very complicated patterns. If <b>jitstack</b> is
+set non-zero on a subject line it overrides any value that was set on the
+pattern.
 </P>
 <br><b>
-Setting match and recursion limits
+Setting heap, match, and depth limits
 </b><br>
 <P>
-The <b>match_limit</b> and <b>recursion_limit</b> modifiers set the appropriate
-limits in the match context. These values are ignored when the
+The <b>heap_limit</b>, <b>match_limit</b>, and <b>depth_limit</b> modifiers set
+the appropriate limits in the match context. These values are ignored when the
 <b>find_limits</b> modifier is specified.
 </P>
 <br><b>
 Finding minimum limits
 </b><br>
 <P>
-If the <b>find_limits</b> modifier is present, <b>pcre2test</b> calls
-<b>pcre2_match()</b> several times, setting different values in the match
-context via <b>pcre2_set_match_limit()</b> and <b>pcre2_set_recursion_limit()</b>
-until it finds the minimum values for each parameter that allow
-<b>pcre2_match()</b> to complete without error.
+If the <b>find_limits</b> modifier is present on a subject line, <b>pcre2test</b>
+calls the relevant matching function several times, setting different values in
+the match context via <b>pcre2_set_heap_limit(), \fBpcre2_set_match_limit()</b>,
+or <b>pcre2_set_depth_limit()</b> until it finds the minimum values for each
+parameter that allows the match to complete without error.
 </P>
 <P>
 If JIT is being used, only the match limit is relevant. If DFA matching is
-being used, neither limit is relevant, and this modifier is ignored (with a
-warning message).
+being used, only the depth limit is relevant.
 </P>
 <P>
 The <i>match_limit</i> number is a measure of the amount of backtracking
 that takes place, and learning the minimum value can be instructive. For most
 simple matches, the number is quite small, but for patterns with very large
 numbers of matching possibilities, it can become large very quickly with
-increasing length of subject string. The <i>match_limit_recursion</i> number is
-a measure of how much stack (or, if PCRE2 is compiled with NO_RECURSE, how much
-heap) memory is needed to complete the match attempt.
+increasing length of subject string.
+</P>
+<P>
+For non-DFA matching, the minimum <i>depth_limit</i> number is a measure of how
+much nested backtracking happens (that is, how deeply the pattern's tree is
+searched). In the case of DFA matching, <i>depth_limit</i> controls the depth of
+recursive calls of the internal function that is used for handling pattern
+recursion, lookaround assertions, and atomic groups.
 </P>
 <br><b>
 Showing MARK names
@@ -1292,8 +1474,15 @@ is added to the non-match message.
 Showing memory usage
 </b><br>
 <P>
-The <b>memory</b> modifier causes <b>pcre2test</b> to log all memory allocation
-and freeing calls that occur during a match operation.
+The <b>memory</b> modifier causes <b>pcre2test</b> to log the sizes of all heap
+memory allocation and freeing calls that occur during a call to
+<b>pcre2_match()</b>. These occur only when a match requires a bigger vector
+than the default for remembering backtracking points. In many cases there will
+be no heap memory used and therefore no additional output. No heap memory is
+allocated during matching with <b>pcre2_dfa_match</b> or with JIT, so in those
+cases the <b>memory</b> modifier never has any effect. For this modifier to
+work, the <b>null_context</b> modifier must not be set on both the pattern and
+the subject, though it can be set on one or the other.
 </P>
 <br><b>
 Setting a starting offset
@@ -1337,8 +1526,8 @@ Passing the subject as zero-terminated
 By default, the subject string is passed to a native API matching function with
 its correct length. In order to test the facility for passing a zero-terminated
 string, the <b>zero_terminate</b> modifier is provided. It causes the length to
-be passed as PCRE2_ZERO_TERMINATED. (When matching via the POSIX interface,
-this modifier has no effect, as there is no facility for passing a length.)
+be passed as PCRE2_ZERO_TERMINATED. When matching via the POSIX interface,
+this modifier is ignored, with a warning.
 </P>
 <P>
 When testing <b>pcre2_substitute()</b>, this modifier also has the effect of
@@ -1393,7 +1582,7 @@ code unit offset of the start of the failing character is also output. Here is
 an example of an interactive <b>pcre2test</b> run.
 <pre>
   $ pcre2test
-  PCRE2 version 9.00 2014-05-10
+  PCRE2 version 10.22 2016-07-29
 
     re&#62; /^abc(\d+)/
   data&#62; abc123
@@ -1420,7 +1609,7 @@ unset substring is shown as "&#60;unset&#62;", as for the second data line.
 If the strings contain any non-printing characters, they are output as \xhh
 escapes if the value is less than 256 and UTF mode is not set. Otherwise they
 are output as \x{hh...} escapes. See below for the definition of non-printing
-characters. If the <b>/aftertext</b> modifier is set, the output for substring
+characters. If the <b>aftertext</b> modifier is set, the output for substring
 0 is followed by the the rest of the subject string, identified by "0+" like
 this:
 <pre>
@@ -1508,28 +1697,14 @@ restart the match with additional subject data by means of the
 For further information about partial matching, see the
 <a href="pcre2partial.html"><b>pcre2partial</b></a>
 documentation.
-</P>
+<a name="callouts"></a></P>
 <br><a name="SEC16" href="#TOC1">CALLOUTS</a><br>
 <P>
 If the pattern contains any callout requests, <b>pcre2test</b>'s callout
-function is called during matching unless <b>callout_none</b> is specified.
-This works with both matching functions.
-</P>
-<P>
-The callout function in <b>pcre2test</b> returns zero (carry on matching) by
-default, but you can use a <b>callout_fail</b> modifier in a subject line (as
-described above) to change this and other parameters of the callout.
-</P>
-<P>
-Inserting callouts can be helpful when using <b>pcre2test</b> to check
-complicated regular expressions. For further information about callouts, see
-the
-<a href="pcre2callout.html"><b>pcre2callout</b></a>
-documentation.
-</P>
-<P>
-The output for callouts with numerical arguments and those with string
-arguments is slightly different.
+function is called during matching unless <b>callout_none</b> is specified. This
+works with both matching functions, and with JIT, though there are some
+differences in behaviour. The output for callouts with numerical arguments and
+those with string arguments is slightly different.
 </P>
 <br><b>
 Callouts with numerical arguments
@@ -1551,7 +1726,7 @@ callout is in a lookbehind assertion.
 </P>
 <P>
 Callouts numbered 255 are assumed to be automatic callouts, inserted as a
-result of the <b>/auto_callout</b> pattern modifier. In this case, instead of
+result of the <b>auto_callout</b> pattern modifier. In this case, instead of
 showing the callout number, the offset in the pattern, preceded by a plus, is
 output. For example:
 <pre>
@@ -1604,6 +1779,107 @@ example:
 
 </PRE>
 </P>
+<br><b>
+Callout modifiers
+</b><br>
+<P>
+The callout function in <b>pcre2test</b> returns zero (carry on matching) by
+default, but you can use a <b>callout_fail</b> modifier in a subject line to
+change this and other parameters of the callout (see below).
+</P>
+<P>
+If the <b>callout_capture</b> modifier is set, the current captured groups are
+output when a callout occurs. This is useful only for non-DFA matching, as
+<b>pcre2_dfa_match()</b> does not support capturing, so no captures are ever
+shown.
+</P>
+<P>
+The normal callout output, showing the callout number or pattern offset (as
+described above) is suppressed if the <b>callout_no_where</b> modifier is set.
+</P>
+<P>
+When using the interpretive matching function <b>pcre2_match()</b> without JIT,
+setting the <b>callout_extra</b> modifier causes additional output from
+<b>pcre2test</b>'s callout function to be generated. For the first callout in a
+match attempt at a new starting position in the subject, "New match attempt" is
+output. If there has been a backtrack since the last callout (or start of
+matching if this is the first callout), "Backtrack" is output, followed by "No
+other matching paths" if the backtrack ended the previous match attempt. For
+example:
+<pre>
+   re&#62; /(a+)b/auto_callout,no_start_optimize,no_auto_possess
+  data&#62; aac\=callout_extra
+  New match attempt
+  ---&#62;aac
+   +0 ^       (
+   +1 ^       a+
+   +3 ^ ^     )
+   +4 ^ ^     b
+  Backtrack
+  ---&#62;aac
+   +3 ^^      )
+   +4 ^^      b
+  Backtrack
+  No other matching paths
+  New match attempt
+  ---&#62;aac
+   +0  ^      (
+   +1  ^      a+
+   +3  ^^     )
+   +4  ^^     b
+  Backtrack
+  No other matching paths
+  New match attempt
+  ---&#62;aac
+   +0   ^     (
+   +1   ^     a+
+  Backtrack
+  No other matching paths
+  New match attempt
+  ---&#62;aac
+   +0    ^    (
+   +1    ^    a+
+  No match
+</pre>
+Notice that various optimizations must be turned off if you want all possible
+matching paths to be scanned. If <b>no_start_optimize</b> is not used, there is
+an immediate "no match", without any callouts, because the starting
+optimization fails to find "b" in the subject, which it knows must be present
+for any match. If <b>no_auto_possess</b> is not used, the "a+" item is turned
+into "a++", which reduces the number of backtracks.
+</P>
+<P>
+The <b>callout_extra</b> modifier has no effect if used with the DFA matching
+function, or with JIT.
+</P>
+<br><b>
+Return values from callouts
+</b><br>
+<P>
+The default return from the callout function is zero, which allows matching to
+continue. The <b>callout_fail</b> modifier can be given one or two numbers. If
+there is only one number, 1 is returned instead of 0 (causing matching to
+backtrack) when a callout of that number is reached. If two numbers (&#60;n&#62;:&#60;m&#62;)
+are given, 1 is returned when callout &#60;n&#62; is reached and there have been at
+least &#60;m&#62; callouts. The <b>callout_error</b> modifier is similar, except that
+PCRE2_ERROR_CALLOUT is returned, causing the entire matching process to be
+aborted. If both these modifiers are set for the same callout number,
+<b>callout_error</b> takes precedence. Note that callouts with string arguments
+are always given the number zero.
+</P>
+<P>
+The <b>callout_data</b> modifier can be given an unsigned or a negative number.
+This is set as the "user data" that is passed to the matching function, and
+passed back when the callout function is invoked. Any value other than zero is
+used as a return from <b>pcre2test</b>'s callout function.
+</P>
+<P>
+Inserting callouts can be helpful when using <b>pcre2test</b> to check
+complicated regular expressions. For further information about callouts, see
+the
+<a href="pcre2callout.html"><b>pcre2callout</b></a>
+documentation.
+</P>
 <br><a name="SEC17" href="#TOC1">NON-PRINTING CHARACTERS</a><br>
 <P>
 When <b>pcre2test</b> is outputting text in the compiled version of a pattern,
@@ -1613,7 +1889,7 @@ therefore shown as hex escapes.
 <P>
 When <b>pcre2test</b> is outputting text that is a matched part of a subject
 string, it behaves in the same way, unless a different locale has been set for
-the pattern (using the <b>/locale</b> modifier). In this case, the
+the pattern (using the <b>locale</b> modifier). In this case, the
 <b>isprint()</b> function is used to distinguish printing and non-printing
 characters.
 <a name="saverestore"></a></P>
@@ -1706,9 +1982,9 @@ Cambridge, England.
 </P>
 <br><a name="SEC21" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 06 July 2016
+Last updated: 21 December 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/html/pcre2unicode.html b/doc/html/pcre2unicode.html
index 6ca367f..448a221 100644
--- a/doc/html/pcre2unicode.html
+++ b/doc/html/pcre2unicode.html
@@ -47,7 +47,7 @@ and
 documentation. Only the short names for properties are supported. For example,
 \p{L} matches a letter. Its Perl synonym, \p{Letter}, is not supported.
 Furthermore, in Perl, many properties may optionally be prefixed by "Is", for
-compatibility with Perl 5.6. PCRE does not support this.
+compatibility with Perl 5.6. PCRE2 does not support this.
 </P>
 <br><b>
 WIDE CHARACTERS AND UTF MODES
@@ -109,10 +109,15 @@ However, the special horizontal and vertical white space matching escapes (\h,
 \H, \v, and \V) do match all the appropriate Unicode characters, whether or
 not PCRE2_UCP is set.
 </P>
+<br><b>
+CASE-EQUIVALENCE IN UTF MODES
+</b><br>
 <P>
-Case-insensitive matching in UTF mode makes use of Unicode properties. A few
-Unicode characters such as Greek sigma have more than two codepoints that are
-case-equivalent, and these are treated as such.
+Case-insensitive matching in a UTF mode makes use of Unicode properties except
+for characters whose code points are less than 128 and that have at most two
+case-equivalent values. For these, a direct table lookup is used for speed. A
+few Unicode characters such as Greek sigma have more than two codepoints that
+are case-equivalent, and these are treated as such.
 </P>
 <br><b>
 VALIDITY OF UTF STRINGS
@@ -173,6 +178,15 @@ or <b>pcre2_dfa_match()</b>.
 <P>
 If you pass an invalid UTF string when PCRE2_NO_UTF_CHECK is set, the result
 is undefined and your program may crash or loop indefinitely.
+</P>
+<P>
+Note that setting PCRE2_NO_UTF_CHECK at compile time does not disable the error
+that is given if an escape sequence for an invalid Unicode code point is
+encountered in the pattern. If you want to allow escape sequences such as
+\x{d800} (a surrogate code point) you can set the
+PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra option. However, this is possible
+only in UTF-8 and UTF-32 modes, because these values are not representable in
+UTF-16.
 <a name="utf8strings"></a></P>
 <br><b>
 Errors in UTF-8 strings
@@ -280,9 +294,9 @@ Cambridge, England.
 REVISION
 </b><br>
 <P>
-Last updated: 03 July 2016
+Last updated: 17 May 2017
 <br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/index.html.src b/doc/index.html.src
index 703c298..b9393d9 100644
--- a/doc/index.html.src
+++ b/doc/index.html.src
@@ -35,6 +35,9 @@ first.
 <tr><td><a href="pcre2compat.html">pcre2compat</a></td>
     <td>&nbsp;&nbsp;Compability with Perl</td></tr>
 
+<tr><td><a href="pcre2convert.html">pcre2convert</a></td>
+    <td>&nbsp;&nbsp;Experimental foreign pattern conversion functions</td></tr>
+
 <tr><td><a href="pcre2demo.html">pcre2demo</a></td>
     <td>&nbsp;&nbsp;A demonstration C program that uses the PCRE2 library</td></tr>
 
@@ -68,9 +71,6 @@ first.
 <tr><td><a href="pcre2serialize.html">pcre2serialize</a></td>
     <td>&nbsp;&nbsp;Serializing functions for saving precompiled patterns</td></tr>
 
-<tr><td><a href="pcre2stack.html">pcre2stack</a></td>
-    <td>&nbsp;&nbsp;Discussion of PCRE2's stack usage</td></tr>
-
 <tr><td><a href="pcre2syntax.html">pcre2syntax</a></td>
     <td>&nbsp;&nbsp;Syntax quick-reference summary</td></tr>
 
@@ -94,6 +94,9 @@ in the library.
 <tr><td><a href="pcre2_code_copy.html">pcre2_code_copy</a></td>
     <td>&nbsp;&nbsp;Copy a compiled pattern</td></tr>
 
+<tr><td><a href="pcre2_code_copy_with_tables.html">pcre2_code_copy_with_tables</a></td>
+    <td>&nbsp;&nbsp;Copy a compiled pattern and its character tables</td></tr>
+
 <tr><td><a href="pcre2_code_free.html">pcre2_code_free</a></td>
     <td>&nbsp;&nbsp;Free a compiled pattern</td></tr>
 
@@ -112,6 +115,18 @@ in the library.
 <tr><td><a href="pcre2_config.html">pcre2_config</a></td>
     <td>&nbsp;&nbsp;Show build-time configuration options</td></tr>
 
+<tr><td><a href="pcre2_convert_context_copy.html">pcre2_convert_context_copy</a></td>
+    <td>&nbsp;&nbsp;Copy a convert context</td></tr>
+
+<tr><td><a href="pcre2_convert_context_create.html">pcre2_convert_context_create</a></td>
+    <td>&nbsp;&nbsp;Create a convert context</td></tr>
+
+<tr><td><a href="pcre2_convert_context_free.html">pcre2_convert_context_free</a></td>
+    <td>&nbsp;&nbsp;Free a convert context</td></tr>
+
+<tr><td><a href="pcre2_converted_pattern_free.html">pcre2_converted_pattern_free</a></td>
+    <td>&nbsp;&nbsp;Free converted foreign pattern</td></tr>
+
 <tr><td><a href="pcre2_dfa_match.html">pcre2_dfa_match</a></td>
     <td>&nbsp;&nbsp;Match a compiled pattern to a subject string
     (DFA algorithm; <i>not</i> Perl compatible)</td></tr>
@@ -183,6 +198,9 @@ in the library.
 <tr><td><a href="pcre2_match_data_free.html">pcre2_match_data_free</a></td>
     <td>&nbsp;&nbsp;Free a match data block</td></tr>
 
+<tr><td><a href="pcre2_pattern_convert.html">pcre2_pattern_convert</a></td>
+    <td>&nbsp;&nbsp;Experimental foreign pattern converter</td></tr>
+
 <tr><td><a href="pcre2_pattern_info.html">pcre2_pattern_info</a></td>
     <td>&nbsp;&nbsp;Extract information about a pattern</td></tr>
 
@@ -207,9 +225,24 @@ in the library.
 <tr><td><a href="pcre2_set_character_tables.html">pcre2_set_character_tables</a></td>
     <td>&nbsp;&nbsp;Set character tables</td></tr>
 
+<tr><td><a href="pcre2_set_compile_extra_options.html">pcre2_set_compile_extra_options</a></td>
+    <td>&nbsp;&nbsp;Set compile time extra options</td></tr>
+
 <tr><td><a href="pcre2_set_compile_recursion_guard.html">pcre2_set_compile_recursion_guard</a></td>
     <td>&nbsp;&nbsp;Set up a compile recursion guard function</td></tr>
 
+<tr><td><a href="pcre2_set_depth_limit.html">pcre2_set_depth_limit</a></td>
+    <td>&nbsp;&nbsp;Set the match backtracking depth limit</td></tr>
+
+<tr><td><a href="pcre2_set_glob_escape.html">pcre2_set_glob_escape</a></td>
+    <td>&nbsp;&nbsp;Set glob escape character</td></tr>
+
+<tr><td><a href="pcre2_set_glob_separator.html">pcre2_set_glob_separator</a></td>
+    <td>&nbsp;&nbsp;Set glob separator character</td></tr>
+
+<tr><td><a href="pcre2_set_heap_limit.html">pcre2_set_heap_limit</a></td>
+    <td>&nbsp;&nbsp;Set the match backtracking heap limit</td></tr>
+
 <tr><td><a href="pcre2_set_match_limit.html">pcre2_set_match_limit</a></td>
     <td>&nbsp;&nbsp;Set the match limit</td></tr>
 
@@ -226,10 +259,10 @@ in the library.
     <td>&nbsp;&nbsp;Set the parentheses nesting limit</td></tr>
 
 <tr><td><a href="pcre2_set_recursion_limit.html">pcre2_set_recursion_limit</a></td>
-    <td>&nbsp;&nbsp;Set the match recursion limit</td></tr>
+    <td>&nbsp;&nbsp;Obsolete: use pcre2_set_depth_limit</td></tr>
 
 <tr><td><a href="pcre2_set_recursion_memory_management.html">pcre2_set_recursion_memory_management</a></td>
-    <td>&nbsp;&nbsp;Set match recursion memory management</td></tr>
+    <td>&nbsp;&nbsp;Obsolete function that (from 10.30 onwards) does nothing</td></tr>
 
 <tr><td><a href="pcre2_substitute.html">pcre2_substitute</a></td>
     <td>&nbsp;&nbsp;Match a compiled pattern to a subject string and do
diff --git a/doc/pcre2.3 b/doc/pcre2.3
index 9a84ce3..83a7655 100644
--- a/doc/pcre2.3
+++ b/doc/pcre2.3
@@ -1,4 +1,4 @@
-.TH PCRE2 3 "16 October 2015" "PCRE2 10.21"
+.TH PCRE2 3 "01 April 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH INTRODUCTION
@@ -104,7 +104,7 @@ lose performance.
 One way of guarding against this possibility is to use the
 \fBpcre2_pattern_info()\fP function to check the compiled pattern's options for
 PCRE2_UTF. Alternatively, you can set the PCRE2_NEVER_UTF option when calling
-\fBpcre2_compile()\fP. This causes an compile time error if a pattern contains
+\fBpcre2_compile()\fP. This causes a compile time error if the pattern contains
 a UTF-setting sequence.
 .P
 The use of Unicode properties for character types such as \ed can also be
@@ -130,7 +130,8 @@ against this: see the \fBpcre2_set_match_limit()\fP function in the
 .\" HREF
 \fBpcre2api\fP
 .\"
-page.
+page. There is a similar function called \fBpcre2_set_depth_limit()\fP that can
+be used to restrict the amount of memory that is used.
 .
 .
 .SH "USER DOCUMENTATION"
@@ -163,7 +164,6 @@ listing), and the short pages for individual functions, are concatenated in
   pcre2perform       discussion of performance issues
   pcre2posix         the POSIX-compatible C API for the 8-bit library
   pcre2sample        discussion of the pcre2demo program
-  pcre2stack         discussion of stack usage
   pcre2syntax        quick syntax reference
   pcre2test          description of the \fBpcre2test\fP command
   pcre2unicode       discussion of Unicode and UTF support
@@ -189,6 +189,6 @@ use my two initials, followed by the two digits 10, at the domain cam.ac.uk.
 .rs
 .sp
 .nf
-Last updated: 16 October 2015
-Copyright (c) 1997-2015 University of Cambridge.
+Last updated: 01 April 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2.txt b/doc/pcre2.txt
index 8f4e8a1..79d94e3 100644
--- a/doc/pcre2.txt
+++ b/doc/pcre2.txt
@@ -89,8 +89,8 @@ SECURITY CONSIDERATIONS
        One  way  of guarding against this possibility is to use the pcre2_pat-
        tern_info() function  to  check  the  compiled  pattern's  options  for
        PCRE2_UTF.  Alternatively,  you can set the PCRE2_NEVER_UTF option when
-       calling pcre2_compile(). This causes an compile time error if a pattern
-       contains a UTF-setting sequence.
+       calling pcre2_compile(). This causes a compile time error if  the  pat-
+       tern contains a UTF-setting sequence.
 
        The  use  of Unicode properties for character types such as \d can also
        be enabled from within the pattern, by specifying "(*UCP)".  This  fea-
@@ -112,7 +112,9 @@ SECURITY CONSIDERATIONS
        has a very large search tree against a string that  will  never  match.
        Nested  unlimited repeats in a pattern are a common example. PCRE2 pro-
        vides some protection against  this:  see  the  pcre2_set_match_limit()
-       function in the pcre2api page.
+       function  in  the  pcre2api  page.  There  is a similar function called
+       pcre2_set_depth_limit() that can be used to restrict the amount of mem-
+       ory that is used.
 
 
 USER DOCUMENTATION
@@ -144,7 +146,6 @@ USER DOCUMENTATION
          pcre2perform       discussion of performance issues
          pcre2posix         the POSIX-compatible C API for the 8-bit library
          pcre2sample        discussion of the pcre2demo program
-         pcre2stack         discussion of stack usage
          pcre2syntax        quick syntax reference
          pcre2test          description of the pcre2test command
          pcre2unicode       discussion of Unicode and UTF support
@@ -166,8 +167,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 16 October 2015
-       Copyright (c) 1997-2015 University of Cambridge.
+       Last updated: 01 April 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -180,9 +181,9 @@ NAME
 
        #include <pcre2.h>
 
-       PCRE2  is  a  new API for PCRE. This document contains a description of
-       all its functions. See the pcre2 document for an overview  of  all  the
-       PCRE2 documentation.
+       PCRE2  is  a  new API for PCRE, starting at release 10.0. This document
+       contains a description of all its native functions. See the pcre2 docu-
+       ment for an overview of all the PCRE2 documentation.
 
 
 PCRE2 NATIVE API BASIC FUNCTIONS
@@ -252,6 +253,9 @@ PCRE2 NATIVE API COMPILE CONTEXT FUNCTIONS
        int pcre2_set_character_tables(pcre2_compile_context *ccontext,
          const unsigned char *tables);
 
+       int pcre2_set_compile_extra_options(pcre2_compile_context *ccontext,
+         uint32_t extra_options);
+
        int pcre2_set_max_pattern_length(pcre2_compile_context *ccontext,
          PCRE2_SIZE value);
 
@@ -279,19 +283,17 @@ PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS
          int (*callout_function)(pcre2_callout_block *, void *),
          void *callout_data);
 
-       int pcre2_set_match_limit(pcre2_match_context *mcontext,
-         uint32_t value);
-
        int pcre2_set_offset_limit(pcre2_match_context *mcontext,
          PCRE2_SIZE value);
 
-       int pcre2_set_recursion_limit(pcre2_match_context *mcontext,
+       int pcre2_set_heap_limit(pcre2_match_context *mcontext,
          uint32_t value);
 
-       int pcre2_set_recursion_memory_management(
-         pcre2_match_context *mcontext,
-         void *(*private_malloc)(PCRE2_SIZE, void *),
-         void (*private_free)(void *, void *), void *memory_data);
+       int pcre2_set_match_limit(pcre2_match_context *mcontext,
+         uint32_t value);
+
+       int pcre2_set_depth_limit(pcre2_match_context *mcontext,
+         uint32_t value);
 
 
 PCRE2 NATIVE API STRING EXTRACTION FUNCTIONS
@@ -379,6 +381,8 @@ PCRE2 NATIVE API AUXILIARY FUNCTIONS
 
        pcre2_code *pcre2_code_copy(const pcre2_code *code);
 
+       pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *code);
+
        int pcre2_get_error_message(int errorcode, PCRE2_UCHAR *buffer,
          PCRE2_SIZE bufflen);
 
@@ -393,19 +397,64 @@ PCRE2 NATIVE API AUXILIARY FUNCTIONS
        int pcre2_config(uint32_t what, void *where);
 
 
+PCRE2 NATIVE API OBSOLETE FUNCTIONS
+
+       int pcre2_set_recursion_limit(pcre2_match_context *mcontext,
+         uint32_t value);
+
+       int pcre2_set_recursion_memory_management(
+         pcre2_match_context *mcontext,
+         void *(*private_malloc)(PCRE2_SIZE, void *),
+         void (*private_free)(void *, void *), void *memory_data);
+
+       These  functions became obsolete at release 10.30 and are retained only
+       for backward compatibility. They should not be used in  new  code.  The
+       first  is  replaced by pcre2_set_depth_limit(); the second is no longer
+       needed and has no effect (it always returns zero).
+
+
+PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS
+
+       pcre2_convert_context *pcre2_convert_context_create(
+         pcre2_general_context *gcontext);
+
+       pcre2_convert_context *pcre2_convert_context_copy(
+         pcre2_convert_context *cvcontext);
+
+       void pcre2_convert_context_free(pcre2_convert_context *cvcontext);
+
+       int pcre2_set_glob_escape(pcre2_convert_context *cvcontext,
+         uint32_t escape_char);
+
+       int pcre2_set_glob_separator(pcre2_convert_context *cvcontext,
+         uint32_t separator_char);
+
+       int pcre2_pattern_convert(PCRE2_SPTR pattern, PCRE2_SIZE length,
+         uint32_t options, PCRE2_UCHAR **buffer,
+         PCRE2_SIZE *blength, pcre2_convert_context *cvcontext);
+
+       void pcre2_converted_pattern_free(PCRE2_UCHAR *converted_pattern);
+
+       These functions provide a way of  converting  non-PCRE2  patterns  into
+       patterns  that  can  be  processed by pcre2_compile(). This facility is
+       experimental and may be changed in future releases. At present, "globs"
+       and  POSIX  basic  and  extended patterns can be converted. Details are
+       given in the pcre2convert documentation.
+
+
 PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES
 
-       There  are  three PCRE2 libraries, supporting 8-bit, 16-bit, and 32-bit
-       code units, respectively. However,  there  is  just  one  header  file,
-       pcre2.h.   This  contains the function prototypes and other definitions
+       There are three PCRE2 libraries, supporting 8-bit, 16-bit,  and  32-bit
+       code  units,  respectively.  However,  there  is  just one header file,
+       pcre2.h.  This contains the function prototypes and  other  definitions
        for all three libraries. One, two, or all three can be installed simul-
-       taneously.  On  Unix-like  systems the libraries are called libpcre2-8,
+       taneously. On Unix-like systems the libraries  are  called  libpcre2-8,
        libpcre2-16, and libpcre2-32, and they can also co-exist with the orig-
        inal PCRE libraries.
 
-       Character  strings are passed to and from a PCRE2 library as a sequence
-       of unsigned integers in code units  of  the  appropriate  width.  Every
-       PCRE2  function  comes  in three different forms, one for each library,
+       Character strings are passed to and from a PCRE2 library as a  sequence
+       of  unsigned  integers  in  code  units of the appropriate width. Every
+       PCRE2 function comes in three different forms, one  for  each  library,
        for example:
 
          pcre2_compile_8()
@@ -417,72 +466,79 @@ PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES
          PCRE2_UCHAR8, PCRE2_UCHAR16, PCRE2_UCHAR32
          PCRE2_SPTR8,  PCRE2_SPTR16,  PCRE2_SPTR32
 
-       The UCHAR types define unsigned code units of the  appropriate  widths.
-       For  example,  PCRE2_UCHAR16 is usually defined as `uint16_t'. The SPTR
-       types are constant pointers to the equivalent  UCHAR  types,  that  is,
+       The  UCHAR  types define unsigned code units of the appropriate widths.
+       For example, PCRE2_UCHAR16 is usually defined as `uint16_t'.  The  SPTR
+       types  are  constant  pointers  to the equivalent UCHAR types, that is,
        they are pointers to vectors of unsigned code units.
 
-       Many  applications use only one code unit width. For their convenience,
+       Many applications use only one code unit width. For their  convenience,
        macros are defined whose names are the generic forms such as pcre2_com-
-       pile()  and  PCRE2_SPTR.  These  macros  use  the  value  of  the macro
-       PCRE2_CODE_UNIT_WIDTH to generate the appropriate width-specific  func-
+       pile() and  PCRE2_SPTR.  These  macros  use  the  value  of  the  macro
+       PCRE2_CODE_UNIT_WIDTH  to generate the appropriate width-specific func-
        tion and macro names.  PCRE2_CODE_UNIT_WIDTH is not defined by default.
-       An application must define it to be  8,  16,  or  32  before  including
+       An  application  must  define  it  to  be 8, 16, or 32 before including
        pcre2.h in order to make use of the generic names.
 
-       Applications  that use more than one code unit width can be linked with
-       more than one PCRE2 library, but must define  PCRE2_CODE_UNIT_WIDTH  to
-       be  0  before  including pcre2.h, and then use the real function names.
-       Any code that is to be included in an environment where  the  value  of
-       PCRE2_CODE_UNIT_WIDTH  is  unknown  should  also  use the real function
+       Applications that use more than one code unit width can be linked  with
+       more  than  one PCRE2 library, but must define PCRE2_CODE_UNIT_WIDTH to
+       be 0 before including pcre2.h, and then use the  real  function  names.
+       Any  code  that  is to be included in an environment where the value of
+       PCRE2_CODE_UNIT_WIDTH is unknown should  also  use  the  real  function
        names. (Unfortunately, it is not possible in C code to save and restore
        the value of a macro.)
 
-       If  PCRE2_CODE_UNIT_WIDTH  is  not  defined before including pcre2.h, a
+       If PCRE2_CODE_UNIT_WIDTH is not defined  before  including  pcre2.h,  a
        compiler error occurs.
 
-       When using multiple libraries in an application,  you  must  take  care
-       when  processing  any  particular  pattern to use only functions from a
-       single library.  For example, if you want to run a match using  a  pat-
-       tern  that  was  compiled  with pcre2_compile_16(), you must do so with
-       pcre2_match_16(), not pcre2_match_8().
+       When  using  multiple  libraries  in an application, you must take care
+       when processing any particular pattern to use  only  functions  from  a
+       single  library.   For example, if you want to run a match using a pat-
+       tern that was compiled with pcre2_compile_16(), you  must  do  so  with
+       pcre2_match_16(), not pcre2_match_8() or pcre2_match_32().
 
-       In the function summaries above, and in the rest of this  document  and
-       other  PCRE2  documents,  functions  and data types are described using
-       their generic names, without the 8, 16, or 32 suffix.
+       In  the  function summaries above, and in the rest of this document and
+       other PCRE2 documents, functions and data  types  are  described  using
+       their generic names, without the _8, _16, or _32 suffix.
 
 
 PCRE2 API OVERVIEW
 
-       PCRE2 has its own native API, which  is  described  in  this  document.
+       PCRE2  has  its  own  native  API, which is described in this document.
        There are also some wrapper functions for the 8-bit library that corre-
-       spond to the POSIX regular expression API, but they do not give  access
-       to all the functionality. They are described in the pcre2posix documen-
-       tation. Both these APIs define a set of C function calls.
-
-       The native API C data types, function prototypes,  option  values,  and
-       error codes are defined in the header file pcre2.h, which contains def-
-       initions of PCRE2_MAJOR and PCRE2_MINOR, the major  and  minor  release
-       numbers  for the library. Applications can use these to include support
+       spond  to the POSIX regular expression API, but they do not give access
+       to all the functionality of PCRE2. They are described in the pcre2posix
+       documentation. Both these APIs define a set of C function calls.
+
+       The  native  API  C data types, function prototypes, option values, and
+       error codes are defined in the header file pcre2.h, which also contains
+       definitions of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release
+       numbers for the library. Applications can use these to include  support
        for different releases of PCRE2.
 
        In a Windows environment, if you want to statically link an application
-       program  against  a non-dll PCRE2 library, you must define PCRE2_STATIC
+       program against a non-dll PCRE2 library, you must  define  PCRE2_STATIC
        before including pcre2.h.
 
-       The functions pcre2_compile(), and pcre2_match() are used for compiling
-       and  matching regular expressions in a Perl-compatible manner. A sample
+       The  functions pcre2_compile() and pcre2_match() are used for compiling
+       and matching regular expressions in a Perl-compatible manner. A  sample
        program that demonstrates the simplest way of using them is provided in
        the file called pcre2demo.c in the PCRE2 source distribution. A listing
-       of this program is  given  in  the  pcre2demo  documentation,  and  the
+       of  this  program  is  given  in  the  pcre2demo documentation, and the
        pcre2sample documentation describes how to compile and run it.
 
-       Just-in-time  compiler support is an optional feature of PCRE2 that can
-       be built in appropriate hardware environments. It greatly speeds up the
-       matching  performance of many patterns. Programs can request that it be
-       used if available, by calling pcre2_jit_compile() after a  pattern  has
-       been successfully compiled by pcre2_compile(). This does nothing if JIT
-       support is not available.
+       The compiling and matching functions recognize various options that are
+       passed as bits in an options argument. There are also some more compli-
+       cated  parameters  such  as  custom  memory  management  functions  and
+       resource  limits  that  are passed in "contexts" (which are just memory
+       blocks, described below). Simple applications do not need to  make  use
+       of contexts.
+
+       Just-in-time  (JIT)  compiler  support  is an optional feature of PCRE2
+       that can be built in  appropriate  hardware  environments.  It  greatly
+       speeds  up  the  matching  performance  of  many patterns. Programs can
+       request that it be used if  available  by  calling  pcre2_jit_compile()
+       after a pattern has been successfully compiled by pcre2_compile(). This
+       does nothing if JIT support is not available.
 
        More complicated programs might need to  make  use  of  the  specialist
        functions    pcre2_jit_stack_create(),    pcre2_jit_stack_free(),   and
@@ -491,20 +547,21 @@ PCRE2 API OVERVIEW
 
        JIT matching is automatically used by pcre2_match() if it is available,
        unless the PCRE2_NO_JIT option is set. There is also a direct interface
-       for  JIT  matching,  which gives improved performance. The JIT-specific
-       functions are discussed in the pcre2jit documentation.
-
-       A second matching function, pcre2_dfa_match(), which is  not  Perl-com-
-       patible,  is  also  provided.  This  uses a different algorithm for the
-       matching. The alternative algorithm finds all possible  matches  (at  a
-       given  point  in  the subject), and scans the subject just once (unless
-       there are lookbehind assertions).  However,  this  algorithm  does  not
-       return  captured  substrings.  A  description of the two matching algo-
-       rithms  and  their  advantages  and  disadvantages  is  given  in   the
-       pcre2matching    documentation.   There   is   no   JIT   support   for
+       for  JIT  matching,  which gives improved performance at the expense of
+       less sanity checking. The JIT-specific functions are discussed  in  the
+       pcre2jit documentation.
+
+       A  second  matching function, pcre2_dfa_match(), which is not Perl-com-
+       patible, is also provided. This uses  a  different  algorithm  for  the
+       matching.  The  alternative  algorithm finds all possible matches (at a
+       given point in the subject), and scans the subject  just  once  (unless
+       there  are  lookaround  assertions).  However,  this algorithm does not
+       return captured substrings. A description of  the  two  matching  algo-
+       rithms   and  their  advantages  and  disadvantages  is  given  in  the
+       pcre2matching   documentation.   There   is   no   JIT   support    for
        pcre2_dfa_match().
 
-       In addition to the main compiling and  matching  functions,  there  are
+       In  addition  to  the  main compiling and matching functions, there are
        convenience functions for extracting captured substrings from a subject
        string that has been matched by pcre2_match(). They are:
 
@@ -518,74 +575,74 @@ PCRE2 API OVERVIEW
          pcre2_substring_nametable_scan()
          pcre2_substring_number_from_name()
 
-       pcre2_substring_free() and pcre2_substring_list_free()  are  also  pro-
-       vided, to free the memory used for extracted strings.
+       pcre2_substring_free()  and  pcre2_substring_list_free()  are also pro-
+       vided, to free memory used for extracted strings.
 
-       The  function  pcre2_substitute()  can be called to match a pattern and
-       return a copy of the subject string with substitutions for  parts  that
+       The function pcre2_substitute() can be called to match  a  pattern  and
+       return  a  copy of the subject string with substitutions for parts that
        were matched.
 
-       Functions  whose  names begin with pcre2_serialize_ are used for saving
+       Functions whose names begin with pcre2_serialize_ are used  for  saving
        compiled patterns on disc or elsewhere, and reloading them later.
 
-       Finally, there are functions for finding out information about  a  com-
-       piled  pattern  (pcre2_pattern_info()) and about the configuration with
+       Finally,  there  are functions for finding out information about a com-
+       piled pattern (pcre2_pattern_info()) and about the  configuration  with
        which PCRE2 was built (pcre2_config()).
 
-       Functions with names ending with _free() are used  for  freeing  memory
-       blocks  of  various  sorts.  In all cases, if one of these functions is
+       Functions  with  names  ending with _free() are used for freeing memory
+       blocks of various sorts. In all cases, if one  of  these  functions  is
        called with a NULL argument, it does nothing.
 
 
 STRING LENGTHS AND OFFSETS
 
-       The PCRE2 API uses string lengths and  offsets  into  strings  of  code
-       units  in  several  places. These values are always of type PCRE2_SIZE,
-       which is an unsigned integer type, currently always defined as  size_t.
-       The  largest  value  that  can  be  stored  in  such  a  type  (that is
-       ~(PCRE2_SIZE)0) is reserved as a special indicator for  zero-terminated
-       strings  and  unset offsets.  Therefore, the longest string that can be
+       The  PCRE2  API  uses  string  lengths and offsets into strings of code
+       units in several places. These values are always  of  type  PCRE2_SIZE,
+       which  is an unsigned integer type, currently always defined as size_t.
+       The largest  value  that  can  be  stored  in  such  a  type  (that  is
+       ~(PCRE2_SIZE)0)  is reserved as a special indicator for zero-terminated
+       strings and unset offsets.  Therefore, the longest string that  can  be
        handled is one less than this maximum.
 
 
 NEWLINES
 
        PCRE2 supports five different conventions for indicating line breaks in
-       strings:  a  single  CR (carriage return) character, a single LF (line-
+       strings: a single CR (carriage return) character, a  single  LF  (line-
        feed) character, the two-character sequence CRLF, any of the three pre-
-       ceding,  or any Unicode newline sequence. The Unicode newline sequences
-       are the three just mentioned, plus the single characters  VT  (vertical
+       ceding, or any Unicode newline sequence. The Unicode newline  sequences
+       are  the  three just mentioned, plus the single characters VT (vertical
        tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
        separator, U+2028), and PS (paragraph separator, U+2029).
 
-       Each of the first three conventions is used by at least  one  operating
+       Each  of  the first three conventions is used by at least one operating
        system as its standard newline sequence. When PCRE2 is built, a default
-       can be specified.  The default default is LF, which is the  Unix  stan-
-       dard.  However, the newline convention can be changed by an application
+       can  be  specified.  The default default is LF, which is the Unix stan-
+       dard. However, the newline convention can be changed by an  application
        when calling pcre2_compile(), or it can be specified by special text at
        the start of the pattern itself; this overrides any other settings. See
        the pcre2pattern page for details of the special character sequences.
 
-       In the PCRE2 documentation the word "newline"  is  used  to  mean  "the
+       In  the  PCRE2  documentation  the  word "newline" is used to mean "the
        character or pair of characters that indicate a line break". The choice
-       of newline convention affects the handling of the dot, circumflex,  and
+       of  newline convention affects the handling of the dot, circumflex, and
        dollar metacharacters, the handling of #-comments in /x mode, and, when
-       CRLF is a recognized line ending sequence, the match position  advance-
+       CRLF  is a recognized line ending sequence, the match position advance-
        ment for a non-anchored pattern. There is more detail about this in the
        section on pcre2_match() options below.
 
-       The choice of newline convention does not affect the interpretation  of
+       The  choice of newline convention does not affect the interpretation of
        the \n or \r escape sequences, nor does it affect what \R matches; this
        has its own separate convention.
 
 
 MULTITHREADING
 
-       In a multithreaded application it is important to keep  thread-specific
-       data  separate  from data that can be shared between threads. The PCRE2
-       library code itself is thread-safe: it contains  no  static  or  global
-       variables.  The  API  is  designed to be fairly simple for non-threaded
-       applications while at the same time ensuring that multithreaded  appli-
+       In  a multithreaded application it is important to keep thread-specific
+       data separate from data that can be shared between threads.  The  PCRE2
+       library  code  itself  is  thread-safe: it contains no static or global
+       variables. The API is designed to be  fairly  simple  for  non-threaded
+       applications  while at the same time ensuring that multithreaded appli-
        cations can use it.
 
        There are several different blocks of data that are used to pass infor-
@@ -593,19 +650,19 @@ MULTITHREADING
 
    The compiled pattern
 
-       A pointer to the compiled form of a pattern is  returned  to  the  user
+       A  pointer  to  the  compiled form of a pattern is returned to the user
        when pcre2_compile() is successful. The data in the compiled pattern is
-       fixed, and does not change when the pattern is matched.  Therefore,  it
-       is  thread-safe, that is, the same compiled pattern can be used by more
+       fixed,  and  does not change when the pattern is matched. Therefore, it
+       is thread-safe, that is, the same compiled pattern can be used by  more
        than one thread simultaneously. For example, an application can compile
        all its patterns at the start, before forking off multiple threads that
-       use them. However, if the just-in-time optimization  feature  is  being
-       used,  it  needs  separate  memory stack areas for each thread. See the
-       pcre2jit documentation for more details.
+       use  them.  However,  if the just-in-time (JIT) optimization feature is
+       being used, it needs separate memory stack areas for each  thread.  See
+       the pcre2jit documentation for more details.
 
-       In a more complicated situation, where patterns are compiled only  when
-       they  are  first needed, but are still shared between threads, pointers
-       to compiled patterns must be protected  from  simultaneous  writing  by
+       In  a more complicated situation, where patterns are compiled only when
+       they are first needed, but are still shared between  threads,  pointers
+       to  compiled  patterns  must  be protected from simultaneous writing by
        multiple threads, at least until a pattern has been compiled. The logic
        can be something like this:
 
@@ -618,16 +675,17 @@ MULTITHREADING
          Release the lock
          Use pointer in pcre2_match()
 
-       Of course, testing for compilation errors should also  be  included  in
+       Of  course,  testing  for compilation errors should also be included in
        the code.
 
        If JIT is being used, but the JIT compilation is not being done immedi-
-       ately, (perhaps waiting to see if the pattern  is  used  often  enough)
+       ately,  (perhaps  waiting  to  see if the pattern is used often enough)
        similar logic is required. JIT compilation updates a pointer within the
-       compiled code block, so a thread must gain unique write access  to  the
-       pointer     before    calling    pcre2_jit_compile().    Alternatively,
-       pcre2_code_copy() can be used to obtain a private copy of the  compiled
-       code.
+       compiled  code  block, so a thread must gain unique write access to the
+       pointer    before    calling    pcre2_jit_compile().     Alternatively,
+       pcre2_code_copy()  or  pcre2_code_copy_with_tables()  can  be  used  to
+       obtain a private copy of the compiled code before calling the JIT  com-
+       piler.
 
    Context blocks
 
@@ -646,10 +704,10 @@ MULTITHREADING
 
    Match blocks
 
-       The matching functions need a block of memory for working space and for
-       storing  the  results  of  a  match.  This includes details of what was
-       matched, as well as additional  information  such  as  the  name  of  a
-       (*MARK) setting. Each thread must provide its own copy of this memory.
+       The matching functions need a block of memory for storing  the  results
+       of a match. This includes details of what was matched, as well as addi-
+       tional information such as the name of a (*MARK) setting.  Each  thread
+       must provide its own copy of this memory.
 
 
 PCRE2 CONTEXTS
@@ -714,21 +772,22 @@ PCRE2 CONTEXTS
 
    The compile context
 
-       A compile context is required if you want to change the default  values
-       of any of the following compile-time parameters:
+       A compile context is required if you want to provide an external  func-
+       tion  for  stack  checking  during compilation or to change the default
+       values of any of the following compile-time parameters:
 
          What \R matches (Unicode newlines or CR, LF, CRLF only)
          PCRE2's character tables
          The newline character sequence
          The compile time nested parentheses limit
          The maximum length of the pattern string
-         An external function for stack checking
+         The extra options bits (none set by default)
 
-       A  compile context is also required if you are using custom memory man-
-       agement.  If none of these apply, just pass NULL as the  context  argu-
+       A compile context is also required if you are using custom memory  man-
+       agement.   If  none of these apply, just pass NULL as the context argu-
        ment of pcre2_compile().
 
-       A  compile context is created, copied, and freed by the following func-
+       A compile context is created, copied, and freed by the following  func-
        tions:
 
        pcre2_compile_context *pcre2_compile_context_create(
@@ -739,57 +798,75 @@ PCRE2 CONTEXTS
 
        void pcre2_compile_context_free(pcre2_compile_context *ccontext);
 
-       A compile context is created with default values  for  its  parameters.
+       A  compile  context  is created with default values for its parameters.
        These can be changed by calling the following functions, which return 0
        on success, or PCRE2_ERROR_BADDATA if invalid data is detected.
 
        int pcre2_set_bsr(pcre2_compile_context *ccontext,
          uint32_t value);
 
-       The value must be PCRE2_BSR_ANYCRLF, to specify that  \R  matches  only
-       CR,  LF,  or CRLF, or PCRE2_BSR_UNICODE, to specify that \R matches any
+       The  value  must  be PCRE2_BSR_ANYCRLF, to specify that \R matches only
+       CR, LF, or CRLF, or PCRE2_BSR_UNICODE, to specify that \R  matches  any
        Unicode line ending sequence. The value is used by the JIT compiler and
-       by   the   two   interpreted   matching  functions,  pcre2_match()  and
+       by  the  two  interpreted   matching   functions,   pcre2_match()   and
        pcre2_dfa_match().
 
        int pcre2_set_character_tables(pcre2_compile_context *ccontext,
          const unsigned char *tables);
 
-       The value must be the result of a  call  to  pcre2_maketables(),  whose
+       The  value  must  be  the result of a call to pcre2_maketables(), whose
        only argument is a general context. This function builds a set of char-
        acter tables in the current locale.
 
+       int pcre2_set_compile_extra_options(pcre2_compile_context *ccontext,
+         uint32_t extra_options);
+
+       As  PCRE2  has developed, almost all the 32 option bits that are avail-
+       able in the options argument of pcre2_compile() have been used  up.  To
+       avoid  running  out, the compile context contains a set of extra option
+       bits which are used for some newer, assumed rarer, options. This  func-
+       tion  sets  those bits. It always sets all the bits (either on or off).
+       It does not modify any existing  setting.  The  available  options  are
+       defined in the section entitled "Extra compile options" below.
+
        int pcre2_set_max_pattern_length(pcre2_compile_context *ccontext,
          PCRE2_SIZE value);
 
-       This sets a maximum length, in code units, for the pattern string  that
-       is  to  be  compiled.  If the pattern is longer, an error is generated.
-       This facility is provided so that  applications  that  accept  patterns
-       from  external sources can limit their size. The default is the largest
-       number that a PCRE2_SIZE variable can hold, which is effectively unlim-
-       ited.
+       This  sets a maximum length, in code units, for any pattern string that
+       is compiled with this context. If the pattern is longer,  an  error  is
+       generated.   This facility is provided so that applications that accept
+       patterns from external sources can limit their size. The default is the
+       largest  number  that  a  PCRE2_SIZE variable can hold, which is effec-
+       tively unlimited.
 
        int pcre2_set_newline(pcre2_compile_context *ccontext,
          uint32_t value);
 
        This specifies which characters or character sequences are to be recog-
-       nized as newlines. The value must be one of PCRE2_NEWLINE_CR  (carriage
+       nized  as newlines. The value must be one of PCRE2_NEWLINE_CR (carriage
        return only), PCRE2_NEWLINE_LF (linefeed only), PCRE2_NEWLINE_CRLF (the
-       two-character sequence CR followed by LF),  PCRE2_NEWLINE_ANYCRLF  (any
-       of the above), or PCRE2_NEWLINE_ANY (any Unicode newline sequence).
+       two-character  sequence  CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any
+       of the above), PCRE2_NEWLINE_ANY (any  Unicode  newline  sequence),  or
+       PCRE2_NEWLINE_NUL (the NUL character, that is a binary zero).
 
-       When a pattern is compiled with the PCRE2_EXTENDED option, the value of
-       this parameter affects the recognition of white space and  the  end  of
-       internal comments starting with #. The value is saved with the compiled
-       pattern for subsequent use by the JIT compiler and by  the  two  inter-
-       preted matching functions, pcre2_match() and pcre2_dfa_match().
+       A pattern can override the value set in the compile context by starting
+       with a sequence such as (*CRLF). See the pcre2pattern page for details.
+
+       When   a   pattern   is   compiled   with   the    PCRE2_EXTENDED    or
+       PCRE2_EXTENDED_MORE option, the newline convention affects the recogni-
+       tion of white space and the end of internal comments starting  with  #.
+       The  value is saved with the compiled pattern for subsequent use by the
+       JIT  compiler  and  by  the   two   interpreted   matching   functions,
+       pcre2_match() and pcre2_dfa_match().
 
        int pcre2_set_parens_nest_limit(pcre2_compile_context *ccontext,
          uint32_t value);
 
        This parameter ajusts the limit, set when PCRE2 is built (default 250),
        on the depth of parenthesis nesting in  a  pattern.  This  limit  stops
-       rogue patterns using up too much system stack when being compiled.
+       rogue  patterns using up too much system stack when being compiled. The
+       limit applies to parentheses of all kinds, not just capturing parenthe-
+       ses.
 
        int pcre2_set_compile_recursion_guard(pcre2_compile_context *ccontext,
          int (*guard_function)(uint32_t, void *), void *user_data);
@@ -797,31 +874,32 @@ PCRE2 CONTEXTS
        There  is at least one application that runs PCRE2 in threads with very
        limited system stack, where running out of stack is to  be  avoided  at
        all  costs. The parenthesis limit above cannot take account of how much
-       stack is actually available. For a finer  control,  you  can  supply  a
-       function  that  is  called whenever pcre2_compile() starts to compile a
-       parenthesized part of a pattern. This function  can  check  the  actual
-       stack size (or anything else that it wants to, of course).
-
-       The  first  argument to the callout function gives the current depth of
-       nesting, and the second is user data that is set up by the  last  argu-
-       ment   of  pcre2_set_compile_recursion_guard().  The  callout  function
+       stack is actually available during compilation. For  a  finer  control,
+       you  can  supply  a  function  that  is called whenever pcre2_compile()
+       starts to compile a parenthesized part of a pattern. This function  can
+       check  the  actual  stack  size  (or anything else that it wants to, of
+       course).
+
+       The first argument to the callout function gives the current  depth  of
+       nesting,  and  the second is user data that is set up by the last argu-
+       ment  of  pcre2_set_compile_recursion_guard().  The  callout   function
        should return zero if all is well, or non-zero to force an error.
 
    The match context
 
-       A match context is required if you want to change the default values of
-       any of the following match-time parameters:
+       A match context is required if you want to:
 
-         A callout function
-         The offset limit for matching an unanchored pattern
-         The limit for calling match() (see below)
-         The limit for calling match() recursively
+         Set up a callout function
+         Set an offset limit for matching an unanchored pattern
+         Change the limit on the amount of heap used when matching
+         Change the backtracking match limit
+         Change the backtracking depth limit
+         Set custom memory management specifically for the match
 
-       A match context is also required if you are using custom memory manage-
-       ment.  If none of these apply, just pass NULL as the  context  argument
-       of pcre2_match(), pcre2_dfa_match(), or pcre2_jit_match().
+       If  none  of  these  apply,  just  pass NULL as the context argument of
+       pcre2_match(), pcre2_dfa_match(), or pcre2_jit_match().
 
-       A  match  context  is created, copied, and freed by the following func-
+       A match context is created, copied, and freed by  the  following  func-
        tions:
 
        pcre2_match_context *pcre2_match_context_create(
@@ -832,7 +910,7 @@ PCRE2 CONTEXTS
 
        void pcre2_match_context_free(pcre2_match_context *mcontext);
 
-       A match context is created with  default  values  for  its  parameters.
+       A  match  context  is  created  with default values for its parameters.
        These can be changed by calling the following functions, which return 0
        on success, or PCRE2_ERROR_BADDATA if invalid data is detected.
 
@@ -840,120 +918,137 @@ PCRE2 CONTEXTS
          int (*callout_function)(pcre2_callout_block *, void *),
          void *callout_data);
 
-       This sets up a "callout" function, which PCRE2 will call  at  specified
-       points during a matching operation. Details are given in the pcre2call-
-       out documentation.
+       This sets up a "callout" function for PCRE2 to call at specified points
+       during a matching operation. Details are given in the pcre2callout doc-
+       umentation.
 
        int pcre2_set_offset_limit(pcre2_match_context *mcontext,
          PCRE2_SIZE value);
 
-       The offset_limit parameter limits how  far  an  unanchored  search  can
-       advance  in  the  subject string. The default value is PCRE2_UNSET. The
-       pcre2_match()     and      pcre2_dfa_match()      functions      return
-       PCRE2_ERROR_NOMATCH  if  a match with a starting point before or at the
-       given offset is not found. For example, if the pattern /abc/ is matched
-       against  "123abc"  with  an  offset  limit  less  than 3, the result is
-       PCRE2_ERROR_NO_MATCH.  A match can never be found  if  the  startoffset
-       argument of pcre2_match() or pcre2_dfa_match() is greater than the off-
-       set limit.
-
-       When using this facility,  you  must  set  PCRE2_USE_OFFSET_LIMIT  when
-       calling  pcre2_compile() so that when JIT is in use, different code can
-       be compiled. If a match is started with a non-default match limit  when
-       PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.
-
-       The  offset limit facility can be used to track progress when searching
-       large subject strings.  See  also  the  PCRE2_FIRSTLINE  option,  which
-       requires a match to start within the first line of the subject. If this
-       is set with an offset limit, a match must occur in the first  line  and
-       also  within  the  offset limit.  In other words, whichever limit comes
-       first is used.
-
-       int pcre2_set_match_limit(pcre2_match_context *mcontext,
+       The  offset_limit  parameter  limits  how  far an unanchored search can
+       advance in the subject string. The default value  is  PCRE2_UNSET.  The
+       pcre2_match()      and      pcre2_dfa_match()      functions     return
+       PCRE2_ERROR_NOMATCH if a match with a starting point before or  at  the
+       given  offset  is  not  found. The pcre2_substitute() function makes no
+       more substitutions.
+
+       For example, if the pattern /abc/ is matched against "123abc"  with  an
+       offset  limit  less than 3, the result is PCRE2_ERROR_NO_MATCH. A match
+       can never be  found  if  the  startoffset  argument  of  pcre2_match(),
+       pcre2_dfa_match(),  or  pcre2_substitute()  is  greater than the offset
+       limit set in the match context.
+
+       When using this  facility,  you  must  set  the  PCRE2_USE_OFFSET_LIMIT
+       option when calling pcre2_compile() so that when JIT is in use, differ-
+       ent code can be compiled. If a match  is  started  with  a  non-default
+       match  limit when PCRE2_USE_OFFSET_LIMIT is not set, an error is gener-
+       ated.
+
+       The offset limit facility can be used to track progress when  searching
+       large  subject  strings or to limit the extent of global substitutions.
+       See also the PCRE2_FIRSTLINE option, which requires a  match  to  start
+       before  or  at  the first newline that follows the start of matching in
+       the subject. If this is set with an offset limit, a match must occur in
+       the first line and also within the offset limit. In other words, which-
+       ever limit comes first is used.
+
+       int pcre2_set_heap_limit(pcre2_match_context *mcontext,
          uint32_t value);
 
-       The match_limit parameter provides a means  of  preventing  PCRE2  from
-       using up too many resources when processing patterns that are not going
-       to match, but which have a very large number of possibilities in  their
-       search  trees. The classic example is a pattern that uses nested unlim-
-       ited repeats.
-
-       Internally, pcre2_match() uses a  function  called  match(),  which  it
-       calls  repeatedly (sometimes recursively). The limit set by match_limit
-       is imposed on the number of times this  function  is  called  during  a
-       match, which has the effect of limiting the amount of backtracking that
-       can take place. For patterns that are not anchored, the count  restarts
-       from  zero  for  each position in the subject string. This limit is not
-       relevant to pcre2_dfa_match(), which ignores it.
-
-       When pcre2_match() is called with a pattern that was successfully  pro-
-       cessed by pcre2_jit_compile(), the way in which matching is executed is
-       entirely different. However, there is still the possibility of  runaway
-       matching  that  goes  on  for  a very long time, and so the match_limit
-       value is also used in this case (but in a different way) to  limit  how
-       long the matching can continue.
+       The heap_limit parameter specifies, in units of kilobytes, the  maximum
+       amount  of  heap memory that pcre2_match() may use to hold backtracking
+       information when running an interpretive match.  This  limit  does  not
+       apply  to  matching with the JIT optimization, which has its own memory
+       control arrangements (see the pcre2jit documentation for more details),
+       nor  does  it apply to pcre2_dfa_match().  If the limit is reached, the
+       negative error code  PCRE2_ERROR_HEAPLIMIT  is  returned.  The  default
+       limit is set when PCRE2 is built; the default default is very large and
+       is essentially "unlimited".
 
-       The  default  value  for  the limit can be set when PCRE2 is built; the
-       default default is 10 million, which handles all but the  most  extreme
-       cases.    If    the    limit   is   exceeded,   pcre2_match()   returns
-       PCRE2_ERROR_MATCHLIMIT. A value for the match limit may  also  be  sup-
-       plied by an item at the start of a pattern of the form
+       A value for the heap limit may also be supplied by an item at the start
+       of a pattern of the form
 
-         (*LIMIT_MATCH=ddd)
+         (*LIMIT_HEAP=ddd)
 
        where  ddd  is  a  decimal  number.  However, such a setting is ignored
        unless ddd is less than the limit set by the  caller  of  pcre2_match()
        or, if no such limit is set, less than the default.
 
-       int pcre2_set_recursion_limit(pcre2_match_context *mcontext,
-         uint32_t value);
+       The  pcre2_match() function starts out using a 20K vector on the system
+       stack for recording backtracking points. The more  nested  backtracking
+       points there are (that is, the deeper the search tree), the more memory
+       is needed.  Heap memory is used only  if  the  initial  vector  is  too
+       small. If the heap limit is set to a value less than 21 (in particular,
+       zero) no heap memory will be used. In this case, only patterns that  do
+       not have a lot of nested backtracking can be successfully processed.
 
-       The recursion_limit parameter is similar to match_limit, but instead of
-       limiting the total number of times that match() is  called,  it  limits
-       the  depth  of  recursion. The recursion depth is a smaller number than
-       the total number of calls, because not all calls to match() are  recur-
-       sive.  This limit is of use only if it is set smaller than match_limit.
+       int pcre2_set_match_limit(pcre2_match_context *mcontext,
+         uint32_t value);
 
-       Limiting the recursion depth limits the amount of system stack that can
-       be used, or, when PCRE2 has been compiled to use  memory  on  the  heap
-       instead  of the stack, the amount of heap memory that can be used. This
-       limit is not relevant, and is ignored, when matching is done using  JIT
-       compiled code or by the pcre2_dfa_match() function.
+       The  match_limit  parameter  provides  a means of preventing PCRE2 from
+       using up too many computing resources when processing patterns that are
+       not going to match, but which have a very large number of possibilities
+       in their search trees. The classic  example  is  a  pattern  that  uses
+       nested unlimited repeats.
+
+       There  is an internal counter in pcre2_match() that is incremented each
+       time round its main matching loop. If  this  value  reaches  the  match
+       limit, pcre2_match() returns the negative value PCRE2_ERROR_MATCHLIMIT.
+       This has the effect of limiting the amount  of  backtracking  that  can
+       take place. For patterns that are not anchored, the count restarts from
+       zero for each position in the subject string. This limit  also  applies
+       to pcre2_dfa_match(), though the counting is done in a different way.
+
+       When  pcre2_match() is called with a pattern that was successfully pro-
+       cessed by pcre2_jit_compile(), the way in which matching is executed is
+       entirely  different. However, there is still the possibility of runaway
+       matching that goes on for a very long  time,  and  so  the  match_limit
+       value  is  also used in this case (but in a different way) to limit how
+       long the matching can continue.
 
-       The  default  value for recursion_limit can be set when PCRE2 is built;
-       the default default is the same value as the default  for  match_limit.
-       If  the limit is exceeded, pcre2_match() returns PCRE2_ERROR_RECURSION-
-       LIMIT. A value for the recursion limit may also be supplied by an  item
-       at the start of a pattern of the form
+       The default value for the limit can be set when  PCRE2  is  built;  the
+       default  default  is 10 million, which handles all but the most extreme
+       cases. A value for the match limit may also be supplied by an  item  at
+       the start of a pattern of the form
 
-         (*LIMIT_RECURSION=ddd)
+         (*LIMIT_MATCH=ddd)
 
        where  ddd  is  a  decimal  number.  However, such a setting is ignored
-       unless ddd is less than the limit set by the  caller  of  pcre2_match()
-       or, if no such limit is set, less than the default.
+       unless ddd is less than the limit set by the caller of pcre2_match() or
+       pcre2_dfa_match() or, if no such limit is set, less than the default.
 
-       int pcre2_set_recursion_memory_management(
-         pcre2_match_context *mcontext,
-         void *(*private_malloc)(PCRE2_SIZE, void *),
-         void (*private_free)(void *, void *), void *memory_data);
+       int pcre2_set_depth_limit(pcre2_match_context *mcontext,
+         uint32_t value);
 
-       This function sets up two additional custom memory management functions
-       for use by pcre2_match() when PCRE2 is compiled to  use  the  heap  for
-       remembering backtracking data, instead of recursive function calls that
-       use the system stack. There is a discussion about PCRE2's  stack  usage
-       in  the  pcre2stack documentation. See the pcre2build documentation for
-       details of how to build PCRE2.
-
-       Using the heap for recursion is a non-standard way of  building  PCRE2,
-       for  use  in  environments  that  have  limited  stacks. Because of the
-       greater use of memory management, pcre2_match() runs more slowly. Func-
-       tions  that  are  different  to the general custom memory functions are
-       provided so that special-purpose external code can  be  used  for  this
-       case,  because  the memory blocks are all the same size. The blocks are
-       retained by pcre2_match() until it is about to exit so that they can be
-       re-used  when  possible during the match. In the absence of these func-
-       tions, the normal custom memory management functions are used, if  sup-
-       plied, otherwise the system functions.
+       This   parameter   limits   the   depth   of   nested  backtracking  in
+       pcre2_match().  Each time a nested backtracking point is passed, a  new
+       memory "frame" is used to remember the state of matching at that point.
+       Thus, this parameter indirectly limits the amount  of  memory  that  is
+       used  in  a  match.  However,  because  the size of each memory "frame"
+       depends on the number of capturing parentheses, the actual memory limit
+       varies  from pattern to pattern. This limit was more useful in versions
+       before 10.30, where function recursion was used for backtracking.
+
+       The depth limit is not relevant, and is ignored, when matching is  done
+       using JIT compiled code. However, it is supported by pcre2_dfa_match(),
+       which uses it to limit the depth of internal recursive  function  calls
+       that implement atomic groups, lookaround assertions, and pattern recur-
+       sions. This is, therefore, an indirect limit on the  amount  of  system
+       stack that is used. A recursive pattern such as /(.)(?1)/, when matched
+       to a very long string using pcre2_dfa_match(), can use a great deal  of
+       stack.
+
+       The  default  value for the depth limit can be set when PCRE2 is built;
+       the default default is the same value as  the  default  for  the  match
+       limit.  If  the  limit  is exceeded, pcre2_match() or pcre2_dfa_match()
+       returns PCRE2_ERROR_DEPTHLIMIT. A value for the depth limit may also be
+       supplied by an item at the start of a pattern of the form
+
+         (*LIMIT_DEPTH=ddd)
+
+       where  ddd  is  a  decimal  number.  However, such a setting is ignored
+       unless ddd is less than the limit set by the caller of pcre2_match() or
+       pcre2_dfa_match() or, if no such limit is set, less than the default.
 
 
 CHECKING BUILD-TIME OPTIONS
@@ -987,6 +1082,26 @@ CHECKING BUILD-TIME OPTIONS
        sequence;  a  value of PCRE2_BSR_ANYCRLF means that \R matches only CR,
        LF, or CRLF. The default can be overridden when a pattern is compiled.
 
+         PCRE2_CONFIG_COMPILED_WIDTHS
+
+       The output is a uint32_t integer whose lower bits indicate  which  code
+       unit  widths  were  selected  when PCRE2 was built. The 1-bit indicates
+       8-bit support, and the 2-bit and 4-bit indicate 16-bit and 32-bit  sup-
+       port, respectively.
+
+         PCRE2_CONFIG_DEPTHLIMIT
+
+       The  output  is a uint32_t integer that gives the default limit for the
+       depth of nested backtracking in pcre2_match() or the  depth  of  nested
+       recursions  and  lookarounds  in pcre2_dfa_match(). Further details are
+       given with pcre2_set_depth_limit() above.
+
+         PCRE2_CONFIG_HEAPLIMIT
+
+       The output is a uint32_t integer that gives, in kilobytes, the  default
+       limit  for  the  amount  of  heap memory used by pcre2_match(). Further
+       details are given with pcre2_set_heap_limit() above.
+
          PCRE2_CONFIG_JIT
 
        The output is a uint32_t integer that is set  to  one  if  support  for
@@ -1021,9 +1136,9 @@ CHECKING BUILD-TIME OPTIONS
 
          PCRE2_CONFIG_MATCHLIMIT
 
-       The output is a uint32_t integer that gives the default limit  for  the
-       number  of  internal  matching function calls in a pcre2_match() execu-
-       tion. Further details are given with pcre2_match() below.
+       The output is a uint32_t integer that gives the default match limit for
+       pcre2_match().  Further  details are given with pcre2_set_match_limit()
+       above.
 
          PCRE2_CONFIG_NEWLINE
 
@@ -1036,10 +1151,17 @@ CHECKING BUILD-TIME OPTIONS
          PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
          PCRE2_NEWLINE_ANY      Any Unicode line ending
          PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
+         PCRE2_NEWLINE_NUL      The NUL character (binary zero)
 
        The default should normally correspond to  the  standard  sequence  for
        your operating system.
 
+         PCRE2_CONFIG_NEVER_BACKSLASH_C
+
+       The  output  is  a uint32_t integer that is set to one if the use of \C
+       was permanently disabled when PCRE2 was built; otherwise it is  set  to
+       zero.
+
          PCRE2_CONFIG_PARENSLIMIT
 
        The  output is a uint32_t integer that gives the maximum depth of nest-
@@ -1050,43 +1172,32 @@ CHECKING BUILD-TIME OPTIONS
        application. For  finer  control  over  compilation  stack  usage,  see
        pcre2_set_compile_recursion_guard().
 
-         PCRE2_CONFIG_RECURSIONLIMIT
-
-       The  output  is a uint32_t integer that gives the default limit for the
-       depth of recursion when calling the internal  matching  function  in  a
-       pcre2_match()  execution.  Further details are given with pcre2_match()
-       below.
-
          PCRE2_CONFIG_STACKRECURSE
 
-       The output is a uint32_t integer that is set to one if internal  recur-
-       sion  when  running  pcre2_match() is implemented by recursive function
-       calls that use the system stack to remember their state.  This  is  the
-       usual  way that PCRE2 is compiled. The output is zero if PCRE2 was com-
-       piled to use blocks of data on the heap instead of  recursive  function
-       calls.
+       This parameter is obsolete and should not be used in new code. The out-
+       put is a uint32_t integer that is always set to zero.
 
          PCRE2_CONFIG_UNICODE_VERSION
 
-       The  where  argument  should point to a buffer that is at least 24 code
-       units long.  (The  exact  length  required  can  be  found  by  calling
-       pcre2_config()  with  where  set  to  NULL.) If PCRE2 has been compiled
-       without Unicode support, the buffer is filled with  the  text  "Unicode
-       not  supported".  Otherwise,  the  Unicode version string (for example,
-       "8.0.0") is inserted. The number of code units used is  returned.  This
+       The where argument should point to a buffer that is at  least  24  code
+       units  long.  (The  exact  length  required  can  be  found  by calling
+       pcre2_config() with where set to NULL.)  If  PCRE2  has  been  compiled
+       without  Unicode  support,  the buffer is filled with the text "Unicode
+       not supported". Otherwise, the Unicode  version  string  (for  example,
+       "8.0.0")  is  inserted. The number of code units used is returned. This
        is the length of the string plus one unit for the terminating zero.
 
          PCRE2_CONFIG_UNICODE
 
-       The  output is a uint32_t integer that is set to one if Unicode support
-       is available; otherwise it is set to zero. Unicode support implies  UTF
+       The output is a uint32_t integer that is set to one if Unicode  support
+       is  available; otherwise it is set to zero. Unicode support implies UTF
        support.
 
          PCRE2_CONFIG_VERSION
 
-       The  where  argument  should point to a buffer that is at least 12 code
-       units long.  (The  exact  length  required  can  be  found  by  calling
-       pcre2_config()  with  where set to NULL.) The buffer is filled with the
+       The where argument should point to a buffer that is at  least  24  code
+       units  long.  (The  exact  length  required  can  be  found  by calling
+       pcre2_config() with where set to NULL.) The buffer is filled  with  the
        PCRE2 version string, zero-terminated. The number of code units used is
        returned. This is the length of the string plus one unit for the termi-
        nating zero.
@@ -1102,28 +1213,41 @@ COMPILING A PATTERN
 
        pcre2_code *pcre2_code_copy(const pcre2_code *code);
 
-       The pcre2_compile() function compiles a pattern into an internal  form.
-       The  pattern  is  defined  by a pointer to a string of code units and a
-       length. If the pattern is zero-terminated, the length can be  specified
-       as  PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of
-       memory that contains the compiled pattern and related data, or NULL  if
-       an error occurred.
-
-       If  the  compile context argument ccontext is NULL, memory for the com-
-       piled pattern  is  obtained  by  calling  malloc().  Otherwise,  it  is
-       obtained  from  the  same memory function that was used for the compile
-       context. The caller must free the memory by  calling  pcre2_code_free()
+       pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *code);
+
+       The  pcre2_compile() function compiles a pattern into an internal form.
+       The pattern is defined by a pointer to a string of  code  units  and  a
+       length  (in  code units). If the pattern is zero-terminated, the length
+       can be specified  as  PCRE2_ZERO_TERMINATED.  The  function  returns  a
+       pointer  to  a  block  of memory that contains the compiled pattern and
+       related data, or NULL if an error occurred.
+
+       If the compile context argument ccontext is NULL, memory for  the  com-
+       piled  pattern  is  obtained  by  calling  malloc().  Otherwise,  it is
+       obtained from the same memory function that was used  for  the  compile
+       context.  The  caller must free the memory by calling pcre2_code_free()
        when it is no longer needed.
 
        The function pcre2_code_copy() makes a copy of the compiled code in new
-       memory, using the same memory allocator as was used for  the  original.
-       However,  if  the  code  has  been  processed  by the JIT compiler (see
-       below), the JIT information cannot be copied (because it  is  position-
+       memory,  using  the same memory allocator as was used for the original.
+       However, if the code has  been  processed  by  the  JIT  compiler  (see
+       below),  the  JIT information cannot be copied (because it is position-
        dependent).  The new copy can initially be used only for non-JIT match-
-       ing, though it can be passed to pcre2_jit_compile()  if  required.  The
-       pcre2_code_copy()  function  provides a way for individual threads in a
-       multithreaded application to acquire a private copy of shared  compiled
-       code.
+       ing, though it can be passed to pcre2_jit_compile() if required.
+
+       The pcre2_code_copy() function provides a way for individual threads in
+       a multithreaded application to acquire a private copy  of  shared  com-
+       piled  code.   However, it does not make a copy of the character tables
+       used by the compiled pattern; the new pattern code points to  the  same
+       tables  as  the original code.  (See "Locale Support" below for details
+       of these character tables.) In many applications the  same  tables  are
+       used  throughout, so this behaviour is appropriate. Nevertheless, there
+       are occasions when a copy of a compiled pattern and the relevant tables
+       are  needed.  The pcre2_code_copy_with_tables() provides this facility.
+       Copies of both the code and the tables are  made,  with  the  new  code
+       pointing  to the new tables. The memory for the new tables is automati-
+       cally freed when pcre2_code_free() is called for the new  copy  of  the
+       compiled code.
 
        NOTE:  When  one  of  the matching functions is called, pointers to the
        compiled pattern and the subject string are set in the match data block
@@ -1141,33 +1265,46 @@ COMPILING A PATTERN
 
        For  those options that can be different in different parts of the pat-
        tern, the contents of the options argument specifies their settings  at
-       the  start  of  compilation.  The PCRE2_ANCHORED and PCRE2_NO_UTF_CHECK
-       options can be set at the time of matching as well as at compile time.
+       the  start  of  compilation. The PCRE2_ANCHORED, PCRE2_ENDANCHORED, and
+       PCRE2_NO_UTF_CHECK options can be set at the time of matching  as  well
+       as at compile time.
 
-       Other, less frequently required compile-time parameters  (for  example,
+       Other,  less  frequently required compile-time parameters (for example,
        the newline setting) can be provided in a compile context (as described
        above).
 
        If errorcode or erroroffset is NULL, pcre2_compile() returns NULL imme-
-       diately.  Otherwise,  the  variables to which these point are set to an
-       error code and an offset (number of code  units)  within  the  pattern,
-       respectively,  when  pcre2_compile() returns NULL because a compilation
+       diately. Otherwise, the variables to which these point are  set  to  an
+       error  code  and  an  offset (number of code units) within the pattern,
+       respectively, when pcre2_compile() returns NULL because  a  compilation
        error has occurred. The values are not defined when compilation is suc-
        cessful and pcre2_compile() returns a non-NULL value.
 
-       The  pcre2_get_error_message() function (see "Obtaining a textual error
-       message" below) provides a textual message for each error code.  Compi-
-       lation errors have positive error codes; UTF formatting error codes are
-       negative. For an invalid UTF-8 or UTF-16 string, the offset is that  of
-       the first code unit of the failing character.
-
-       Some  errors are not detected until the whole pattern has been scanned;
-       in these cases, the offset passed back is the length  of  the  pattern.
-       Note  that  the  offset is in code units, not characters, even in a UTF
+       There are nearly 100 positive  error  codes  that  pcre2_compile()  may
+       return  if  it finds an error in the pattern. There are also some nega-
+       tive error codes that are used for invalid UTF strings. These  are  the
+       same as given by pcre2_match() and pcre2_dfa_match(), and are described
+       in the pcre2unicode page. There is no separate  documentation  for  the
+       positive  error  codes,  because  the  textual  error messages that are
+       obtained  by  calling  the  pcre2_get_error_message()   function   (see
+       "Obtaining  a textual error message" below) should be self-explanatory.
+       Macro names starting with PCRE2_ERROR_ are defined  for  both  positive
+       and negative error codes in pcre2.h.
+
+       The value returned in erroroffset is an indication of where in the pat-
+       tern the error occurred. It is not necessarily the  furthest  point  in
+       the  pattern  that  was  read. For example, after the error "lookbehind
+       assertion is not fixed length", the error offset points to the start of
+       the  failing assertion. For an invalid UTF-8 or UTF-16 string, the off-
+       set is that of the first code unit of the failing character.
+
+       Some errors are not detected until the whole pattern has been  scanned;
+       in  these  cases,  the offset passed back is the length of the pattern.
+       Note that the offset is in code units, not characters, even  in  a  UTF
        mode. It may sometimes point into the middle of a UTF-8 or UTF-16 char-
        acter.
 
-       This  code  fragment shows a typical straightforward call to pcre2_com-
+       This code fragment shows a typical straightforward call  to  pcre2_com-
        pile():
 
          pcre2_code *re;
@@ -1181,77 +1318,86 @@ COMPILING A PATTERN
            &erroffset,             /* for error offset */
            NULL);                  /* no compile context */
 
-       The following names for option bits are defined in the  pcre2.h  header
+       The  following  names for option bits are defined in the pcre2.h header
        file:
 
          PCRE2_ANCHORED
 
        If this bit is set, the pattern is forced to be "anchored", that is, it
-       is constrained to match only at the first matching point in the  string
-       that  is being searched (the "subject string"). This effect can also be
-       achieved by appropriate constructs in the pattern itself, which is  the
+       is  constrained to match only at the first matching point in the string
+       that is being searched (the "subject string"). This effect can also  be
+       achieved  by appropriate constructs in the pattern itself, which is the
        only way to do it in Perl.
 
          PCRE2_ALLOW_EMPTY_CLASS
 
-       By  default, for compatibility with Perl, a closing square bracket that
-       immediately follows an opening one is treated as a data  character  for
-       the  class.  When  PCRE2_ALLOW_EMPTY_CLASS  is  set,  it terminates the
+       By default, for compatibility with Perl, a closing square bracket  that
+       immediately  follows  an opening one is treated as a data character for
+       the class. When  PCRE2_ALLOW_EMPTY_CLASS  is  set,  it  terminates  the
        class, which therefore contains no characters and so can never match.
 
          PCRE2_ALT_BSUX
 
-       This option request alternative handling  of  three  escape  sequences,
-       which  makes  PCRE2's  behaviour more like ECMAscript (aka JavaScript).
+       This  option  request  alternative  handling of three escape sequences,
+       which makes PCRE2's behaviour more like  ECMAscript  (aka  JavaScript).
        When it is set:
 
        (1) \U matches an upper case "U" character; by default \U causes a com-
        pile time error (Perl uses \U to upper case subsequent characters).
 
        (2) \u matches a lower case "u" character unless it is followed by four
-       hexadecimal digits, in which case the hexadecimal  number  defines  the
-       code  point  to match. By default, \u causes a compile time error (Perl
+       hexadecimal  digits,  in  which case the hexadecimal number defines the
+       code point to match. By default, \u causes a compile time  error  (Perl
        uses it to upper case the following character).
 
-       (3) \x matches a lower case "x" character unless it is followed by  two
-       hexadecimal  digits,  in  which case the hexadecimal number defines the
-       code point to match. By default, as in Perl, a  hexadecimal  number  is
+       (3)  \x matches a lower case "x" character unless it is followed by two
+       hexadecimal digits, in which case the hexadecimal  number  defines  the
+       code  point  to  match. By default, as in Perl, a hexadecimal number is
        always expected after \x, but it may have zero, one, or two digits (so,
        for example, \xz matches a binary zero character followed by z).
 
          PCRE2_ALT_CIRCUMFLEX
 
        In  multiline  mode  (when  PCRE2_MULTILINE  is  set),  the  circumflex
-       metacharacter  matches at the start of the subject (unless PCRE2_NOTBOL
-       is set), and also after any internal  newline.  However,  it  does  not
+       metacharacter matches at the start of the subject (unless  PCRE2_NOTBOL
+       is  set),  and  also  after  any internal newline. However, it does not
        match after a newline at the end of the subject, for compatibility with
-       Perl. If you want a multiline circumflex also to match after  a  termi-
+       Perl.  If  you want a multiline circumflex also to match after a termi-
        nating newline, you must set PCRE2_ALT_CIRCUMFLEX.
 
          PCRE2_ALT_VERBNAMES
 
-       By  default, for compatibility with Perl, the name in any verb sequence
-       such as (*MARK:NAME) is  any  sequence  of  characters  that  does  not
-       include  a  closing  parenthesis. The name is not processed in any way,
-       and it is not possible to include a closing parenthesis  in  the  name.
-       However,  if  the  PCRE2_ALT_VERBNAMES  option is set, normal backslash
-       processing is applied to verb  names  and  only  an  unescaped  closing
-       parenthesis  terminates the name. A closing parenthesis can be included
-       in a name either as \) or between \Q  and  \E.  If  the  PCRE2_EXTENDED
-       option is set, unescaped whitespace in verb names is skipped and #-com-
-       ments are recognized, exactly as in the rest of the pattern.
+       By default, for compatibility with Perl, the name in any verb  sequence
+       such  as  (*MARK:NAME)  is  any  sequence  of  characters that does not
+       include a closing parenthesis. The name is not processed  in  any  way,
+       and  it  is  not possible to include a closing parenthesis in the name.
+       However, if the PCRE2_ALT_VERBNAMES option  is  set,  normal  backslash
+       processing  is  applied  to  verb  names  and only an unescaped closing
+       parenthesis terminates the name. A closing parenthesis can be  included
+       in  a  name either as \) or between \Q and \E. If the PCRE2_EXTENDED or
+       PCRE2_EXTENDED_MORE option is set, unescaped whitespace in  verb  names
+       is  skipped  and  #-comments are recognized in this mode, exactly as in
+       the rest of the pattern.
 
          PCRE2_AUTO_CALLOUT
 
        If this bit  is  set,  pcre2_compile()  automatically  inserts  callout
-       items, all with number 255, before each pattern item. For discussion of
-       the callout facility, see the pcre2callout documentation.
+       items,  all  with  number 255, before each pattern item, except immedi-
+       ately before or after an explicit callout in the pattern.  For  discus-
+       sion of the callout facility, see the pcre2callout documentation.
 
          PCRE2_CASELESS
 
-       If this bit is set, letters in the pattern match both upper  and  lower
-       case  letters in the subject. It is equivalent to Perl's /i option, and
-       it can be changed within a pattern by a (?i) option setting.
+       If  this  bit is set, letters in the pattern match both upper and lower
+       case letters in the subject. It is equivalent to Perl's /i option,  and
+       it  can  be  changed  within  a  pattern  by  a (?i) option setting. If
+       PCRE2_UTF is set, Unicode properties are used for all  characters  with
+       more  than one other case, and for all characters whose code points are
+       greater than U+007f. For lower valued characters with  only  one  other
+       case,  a  lookup  table is used for speed. When PCRE2_UTF is not set, a
+       lookup table is used for all code points less than 256, and higher code
+       points  (available  only  in  16-bit or 32-bit mode) are treated as not
+       having another case.
 
          PCRE2_DOLLAR_ENDONLY
 
@@ -1281,178 +1427,229 @@ COMPILING A PATTERN
        matched.  There  are  more details of named subpatterns below; see also
        the pcre2pattern documentation.
 
+         PCRE2_ENDANCHORED
+
+       If this bit is set, the end of any pattern match must be right  at  the
+       end of the string being searched (the "subject string"). If the pattern
+       match succeeds by reaching (*ACCEPT), but does not reach the end of the
+       subject,  the match fails at the current starting point. For unanchored
+       patterns, a new match is then tried at the next  starting  point.  How-
+       ever, if the match succeeds by reaching the end of the pattern, but not
+       the end of the subject, backtracking occurs and  an  alternative  match
+       may be found. Consider these two patterns:
+
+         .(*ACCEPT)|..
+         .|..
+
+       If  matched against "abc" with PCRE2_ENDANCHORED set, the first matches
+       "c" whereas the second matches "bc". The  effect  of  PCRE2_ENDANCHORED
+       can  also  be achieved by appropriate constructs in the pattern itself,
+       which is the only way to do it in Perl.
+
+       For DFA matching with pcre2_dfa_match(), PCRE2_ENDANCHORED applies only
+       to  the  first  (that  is,  the longest) matched string. Other parallel
+       matches, which are necessarily substrings of the first one, must  obvi-
+       ously end before the end of the subject.
+
          PCRE2_EXTENDED
 
-       If this bit is set, most white space  characters  in  the  pattern  are
-       totally  ignored  except when escaped or inside a character class. How-
-       ever, white space is not allowed within  sequences  such  as  (?>  that
+       If  this  bit  is  set,  most white space characters in the pattern are
+       totally ignored except when escaped or inside a character  class.  How-
+       ever,  white  space  is  not  allowed within sequences such as (?> that
        introduce various parenthesized subpatterns, nor within numerical quan-
-       tifiers such as {1,3}.  Ignorable white space is permitted  between  an
-       item  and a following quantifier and between a quantifier and a follow-
+       tifiers  such  as {1,3}.  Ignorable white space is permitted between an
+       item and a following quantifier and between a quantifier and a  follow-
        ing + that indicates possessiveness.
 
-       PCRE2_EXTENDED also causes characters between an unescaped # outside  a
-       character  class  and the next newline, inclusive, to be ignored, which
+       PCRE2_EXTENDED  also causes characters between an unescaped # outside a
+       character class and the next newline, inclusive, to be  ignored,  which
        makes it possible to include comments inside complicated patterns. Note
-       that  the  end of this type of comment is a literal newline sequence in
+       that the end of this type of comment is a literal newline  sequence  in
        the pattern; escape sequences that happen to represent a newline do not
-       count.  PCRE2_EXTENDED is equivalent to Perl's /x option, and it can be
+       count. PCRE2_EXTENDED is equivalent to Perl's /x option, and it can  be
        changed within a pattern by a (?x) option setting.
 
        Which characters are interpreted as newlines can be specified by a set-
-       ting  in  the compile context that is passed to pcre2_compile() or by a
-       special sequence at the start of the pattern, as described in the  sec-
-       tion  entitled "Newline conventions" in the pcre2pattern documentation.
+       ting in the compile context that is passed to pcre2_compile() or  by  a
+       special  sequence at the start of the pattern, as described in the sec-
+       tion entitled "Newline conventions" in the pcre2pattern  documentation.
        A default is defined when PCRE2 is built.
 
+         PCRE2_EXTENDED_MORE
+
+       This  option  has  the  effect  of  PCRE2_EXTENDED,  but,  in addition,
+       unescaped space and horizontal tab  characters  are  ignored  inside  a
+       character  class.  PCRE2_EXTENDED_MORE is equivalent to Perl's 5.26 /xx
+       option, and it can be changed within a pattern by a (?xx)  option  set-
+       ting.
+
          PCRE2_FIRSTLINE
 
-       If this option is set, an  unanchored  pattern  is  required  to  match
-       before  or  at  the  first  newline  in  the subject string, though the
-       matched text may continue over the  newline.  See  also  PCRE2_USE_OFF-
-       SET_LIMIT,   which  provides  a  more  general  limiting  facility.  If
-       PCRE2_FIRSTLINE is set with an offset limit, a match must occur in  the
-       first  line and also within the offset limit. In other words, whichever
-       limit comes first is used.
+       If this option is set, the start of an unanchored pattern match must be
+       before or at the first newline in  the  subject  string  following  the
+       start  of  matching, though the matched text may continue over the new-
+       line. If startoffset is non-zero, the limiting newline is not necessar-
+       ily  the  first  newline  in  the  subject. For example, if the subject
+       string is "abc\nxyz" (where \n represents a single-character newline) a
+       pattern  match for "yz" succeeds with PCRE2_FIRSTLINE if startoffset is
+       greater than 3. See also PCRE2_USE_OFFSET_LIMIT, which provides a  more
+       general  limiting  facility.  If  PCRE2_FIRSTLINE is set with an offset
+       limit, a match must occur in the first line and also within the  offset
+       limit. In other words, whichever limit comes first is used.
+
+         PCRE2_LITERAL
+
+       If this option is set, all meta-characters in the pattern are disabled,
+       and it is treated as a literal string. Matching literal strings with  a
+       regular expression engine is not the most efficient way of doing it. If
+       you are doing a lot of literal matching and  are  worried  about  effi-
+       ciency, you should consider using other approaches. The only other main
+       options  that  are  allowed  with  PCRE2_LITERAL  are:  PCRE2_ANCHORED,
+       PCRE2_ENDANCHORED, PCRE2_AUTO_CALLOUT, PCRE2_CASELESS, PCRE2_FIRSTLINE,
+       PCRE2_NO_START_OPTIMIZE,     PCRE2_NO_UTF_CHECK,     PCRE2_UTF,     and
+       PCRE2_USE_OFFSET_LIMIT.  The  extra  options PCRE2_EXTRA_MATCH_LINE and
+       PCRE2_EXTRA_MATCH_WORD are also supported. Any other options  cause  an
+       error.
 
          PCRE2_MATCH_UNSET_BACKREF
 
-       If this option is set, a back reference to an  unset  subpattern  group
-       matches  an  empty  string (by default this causes the current matching
-       alternative to fail).  A pattern such as  (\1)(a)  succeeds  when  this
-       option  is set (assuming it can find an "a" in the subject), whereas it
-       fails by default, for Perl compatibility.  Setting  this  option  makes
+       If  this  option  is set, a back reference to an unset subpattern group
+       matches an empty string (by default this causes  the  current  matching
+       alternative  to  fail).   A  pattern such as (\1)(a) succeeds when this
+       option is set (assuming it can find an "a" in the subject), whereas  it
+       fails  by  default,  for  Perl compatibility. Setting this option makes
        PCRE2 behave more like ECMAscript (aka JavaScript).
 
          PCRE2_MULTILINE
 
-       By  default,  for  the purposes of matching "start of line" and "end of
-       line", PCRE2 treats the subject string as consisting of a  single  line
-       of  characters,  even  if  it actually contains newlines. The "start of
-       line" metacharacter (^) matches only at the start of  the  string,  and
-       the  "end  of  line"  metacharacter  ($) matches only at the end of the
+       By default, for the purposes of matching "start of line"  and  "end  of
+       line",  PCRE2  treats the subject string as consisting of a single line
+       of characters, even if it actually contains  newlines.  The  "start  of
+       line"  metacharacter  (^)  matches only at the start of the string, and
+       the "end of line" metacharacter ($) matches only  at  the  end  of  the
        string,  or  before  a  terminating  newline  (except  when  PCRE2_DOL-
-       LAR_ENDONLY  is  set).  Note, however, that unless PCRE2_DOTALL is set,
+       LAR_ENDONLY is set). Note, however, that unless  PCRE2_DOTALL  is  set,
        the "any character" metacharacter (.) does not match at a newline. This
        behaviour (for ^, $, and dot) is the same as Perl.
 
-       When  PCRE2_MULTILINE  it is set, the "start of line" and "end of line"
-       constructs match immediately following or immediately  before  internal
-       newlines  in  the  subject string, respectively, as well as at the very
-       start and end. This is equivalent to Perl's /m option, and  it  can  be
+       When PCRE2_MULTILINE it is set, the "start of line" and "end  of  line"
+       constructs  match  immediately following or immediately before internal
+       newlines in the subject string, respectively, as well as  at  the  very
+       start  and  end.  This is equivalent to Perl's /m option, and it can be
        changed within a pattern by a (?m) option setting. Note that the "start
        of line" metacharacter does not match after a newline at the end of the
-       subject,  for compatibility with Perl.  However, you can change this by
-       setting the PCRE2_ALT_CIRCUMFLEX option. If there are no newlines in  a
-       subject  string,  or  no  occurrences  of  ^ or $ in a pattern, setting
+       subject, for compatibility with Perl.  However, you can change this  by
+       setting  the PCRE2_ALT_CIRCUMFLEX option. If there are no newlines in a
+       subject string, or no occurrences of ^  or  $  in  a  pattern,  setting
        PCRE2_MULTILINE has no effect.
 
          PCRE2_NEVER_BACKSLASH_C
 
-       This option locks out the use of \C in the pattern that is  being  com-
-       piled.   This  escape  can  cause  unpredictable  behaviour in UTF-8 or
-       UTF-16 modes, because it may leave the current matching  point  in  the
-       middle  of  a  multi-code-unit  character. This option may be useful in
-       applications that process patterns from  external  sources.  Note  that
+       This  option  locks out the use of \C in the pattern that is being com-
+       piled.  This escape can  cause  unpredictable  behaviour  in  UTF-8  or
+       UTF-16  modes,  because  it may leave the current matching point in the
+       middle of a multi-code-unit character. This option  may  be  useful  in
+       applications  that  process  patterns  from external sources. Note that
        there is also a build-time option that permanently locks out the use of
        \C.
 
          PCRE2_NEVER_UCP
 
-       This option locks out the use of Unicode properties  for  handling  \B,
+       This  option  locks  out the use of Unicode properties for handling \B,
        \b, \D, \d, \S, \s, \W, \w, and some of the POSIX character classes, as
-       described for the PCRE2_UCP option below. In  particular,  it  prevents
-       the  creator of the pattern from enabling this facility by starting the
-       pattern with (*UCP). This option may be  useful  in  applications  that
+       described  for  the  PCRE2_UCP option below. In particular, it prevents
+       the creator of the pattern from enabling this facility by starting  the
+       pattern  with  (*UCP).  This  option may be useful in applications that
        process patterns from external sources. The option combination PCRE_UCP
        and PCRE_NEVER_UCP causes an error.
 
          PCRE2_NEVER_UTF
 
-       This option locks out interpretation of the pattern as  UTF-8,  UTF-16,
+       This  option  locks out interpretation of the pattern as UTF-8, UTF-16,
        or UTF-32, depending on which library is in use. In particular, it pre-
-       vents the creator of the pattern from switching to  UTF  interpretation
-       by  starting  the  pattern  with  (*UTF).  This option may be useful in
-       applications that process patterns from external sources. The  combina-
+       vents  the  creator of the pattern from switching to UTF interpretation
+       by starting the pattern with (*UTF).  This  option  may  be  useful  in
+       applications  that process patterns from external sources. The combina-
        tion of PCRE2_UTF and PCRE2_NEVER_UTF causes an error.
 
          PCRE2_NO_AUTO_CAPTURE
 
        If this option is set, it disables the use of numbered capturing paren-
-       theses in the pattern. Any opening parenthesis that is not followed  by
-       ?  behaves as if it were followed by ?: but named parentheses can still
-       be used for capturing (and they acquire  numbers  in  the  usual  way).
-       There  is  no  equivalent  of  this  option in Perl. Note that, if this
-       option is set, references  to  capturing  groups  (back  references  or
-       recursion/subroutine  calls) may only refer to named groups, though the
-       reference can be by name or by number.
+       theses  in the pattern. Any opening parenthesis that is not followed by
+       ? behaves as if it were followed by ?: but named parentheses can  still
+       be used for capturing (and they acquire numbers in the usual way). This
+       is the same as Perl's /n option.  Note that, when this option  is  set,
+       references to capturing groups (back references or recursion/subroutine
+       calls) may only refer to named groups, though the reference can  be  by
+       name or by number.
 
          PCRE2_NO_AUTO_POSSESS
 
        If this option is set, it disables "auto-possessification", which is an
-       optimization  that,  for example, turns a+b into a++b in order to avoid
-       backtracks into a+ that can never be successful. However,  if  callouts
-       are  in  use,  auto-possessification means that some callouts are never
+       optimization that, for example, turns a+b into a++b in order  to  avoid
+       backtracks  into  a+ that can never be successful. However, if callouts
+       are in use, auto-possessification means that some  callouts  are  never
        taken. You can set this option if you want the matching functions to do
-       a  full  unoptimized  search and run all the callouts, but it is mainly
+       a full unoptimized search and run all the callouts, but  it  is  mainly
        provided for testing purposes.
 
          PCRE2_NO_DOTSTAR_ANCHOR
 
        If this option is set, it disables an optimization that is applied when
-       .*  is  the  first significant item in a top-level branch of a pattern,
-       and all the other branches also start with .* or with \A or  \G  or  ^.
-       The  optimization  is  automatically disabled for .* if it is inside an
-       atomic group or a capturing group that is the subject of a back  refer-
-       ence,  or  if  the pattern contains (*PRUNE) or (*SKIP). When the opti-
-       mization is not disabled, such a pattern is automatically  anchored  if
+       .* is the first significant item in a top-level branch  of  a  pattern,
+       and  all  the  other branches also start with .* or with \A or \G or ^.
+       The optimization is automatically disabled for .* if it  is  inside  an
+       atomic  group or a capturing group that is the subject of a back refer-
+       ence, or if the pattern contains (*PRUNE) or (*SKIP).  When  the  opti-
+       mization  is  not disabled, such a pattern is automatically anchored if
        PCRE2_DOTALL is set for all the .* items and PCRE2_MULTILINE is not set
-       for any ^ items. Otherwise, the fact that any match must  start  either
-       at  the start of the subject or following a newline is remembered. Like
+       for  any  ^ items. Otherwise, the fact that any match must start either
+       at the start of the subject or following a newline is remembered.  Like
        other optimizations, this can cause callouts to be skipped.
 
          PCRE2_NO_START_OPTIMIZE
 
-       This is an option whose main effect is at matching time.  It  does  not
+       This  is  an  option whose main effect is at matching time. It does not
        change what pcre2_compile() generates, but it does affect the output of
        the JIT compiler.
 
-       There are a number of optimizations that may occur at the  start  of  a
-       match,  in  order  to speed up the process. For example, if it is known
-       that an unanchored match must start  with  a  specific  character,  the
-       matching  code searches the subject for that character, and fails imme-
-       diately if it cannot find it, without actually running the main  match-
-       ing  function.  This means that a special item such as (*COMMIT) at the
-       start of a pattern is not considered until after  a  suitable  starting
-       point  for  the  match  has  been found. Also, when callouts or (*MARK)
-       items are in use, these "start-up" optimizations can cause them  to  be
-       skipped  if  the pattern is never actually used. The start-up optimiza-
-       tions are in effect a pre-scan of the subject that takes  place  before
+       There  are  a  number of optimizations that may occur at the start of a
+       match, in order to speed up the process. For example, if  it  is  known
+       that  an  unanchored  match must start with a specific code unit value,
+       the matching code searches the subject for that value, and fails  imme-
+       diately  if it cannot find it, without actually running the main match-
+       ing function. This means that a special item such as (*COMMIT)  at  the
+       start  of  a  pattern is not considered until after a suitable starting
+       point for the match has been found.  Also,  when  callouts  or  (*MARK)
+       items  are  in use, these "start-up" optimizations can cause them to be
+       skipped if the pattern is never actually used. The  start-up  optimiza-
+       tions  are  in effect a pre-scan of the subject that takes place before
        the pattern is run.
 
        The PCRE2_NO_START_OPTIMIZE option disables the start-up optimizations,
-       possibly causing performance to suffer,  but  ensuring  that  in  cases
-       where  the  result is "no match", the callouts do occur, and that items
+       possibly  causing  performance  to  suffer,  but ensuring that in cases
+       where the result is "no match", the callouts do occur, and  that  items
        such as (*COMMIT) and (*MARK) are considered at every possible starting
        position in the subject string.
 
-       Setting  PCRE2_NO_START_OPTIMIZE  may  change the outcome of a matching
+       Setting PCRE2_NO_START_OPTIMIZE may change the outcome  of  a  matching
        operation.  Consider the pattern
 
          (*COMMIT)ABC
 
-       When this is compiled, PCRE2 records the fact that a match  must  start
-       with  the  character  "A".  Suppose the subject string is "DEFABC". The
-       start-up optimization scans along the subject, finds "A" and  runs  the
-       first  match attempt from there. The (*COMMIT) item means that the pat-
-       tern must match the current starting position, which in this  case,  it
-       does.  However,  if  the same match is run with PCRE2_NO_START_OPTIMIZE
-       set, the initial scan along the subject string  does  not  happen.  The
-       first  match  attempt  is  run  starting  from "D" and when this fails,
-       (*COMMIT) prevents any further matches  being  tried,  so  the  overall
-       result is "no match". There are also other start-up optimizations.  For
-       example, a minimum length for the subject may be recorded. Consider the
-       pattern
+       When  this  is compiled, PCRE2 records the fact that a match must start
+       with the character "A". Suppose the subject  string  is  "DEFABC".  The
+       start-up  optimization  scans along the subject, finds "A" and runs the
+       first match attempt from there. The (*COMMIT) item means that the  pat-
+       tern  must  match the current starting position, which in this case, it
+       does. However, if the same match is  run  with  PCRE2_NO_START_OPTIMIZE
+       set,  the  initial  scan  along the subject string does not happen. The
+       first match attempt is run starting  from  "D"  and  when  this  fails,
+       (*COMMIT)  prevents  any  further  matches  being tried, so the overall
+       result is "no match".
+
+       There are also other start-up optimizations.  For  example,  a  minimum
+       length for the subject may be recorded. Consider the pattern
 
          (*MARK:A)(X|Y)
 
@@ -1469,63 +1666,133 @@ COMPILING A PATTERN
        When  PCRE2_UTF  is set, the validity of the pattern as a UTF string is
        automatically checked. There are  discussions  about  the  validity  of
        UTF-8  strings,  UTF-16 strings, and UTF-32 strings in the pcre2unicode
-       document.  If an invalid UTF sequence is found, pcre2_compile() returns
+       document. If an invalid UTF sequence is found, pcre2_compile()  returns
        a negative error code.
 
-       If you know that your pattern is valid, and you want to skip this check
-       for performance reasons, you can  set  the  PCRE2_NO_UTF_CHECK  option.
-       When  it  is set, the effect of passing an invalid UTF string as a pat-
-       tern is undefined. It may cause your program to  crash  or  loop.  Note
-       that   this   option   can   also   be   passed  to  pcre2_match()  and
-       pcre_dfa_match(), to suppress validity checking of the subject string.
+       If  you  know  that your pattern is a valid UTF string, and you want to
+       skip  this  check  for   performance   reasons,   you   can   set   the
+       PCRE2_NO_UTF_CHECK  option.  When  it  is set, the effect of passing an
+       invalid UTF string as a pattern is undefined. It may cause your program
+       to crash or loop.
+
+       Note  that  this  option  can  also  be  passed  to  pcre2_match()  and
+       pcre_dfa_match(), to suppress UTF  validity  checking  of  the  subject
+       string.
+
+       Note also that setting PCRE2_NO_UTF_CHECK at compile time does not dis-
+       able the error that is given if an escape sequence for an invalid  Uni-
+       code  code  point is encountered in the pattern. In particular, the so-
+       called "surrogate" code points (0xd800 to 0xdfff) are invalid.  If  you
+       want  to  allow  escape  sequences  such  as  \x{d800}  you can set the
+       PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra option, as described  in  the
+       section  entitled "Extra compile options" below.  However, this is pos-
+       sible only in UTF-8 and UTF-32 modes, because these values are not rep-
+       resentable in UTF-16.
 
          PCRE2_UCP
 
        This option changes the way PCRE2 processes \B, \b, \D, \d, \S, \s, \W,
-       \w,  and  some  of  the POSIX character classes. By default, only ASCII
-       characters are recognized, but if PCRE2_UCP is set, Unicode  properties
-       are  used instead to classify characters. More details are given in the
+       \w, and some of the POSIX character classes.  By  default,  only  ASCII
+       characters  are recognized, but if PCRE2_UCP is set, Unicode properties
+       are used instead to classify characters. More details are given in  the
        section on generic character types in the pcre2pattern page. If you set
-       PCRE2_UCP,  matching one of the items it affects takes much longer. The
-       option is available only if PCRE2 has been compiled with  Unicode  sup-
-       port.
+       PCRE2_UCP, matching one of the items it affects takes much longer.  The
+       option  is  available only if PCRE2 has been compiled with Unicode sup-
+       port (which is the default).
 
          PCRE2_UNGREEDY
 
-       This  option  inverts  the "greediness" of the quantifiers so that they
-       are not greedy by default, but become greedy if followed by "?". It  is
-       not  compatible  with Perl. It can also be set by a (?U) option setting
+       This option inverts the "greediness" of the quantifiers  so  that  they
+       are  not greedy by default, but become greedy if followed by "?". It is
+       not compatible with Perl. It can also be set by a (?U)  option  setting
        within the pattern.
 
          PCRE2_USE_OFFSET_LIMIT
 
        This option must be set for pcre2_compile() if pcre2_set_offset_limit()
-       is  going  to be used to set a non-default offset limit in a match con-
-       text for matches that use this pattern. An error  is  generated  if  an
-       offset  limit  is  set  without  this option. For more details, see the
-       description of pcre2_set_offset_limit() in the section  that  describes
+       is going to be used to set a non-default offset limit in a  match  con-
+       text  for  matches  that  use this pattern. An error is generated if an
+       offset limit is set without this option.  For  more  details,  see  the
+       description  of  pcre2_set_offset_limit() in the section that describes
        match contexts. See also the PCRE2_FIRSTLINE option above.
 
          PCRE2_UTF
 
-       This  option  causes  PCRE2  to regard both the pattern and the subject
-       strings that are subsequently processed as strings  of  UTF  characters
-       instead  of  single-code-unit  strings.  It  is available when PCRE2 is
-       built to include Unicode support (which is  the  default).  If  Unicode
-       support  is  not  available,  the use of this option provokes an error.
-       Details of how this option changes the behaviour of PCRE2 are given  in
+       This option causes PCRE2 to regard both the  pattern  and  the  subject
+       strings  that  are  subsequently processed as strings of UTF characters
+       instead of single-code-unit strings. It  is  available  when  PCRE2  is
+       built  to  include  Unicode  support (which is the default). If Unicode
+       support is not available, the use of this  option  provokes  an  error.
+       Details  of  how  PCRE2_UTF changes the behaviour of PCRE2 are given in
        the pcre2unicode page.
 
-
-COMPILATION ERROR CODES
-
-       There  are over 80 positive error codes that pcre2_compile() may return
-       (via errorcode) if it finds an error in the  pattern.  There  are  also
-       some  negative error codes that are used for invalid UTF strings. These
-       are the same as given by pcre2_match() and pcre2_dfa_match(),  and  are
-       described in the pcre2unicode page. The pcre2_get_error_message() func-
-       tion (see "Obtaining a textual error message" below) can be  called  to
-       obtain a textual error message from any error code.
+   Extra compile options
+
+       Unlike the main compile-time options, the extra options are  not  saved
+       with the compiled pattern. The option bits that can be set in a compile
+       context by calling the pcre2_set_compile_extra_options()  function  are
+       as follows:
+
+         PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
+
+       This  option  applies when compiling a pattern in UTF-8 or UTF-32 mode.
+       It is forbidden in UTF-16 mode, and ignored in non-UTF  modes.  Unicode
+       "surrogate" code points in the range 0xd800 to 0xdfff are used in pairs
+       in UTF-16 to encode code points with values in  the  range  0x10000  to
+       0x10ffff.  The  surrogates  cannot  therefore be represented in UTF-16.
+       They can be represented in UTF-8 and UTF-32, but are defined as invalid
+       code  points,  and  cause  errors  if  encountered in a UTF-8 or UTF-32
+       string that is being checked for validity by PCRE2.
+
+       These values also cause errors if encountered in escape sequences  such
+       as \x{d912} within a pattern. However, it seems that some applications,
+       when using PCRE2 to check for unwanted  characters  in  UTF-8  strings,
+       explicitly   test  for  the  surrogates  using  escape  sequences.  The
+       PCRE2_NO_UTF_CHECK option does  not  disable  the  error  that  occurs,
+       because  it applies only to the testing of input strings for UTF valid-
+       ity.
+
+       If the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set,  surro-
+       gate  code  point values in UTF-8 and UTF-32 patterns no longer provoke
+       errors and are incorporated in the compiled pattern. However, they  can
+       only  match  subject characters if the matching function is called with
+       PCRE2_NO_UTF_CHECK set.
+
+         PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
+
+       This is a dangerous option. Use with care. By default, an  unrecognized
+       escape  such  as \j or a malformed one such as \x{2z} causes a compile-
+       time error when detected by pcre2_compile(). Perl is somewhat inconsis-
+       tent  in  handling  such items: for example, \j is treated as a literal
+       "j", and non-hexadecimal digits in \x{} are just ignored, though  warn-
+       ings  are given in both cases if Perl's warning switch is enabled. How-
+       ever, a malformed octal number after \o{  always  causes  an  error  in
+       Perl.
+
+       If  the  PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL  extra  option  is passed to
+       pcre2_compile(), all unrecognized or  erroneous  escape  sequences  are
+       treated  as  single-character escapes. For example, \j is a literal "j"
+       and \x{2z} is treated as  the  literal  string  "x{2z}".  Setting  this
+       option  means  that  typos in patterns may go undetected and have unex-
+       pected results. This is a dangerous option. Use with care.
+
+         PCRE2_EXTRA_MATCH_LINE
+
+       This option is provided for use by  the  -x  option  of  pcre2grep.  It
+       causes  the  pattern  only to match complete lines. This is achieved by
+       automatically inserting the code for "^(?:" at the start  of  the  com-
+       piled  pattern  and ")$" at the end. Thus, when PCRE2_MULTILINE is set,
+       the matched line may be in the  middle  of  the  subject  string.  This
+       option can be used with PCRE2_LITERAL.
+
+         PCRE2_EXTRA_MATCH_WORD
+
+       This  option  is  provided  for  use  by the -w option of pcre2grep. It
+       causes the pattern only to match strings that have a word  boundary  at
+       the  start and the end. This is achieved by automatically inserting the
+       code for "\b(?:" at the start of the compiled pattern and ")\b" at  the
+       end.  The option may be used with PCRE2_LITERAL. However, it is ignored
+       if PCRE2_EXTRA_MATCH_LINE is also set.
 
 
 JUST-IN-TIME (JIT) COMPILATION
@@ -1547,53 +1814,53 @@ JUST-IN-TIME (JIT) COMPILATION
 
        void pcre2_jit_stack_free(pcre2_jit_stack *jit_stack);
 
-       These  functions  provide  support  for  JIT compilation, which, if the
-       just-in-time compiler is available, further processes a  compiled  pat-
+       These functions provide support for  JIT  compilation,  which,  if  the
+       just-in-time  compiler  is available, further processes a compiled pat-
        tern into machine code that executes much faster than the pcre2_match()
-       interpretive matching function. Full details are given in the  pcre2jit
+       interpretive  matching function. Full details are given in the pcre2jit
        documentation.
 
-       JIT  compilation  is  a heavyweight optimization. It can take some time
-       for patterns to be analyzed, and for one-off matches  and  simple  pat-
-       terns  the benefit of faster execution might be offset by a much slower
-       compilation time.  Most, but not all patterns can be optimized  by  the
+       JIT compilation is a heavyweight optimization. It can  take  some  time
+       for  patterns  to  be analyzed, and for one-off matches and simple pat-
+       terns the benefit of faster execution might be offset by a much  slower
+       compilation  time.  Most (but not all) patterns can be optimized by the
        JIT compiler.
 
 
 LOCALE SUPPORT
 
-       PCRE2  handles caseless matching, and determines whether characters are
-       letters, digits, or whatever, by reference to a set of tables,  indexed
-       by  character  code  point.  This applies only to characters whose code
-       points are less than 256. By default, higher-valued code  points  never
-       match  escapes  such  as \w or \d.  However, if PCRE2 is built with UTF
-       support, all characters can be tested with  \p  and  \P,  or,  alterna-
-       tively,  the  PCRE2_UCP  option  can be set when a pattern is compiled;
-       this causes \w and friends to use Unicode property support  instead  of
+       PCRE2 handles caseless matching, and determines whether characters  are
+       letters,  digits, or whatever, by reference to a set of tables, indexed
+       by character code point. This applies only  to  characters  whose  code
+       points  are  less than 256. By default, higher-valued code points never
+       match escapes such as \w or \d.  However, if PCRE2 is built  with  Uni-
+       code support, all characters can be tested with \p and \P, or, alterna-
+       tively, the PCRE2_UCP option can be set when  a  pattern  is  compiled;
+       this  causes  \w and friends to use Unicode property support instead of
        the built-in tables.
 
-       The  use  of  locales  with Unicode is discouraged. If you are handling
-       characters with code points greater than 128,  you  should  either  use
+       The use of locales with Unicode is discouraged.  If  you  are  handling
+       characters  with  code  points  greater than 128, you should either use
        Unicode support, or use locales, but not try to mix the two.
 
-       PCRE2  contains  an  internal  set of character tables that are used by
-       default.  These are sufficient for  many  applications.  Normally,  the
+       PCRE2 contains an internal set of character tables  that  are  used  by
+       default.   These  are  sufficient  for many applications. Normally, the
        internal tables recognize only ASCII characters. However, when PCRE2 is
        built, it is possible to cause the internal tables to be rebuilt in the
        default "C" locale of the local system, which may cause them to be dif-
        ferent.
 
-       The internal tables can be overridden by tables supplied by the  appli-
-       cation  that  calls  PCRE2.  These may be created in a different locale
-       from the default.  As more and more applications change to  using  Uni-
+       The  internal tables can be overridden by tables supplied by the appli-
+       cation that calls PCRE2. These may be created  in  a  different  locale
+       from  the  default.  As more and more applications change to using Uni-
        code, the need for this locale support is expected to die away.
 
-       External  tables  are built by calling the pcre2_maketables() function,
-       in the relevant locale. The result can be passed to pcre2_compile()  as
-       often   as  necessary,  by  creating  a  compile  context  and  calling
-       pcre2_set_character_tables() to set the  tables  pointer  therein.  For
-       example,  to  build  and use tables that are appropriate for the French
-       locale (where accented characters with  values  greater  than  128  are
+       External tables are built by calling the  pcre2_maketables()  function,
+       in  the relevant locale. The result can be passed to pcre2_compile() as
+       often  as  necessary,  by  creating  a  compile  context  and   calling
+       pcre2_set_character_tables()  to  set  the  tables pointer therein. For
+       example, to build and use tables that are appropriate  for  the  French
+       locale  (where  accented  characters  with  values greater than 128 are
        treated as letters), the following code could be used:
 
          setlocale(LC_CTYPE, "fr_FR");
@@ -1602,15 +1869,15 @@ LOCALE SUPPORT
          pcre2_set_character_tables(ccontext, tables);
          re = pcre2_compile(..., ccontext);
 
-       The  locale  name "fr_FR" is used on Linux and other Unix-like systems;
-       if you are using Windows, the name for the French locale  is  "french".
-       It  is the caller's responsibility to ensure that the memory containing
+       The locale name "fr_FR" is used on Linux and other  Unix-like  systems;
+       if  you  are using Windows, the name for the French locale is "french".
+       It is the caller's responsibility to ensure that the memory  containing
        the tables remains available for as long as it is needed.
 
        The pointer that is passed (via the compile context) to pcre2_compile()
-       is  saved  with  the  compiled pattern, and the same tables are used by
-       pcre2_match() and pcre_dfa_match(). Thus, for any single pattern,  com-
-       pilation,  and  matching  all  happen in the same locale, but different
+       is saved with the compiled pattern, and the same  tables  are  used  by
+       pcre2_match()  and pcre_dfa_match(). Thus, for any single pattern, com-
+       pilation and matching both happen in the  same  locale,  but  different
        patterns can be processed in different locales.
 
 
@@ -1618,14 +1885,14 @@ INFORMATION ABOUT A COMPILED PATTERN
 
        int pcre2_pattern_info(const pcre2 *code, uint32_t what, void *where);
 
-       The pcre2_pattern_info() function returns general information  about  a
+       The  pcre2_pattern_info()  function returns general information about a
        compiled pattern. For information about callouts, see the next section.
-       The first argument for pcre2_pattern_info() is a pointer  to  the  com-
+       The  first  argument  for pcre2_pattern_info() is a pointer to the com-
        piled pattern. The second argument specifies which piece of information
-       is required, and the third argument is  a  pointer  to  a  variable  to
-       receive  the data. If the third argument is NULL, the first argument is
-       ignored, and the function returns the size in  bytes  of  the  variable
-       that is required for the information requested. Otherwise, The yield of
+       is  required,  and  the  third  argument  is a pointer to a variable to
+       receive the data. If the third argument is NULL, the first argument  is
+       ignored,  and  the  function  returns the size in bytes of the variable
+       that is required for the information requested. Otherwise, the yield of
        the function is zero for success, or one of the following negative num-
        bers:
 
@@ -1634,9 +1901,9 @@ INFORMATION ABOUT A COMPILED PATTERN
          PCRE2_ERROR_BADOPTION      the value of what was invalid
          PCRE2_ERROR_UNSET          the requested field is not set
 
-       The  "magic  number" is placed at the start of each compiled pattern as
-       an simple check against passing an arbitrary memory pointer. Here is  a
-       typical  call of pcre2_pattern_info(), to obtain the length of the com-
+       The "magic number" is placed at the start of each compiled  pattern  as
+       an  simple check against passing an arbitrary memory pointer. Here is a
+       typical call of pcre2_pattern_info(), to obtain the length of the  com-
        piled pattern:
 
          int rc;
@@ -1651,12 +1918,16 @@ INFORMATION ABOUT A COMPILED PATTERN
 
          PCRE2_INFO_ALLOPTIONS
          PCRE2_INFO_ARGOPTIONS
-
-       Return a copy of the pattern's options. The third argument should point
-       to a  uint32_t  variable.  PCRE2_INFO_ARGOPTIONS  returns  exactly  the
-       options  that were passed to pcre2_compile(), whereas PCRE2_INFO_ALLOP-
-       TIONS returns the compile options as modified by any  top-level  (*XXX)
-       option settings such as (*UTF) at the start of the pattern itself.
+         PCRE2_INFO_EXTRAOPTIONS
+
+       Return copies of the pattern's options. The third argument should point
+       to  a  uint32_t  variable.  PCRE2_INFO_ARGOPTIONS  returns  exactly the
+       options that were passed to pcre2_compile(), whereas  PCRE2_INFO_ALLOP-
+       TIONS  returns  the compile options as modified by any top-level (*XXX)
+       option settings such as (*UTF) at the  start  of  the  pattern  itself.
+       PCRE2_INFO_EXTRAOPTIONS  returns the extra options that were set in the
+       compile context by calling the pcre2_set_compile_extra_options()  func-
+       tion.
 
        For   example,   if  the  pattern  /(*UTF)abc/  is  compiled  with  the
        PCRE2_EXTENDED  option,  the  result   for   PCRE2_INFO_ALLOPTIONS   is
@@ -1681,8 +1952,8 @@ INFORMATION ABOUT A COMPILED PATTERN
          .* is not in a capturing group that is the subject
               of a back reference
          PCRE2_DOTALL is in force for .*
-         Neither (*PRUNE) nor (*SKIP) appears in the pattern.
-         PCRE2_NO_DOTSTAR_ANCHOR is not set.
+         Neither (*PRUNE) nor (*SKIP) appears in the pattern
+         PCRE2_NO_DOTSTAR_ANCHOR is not set
 
        For  patterns  that are auto-anchored, the PCRE2_ANCHORED bit is set in
        the options returned for PCRE2_INFO_ALLOPTIONS.
@@ -1711,6 +1982,16 @@ INFORMATION ABOUT A COMPILED PATTERN
        terns where (?| is not used, this is also the total number of capturing
        subpatterns.  The third argument should point to an uint32_t variable.
 
+         PCRE2_INFO_DEPTHLIMIT
+
+       If the pattern set a backtracking depth limit by including an  item  of
+       the  form  (*LIMIT_DEPTH=nnnn) at the start, the value is returned. The
+       third argument should point to an unsigned 32-bit integer. If  no  such
+       value  has been set, the call to pcre2_pattern_info() returns the error
+       PCRE2_ERROR_UNSET. Note that this limit will only be used during match-
+       ing  if it is less than the limit set or defaulted by the caller of the
+       match function.
+
          PCRE2_INFO_FIRSTBITMAP
 
        In the absence of a single first code unit for a non-anchored  pattern,
@@ -1727,33 +2008,53 @@ INFORMATION ABOUT A COMPILED PATTERN
        Return information about the first code unit of any matched string, for
        a non-anchored pattern. The third argument should point to an  uint32_t
        variable.  If there is a fixed first value, for example, the letter "c"
-       from a pattern such as (cat|cow|coyote), 1 is returned, and the charac-
-       ter  value can be retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is
-       no fixed first value, but it is known that a match can  occur  only  at
-       the  start  of  the subject or following a newline in the subject, 2 is
-       returned. Otherwise, and for anchored patterns, 0 is returned.
+       from a pattern such as (cat|cow|coyote), 1 is returned, and  the  value
+       can  be  retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed
+       first value, but it is known that a match can occur only at  the  start
+       of  the  subject  or following a newline in the subject, 2 is returned.
+       Otherwise, and for anchored patterns, 0 is returned.
 
          PCRE2_INFO_FIRSTCODEUNIT
 
-       Return the value of the first code unit of any matched  string  in  the
-       situation where PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0.
+       Return the value of the first code unit of any  matched  string  for  a
+       pattern  where  PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0.
        The third argument should point to an uint32_t variable. In  the  8-bit
        library,  the  value is always less than 256. In the 16-bit library the
        value can be up to 0xffff. In the 32-bit library  in  UTF-32  mode  the
        value can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32
        mode.
 
+         PCRE2_INFO_FRAMESIZE
+
+       Return the size (in bytes) of the data frames that are used to remember
+       backtracking  positions  when the pattern is processed by pcre2_match()
+       without the use of JIT. The third argument should point  to  an  size_t
+       variable. The frame size depends on the number of capturing parentheses
+       in the pattern. Each additional capturing  group  adds  two  PCRE2_SIZE
+       variables.
+
          PCRE2_INFO_HASBACKSLASHC
 
-       Return 1 if the pattern contains any instances of \C, otherwise 0.  The
+       Return  1 if the pattern contains any instances of \C, otherwise 0. The
        third argument should point to an uint32_t variable.
 
          PCRE2_INFO_HASCRORLF
 
-       Return  1  if  the  pattern  contains any explicit matches for CR or LF
+       Return 1 if the pattern contains any explicit  matches  for  CR  or  LF
        characters, otherwise 0. The third argument should point to an uint32_t
-       variable.  An explicit match is either a literal CR or LF character, or
-       \r or \n.
+       variable. An explicit match is either a literal CR or LF character,  or
+       \r  or  \n  or  one  of  the  equivalent  hexadecimal  or  octal escape
+       sequences.
+
+         PCRE2_INFO_HEAPLIMIT
+
+       If the pattern set a heap memory limit by including an item of the form
+       (*LIMIT_HEAP=nnnn) at the start, the value is returned. The third argu-
+       ment should point to an unsigned 32-bit integer. If no such  value  has
+       been   set,   the   call  to  pcre2_pattern_info()  returns  the  error
+       PCRE2_ERROR_UNSET. Note that this limit will only be used during match-
+       ing  if it is less than the limit set or defaulted by the caller of the
+       match function.
 
          PCRE2_INFO_JCHANGED
 
@@ -1782,10 +2083,10 @@ INFORMATION ABOUT A COMPILED PATTERN
 
          PCRE2_INFO_LASTCODEUNIT
 
-       Return  the value of the rightmost literal data unit that must exist in
-       any matched string, other than at its start, if such a value  has  been
-       recorded.  The  third argument should point to an uint32_t variable. If
-       there is no such value, 0 is returned.
+       Return  the value of the rightmost literal code unit that must exist in
+       any matched string, other than  at  its  start,  for  a  pattern  where
+       PCRE2_INFO_LASTCODETYPE returns 1. Otherwise, return 0. The third argu-
+       ment should point to an uint32_t variable.
 
          PCRE2_INFO_MATCHEMPTY
 
@@ -1801,7 +2102,9 @@ INFORMATION ABOUT A COMPILED PATTERN
        (*LIMIT_MATCH=nnnn)  at  the  start,  the  value is returned. The third
        argument should point to an unsigned 32-bit integer. If no  such  value
        has  been  set,  the  call  to  pcre2_pattern_info()  returns the error
-       PCRE2_ERROR_UNSET.
+       PCRE2_ERROR_UNSET. Note that this limit will only be used during match-
+       ing  if it is less than the limit set or defaulted by the caller of the
+       match function.
 
          PCRE2_INFO_MAXLOOKBEHIND
 
@@ -1814,7 +2117,7 @@ INFORMATION ABOUT A COMPILED PATTERN
        inspect the previous character. This is to ensure  that  at  least  one
        character  from  the old segment is retained when a new segment is pro-
        cessed. Otherwise, if there are no lookbehinds in the pattern, \A might
-       match incorrectly at the start of a new segment.
+       match incorrectly at the start of a second or subsequent segment.
 
          PCRE2_INFO_MINLENGTH
 
@@ -1889,24 +2192,17 @@ INFORMATION ABOUT A COMPILED PATTERN
 
          PCRE2_INFO_NEWLINE
 
-       The output is a uint32_t with one of the following values:
+       The output is one of the following uint32_t values:
 
          PCRE2_NEWLINE_CR       Carriage return (CR)
          PCRE2_NEWLINE_LF       Linefeed (LF)
          PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
          PCRE2_NEWLINE_ANY      Any Unicode line ending
          PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
+         PCRE2_NEWLINE_NUL      The NUL character (binary zero)
 
-       This specifies the default character sequence that will  be  recognized
-       as meaning "newline" while matching.
-
-         PCRE2_INFO_RECURSIONLIMIT
-
-       If  the  pattern set a recursion limit by including an item of the form
-       (*LIMIT_RECURSION=nnnn) at the start, the value is returned. The  third
-       argument  should  point to an unsigned 32-bit integer. If no such value
-       has been set,  the  call  to  pcre2_pattern_info()  returns  the  error
-       PCRE2_ERROR_UNSET.
+       This identifies the character sequence that will be recognized as mean-
+       ing "newline" while matching.
 
          PCRE2_INFO_SIZE
 
@@ -1962,15 +2258,15 @@ THE MATCH DATA BLOCK
        match  data  block,  which  is  an opaque structure that is accessed by
        function calls. In particular, the match data block contains  a  vector
        of  offsets into the subject string that define the matched part of the
-       subject and any substrings that were captured.  This  is  know  as  the
+       subject and any substrings that were captured. This  is  known  as  the
        ovector.
 
        Before  calling  pcre2_match(), pcre2_dfa_match(), or pcre2_jit_match()
        you must create a match data block by calling one of the creation func-
        tions  above.  For pcre2_match_data_create(), the first argument is the
        number of pairs of offsets in the  ovector.  One  pair  of  offsets  is
-       required  to  identify  the string that matched the whole pattern, with
-       another pair for each captured substring. For example,  a  value  of  4
+       required to identify the string that matched the whole pattern, with an
+       additional pair for each captured substring. For example, a value of  4
        creates  enough space to record the matched portion of the subject plus
        three captured substrings. A minimum of at least 1 pair is  imposed  by
        pcre2_match_data_create(), so it is always possible to return the over-
@@ -2038,7 +2334,7 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
            11,             /* the length of the subject string */
            0,              /* start at offset 0 in the subject */
            0,              /* default options */
-           match_data,     /* the match data block */
+           md,             /* the match data block */
            NULL);          /* a match context; NULL means use defaults */
 
        If  the  subject  string is zero-terminated, the length can be given as
@@ -2094,24 +2390,26 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
        so,  and the current character is CR followed by LF, advance the start-
        ing offset by two characters instead of one.
 
-       If a non-zero starting offset is passed when the pattern  is  anchored,
-       one attempt to match at the given offset is made. This can only succeed
-       if the pattern does not require the match to be at  the  start  of  the
-       subject.
+       If a non-zero starting offset is passed when the pattern is anchored, a
+       single attempt to match at the given offset is made. This can only suc-
+       ceed if the pattern does not require the match to be at  the  start  of
+       the  subject.  In other words, the anchoring must be the result of set-
+       ting the PCRE2_ANCHORED option or the use of .* with PCRE2_DOTALL,  not
+       by starting the pattern with ^ or \A.
 
    Option bits for pcre2_match()
 
        The unused bits of the options argument for pcre2_match() must be zero.
-       The only  bits  that  may  be  set  are  PCRE2_ANCHORED,  PCRE2_NOTBOL,
-       PCRE2_NOTEOL,   PCRE2_NOTEMPTY,  PCRE2_NOTEMPTY_ATSTART,  PCRE2_NO_JIT,
-       PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and  PCRE2_PARTIAL_SOFT.  Their
-       action is described below.
-
-       Setting  PCRE2_ANCHORED  at match time is not supported by the just-in-
-       time (JIT) compiler. If it is set, JIT matching  is  disabled  and  the
-       normal   interpretive   code   in  pcre2_match()  is  run.  Apart  from
-       PCRE2_NO_JIT (obviously), the remaining options are supported  for  JIT
-       matching.
+       The only bits that may be set  are  PCRE2_ANCHORED,  PCRE2_ENDANCHORED,
+       PCRE2_NOTBOL,   PCRE2_NOTEOL,  PCRE2_NOTEMPTY,  PCRE2_NOTEMPTY_ATSTART,
+       PCRE2_NO_JIT, PCRE2_NO_UTF_CHECK,  PCRE2_PARTIAL_HARD,  and  PCRE2_PAR-
+       TIAL_SOFT.  Their action is described below.
+
+       Setting  PCRE2_ANCHORED  or PCRE2_ENDANCHORED at match time is not sup-
+       ported by the just-in-time (JIT) compiler. If it is set,  JIT  matching
+       is  disabled  and  the interpretive code in pcre2_match() is run. Apart
+       from PCRE2_NO_JIT (obviously), the remaining options are supported  for
+       JIT matching.
 
          PCRE2_ANCHORED
 
@@ -2121,6 +2419,12 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
        unachored at matching time. Note that setting the option at match  time
        disables JIT matching.
 
+         PCRE2_ENDANCHORED
+
+       If  the  PCRE2_ENDANCHORED option is set, any string that pcre2_match()
+       matches must be right at the end of the subject string. Note that  set-
+       ting the option at match time disables JIT matching.
+
          PCRE2_NOTBOL
 
        This option specifies that first character of the subject string is not
@@ -2192,11 +2496,11 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
        checks  for  performance  reasons,  you  can set the PCRE2_NO_UTF_CHECK
        option when calling pcre2_match(). You might want to do  this  for  the
        second and subsequent calls to pcre2_match() if you are making repeated
-       calls to find all the matches in a single subject string.
+       calls to find other matches in the same subject string.
 
-       NOTE: When PCRE2_NO_UTF_CHECK is set, the effect of passing an  invalid
-       string  as a subject, or an invalid value of startoffset, is undefined.
-       Your program may crash or loop indefinitely.
+       WARNING: When PCRE2_NO_UTF_CHECK is  set,  the  effect  of  passing  an
+       invalid  string  as  a  subject, or an invalid value of startoffset, is
+       undefined.  Your program may crash or loop indefinitely.
 
          PCRE2_PARTIAL_HARD
          PCRE2_PARTIAL_SOFT
@@ -2249,11 +2553,12 @@ NEWLINE HANDLING WHEN MATCHING
        acter after the first failure.
 
        An explicit match for CR of LF is either a literal appearance of one of
-       those  characters  in  the  pattern,  or  one  of  the  \r or \n escape
-       sequences. Implicit matches such as [^X] do not  count,  nor  does  \s,
-       even though it includes CR and LF in the characters that it matches.
+       those  characters  in the pattern, or one of the \r or \n or equivalent
+       octal or hexadecimal escape sequences. Implicit matches such as [^X] do
+       not  count, nor does \s, even though it includes CR and LF in the char-
+       acters that it matches.
 
-       Notwithstanding  the above, anomalous effects may still occur when CRLF
+       Notwithstanding the above, anomalous effects may still occur when  CRLF
        is a valid newline sequence and explicit \r or \n escapes appear in the
        pattern.
 
@@ -2264,85 +2569,81 @@ HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS
 
        PCRE2_SIZE *pcre2_get_ovector_pointer(pcre2_match_data *match_data);
 
-       In  general, a pattern matches a certain portion of the subject, and in
-       addition, further substrings from the subject  may  be  picked  out  by
-       parenthesized  parts  of  the  pattern.  Following the usage in Jeffrey
-       Friedl's book, this is called "capturing"  in  what  follows,  and  the
-       phrase  "capturing subpattern" or "capturing group" is used for a frag-
-       ment of a pattern that picks out a substring.  PCRE2  supports  several
+       In general, a pattern matches a certain portion of the subject, and  in
+       addition,  further  substrings  from  the  subject may be picked out by
+       parenthesized parts of the pattern.  Following  the  usage  in  Jeffrey
+       Friedl's  book,  this  is  called  "capturing" in what follows, and the
+       phrase "capturing subpattern" or "capturing group" is used for a  frag-
+       ment  of  a  pattern that picks out a substring. PCRE2 supports several
        other kinds of parenthesized subpattern that do not cause substrings to
-       be captured. The pcre2_pattern_info() function can be used to find  out
+       be  captured. The pcre2_pattern_info() function can be used to find out
        how many capturing subpatterns there are in a compiled pattern.
 
-       You  can  use  auxiliary functions for accessing captured substrings by
+       You can use auxiliary functions for accessing  captured  substrings  by
        number or by name, as described in sections below.
 
        Alternatively, you can make direct use of the vector of PCRE2_SIZE val-
-       ues,  called  the  ovector,  which  contains  the  offsets  of captured
-       strings.  It  is  part  of  the  match  data   block.    The   function
-       pcre2_get_ovector_pointer()  returns  the  address  of the ovector, and
+       ues, called  the  ovector,  which  contains  the  offsets  of  captured
+       strings.   It   is   part  of  the  match  data  block.   The  function
+       pcre2_get_ovector_pointer() returns the address  of  the  ovector,  and
        pcre2_get_ovector_count() returns the number of pairs of values it con-
        tains.
 
        Within the ovector, the first in each pair of values is set to the off-
        set of the first code unit of a substring, and the second is set to the
-       offset  of the first code unit after the end of a substring. These val-
-       ues are always code unit offsets, not character offsets. That is,  they
-       are  byte  offsets  in  the 8-bit library, 16-bit offsets in the 16-bit
+       offset of the first code unit after the end of a substring. These  val-
+       ues  are always code unit offsets, not character offsets. That is, they
+       are byte offsets in the 8-bit library, 16-bit  offsets  in  the  16-bit
        library, and 32-bit offsets in the 32-bit library.
 
-       After a partial match  (error  return  PCRE2_ERROR_PARTIAL),  only  the
-       first  pair  of  offsets  (that is, ovector[0] and ovector[1]) are set.
-       They identify the part of the subject that was partially  matched.  See
+       After  a  partial  match  (error  return PCRE2_ERROR_PARTIAL), only the
+       first pair of offsets (that is, ovector[0]  and  ovector[1])  are  set.
+       They  identify  the part of the subject that was partially matched. See
        the pcre2partial documentation for details of partial matching.
 
-       After a successful match, the first pair of offsets identifies the por-
-       tion of the subject string that was matched by the entire pattern.  The
-       next  pair  is  used for the first capturing subpattern, and so on. The
-       value returned by pcre2_match() is one more than the  highest  numbered
-       pair  that  has been set. For example, if two substrings have been cap-
-       tured, the returned value is 3. If there are no capturing  subpatterns,
-       the return value from a successful match is 1, indicating that just the
-       first pair of offsets has been set.
+       After a fully successful match, the first pair  of  offsets  identifies
+       the  portion  of the subject string that was matched by the entire pat-
+       tern. The next pair is used for the first captured  substring,  and  so
+       on.  The  value  returned by pcre2_match() is one more than the highest
+       numbered pair that has been set. For example, if  two  substrings  have
+       been  captured,  the returned value is 3. If there are no captured sub-
+       strings, the return value from a successful match is 1, indicating that
+       just the first pair of offsets has been set.
 
-       If a pattern uses the \K escape sequence within a  positive  assertion,
+       If  a  pattern uses the \K escape sequence within a positive assertion,
        the reported start of a successful match can be greater than the end of
-       the match.  For example, if the pattern  (?=ab\K)  is  matched  against
+       the  match.   For  example,  if the pattern (?=ab\K) is matched against
        "ab", the start and end offset values for the match are 2 and 0.
 
-       If  a  capturing subpattern group is matched repeatedly within a single
-       match operation, it is the last portion of the subject that it  matched
+       If a capturing subpattern group is matched repeatedly within  a  single
+       match  operation, it is the last portion of the subject that it matched
        that is returned.
 
        If the ovector is too small to hold all the captured substring offsets,
-       as much as possible is filled in, and the function returns a  value  of
-       zero.  If captured substrings are not of interest, pcre2_match() may be
+       as  much  as possible is filled in, and the function returns a value of
+       zero. If captured substrings are not of interest, pcre2_match() may  be
        called with a match data block whose ovector is of minimum length (that
-       is, one pair). However, if the pattern contains back references and the
-       ovector is not big enough to remember the related substrings, PCRE2 has
-       to  get  additional  memory for use during matching. Thus it is usually
-       advisable to set up a match data block containing an ovector of reason-
-       able size.
+       is, one pair).
 
-       It  is  possible for capturing subpattern number n+1 to match some part
+       It is possible for capturing subpattern number n+1 to match  some  part
        of the subject when subpattern n has not been used at all. For example,
-       if  the  string  "abc"  is  matched against the pattern (a|(z))(bc) the
+       if the string "abc" is matched  against  the  pattern  (a|(z))(bc)  the
        return from the function is 4, and subpatterns 1 and 3 are matched, but
-       2  is  not.  When  this happens, both values in the offset pairs corre-
+       2 is not. When this happens, both values in  the  offset  pairs  corre-
        sponding to unused subpatterns are set to PCRE2_UNSET.
 
-       Offset values that correspond to unused subpatterns at the end  of  the
-       expression  are  also  set  to  PCRE2_UNSET. For example, if the string
+       Offset  values  that correspond to unused subpatterns at the end of the
+       expression are also set to PCRE2_UNSET.  For  example,  if  the  string
        "abc" is matched against the pattern (abc)(x(yz)?)? subpatterns 2 and 3
-       are  not matched.  The return from the function is 2, because the high-
+       are not matched.  The return from the function is 2, because the  high-
        est used capturing subpattern number is 1. The offsets for for the sec-
-       ond  and  third  capturing  subpatterns  (assuming  the vector is large
+       ond and third capturing  subpatterns  (assuming  the  vector  is  large
        enough, of course) are set to PCRE2_UNSET.
 
        Elements in the ovector that do not correspond to capturing parentheses
        in the pattern are never changed. That is, if a pattern contains n cap-
        turing parentheses, no more than ovector[0] to ovector[2n+1] are set by
-       pcre2_match().  The  other  elements retain whatever values they previ-
+       pcre2_match(). The other elements retain whatever  values  they  previ-
        ously had.
 
 
@@ -2352,56 +2653,60 @@ OTHER INFORMATION ABOUT A MATCH
 
        PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *match_data);
 
-       As well as the offsets in the ovector, other information about a  match
-       is  retained  in the match data block and can be retrieved by the above
-       functions in appropriate circumstances. If they  are  called  at  other
+       As  well as the offsets in the ovector, other information about a match
+       is retained in the match data block and can be retrieved by  the  above
+       functions  in  appropriate  circumstances.  If they are called at other
        times, the result is undefined.
 
-       After  a  successful match, a partial match (PCRE2_ERROR_PARTIAL), or a
-       failure to match (PCRE2_ERROR_NOMATCH), a (*MARK) name  may  be  avail-
-       able,  and  pcre2_get_mark() can be called. It returns a pointer to the
-       zero-terminated name, which is within the compiled  pattern.  Otherwise
-       NULL  is returned. The length of the (*MARK) name (excluding the termi-
-       nating zero) is stored in the code unit that  preceeds  the  name.  You
-       should  use  this  instead  of  relying  on the terminating zero if the
-       (*MARK) name might contain a binary zero.
-
-       After a successful match, the (*MARK) name that is returned is the last
-       one  encountered  on the matching path through the pattern. After a "no
-       match" or a  partial  match,  the  last  encountered  (*MARK)  name  is
-       returned. For example, consider this pattern:
+       After a successful match, a partial match (PCRE2_ERROR_PARTIAL),  or  a
+       failure to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN)
+       name may be available. The function pcre2_get_mark() can be  called  to
+       access  this  name.  The  same  function applies to all three verbs. It
+       returns a pointer to the zero-terminated name, which is within the com-
+       piled pattern. If no name is available, NULL is returned. The length of
+       the name (excluding the terminating zero) is stored in  the  code  unit
+       that  precedes  the name. You should use this length instead of relying
+       on the terminating zero if the name might contain a binary zero.
+
+       After a successful match,  the  name  that  is  returned  is  the  last
+       (*MARK),  (*PRUNE),  or  (*THEN)  name encountered on the matching path
+       through the pattern.  Instances of (*PRUNE) and (*THEN)  without  names
+       are   ignored.  Thus,  for  example,  if  the  matching  path  contains
+       (*MARK:A)(*PRUNE), the name "A" is returned.  After a "no match"  or  a
+       partial  match,  the  last  encountered name is returned.  For example,
+       consider this pattern:
 
          ^(*MARK:A)((*MARK:B)a|b)c
 
-       When  it  matches "bc", the returned mark is A. The B mark is "seen" in
-       the first branch of the group, but it is not on the matching  path.  On
-       the  other  hand,  when  this pattern fails to match "bx", the returned
-       mark is B.
+       When it matches "bc", the returned name is A. The B mark is  "seen"  in
+       the  first  branch of the group, but it is not on the matching path. On
+       the other hand, when this pattern fails to  match  "bx",  the  returned
+       name is B.
 
-       After a successful match, a partial match, or one of  the  invalid  UTF
-       errors  (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can
+       After  a  successful  match, a partial match, or one of the invalid UTF
+       errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar()  can
        be called. After a successful or partial match it returns the code unit
-       offset  of  the character at which the match started. For a non-partial
-       match, this can be different to the value of ovector[0] if the  pattern
-       contains  the  \K escape sequence. After a partial match, however, this
-       value is always the same as ovector[0] because \K does not  affect  the
+       offset of the character at which the match started. For  a  non-partial
+       match,  this can be different to the value of ovector[0] if the pattern
+       contains the \K escape sequence. After a partial match,  however,  this
+       value  is  always the same as ovector[0] because \K does not affect the
        result of a partial match.
 
-       After  a UTF check failure, pcre2_get_startchar() can be used to obtain
+       After a UTF check failure, pcre2_get_startchar() can be used to  obtain
        the code unit offset of the invalid UTF character. Details are given in
        the pcre2unicode page.
 
 
 ERROR RETURNS FROM pcre2_match()
 
-       If  pcre2_match() fails, it returns a negative number. This can be con-
-       verted to a text string by calling the pcre2_get_error_message()  func-
-       tion  (see  "Obtaining a textual error message" below).  Negative error
-       codes are also returned by other functions,  and  are  documented  with
-       them.  The codes are given names in the header file. If UTF checking is
+       If pcre2_match() fails, it returns a negative number. This can be  con-
+       verted  to a text string by calling the pcre2_get_error_message() func-
+       tion (see "Obtaining a textual error message" below).   Negative  error
+       codes  are  also  returned  by other functions, and are documented with
+       them. The codes are given names in the header file. If UTF checking  is
        in force and an invalid UTF subject string is detected, one of a number
-       of  UTF-specific negative error codes is returned. Details are given in
-       the pcre2unicode page. The following are the other errors that  may  be
+       of UTF-specific negative error codes is returned. Details are given  in
+       the  pcre2unicode  page. The following are the other errors that may be
        returned by pcre2_match():
 
          PCRE2_ERROR_NOMATCH
@@ -2410,20 +2715,21 @@ ERROR RETURNS FROM pcre2_match()
 
          PCRE2_ERROR_PARTIAL
 
-       The  subject  string did not match, but it did match partially. See the
+       The subject string did not match, but it did match partially.  See  the
        pcre2partial documentation for details of partial matching.
 
          PCRE2_ERROR_BADMAGIC
 
        PCRE2 stores a 4-byte "magic number" at the start of the compiled code,
-       to  catch  the case when it is passed a junk pointer. This is the error
+       to catch the case when it is passed a junk pointer. This is  the  error
        that is returned when the magic number is not present.
 
          PCRE2_ERROR_BADMODE
 
-       This error is given when a pattern  that  was  compiled  by  the  8-bit
-       library  is  passed  to  a  16-bit  or 32-bit library function, or vice
-       versa.
+       This  error is given when a compiled pattern is passed to a function in
+       a library of a different code unit width, for example, a  pattern  com-
+       piled  by  the  8-bit  library  is passed to a 16-bit or 32-bit library
+       function.
 
          PCRE2_ERROR_BADOFFSET
 
@@ -2447,19 +2753,19 @@ ERROR RETURNS FROM pcre2_match()
        pcre2_callout_enumerate()  to  return a distinctive error code. See the
        pcre2callout documentation for details.
 
+         PCRE2_ERROR_DEPTHLIMIT
+
+       The nested backtracking depth limit was reached.
+
+         PCRE2_ERROR_HEAPLIMIT
+
+       The heap limit was reached.
+
          PCRE2_ERROR_INTERNAL
 
        An unexpected internal error has occurred. This error could  be  caused
        by a bug in PCRE2 or by overwriting of the compiled pattern.
 
-         PCRE2_ERROR_JIT_BADOPTION
-
-       This  error  is  returned  when a pattern that was successfully studied
-       using JIT is being matched, but the matching mode (partial or  complete
-       match)  does  not  correspond to any JIT compilation mode. When the JIT
-       fast path function is used, this error may be also  given  for  invalid
-       options. See the pcre2jit documentation for more details.
-
          PCRE2_ERROR_JIT_STACKLIMIT
 
        This  error  is  returned  when a pattern that was successfully studied
@@ -2469,15 +2775,15 @@ ERROR RETURNS FROM pcre2_match()
 
          PCRE2_ERROR_MATCHLIMIT
 
-       The backtracking limit was reached.
+       The backtracking match limit was reached.
 
          PCRE2_ERROR_NOMEMORY
 
-       If a pattern contains back references,  but  the  ovector  is  not  big
-       enough  to  remember  the  referenced substrings, PCRE2 gets a block of
-       memory at the start of matching to use for this purpose. There are some
-       other  special cases where extra memory is needed during matching. This
-       error is given when memory cannot be obtained.
+       If a pattern contains many nested backtracking points, heap  memory  is
+       used  to  remember them. This error is given when the memory allocation
+       function (default or  custom)  fails.  Note  that  a  different  error,
+       PCRE2_ERROR_HEAPLIMIT,  is given if the amount of memory needed exceeds
+       the heap limit.
 
          PCRE2_ERROR_NULL
 
@@ -2493,10 +2799,6 @@ ERROR RETURNS FROM pcre2_match()
        plicated  cases,  in particular mutual recursions between two different
        subpatterns, cannot be detected until matching is attempted.
 
-         PCRE2_ERROR_RECURSIONLIMIT
-
-       The internal recursion limit was reached.
-
 
 OBTAINING A TEXTUAL ERROR MESSAGE
 
@@ -2506,16 +2808,17 @@ OBTAINING A TEXTUAL ERROR MESSAGE
        A text message for an error code  from  any  PCRE2  function  (compile,
        match,  or  auxiliary)  can be obtained by calling pcre2_get_error_mes-
        sage(). The code is passed as the first argument,  with  the  remaining
-       two  arguments specifying a code unit buffer and its length, into which
-       the text message is placed. Note that the message is returned  in  code
-       units of the appropriate width for the library that is being used.
+       two  arguments  specifying  a  code  unit buffer and its length in code
+       units, into which the text message is placed. The message  is  returned
+       in  code  units  of the appropriate width for the library that is being
+       used.
 
-       The  returned message is terminated with a trailing zero, and the func-
-       tion returns the number of code  units  used,  excluding  the  trailing
+       The returned message is terminated with a trailing zero, and the  func-
+       tion  returns  the  number  of  code units used, excluding the trailing
        zero.  If  the  error  number  is  unknown,  the  negative  error  code
-       PCRE2_ERROR_BADDATA is returned. If the buffer is too small,  the  mes-
-       sage  is  truncated  (but still with a trailing zero), and the negative
-       error code PCRE2_ERROR_NOMEMORY is returned.  None of the messages  are
+       PCRE2_ERROR_BADDATA  is  returned. If the buffer is too small, the mes-
+       sage is truncated (but still with a trailing zero),  and  the  negative
+       error  code PCRE2_ERROR_NOMEMORY is returned.  None of the messages are
        very long; a buffer size of 120 code units is ample.
 
 
@@ -2534,39 +2837,39 @@ EXTRACTING CAPTURED SUBSTRINGS BY NUMBER
 
        void pcre2_substring_free(PCRE2_UCHAR *buffer);
 
-       Captured  substrings  can  be accessed directly by using the ovector as
+       Captured substrings can be accessed directly by using  the  ovector  as
        described above.  For convenience, auxiliary functions are provided for
-       extracting   captured  substrings  as  new,  separate,  zero-terminated
+       extracting  captured  substrings  as  new,  separate,   zero-terminated
        strings. A substring that contains a binary zero is correctly extracted
-       and  has  a  further  zero  added on the end, but the result is not, of
+       and has a further zero added on the end, but  the  result  is  not,  of
        course, a C string.
 
        The functions in this section identify substrings by number. The number
        zero refers to the entire matched substring, with higher numbers refer-
-       ring to substrings captured by parenthesized groups.  After  a  partial
-       match,  only  substring  zero  is  available. An attempt to extract any
-       other substring gives the error PCRE2_ERROR_PARTIAL. The  next  section
+       ring  to  substrings  captured by parenthesized groups. After a partial
+       match, only substring zero is available.  An  attempt  to  extract  any
+       other  substring  gives the error PCRE2_ERROR_PARTIAL. The next section
        describes similar functions for extracting captured substrings by name.
 
-       If  a  pattern uses the \K escape sequence within a positive assertion,
+       If a pattern uses the \K escape sequence within a  positive  assertion,
        the reported start of a successful match can be greater than the end of
-       the  match.   For  example,  if the pattern (?=ab\K) is matched against
-       "ab", the start and end offset values for the match are  2  and  0.  In
-       this  situation,  calling  these functions with a zero substring number
+       the match.  For example, if the pattern  (?=ab\K)  is  matched  against
+       "ab",  the  start  and  end offset values for the match are 2 and 0. In
+       this situation, calling these functions with a  zero  substring  number
        extracts a zero-length empty string.
 
-       You can find the length in code units of a captured  substring  without
-       extracting  it  by calling pcre2_substring_length_bynumber(). The first
-       argument is a pointer to the match data block, the second is the  group
-       number,  and the third is a pointer to a variable into which the length
-       is placed. If you just want to know whether or not  the  substring  has
+       You  can  find the length in code units of a captured substring without
+       extracting it by calling pcre2_substring_length_bynumber().  The  first
+       argument  is a pointer to the match data block, the second is the group
+       number, and the third is a pointer to a variable into which the  length
+       is  placed.  If  you just want to know whether or not the substring has
        been captured, you can pass the third argument as NULL.
 
-       The  pcre2_substring_copy_bynumber()  function  copies  a captured sub-
-       string into a supplied buffer,  whereas  pcre2_substring_get_bynumber()
-       copies  it  into  new memory, obtained using the same memory allocation
-       function that was used for the match data block. The  first  two  argu-
-       ments  of  these  functions are a pointer to the match data block and a
+       The pcre2_substring_copy_bynumber() function  copies  a  captured  sub-
+       string  into  a supplied buffer, whereas pcre2_substring_get_bynumber()
+       copies it into new memory, obtained using the  same  memory  allocation
+       function  that  was  used for the match data block. The first two argu-
+       ments of these functions are a pointer to the match data  block  and  a
        capturing group number.
 
        The final arguments of pcre2_substring_copy_bynumber() are a pointer to
@@ -2575,25 +2878,25 @@ EXTRACTING CAPTURED SUBSTRINGS BY NUMBER
        for the extracted substring, excluding the terminating zero.
 
        For pcre2_substring_get_bynumber() the third and fourth arguments point
-       to variables that are updated with a pointer to the new memory and  the
-       number  of  code units that comprise the substring, again excluding the
-       terminating zero. When the substring is no longer  needed,  the  memory
+       to  variables that are updated with a pointer to the new memory and the
+       number of code units that comprise the substring, again  excluding  the
+       terminating  zero.  When  the substring is no longer needed, the memory
        should be freed by calling pcre2_substring_free().
 
-       The  return  value  from  all these functions is zero for success, or a
-       negative error code. If the pattern match  failed,  the  match  failure
-       code  is  returned.   If  a  substring number greater than zero is used
-       after a partial match, PCRE2_ERROR_PARTIAL is returned. Other  possible
+       The return value from all these functions is zero  for  success,  or  a
+       negative  error  code.  If  the pattern match failed, the match failure
+       code is returned.  If a substring number  greater  than  zero  is  used
+       after  a partial match, PCRE2_ERROR_PARTIAL is returned. Other possible
        error codes are:
 
          PCRE2_ERROR_NOMEMORY
 
-       The  buffer  was  too small for pcre2_substring_copy_bynumber(), or the
+       The buffer was too small for  pcre2_substring_copy_bynumber(),  or  the
        attempt to get memory failed for pcre2_substring_get_bynumber().
 
          PCRE2_ERROR_NOSUBSTRING
 
-       There is no substring with that number in the  pattern,  that  is,  the
+       There  is  no  substring  with that number in the pattern, that is, the
        number is greater than the number of capturing parentheses.
 
          PCRE2_ERROR_UNAVAILABLE
@@ -2604,8 +2907,8 @@ EXTRACTING CAPTURED SUBSTRINGS BY NUMBER
 
          PCRE2_ERROR_UNSET
 
-       The  substring  did  not  participate in the match. For example, if the
-       pattern is (abc)|(def) and the subject is "def", and the  ovector  con-
+       The substring did not participate in the match.  For  example,  if  the
+       pattern  is  (abc)|(def) and the subject is "def", and the ovector con-
        tains at least two capturing slots, substring number 1 is unset.
 
 
@@ -2616,32 +2919,32 @@ EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS
 
        void pcre2_substring_list_free(PCRE2_SPTR *list);
 
-       The  pcre2_substring_list_get()  function  extracts  all available sub-
-       strings and builds a list of pointers to  them.  It  also  (optionally)
-       builds  a  second  list  that  contains  their lengths (in code units),
+       The pcre2_substring_list_get() function  extracts  all  available  sub-
+       strings  and  builds  a  list of pointers to them. It also (optionally)
+       builds a second list that  contains  their  lengths  (in  code  units),
        excluding a terminating zero that is added to each of them. All this is
        done in a single block of memory that is obtained using the same memory
        allocation function that was used to get the match data block.
 
-       This function must be called only after a successful match.  If  called
+       This  function  must be called only after a successful match. If called
        after a partial match, the error code PCRE2_ERROR_PARTIAL is returned.
 
-       The  address of the memory block is returned via listptr, which is also
+       The address of the memory block is returned via listptr, which is  also
        the start of the list of string pointers. The end of the list is marked
-       by  a  NULL pointer. The address of the list of lengths is returned via
-       lengthsptr. If your strings do not contain binary zeros and you do  not
+       by a NULL pointer. The address of the list of lengths is  returned  via
+       lengthsptr.  If your strings do not contain binary zeros and you do not
        therefore need the lengths, you may supply NULL as the lengthsptr argu-
-       ment to disable the creation of a list of lengths.  The  yield  of  the
-       function  is zero if all went well, or PCRE2_ERROR_NOMEMORY if the mem-
-       ory block could not be obtained. When the list is no longer needed,  it
+       ment  to  disable  the  creation of a list of lengths. The yield of the
+       function is zero if all went well, or PCRE2_ERROR_NOMEMORY if the  mem-
+       ory  block could not be obtained. When the list is no longer needed, it
        should be freed by calling pcre2_substring_list_free().
 
        If this function encounters a substring that is unset, which can happen
-       when capturing subpattern number n+1 matches some part of the  subject,
-       but  subpattern n has not been used at all, it returns an empty string.
-       This can be distinguished  from  a  genuine  zero-length  substring  by
+       when  capturing subpattern number n+1 matches some part of the subject,
+       but subpattern n has not been used at all, it returns an empty  string.
+       This  can  be  distinguished  from  a  genuine zero-length substring by
        inspecting  the  appropriate  offset  in  the  ovector,  which  contain
-       PCRE2_UNSET  for   unset   substrings,   or   by   calling   pcre2_sub-
+       PCRE2_UNSET   for   unset   substrings,   or   by   calling  pcre2_sub-
        string_length_bynumber().
 
 
@@ -2661,39 +2964,39 @@ EXTRACTING CAPTURED SUBSTRINGS BY NAME
 
        void pcre2_substring_free(PCRE2_UCHAR *buffer);
 
-       To  extract a substring by name, you first have to find associated num-
+       To extract a substring by name, you first have to find associated  num-
        ber.  For example, for this pattern:
 
          (a+)b(?<xxx>\d+)...
 
        the number of the subpattern called "xxx" is 2. If the name is known to
-       be  unique  (PCRE2_DUPNAMES  was not set), you can find the number from
+       be unique (PCRE2_DUPNAMES was not set), you can find  the  number  from
        the name by calling pcre2_substring_number_from_name(). The first argu-
-       ment  is the compiled pattern, and the second is the name. The yield of
+       ment is the compiled pattern, and the second is the name. The yield  of
        the function is the subpattern number, PCRE2_ERROR_NOSUBSTRING if there
-       is  no  subpattern  of  that  name, or PCRE2_ERROR_NOUNIQUESUBSTRING if
-       there is more than one subpattern of that name. Given the  number,  you
-       can  extract  the  substring  directly,  or  use  one  of the functions
-       described above.
-
-       For convenience, there are also "byname" functions that  correspond  to
-       the  "bynumber"  functions,  the  only difference being that the second
-       argument is a name instead of a number. If PCRE2_DUPNAMES  is  set  and
+       is no subpattern of  that  name,  or  PCRE2_ERROR_NOUNIQUESUBSTRING  if
+       there  is  more than one subpattern of that name. Given the number, you
+       can extract the substring directly from the ovector, or use one of  the
+       "bynumber" functions described above.
+
+       For  convenience,  there are also "byname" functions that correspond to
+       the "bynumber" functions, the only difference  being  that  the  second
+       argument  is  a  name instead of a number. If PCRE2_DUPNAMES is set and
        there are duplicate names, these functions scan all the groups with the
        given name, and return the first named string that is set.
 
-       If there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING  is
-       returned.  If  all  groups  with the name have numbers that are greater
-       than the number of slots in  the  ovector,  PCRE2_ERROR_UNAVAILABLE  is
-       returned.  If  there  is at least one group with a slot in the ovector,
+       If  there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING is
+       returned. If all groups with the name have  numbers  that  are  greater
+       than  the  number  of  slots in the ovector, PCRE2_ERROR_UNAVAILABLE is
+       returned. If there is at least one group with a slot  in  the  ovector,
        but no group is found to be set, PCRE2_ERROR_UNSET is returned.
 
        Warning: If the pattern uses the (?| feature to set up multiple subpat-
-       terns  with  the  same number, as described in the section on duplicate
-       subpattern numbers in the pcre2pattern page, you cannot  use  names  to
-       distinguish  the  different subpatterns, because names are not included
-       in the compiled code. The matching process uses only numbers. For  this
-       reason,  the  use of different names for subpatterns of the same number
+       terns with the same number, as described in the  section  on  duplicate
+       subpattern  numbers  in  the pcre2pattern page, you cannot use names to
+       distinguish the different subpatterns, because names are  not  included
+       in  the compiled code. The matching process uses only numbers. For this
+       reason, the use of different names for subpatterns of the  same  number
        causes an error at compile time.
 
 
@@ -2706,46 +3009,47 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
          PCRE2_SIZE rlength, PCRE2_UCHAR *outputbufferP,
          PCRE2_SIZE *outlengthptr);
 
-       This function calls pcre2_match() and then makes a copy of the  subject
-       string  in  outputbuffer,  replacing the part that was matched with the
-       replacement string, whose length is supplied in rlength.  This  can  be
+       This  function calls pcre2_match() and then makes a copy of the subject
+       string in outputbuffer, replacing the part that was  matched  with  the
+       replacement  string,  whose  length is supplied in rlength. This can be
        given as PCRE2_ZERO_TERMINATED for a zero-terminated string. Matches in
-       which a \K item in a lookahead in the pattern causes the match  to  end
+       which  a  \K item in a lookahead in the pattern causes the match to end
        before it starts are not supported, and give rise to an error return.
 
-       The  first  seven  arguments  of pcre2_substitute() are the same as for
+       The first seven arguments of pcre2_substitute() are  the  same  as  for
        pcre2_match(), except that the partial matching options are not permit-
-       ted,  and  match_data may be passed as NULL, in which case a match data
-       block is obtained and freed within this function, using memory  manage-
-       ment  functions from the match context, if provided, or else those that
+       ted, and match_data may be passed as NULL, in which case a  match  data
+       block  is obtained and freed within this function, using memory manage-
+       ment functions from the match context, if provided, or else those  that
        were used to allocate memory for the compiled code.
 
-       The outlengthptr argument must point to a variable  that  contains  the
-       length,  in  code  units, of the output buffer. If the function is suc-
-       cessful, the value is updated to contain the length of the new  string,
+       The  outlengthptr  argument  must point to a variable that contains the
+       length, in code units, of the output buffer. If the  function  is  suc-
+       cessful,  the value is updated to contain the length of the new string,
        excluding the trailing zero that is automatically added.
 
-       If  the  function  is  not  successful,  the value set via outlengthptr
-       depends on the type of error. For  syntax  errors  in  the  replacement
-       string,  the  value  is  the offset in the replacement string where the
-       error was detected. For other  errors,  the  value  is  PCRE2_UNSET  by
-       default.  This  includes the case of the output buffer being too small,
-       unless PCRE2_SUBSTITUTE_OVERFLOW_LENGTH is set (see  below),  in  which
-       case  the  value  is the minimum length needed, including space for the
-       trailing zero. Note that in  order  to  compute  the  required  length,
-       pcre2_substitute()  has  to  simulate  all  the  matching  and copying,
+       If the function is not  successful,  the  value  set  via  outlengthptr
+       depends  on  the  type  of  error. For syntax errors in the replacement
+       string, the value is the offset in the  replacement  string  where  the
+       error  was  detected.  For  other  errors,  the value is PCRE2_UNSET by
+       default. This includes the case of the output buffer being  too  small,
+       unless  PCRE2_SUBSTITUTE_OVERFLOW_LENGTH  is  set (see below), in which
+       case the value is the minimum length needed, including  space  for  the
+       trailing  zero.  Note  that  in  order  to compute the required length,
+       pcre2_substitute() has  to  simulate  all  the  matching  and  copying,
        instead of giving an error return as soon as the buffer overflows. Note
        also that the length is in code units, not bytes.
 
-       In  the replacement string, which is interpreted as a UTF string in UTF
-       mode, and is checked for UTF  validity  unless  the  PCRE2_NO_UTF_CHECK
+       In the replacement string, which is interpreted as a UTF string in  UTF
+       mode,  and  is  checked  for UTF validity unless the PCRE2_NO_UTF_CHECK
        option is set, a dollar character is an escape character that can spec-
-       ify the insertion of characters from capturing groups or (*MARK)  items
-       in the pattern. The following forms are always recognized:
+       ify  the  insertion  of  characters  from  capturing groups or (*MARK),
+       (*PRUNE), or (*THEN) items in the  pattern.  The  following  forms  are
+       always recognized:
 
          $$                  insert a dollar character
          $<n> or ${<n>}      insert the contents of group <n>
-         $*MARK or ${*MARK}  insert the name of the last (*MARK) encountered
+         $*MARK or ${*MARK}  insert a (*MARK), (*PRUNE), or (*THEN) name
 
        Either  a  group  number  or  a  group name can be given for <n>. Curly
        brackets are required only if the following character would  be  inter-
@@ -2754,24 +3058,44 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
        matched  with "=abc=" and the replacement string "+$1$0$1+", the result
        is "=+babcb+=".
 
-       The facility for inserting a (*MARK) name can be used to perform simple
-       simultaneous substitutions, as this pcre2test example shows:
+       $*MARK inserts the name from the last encountered (*MARK), (*PRUNE), or
+       (*THEN)  on  the  matching  path  that  has a name. (*MARK) must always
+       include a name, but (*PRUNE) and (*THEN) need not. For example, in  the
+       case   of   (*MARK:A)(*PRUNE)   the  name  inserted  is  "A",  but  for
+       (*MARK:A)(*PRUNE:B) the relevant name is "B".   This  facility  can  be
+       used  to  perform  simple simultaneous substitutions, as this pcre2test
+       example shows:
 
-         /(*:pear)apple|(*:orange)lemon/g,replace=${*MARK}
+         /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
              apple lemon
           2: pear orange
 
-       As  well as the usual options for pcre2_match(), a number of additional
-       options can be set in the options argument.
+       As well as the usual options for pcre2_match(), a number of  additional
+       options can be set in the options argument of pcre2_substitute().
 
        PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject
-       string,  replacing  every  matching substring. If this is not set, only
-       the first matching substring is replaced. If any matched substring  has
-       zero  length, after the substitution has happened, an attempt to find a
-       non-empty match at the same position is performed. If this is not  suc-
-       cessful,  the current position is advanced by one character except when
-       CRLF is a valid newline sequence and the next two  characters  are  CR,
-       LF. In this case, the current position is advanced by two characters.
+       string, replacing every matching substring. If this option is not  set,
+       only  the  first matching substring is replaced. The search for matches
+       takes place in the original subject string (that is, previous  replace-
+       ments  do  not  affect  it).  Iteration is implemented by advancing the
+       startoffset value for each search, which is always  passed  the  entire
+       subject string. If an offset limit is set in the match context, search-
+       ing stops when that limit is reached.
+
+       You can restrict the effect of a global substitution to  a  portion  of
+       the subject string by setting either or both of startoffset and an off-
+       set limit. Here is a pcre2test example:
+
+         /B/g,replace=!,use_offset_limit
+         ABC ABC ABC ABC\=offset=3,offset_limit=12
+          2: ABC A!C A!C ABC
+
+       When continuing with global substitutions after  matching  a  substring
+       with zero length, an attempt to find a non-empty match at the same off-
+       set is performed.  If this is not successful, the offset is advanced by
+       one character except when CRLF is a valid newline sequence and the next
+       two characters are CR, LF. In this case, the offset is advanced by  two
+       characters.
 
        PCRE2_SUBSTITUTE_OVERFLOW_LENGTH  changes  what happens when the output
        buffer is too small. The default action is to return PCRE2_ERROR_NOMEM-
@@ -2883,10 +3207,10 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
        PCRE2_ERROR_BADREPLACEMENT  is  used for miscellaneous syntax errors in
        the   replacement   string,   with   more   particular   errors   being
        PCRE2_ERROR_BADREPESCAPE  (invalid  escape  sequence), PCRE2_ERROR_REP-
-       MISSING_BRACE (closing curly bracket not found),  PCRE2_BADSUBSTITUTION
-       (syntax  error in extended group substitution), and PCRE2_BADSUBPATTERN
-       (the pattern match ended before it started, which can happen if  \K  is
-       used in an assertion).
+       MISSINGBRACE (closing curly bracket not found),  PCRE2_ERROR_BADSUBSTI-
+       TUTION   (syntax   error   in   extended   group   substitution),   and
+       PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before it  started,
+       which can happen if \K is used in an assertion).
 
        As for all PCRE2 errors, a text message that describes the error can be
        obtained  by  calling  the  pcre2_get_error_message()   function   (see
@@ -2961,13 +3285,13 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
 
        The function pcre2_dfa_match() is called  to  match  a  subject  string
        against  a  compiled pattern, using a matching algorithm that scans the
-       subject string just once, and does not backtrack.  This  has  different
-       characteristics  to  the  normal  algorithm, and is not compatible with
-       Perl. Some of the features of PCRE2 patterns are not supported.  Never-
-       theless,  there are times when this kind of matching can be useful. For
-       a discussion of the two matching algorithms, and  a  list  of  features
-       that pcre2_dfa_match() does not support, see the pcre2matching documen-
-       tation.
+       subject string just once (not counting lookaround assertions), and does
+       not  backtrack.  This has different characteristics to the normal algo-
+       rithm, and is not compatible with Perl. Some of the features  of  PCRE2
+       patterns  are  not  supported.  Nevertheless, there are times when this
+       kind of matching can be useful. For a discussion of  the  two  matching
+       algorithms, and a list of features that pcre2_dfa_match() does not sup-
+       port, see the pcre2matching documentation.
 
        The arguments for the pcre2_dfa_match() function are the  same  as  for
        pcre2_match(), plus two extras. The ovector within the match data block
@@ -2991,7 +3315,7 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
            11,             /* the length of the subject string */
            0,              /* start at offset 0 in the subject */
            0,              /* default options */
-           match_data,     /* the match data block */
+           md,             /* the match data block */
            NULL,           /* a match context; NULL means use defaults */
            wspace,         /* working space vector */
            20);            /* number of elements (NOT size in bytes) */
@@ -2999,12 +3323,12 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
    Option bits for pcre_dfa_match()
 
        The unused bits of the options argument for pcre2_dfa_match()  must  be
-       zero.  The  only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
-       PCRE2_NOTEOL,          PCRE2_NOTEMPTY,          PCRE2_NOTEMPTY_ATSTART,
-       PCRE2_NO_UTF_CHECK,       PCRE2_PARTIAL_HARD,       PCRE2_PARTIAL_SOFT,
-       PCRE2_DFA_SHORTEST, and PCRE2_DFA_RESTART. All but  the  last  four  of
-       these  are  exactly the same as for pcre2_match(), so their description
-       is not repeated here.
+       zero.  The  only  bits that may be set are PCRE2_ANCHORED, PCRE2_ENDAN-
+       CHORED,       PCRE2_NOTBOL,        PCRE2_NOTEOL,        PCRE2_NOTEMPTY,
+       PCRE2_NOTEMPTY_ATSTART,     PCRE2_NO_UTF_CHECK,     PCRE2_PARTIAL_HARD,
+       PCRE2_PARTIAL_SOFT, PCRE2_DFA_SHORTEST, and PCRE2_DFA_RESTART. All  but
+       the  last  four  of these are exactly the same as for pcre2_match(), so
+       their description is not repeated here.
 
          PCRE2_PARTIAL_HARD
          PCRE2_PARTIAL_SOFT
@@ -3093,7 +3417,7 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
        example,  the pattern "a\d+" is compiled as if it were "a\d++". For DFA
        matching, this means that only one possible  match  is  found.  If  you
        really  do  want multiple matches in such cases, either use an ungreedy
-       repeat auch as "a\d+?" or set  the  PCRE2_NO_AUTO_POSSESS  option  when
+       repeat such as "a\d+?" or set  the  PCRE2_NO_AUTO_POSSESS  option  when
        compiling.
 
    Error returns from pcre2_dfa_match()
@@ -3138,8 +3462,7 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
 SEE ALSO
 
        pcre2build(3),    pcre2callout(3),    pcre2demo(3),   pcre2matching(3),
-       pcre2partial(3),    pcre2posix(3),    pcre2sample(3),    pcre2stack(3),
-       pcre2unicode(3).
+       pcre2partial(3), pcre2posix(3), pcre2sample(3), pcre2unicode(3).
 
 
 AUTHOR
@@ -3151,8 +3474,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 17 June 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 31 December 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -3198,21 +3521,21 @@ PCRE2 BUILD-TIME OPTIONS
 
          ./configure --help
 
-       The  following  sections  include  descriptions  of options whose names
-       begin with --enable or --disable. These settings specify changes to the
-       defaults  for  the configure command. Because of the way that configure
-       works, --enable and --disable always come in pairs, so  the  complemen-
-       tary  option always exists as well, but as it specifies the default, it
-       is not described.
+       The  following  sections include descriptions of "on/off" options whose
+       names begin with --enable or --disable. Because of the way that config-
+       ure  works, --enable and --disable always come in pairs, so the comple-
+       mentary option always exists as well, but as it specifies the  default,
+       it is not described.  Options that specify values have names that start
+       with --with.
 
 
 BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES
 
        By default, a library called libpcre2-8 is built, containing  functions
-       that  take  string arguments contained in vectors of bytes, interpreted
+       that  take  string  arguments contained in arrays of bytes, interpreted
        either as single-byte characters, or UTF-8 strings. You can also  build
        two  other libraries, called libpcre2-16 and libpcre2-32, which process
-       strings that are contained in vectors of 16-bit and 32-bit code  units,
+       strings that are contained in arrays of 16-bit and 32-bit  code  units,
        respectively. These can be interpreted either as single-unit characters
        or UTF-16/UTF-32 strings. To build these additional libraries, add  one
        or both of the following to the configure command:
@@ -3260,11 +3583,11 @@ UNICODE AND UTF SUPPORT
        application has locked this out by setting PCRE2_NEVER_UTF.
 
        UTF support allows the libraries to process character code points up to
-       0x10ffff in the strings that they handle. It also provides support  for
-       accessing  the  Unicode  properties  of  such characters, using pattern
-       escapes such as \P, \p, and \X. Only the  general  category  properties
-       such  as Lu and Nd are supported. Details are given in the pcre2pattern
-       documentation.
+       0x10ffff in the strings that they handle. Unicode  support  also  gives
+       access  to  the Unicode properties of characters, using pattern escapes
+       such as \P, \p, and \X. Only the general category properties such as Lu
+       and  Nd are supported. Details are given in the pcre2pattern documenta-
+       tion.
 
        Pattern escapes such as \d and \w do not by default make use of Unicode
        properties.  The  application  can  request that they do by setting the
@@ -3287,15 +3610,21 @@ DISABLING THE USE OF \C
 
 JUST-IN-TIME COMPILER SUPPORT
 
-       Just-in-time compiler support is included in the build by specifying
+       Just-in-time  (JIT) compiler support is included in the build by speci-
+       fying
 
          --enable-jit
 
-       This  support  is available only for certain hardware architectures. If
-       this option is set for an unsupported architecture,  a  building  error
-       occurs.   See the pcre2jit documentation for a discussion of JIT usage.
-       When JIT support is enabled, pcre2grep automatically makes use  of  it,
-       unless you add
+       This support is available only for certain hardware  architectures.  If
+       this  option  is  set for an unsupported architecture, a building error
+       occurs. If you are running under SELinux you may also want to add
+
+         --enable-jit-sealloc
+
+       which enables the use of an execmem allocator in JIT that is compatible
+       with  SELinux.  This  has  no  effect  if  JIT  is not enabled. See the
+       pcre2jit documentation for a discussion of JIT usage. When JIT  support
+       is enabled, pcre2grep automatically makes use of it, unless you add
 
          --disable-pcre2grep-jit
 
@@ -3325,7 +3654,7 @@ NEWLINE RECOGNITION
          --enable-newline-is-anycrlf
 
        which  causes  PCRE2 to recognize any of the three sequences CR, LF, or
-       CRLF as indicating a line ending. Finally, a fifth option, specified by
+       CRLF as indicating a line ending. A fifth option, specified by
 
          --enable-newline-is-any
 
@@ -3333,144 +3662,148 @@ NEWLINE RECOGNITION
        newline sequences are the three just mentioned, plus the single charac-
        ters VT (vertical tab, U+000B), FF (form feed, U+000C), NEL (next line,
        U+0085),  LS  (line  separator,  U+2028),  and PS (paragraph separator,
-       U+2029).
+       U+2029). The final option is
+
+         --enable-newline-is-nul
+
+       which causes NUL (binary zero) is set as the default line-ending  char-
+       acter.
 
        Whatever default line ending convention is selected when PCRE2 is built
-       can  be  overridden by applications that use the library. At build time
-       it is conventional to use the standard for your operating system.
+       can be overridden by applications that use the library. At  build  time
+       it is recommended to use the standard for your operating system.
 
 
 WHAT \R MATCHES
 
-       By default, the sequence \R in a pattern matches  any  Unicode  newline
-       sequence,  independently  of  what has been selected as the line ending
+       By  default,  the  sequence \R in a pattern matches any Unicode newline
+       sequence, independently of what has been selected as  the  line  ending
        sequence. If you specify
 
          --enable-bsr-anycrlf
 
-       the default is changed so that \R matches only CR, LF, or  CRLF.  What-
-       ever  is selected when PCRE2 is built can be overridden by applications
-       that use the called.
+       the  default  is changed so that \R matches only CR, LF, or CRLF. What-
+       ever is selected when PCRE2 is built can be overridden by  applications
+       that use the library.
 
 
 HANDLING VERY LARGE PATTERNS
 
-       Within a compiled pattern, offset values are used  to  point  from  one
-       part  to another (for example, from an opening parenthesis to an alter-
-       nation metacharacter). By default, in the 8-bit and  16-bit  libraries,
-       two-byte  values  are used for these offsets, leading to a maximum size
-       for a compiled pattern of around 64K code units. This is sufficient  to
+       Within  a  compiled  pattern,  offset values are used to point from one
+       part to another (for example, from an opening parenthesis to an  alter-
+       nation  metacharacter).  By default, in the 8-bit and 16-bit libraries,
+       two-byte values are used for these offsets, leading to a  maximum  size
+       for  a compiled pattern of around 64K code units. This is sufficient to
        handle all but the most gigantic patterns. Nevertheless, some people do
-       want to process truly enormous patterns, so it is possible  to  compile
-       PCRE2  to  use three-byte or four-byte offsets by adding a setting such
+       want  to  process truly enormous patterns, so it is possible to compile
+       PCRE2 to use three-byte or four-byte offsets by adding a  setting  such
        as
 
          --with-link-size=3
 
-       to the configure command. The value given must be 2, 3, or 4.  For  the
-       16-bit  library,  a  value of 3 is rounded up to 4. In these libraries,
-       using longer offsets slows down the operation of PCRE2 because  it  has
-       to  load additional data when handling them. For the 32-bit library the
-       value is always 4 and cannot be overridden; the value  of  --with-link-
+       to  the  configure command. The value given must be 2, 3, or 4. For the
+       16-bit library, a value of 3 is rounded up to 4.  In  these  libraries,
+       using  longer  offsets slows down the operation of PCRE2 because it has
+       to load additional data when handling them. For the 32-bit library  the
+       value  is  always 4 and cannot be overridden; the value of --with-link-
        size is ignored.
 
 
-AVOIDING EXCESSIVE STACK USAGE
-
-       When  matching  with the pcre2_match() function, PCRE2 implements back-
-       tracking by making recursive  calls  to  an  internal  function  called
-       match().  In  environments where the size of the stack is limited, this
-       can severely limit PCRE2's operation. (The Unix  environment  does  not
-       usually  suffer from this problem, but it may sometimes be necessary to
-       increase  the  maximum  stack  size.  There  is  a  discussion  in  the
-       pcre2stack  documentation.)  An  alternative approach to recursion that
-       uses memory from the heap to remember data, instead of using  recursive
-       function  calls, has been implemented to work round the problem of lim-
-       ited stack size. If you want to build a version  of  PCRE2  that  works
-       this way, add
+LIMITING PCRE2 RESOURCE USAGE
 
-         --disable-stack-for-recursion
+       The pcre2_match() function increments a counter each time it goes round
+       its  main  loop. Putting a limit on this counter controls the amount of
+       computing resource used by a single call to  pcre2_match().  The  limit
+       can be changed at run time, as described in the pcre2api documentation.
+       The default is 10 million, but this can be changed by adding a  setting
+       such as
 
-       to the configure command. By default, the system functions malloc() and
-       free() are called to manage the heap memory that is required, but  cus-
-       tom  memory  management  functions  can  be  called instead. PCRE2 runs
-       noticeably more slowly when built in this way. This option affects only
-       the pcre2_match() function; it is not relevant for pcre2_dfa_match().
+         --with-match-limit=500000
 
+       to   the   configure   command.   This  setting  also  applies  to  the
+       pcre2_dfa_match() matching function, and to JIT  matching  (though  the
+       counting is done differently).
 
-LIMITING PCRE2 RESOURCE USAGE
+       The  pcre2_match() function starts out using a 20K vector on the system
+       stack to record  backtracking  points.  The  more  nested  backtracking
+       points there are (that is, the deeper the search tree), the more memory
+       is needed. If the initial vector is not large enough,  heap  memory  is
+       used, up to a certain limit, which is specified in kilobytes. The limit
+       can be changed at run time, as described in the pcre2api documentation.
+       The  default  limit (in effect unlimited) is 20 million. You can change
+       this by a setting such as
 
-       Internally, PCRE2 has a function called match(), which it calls repeat-
-       edly  (sometimes  recursively)  when  matching  a  pattern   with   the
-       pcre2_match() function. By controlling the maximum number of times this
-       function may be called during a single matching operation, a limit  can
-       be  placed on the resources used by a single call to pcre2_match(). The
-       limit can be changed at run time, as described in the pcre2api documen-
-       tation.  The default is 10 million, but this can be changed by adding a
-       setting such as
+         --with-heap-limit=500
 
-         --with-match-limit=500000
+       which limits the amount of heap to 500 kilobytes.  This  limit  applies
+       only  to interpretive matching in pcre2_match(). It does not apply when
+       JIT (which has its own memory arrangements) is used, nor does it  apply
+       to pcre2_dfa_match().
 
-       to  the  configure  command.  This  setting  has  no  effect   on   the
-       pcre2_dfa_match() matching function.
+       You  can  also explicitly limit the depth of nested backtracking in the
+       pcre2_match() interpreter. This limit defaults to the value that is set
+       for  --with-match-limit.  You  can set a lower default limit by adding,
+       for example,
 
-       In  some  environments  it is desirable to limit the depth of recursive
-       calls of match() more strictly than the total number of calls, in order
-       to  restrict  the maximum amount of stack (or heap, if --disable-stack-
-       for-recursion is specified) that is used. A second limit controls this;
-       it  defaults  to  the  value  that is set for --with-match-limit, which
-       imposes no additional constraints. However, you can set a  lower  limit
-       by adding, for example,
+         --with-match-limit_depth=10000
 
-         --with-match-limit-recursion=10000
+       to the configure command. This value can be  overridden  at  run  time.
+       This  depth  limit  indirectly limits the amount of heap memory that is
+       used, but because the size of each backtracking "frame" depends on  the
+       number  of  capturing parentheses in a pattern, the amount of heap that
+       is used before the limit is reached varies  from  pattern  to  pattern.
+       This  limit  was  more  useful in versions before 10.30, where function
+       recursion was used for backtracking.
 
-       to  the  configure  command.  This  value can also be overridden at run
-       time.
+       As well as applying to pcre2_match(), the depth limit also controls the
+       depth  of recursive function calls in pcre2_dfa_match(). These are used
+       for lookaround assertions, atomic groups,  and  recursion  within  pat-
+       terns.  The limit does not apply to JIT matching.
 
 
 CREATING CHARACTER TABLES AT BUILD TIME
 
        PCRE2 uses fixed tables for processing characters whose code points are
        less than 256. By default, PCRE2 is built with a set of tables that are
-       distributed in the file src/pcre2_chartables.c.dist. These  tables  are
+       distributed  in  the file src/pcre2_chartables.c.dist. These tables are
        for ASCII codes only. If you add
 
          --enable-rebuild-chartables
 
-       to  the  configure  command, the distributed tables are no longer used.
-       Instead, a program called dftables is compiled and  run.  This  outputs
+       to the configure command, the distributed tables are  no  longer  used.
+       Instead,  a  program  called dftables is compiled and run. This outputs
        the source for new set of tables, created in the default locale of your
-       C run-time system. (This method of replacing the tables does  not  work
-       if  you are cross compiling, because dftables is run on the local host.
-       If you need to create alternative tables when cross compiling, you will
-       have to do so "by hand".)
+       C run-time system. This method of replacing the tables does not work if
+       you are cross compiling, because dftables is run on the local host.  If
+       you  need  to  create alternative tables when cross compiling, you will
+       have to do so "by hand".
 
 
 USING EBCDIC CODE
 
-       PCRE2  assumes  by default that it will run in an environment where the
-       character code is ASCII or Unicode, which is a superset of ASCII.  This
+       PCRE2 assumes by default that it will run in an environment  where  the
+       character  code is ASCII or Unicode, which is a superset of ASCII. This
        is the case for most computer operating systems. PCRE2 can, however, be
        compiled to run in an 8-bit EBCDIC environment by adding
 
          --enable-ebcdic --disable-unicode
 
        to the configure command. This setting implies --enable-rebuild-charta-
-       bles.  You  should  only  use  it if you know that you are in an EBCDIC
+       bles. You should only use it if you know that  you  are  in  an  EBCDIC
        environment (for example, an IBM mainframe operating system).
 
-       It is not possible to support both EBCDIC and UTF-8 codes in  the  same
-       version  of  the  library. Consequently, --enable-unicode and --enable-
+       It  is  not possible to support both EBCDIC and UTF-8 codes in the same
+       version of the library. Consequently,  --enable-unicode  and  --enable-
        ebcdic are mutually exclusive.
 
        The EBCDIC character that corresponds to an ASCII LF is assumed to have
-       the  value  0x15 by default. However, in some EBCDIC environments, 0x25
+       the value 0x15 by default. However, in some EBCDIC  environments,  0x25
        is used. In such an environment you should use
 
          --enable-ebcdic-nl25
 
        as well as, or instead of, --enable-ebcdic. The EBCDIC character for CR
-       has  the  same  value  as in ASCII, namely, 0x0d. Whichever of 0x15 and
+       has the same value as in ASCII, namely, 0x0d.  Whichever  of  0x15  and
        0x25 is not chosen as LF is made to correspond to the Unicode NEL char-
        acter (which, in Unicode, is 0x85).
 
@@ -3483,39 +3816,44 @@ PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS
 
        By default, on non-Windows systems, pcre2grep supports the use of call-
        outs with string arguments within the patterns it is matching, in order
-       to  run external scripts. For details, see the pcre2grep documentation.
-       This support can be disabled by adding  --disable-pcre2grep-callout  to
+       to run external scripts. For details, see the pcre2grep  documentation.
+       This  support  can be disabled by adding --disable-pcre2grep-callout to
        the configure command.
 
 
 PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT
 
-       By  default,  pcre2grep reads all files as plain text. You can build it
-       so that it recognizes files whose names end in .gz or .bz2,  and  reads
+       By default, pcre2grep reads all files as plain text. You can  build  it
+       so  that  it recognizes files whose names end in .gz or .bz2, and reads
        them with libz or libbz2, respectively, by adding one or both of
 
          --enable-pcre2grep-libz
          --enable-pcre2grep-libbz2
 
        to the configure command. These options naturally require that the rel-
-       evant libraries are installed on your system. Configuration  will  fail
+       evant  libraries  are installed on your system. Configuration will fail
        if they are not.
 
 
 PCRE2GREP BUFFER SIZE
 
-       pcre2grep  uses an internal buffer to hold a "window" on the file it is
+       pcre2grep uses an internal buffer to hold a "window" on the file it  is
        scanning, in order to be able to output "before" and "after" lines when
-       it  finds  a match. The size of the buffer is controlled by a parameter
-       whose default value is 20K. The buffer itself is three times this size,
-       but because of the way it is used for holding "before" lines, the long-
-       est line that is guaranteed to be processable is  the  parameter  size.
-       You can change the default parameter value by adding, for example,
+       it finds a match. The starting size of the buffer is  controlled  by  a
+       parameter  whose default value is 20K. The buffer itself is three times
+       this size, but because of the way  it  is  used  for  holding  "before"
+       lines,  the  longest  line  that is guaranteed to be processable is the
+       parameter size. If a longer line is  encountered,  pcre2grep  automati-
+       cally expands the buffer, up to a specified maximum size, whose default
+       is 1M or the starting size, whichever is the larger. You can change the
+       default parameter values by adding, for example,
 
-         --with-pcre2grep-bufsize=50K
+         --with-pcre2grep-bufsize=51200
+         --with-pcre2grep-max-bufsize=2097152
 
-       to  the  configure  command.  The caller of pcre2grep can override this
-       value by using --buffer-size on the command line.
+       to  the  configure  command. The caller of pcre2grep can override these
+       values by using --buffer-size  and  --max-buffer-size  on  the  command
+       line.
 
 
 PCRE2TEST OPTION FOR LIBREADLINE SUPPORT
@@ -3525,26 +3863,26 @@ PCRE2TEST OPTION FOR LIBREADLINE SUPPORT
          --enable-pcre2test-libreadline
          --enable-pcre2test-libedit
 
-       to the configure command, pcre2test  is  linked  with  the  libreadline
+       to  the  configure  command,  pcre2test  is linked with the libreadline
        orlibedit library, respectively, and when its input is from a terminal,
-       it reads it using the readline() function. This  provides  line-editing
-       and  history  facilities.  Note that libreadline is GPL-licensed, so if
-       you distribute a binary of pcre2test linked in this way, there  may  be
+       it  reads  it using the readline() function. This provides line-editing
+       and history facilities. Note that libreadline is  GPL-licensed,  so  if
+       you  distribute  a binary of pcre2test linked in this way, there may be
        licensing issues. These can be avoided by linking instead with libedit,
        which has a BSD licence.
 
-       Setting --enable-pcre2test-libreadline causes the -lreadline option  to
-       be  added to the pcre2test build. In many operating environments with a
-       sytem-installed readline library this is sufficient. However,  in  some
+       Setting  --enable-pcre2test-libreadline causes the -lreadline option to
+       be added to the pcre2test build. In many operating environments with  a
+       sytem-installed  readline  library this is sufficient. However, in some
        environments (e.g. if an unmodified distribution version of readline is
-       in use), some extra configuration may be necessary.  The  INSTALL  file
+       in  use),  some  extra configuration may be necessary. The INSTALL file
        for libreadline says this:
 
          "Readline uses the termcap functions, but does not link with
          the termcap or curses library itself, allowing applications
          which link with readline the to choose an appropriate library."
 
-       If  your environment has not been set up so that an appropriate library
+       If your environment has not been set up so that an appropriate  library
        is automatically included, you may need to add something like
 
          LIBS="-ncurses"
@@ -3558,7 +3896,7 @@ INCLUDING DEBUGGING CODE
 
          --enable-debug
 
-       to the configure command, additional debugging code is included in  the
+       to  the configure command, additional debugging code is included in the
        build. This feature is intended for use by the PCRE2 maintainers.
 
 
@@ -3568,15 +3906,15 @@ DEBUGGING WITH VALGRIND SUPPORT
 
          --enable-valgrind
 
-       to  the  configure command, PCRE2 will use valgrind annotations to mark
-       certain memory regions as  unaddressable.  This  allows  it  to  detect
-       invalid  memory  accesses,  and  is  mostly  useful for debugging PCRE2
+       to the configure command, PCRE2 will use valgrind annotations  to  mark
+       certain  memory  regions  as  unaddressable.  This  allows it to detect
+       invalid memory accesses, and  is  mostly  useful  for  debugging  PCRE2
        itself.
 
 
 CODE COVERAGE REPORTING
 
-       If your C compiler is gcc, you can build a version of  PCRE2  that  can
+       If  your  C  compiler is gcc, you can build a version of PCRE2 that can
        generate a code coverage report for its test suite. To enable this, you
        must install lcov version 1.6 or above. Then specify
 
@@ -3585,20 +3923,20 @@ CODE COVERAGE REPORTING
        to the configure command and build PCRE2 in the usual way.
 
        Note that using ccache (a caching C compiler) is incompatible with code
-       coverage  reporting. If you have configured ccache to run automatically
+       coverage reporting. If you have configured ccache to run  automatically
        on your system, you must set the environment variable
 
          CCACHE_DISABLE=1
 
        before running make to build PCRE2, so that ccache is not used.
 
-       When --enable-coverage is used,  the  following  addition  targets  are
+       When  --enable-coverage  is  used,  the  following addition targets are
        added to the Makefile:
 
          make coverage
 
-       This  creates  a  fresh coverage report for the PCRE2 test suite. It is
-       equivalent to running "make coverage-reset", "make  coverage-baseline",
+       This creates a fresh coverage report for the PCRE2 test  suite.  It  is
+       equivalent  to running "make coverage-reset", "make coverage-baseline",
        "make check", and then "make coverage-report".
 
          make coverage-reset
@@ -3615,21 +3953,59 @@ CODE COVERAGE REPORTING
 
          make coverage-clean-report
 
-       This  removes the generated coverage report without cleaning the cover-
+       This removes the generated coverage report without cleaning the  cover-
        age data itself.
 
          make coverage-clean-data
 
-       This removes the captured coverage data without removing  the  coverage
+       This  removes  the captured coverage data without removing the coverage
        files created at compile time (*.gcno).
 
          make coverage-clean
 
-       This  cleans all coverage data including the generated coverage report.
-       For more information about code coverage, see the gcov and  lcov  docu-
+       This cleans all coverage data including the generated coverage  report.
+       For  more  information about code coverage, see the gcov and lcov docu-
        mentation.
 
 
+SUPPORT FOR FUZZERS
+
+       There is a special option for use by people who  want  to  run  fuzzing
+       tests on PCRE2:
+
+         --enable-fuzz-support
+
+       At present this applies only to the 8-bit library. If set, it causes an
+       extra library  called  libpcre2-fuzzsupport.a  to  be  built,  but  not
+       installed.  This contains a single function called LLVMFuzzerTestOneIn-
+       put() whose arguments are a pointer to a string and the length  of  the
+       string.  When  called,  this  function tries to compile the string as a
+       pattern, and if that succeeds, to match it.  This is done both with  no
+       options  and  with some random options bits that are generated from the
+       string.
+
+       Setting --enable-fuzz-support also causes  a  binary  called  pcre2fuz-
+       zcheck  to be created. This is normally run under valgrind or used when
+       PCRE2 is compiled with address sanitizing enabled. It calls the fuzzing
+       function  and  outputs information about it is doing. The input strings
+       are specified by arguments: if an argument starts with "=" the rest  of
+       it  is  a  literal  input string. Otherwise, it is assumed to be a file
+       name, and the contents of the file are the test string.
+
+
+OBSOLETE OPTION
+
+       In versions of PCRE2 prior to 10.30, there were two  ways  of  handling
+       backtracking  in the pcre2_match() function. The default was to use the
+       system stack, but if
+
+         --disable-stack-for-recursion
+
+       was set, memory on the heap was used. From release 10.30  onwards  this
+       has  changed  (the  stack  is  no longer used) and this option now does
+       nothing except give a warning.
+
+
 SEE ALSO
 
        pcre2api(3), pcre2-config(3).
@@ -3644,8 +4020,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 01 April 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 18 July 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -3689,14 +4065,22 @@ DESCRIPTION
 
        If the PCRE2_AUTO_CALLOUT option bit is set when a pattern is compiled,
        PCRE2 automatically inserts callouts, all with number 255, before  each
-       item  in  the  pattern. For example, if PCRE2_AUTO_CALLOUT is used with
-       the pattern
+       item  in the pattern except for immediately before or after an explicit
+       callout. For example, if PCRE2_AUTO_CALLOUT is used with the pattern
 
-         A(\d{2}|--)
+         A(?C3)B
 
        it is processed as if it were
 
-       (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
+         (?C255)A(?C3)B(?C255)
+
+       Here is a more complicated example:
+
+         A(\d{2}|--)
+
+       With PCRE2_AUTO_CALLOUT, this pattern is processed as if it were
+
+         (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
 
        Notice that there is a callout before and after  each  parenthesis  and
        alternation bar. If the pattern contains a conditional group whose con-
@@ -3737,10 +4121,11 @@ MISSING CALLOUTS
          No match
 
        This indicates that when matching [bc] fails, there is no  backtracking
-       into  a+  and  therefore the callouts that would be taken for the back-
-       tracks do not occur.  You can disable the  auto-possessify  feature  by
-       passing  PCRE2_NO_AUTO_POSSESS to pcre2_compile(), or starting the pat-
-       tern with (*NO_AUTO_POSSESS). In this case, the output changes to this:
+       into a+ (because it is being treated as a++) and therefore the callouts
+       that would be taken for the backtracks do not occur.  You  can  disable
+       the   auto-possessify   feature  by  passing  PCRE2_NO_AUTO_POSSESS  to
+       pcre2_compile(), or starting the pattern  with  (*NO_AUTO_POSSESS).  In
+       this case, the output changes to this:
 
          --->aaaa
           +0 ^        a+
@@ -3756,14 +4141,17 @@ MISSING CALLOUTS
    Automatic .* anchoring
 
        By default, an optimization is applied when .* is the first significant
-       item in a pattern. If PCRE2_DOTALL is set, so that the  dot  can  match
-       any  character,  the pattern is automatically anchored. If PCRE2_DOTALL
-       is not set, a match can start only after an internal newline or at  the
-       beginning  of  the  subject,  and  pcre2_compile() remembers this. This
-       optimization is disabled, however, if .* is in an atomic  group  or  if
-       there  is  a back reference to the capturing group in which it appears.
-       It is also disabled if the pattern contains (*PRUNE) or  (*SKIP).  How-
-       ever, the presence of callouts does not affect it.
+       item  in  a  pattern. If PCRE2_DOTALL is set, so that the dot can match
+       any character, the pattern is automatically anchored.  If  PCRE2_DOTALL
+       is  not set, a match can start only after an internal newline or at the
+       beginning of the subject, and pcre2_compile() remembers this. If a pat-
+       tern  has more than one top-level branch, automatic anchoring occurs if
+       all branches are anchorable.
+
+       This optimization is disabled, however, if .* is in an atomic group  or
+       if  there  is  a  back  reference  to  the  capturing group in which it
+       appears. It is also  disabled  if  the  pattern  contains  (*PRUNE)  or
+       (*SKIP). However, the presence of callouts does not affect it.
 
        For  example,  if  the pattern .*\d is compiled with PCRE2_AUTO_CALLOUT
        and applied to the string "aa", the pcre2test output is:
@@ -3795,46 +4183,45 @@ MISSING CALLOUTS
        ter.   Another  optimization, described in the next section, means that
        there is no subsequent attempt to match with an empty subject.
 
-       If a pattern has more than one top-level  branch,  automatic  anchoring
-       occurs if all branches are anchorable.
-
    Other optimizations
 
-       Other  optimizations  that  provide fast "no match" results also affect
+       Other optimizations that provide fast "no match"  results  also  affect
        callouts.  For example, if the pattern is
 
          ab(?C4)cd
 
-       PCRE2 knows that any matching string must contain the  letter  "d".  If
-       the  subject  string  is  "abyz",  the  lack of "d" means that matching
-       doesn't ever start, and the callout is  never  reached.  However,  with
+       PCRE2  knows  that  any matching string must contain the letter "d". If
+       the subject string is "abyz", the  lack  of  "d"  means  that  matching
+       doesn't  ever  start,  and  the callout is never reached. However, with
        "abyd", though the result is still no match, the callout is obeyed.
 
-       PCRE2  also  knows  the  minimum  length of a matching string, and will
-       immediately give a "no match" return without actually running  a  match
-       if  the  subject is not long enough, or, for unanchored patterns, if it
-       has been scanned far enough.
+       For most patterns PCRE2 also knows the minimum  length  of  a  matching
+       string,  and will immediately give a "no match" return without actually
+       running a match if the subject is not long enough, or,  for  unanchored
+       patterns, if it has been scanned far enough.
 
        You can disable these optimizations by passing the PCRE2_NO_START_OPTI-
-       MIZE  option  to  pcre2_compile(),  or  by  starting  the  pattern with
-       (*NO_START_OPT). This slows down the matching process, but does  ensure
+       MIZE option  to  pcre2_compile(),  or  by  starting  the  pattern  with
+       (*NO_START_OPT).  This slows down the matching process, but does ensure
        that callouts such as the example above are obeyed.
 
 
 THE CALLOUT INTERFACE
 
-       During  matching,  when  PCRE2  reaches a callout point, if an external
-       function is set in the match context, it is  called.  This  applies  to
-       both  normal  and DFA matching. The first argument to the callout func-
-       tion is a pointer to a pcre2_callout block. The second argument is  the
-       void  *  callout  data that was supplied when the callout was set up by
-       calling pcre2_set_callout() (see the pcre2api documentation). The call-
-       out block structure contains the following fields:
+       During matching, when PCRE2 reaches a callout  point,  if  an  external
+       function  is  provided in the match context, it is called. This applies
+       to both normal, DFA, and JIT matching. The first argument to the  call-
+       out function is a pointer to a pcre2_callout block. The second argument
+       is the void * callout data that was supplied when the callout  was  set
+       up by calling pcre2_set_callout() (see the pcre2api documentation). The
+       callout block structure contains the following fields, not  necessarily
+       in this order:
 
          uint32_t      version;
          uint32_t      callout_number;
          uint32_t      capture_top;
          uint32_t      capture_last;
+         uint32_t      callout_flags;
          PCRE2_SIZE   *offset_vector;
          PCRE2_SPTR    mark;
          PCRE2_SPTR    subject;
@@ -3848,19 +4235,19 @@ THE CALLOUT INTERFACE
          PCRE2_SPTR    callout_string;
 
        The  version field contains the version number of the block format. The
-       current version is 1; the three callout string fields  were  added  for
-       this  version. If you are writing an application that might use an ear-
-       lier release of PCRE2, you  should  check  the  version  number  before
-       accessing  any  of  these  fields.  The version number will increase in
-       future if more fields are added, but the intention is never  to  remove
-       any of the existing fields.
+       current version is 2; the three callout string fields  were  added  for
+       version  1, and the callout_flags field for version 2. If you are writ-
+       ing an application that might use an  earlier  release  of  PCRE2,  you
+       should  check  the version number before accessing any of these fields.
+       The version number will increase in future if more  fields  are  added,
+       but the intention is never to remove any of the existing fields.
 
    Fields for numerical callouts
 
        For  a  numerical  callout,  callout_string is NULL, and callout_number
        contains the number of the callout, in the range  0-255.  This  is  the
-       number  that  follows  (?C for manual callouts; it is 255 for automati-
-       cally generated callouts.
+       number  that  follows  (?C for callouts that part of the pattern; it is
+       255 for automatically generated callouts.
 
    Fields for string callouts
 
@@ -3885,74 +4272,123 @@ THE CALLOUT INTERFACE
        The remaining fields in the callout block are the same for  both  kinds
        of callout.
 
-       The offset_vector field is a pointer to the vector of capturing offsets
-       (the "ovector") that was passed to the matching function in  the  match
-       data  block.  When pcre2_match() is used, the contents can be inspected
-       in order to extract substrings that have been matched so  far,  in  the
-       same  way as for extracting substrings after a match has completed. For
-       the DFA matching function, this field is not useful.
+       The  offset_vector  field is a pointer to a vector of capturing offsets
+       (the "ovector"). You may read the elements in this vector, but you must
+       not change any of them.
+
+       For  calls  to  pcre2_match(),  the  offset_vector  field is not (since
+       release 10.30) a pointer to the actual ovector that was passed  to  the
+       matching  function  in  the  match  data block. Instead it points to an
+       internal ovector of a size large enough to hold all  possible  captured
+       substrings in the pattern. Note that whenever a recursion or subroutine
+       call within a pattern completes, the capturing state is reset  to  what
+       it was before.
+
+       The  capture_last  field  contains the number of the most recently cap-
+       tured substring, and the capture_top field contains one more  than  the
+       number  of  the  highest numbered captured substring so far. If no sub-
+       strings have yet been captured, the value of capture_last is 0 and  the
+       value  of  capture_top  is  1. The values of these fields do not always
+       differ  by  one;  for  example,  when  the  callout  in   the   pattern
+       ((a)(b))(?C2) is taken, capture_last is 1 but capture_top is 4.
+
+       The   contents  of  ovector[2]  to  ovector[<capture_top>*2-1]  can  be
+       inspected in order to extract substrings that have been matched so far,
+       in  the  same way as extracting substrings after a match has completed.
+       The values in ovector[0] and ovector[1] are always PCRE2_UNSET  because
+       the  match is by definition not complete. Substrings that have not been
+       captured but whose numbers are less than capture_top also have both  of
+       their ovector slots set to PCRE2_UNSET.
+
+       For  DFA  matching,  the offset_vector field points to the ovector that
+       was passed to the matching function in the match  data  block,  but  it
+       holds  no  useful information at callout time because pcre2_dfa_match()
+       does not support substring  capturing.  The  value  of  capture_top  is
+       always 1 and the value of capture_last is always 0 for DFA matching.
 
        The subject and subject_length fields contain copies of the values that
        were passed to the matching function.
 
-       The  start_match  field normally contains the offset within the subject
-       at which the current match attempt  started.  However,  if  the  escape
-       sequence  \K has been encountered, this value is changed to reflect the
-       modified starting point. If the pattern is not  anchored,  the  callout
+       The start_match field normally contains the offset within  the  subject
+       at  which  the  current  match  attempt started. However, if the escape
+       sequence \K has been encountered, this value is changed to reflect  the
+       modified  starting  point.  If the pattern is not anchored, the callout
        function may be called several times from the same point in the pattern
        for different starting points in the subject.
 
-       The current_position field contains the offset within  the  subject  of
+       The  current_position  field  contains the offset within the subject of
        the current match pointer.
 
-       When the pcre2_match() is used, the capture_top field contains one more
-       than the number of the highest numbered captured substring so  far.  If
-       no substrings have been captured, the value of capture_top is one. This
-       is always the case when the DFA functions are used, because they do not
-       support captured substrings.
-
-       The  capture_last  field  contains the number of the most recently cap-
-       tured substring. However, when a recursion exits, the value reverts  to
-       what  it  was  outside  the recursion, as do the values of all captured
-       substrings. If no substrings have been  captured,  the  value  of  cap-
-       ture_last is 0. This is always the case for the DFA matching functions.
-
        The pattern_position field contains the offset in the pattern string to
        the next item to be matched.
 
-       The next_item_length field contains the length of the next item  to  be
-       matched in the pattern string. When the callout immediately precedes an
-       alternation bar, a closing parenthesis, or the end of the pattern,  the
-       length  is  zero. When the callout precedes an opening parenthesis, the
-       length is that of the entire subpattern.
-
-       The pattern_position and next_item_length fields are intended  to  help
-       in  distinguishing between different automatic callouts, which all have
-       the same callout number. However, they are set for  all  callouts,  and
+       The  next_item_length  field contains the length of the next item to be
+       processed in the pattern string. When the callout is at the end of  the
+       pattern,  the  length  is  zero.  When  the callout precedes an opening
+       parenthesis, the length includes meta characters that follow the paren-
+       thesis.  For  example,  in a callout before an assertion such as (?=ab)
+       the length is 3. For an an alternation bar or  a  closing  parenthesis,
+       the  length is one, unless a closing parenthesis is followed by a quan-
+       tifier, in which case its length is included.  (This changed in release
+       10.23.  In  earlier  releases, before an opening parenthesis the length
+       was that of the entire subpattern, and before an alternation bar  or  a
+       closing parenthesis the length was zero.)
+
+       The  pattern_position  and next_item_length fields are intended to help
+       in distinguishing between different automatic callouts, which all  have
+       the  same  callout  number. However, they are set for all callouts, and
        are used by pcre2test to show the next item to be matched when display-
        ing callout information.
 
        In callouts from pcre2_match() the mark field contains a pointer to the
-       zero-terminated  name of the most recently passed (*MARK), (*PRUNE), or
-       (*THEN) item in the match, or NULL if no such items have  been  passed.
-       Instances  of  (*PRUNE)  or  (*THEN) without a name do not obliterate a
+       zero-terminated name of the most recently passed (*MARK), (*PRUNE),  or
+       (*THEN)  item  in the match, or NULL if no such items have been passed.
+       Instances of (*PRUNE) or (*THEN) without a name  do  not  obliterate  a
        previous (*MARK). In callouts from the DFA matching function this field
        always contains NULL.
 
+       The   callout_flags   field   is   always   zero   in   callouts   from
+       pcre2_dfa_match() or when JIT is being used. When pcre2_match() without
+       JIT is used, the following bits may be set:
+
+         PCRE2_CALLOUT_STARTMATCH
+
+       This is set for the first callout after the start of matching for  each
+       new starting position in the subject.
+
+         PCRE2_CALLOUT_BACKTRACK
+
+       This  is  set if there has been a matching backtrack since the previous
+       callout, or since the start of matching if this is  the  first  callout
+       from a pcre2_match() run.
+
+       Both  bits  are  set when a backtrack has caused a "bumpalong" to a new
+       starting position in the subject. Output from pcre2test does not  indi-
+       cate  the  presence  of these bits unless the callout_extra modifier is
+       set.
+
+       The information in the callout_flags field is provided so that applica-
+       tions  can track and tell their users how matching with backtracking is
+       done. This can be useful when trying to optimize patterns, or  just  to
+       understand  how  PCRE2  works. There is no support in pcre2_dfa_match()
+       because there is no backtracking in DFA matching, and there is no  sup-
+       port in JIT because JIT is all about maximimizing matching performance.
+       In both these cases the callout_flags field is always zero.
+
 
 RETURN VALUES FROM CALLOUTS
 
        The external callout function returns an integer to PCRE2. If the value
-       is zero, matching proceeds as normal. If  the  value  is  greater  than
-       zero,  matching  fails  at  the current point, but the testing of other
+       is  zero,  matching  proceeds  as  normal. If the value is greater than
+       zero, matching fails at the current point, but  the  testing  of  other
        matching possibilities goes ahead, just as if a lookahead assertion had
        failed. If the value is less than zero, the match is abandoned, and the
        matching function returns the negative value.
 
-       Negative  values  should  normally  be   chosen   from   the   set   of
-       PCRE2_ERROR_xxx  values.  In  particular,  PCRE2_ERROR_NOMATCH forces a
-       standard "no match" failure. The error  number  PCRE2_ERROR_CALLOUT  is
-       reserved  for  use by callout functions; it will never be used by PCRE2
+       Negative   values   should   normally   be   chosen  from  the  set  of
+       PCRE2_ERROR_xxx values. In  particular,  PCRE2_ERROR_NOMATCH  forces  a
+       standard  "no  match"  failure. The error number PCRE2_ERROR_CALLOUT is
+       reserved for use by callout functions; it will never be used  by  PCRE2
        itself.
 
 
@@ -3963,14 +4399,14 @@ CALLOUT ENUMERATION
          void *user_data);
 
        A script language that supports the use of string arguments in callouts
-       might  like  to  scan  all the callouts in a pattern before running the
+       might like to scan all the callouts in a  pattern  before  running  the
        match. This can be done by calling pcre2_callout_enumerate(). The first
-       argument  is  a  pointer  to a compiled pattern, the second points to a
-       callback function, and the third is arbitrary user data.  The  callback
-       function  is  called  for  every callout in the pattern in the order in
+       argument is a pointer to a compiled pattern, the  second  points  to  a
+       callback  function,  and the third is arbitrary user data. The callback
+       function is called for every callout in the pattern  in  the  order  in
        which they appear. Its first argument is a pointer to a callout enumer-
-       ation  block,  and  its second argument is the user_data value that was
-       passed to pcre2_callout_enumerate(). The data block contains  the  fol-
+       ation block, and its second argument is the user_data  value  that  was
+       passed  to  pcre2_callout_enumerate(). The data block contains the fol-
        lowing fields:
 
          version                Block version number
@@ -3981,17 +4417,17 @@ CALLOUT ENUMERATION
          callout_string_length  Length of callout string
          callout_string         Points to callout string or is NULL
 
-       The  version  number is currently 0. It will increase if new fields are
-       ever added to the block. The remaining fields are  the  same  as  their
-       namesakes  in  the pcre2_callout block that is used for callouts during
+       The version number is currently 0. It will increase if new  fields  are
+       ever  added  to  the  block. The remaining fields are the same as their
+       namesakes in the pcre2_callout block that is used for  callouts  during
        matching, as described above.
 
-       Note that the value of pattern_position is  unique  for  each  callout.
-       However,  if  a callout occurs inside a group that is quantified with a
+       Note  that  the  value  of pattern_position is unique for each callout.
+       However, if a callout occurs inside a group that is quantified  with  a
        non-zero minimum or a fixed maximum, the group is replicated inside the
-       compiled  pattern.  For example, a pattern such as /(a){2}/ is compiled
-       as if it were /(a)(a)/. This means that the callout will be  enumerated
-       more  than  once,  but with the same value for pattern_position in each
+       compiled pattern. For example, a pattern such as /(a){2}/  is  compiled
+       as  if it were /(a)(a)/. This means that the callout will be enumerated
+       more than once, but with the same value for  pattern_position  in  each
        case.
 
        The callback function should normally return zero. If it returns a non-
@@ -4008,8 +4444,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 23 March 2015
-       Copyright (c) 1997-2015 University of Cambridge.
+       Last updated: 22 December 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -4024,45 +4460,46 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
 
        This document describes the differences in the ways that PCRE2 and Perl
        handle regular expressions. The differences  described  here  are  with
-       respect to Perl versions 5.10 and above.
+       respect  to Perl versions 5.26, but as both Perl and PCRE2 are continu-
+       ally changing, the information may sometimes be out of date.
 
-       1.  PCRE2  has only a subset of Perl's Unicode support. Details of what
+       1. PCRE2 has only a subset of Perl's Unicode support. Details  of  what
        it does have are given in the pcre2unicode page.
 
-       2. PCRE2 allows repeat quantifiers only  on  parenthesized  assertions,
-       but  they  do not mean what you might think. For example, (?!a){3} does
-       not assert that the next three characters are not "a". It just  asserts
-       that  the  next  character  is not "a" three times (in principle: PCRE2
-       optimizes this to run the assertion  just  once).  Perl  allows  repeat
-       quantifiers  on  other  assertions such as \b, but these do not seem to
-       have any use.
-
-       3. Capturing subpatterns that occur inside  negative  lookahead  asser-
-       tions  are  counted,  but their entries in the offsets vector are never
-       set. Perl sometimes (but not always) sets its numerical variables  from
-       inside negative assertions.
-
-       4.  The  following Perl escape sequences are not supported: \l, \u, \L,
-       \U, and \N when followed by a character name or Unicode value.  (\N  on
+       2.  Like  Perl, PCRE2 allows repeat quantifiers on parenthesized asser-
+       tions, but they do not mean what you might think. For example, (?!a){3}
+       does  not  assert  that  the next three characters are not "a". It just
+       asserts that the next character is not "a" three times  (in  principle:
+       PCRE2  optimizes this to run the assertion just once). Perl allows some
+       repeat quantifiers on other  assertions,  for  example,  \b*  (but  not
+       \b{3}), but these do not seem to have any use.
+
+       3.  Capturing  subpatterns that occur inside negative lookaround asser-
+       tions are counted, but their entries in the offsets vector are set only
+       when  a  negative  assertion  is a condition that has a matching branch
+       (that is, the condition is false).
+
+       4. The following Perl escape sequences are not supported: \l,  \u,  \L,
+       \U,  and  \N when followed by a character name or Unicode value. (\N on
        its own, matching a non-newline character, is supported.) In fact these
-       are implemented by Perl's general string-handling and are not  part  of
-       its  pattern matching engine. If any of these are encountered by PCRE2,
+       are  implemented  by Perl's general string-handling and are not part of
+       its pattern matching engine. If any of these are encountered by  PCRE2,
        an error is generated by default. However, if the PCRE2_ALT_BSUX option
        is set, \U and \u are interpreted as ECMAScript interprets them.
 
        5. The Perl escape sequences \p, \P, and \X are supported only if PCRE2
-       is built with Unicode support. The properties that can be  tested  with
-       \p and \P are limited to the general category properties such as Lu and
-       Nd, script names such as Greek or Han, and the derived  properties  Any
-       and L&. PCRE2 does support the Cs (surrogate) property, which Perl does
-       not; the Perl documentation says "Because Perl hides the need  for  the
-       user  to  understand the internal representation of Unicode characters,
-       there is no need to implement the  somewhat  messy  concept  of  surro-
-       gates."
-
-       6.  PCRE2 does support the \Q...\E escape for quoting substrings. Char-
-       acters in between are treated as literals. This is  slightly  different
-       from  Perl  in  that  $  and  @ are also handled as literals inside the
+       is built with Unicode support (the default). The properties that can be
+       tested with \p and \P are limited to the  general  category  properties
+       such  as  Lu and Nd, script names such as Greek or Han, and the derived
+       properties Any and L&.  PCRE2 does support the Cs (surrogate) property,
+       which  Perl  does  not; the Perl documentation says "Because Perl hides
+       the need for the user to understand the internal representation of Uni-
+       code  characters, there is no need to implement the somewhat messy con-
+       cept of surrogates."
+
+       6. PCRE2 does support the \Q...\E escape for quoting substrings.  Char-
+       acters  in  between are treated as literals. This is slightly different
+       from Perl in that $ and @ are  also  handled  as  literals  inside  the
        quotes. In Perl, they cause variable interpolation (but of course PCRE2
        does not have variables).  Note the following examples:
 
@@ -4073,22 +4510,17 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
            \Qabc\$xyz\E       abc\$xyz          abc\$xyz
            \Qabc\E\$\Qxyz\E   abc$xyz           abc$xyz
 
-       The  \Q...\E  sequence  is recognized both inside and outside character
+       The \Q...\E sequence is recognized both inside  and  outside  character
        classes.
 
-       7.  Fairly  obviously,  PCRE2  does  not  support  the  (?{code})   and
-       (??{code})  constructions. However, there is support for recursive pat-
-       terns. This is not available in Perl 5.8, but it is in Perl 5.10. Also,
-       the  PCRE2  "callout"  feature allows an external function to be called
-       during  pattern  matching.  See  the  pcre2callout  documentation   for
-       details.
+       7.   Fairly  obviously,  PCRE2  does  not  support  the  (?{code})  and
+       (??{code}) constructions. However, there is support  PCRE2's  "callout"
+       feature,  which allows an external function to be called during pattern
+       matching. See the pcre2callout documentation for details.
 
-       8.  Subroutine  calls  (whether recursive or not) are treated as atomic
-       groups.  Atomic recursion is like Python,  but  unlike  Perl.  Captured
-       values  that  are  set outside a subroutine call can be referenced from
-       inside in PCRE2, but not in Perl. There is a discussion  that  explains
-       these  differences  in  more detail in the section on recursion differ-
-       ences from Perl in the pcre2pattern page.
+       8. Subroutine calls (whether recursive or not) were treated  as  atomic
+       groups  up to PCRE2 release 10.23, but from release 10.30 this changed,
+       and backtracking into subroutine calls is now supported, as in Perl.
 
        9. If any of the backtracking control verbs are used  in  a  subpattern
        that  is  called  as  a  subroutine (whether or not recursively), their
@@ -4103,7 +4535,7 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
        first one that is backtracked onto acts. For example,  in  the  pattern
        A(*COMMIT)B(*PRUNE)C  a  failure in B triggers (*COMMIT), but a failure
        in C triggers (*PRUNE). Perl's behaviour is more complex; in many cases
-       it is the same as PCRE2, but there are examples where it differs.
+       it is the same as PCRE2, but there are cases where it differs.
 
        11.  Most  backtracking  verbs in assertions have their normal actions.
        They are not confined to the assertion.
@@ -4117,18 +4549,18 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
        pattern names is not as general as Perl's. This is a consequence of the
        fact  the  PCRE2  works internally just with numbers, using an external
        table to translate between numbers and names. In particular, a  pattern
-       such  as  (?|(?<a>A)|(?<b)B),  where the two capturing parentheses have
+       such  as  (?|(?<a>A)|(?<b>B),  where the two capturing parentheses have
        the same number but different names, is not supported,  and  causes  an
        error  at compile time. If it were allowed, it would not be possible to
        distinguish which parentheses matched, because both names map  to  cap-
        turing subpattern number 1. To avoid this confusing situation, an error
        is given at compile time.
 
-       14. Perl recognizes comments in some places that PCRE2  does  not,  for
-       example,  between  the  ( and ? at the start of a subpattern. If the /x
-       modifier is set, Perl allows white space between ( and ?  (though  cur-
-       rent  Perls warn that this is deprecated) but PCRE2 never does, even if
-       the PCRE2_EXTENDED option is set.
+       14. Perl used to recognize comments in some places that PCRE2 does not,
+       for  example,  between the ( and ? at the start of a subpattern. If the
+       /x modifier is set, Perl allowed white space between ( and ? though the
+       latest  Perls give an error (for a while it was just deprecated). There
+       may still be some cases where Perl behaves differently.
 
        15. Perl, when in warning mode, gives warnings  for  character  classes
        such  as  [A-\d] or [a-[:digit:]]. It then treats the hyphens as liter-
@@ -4138,50 +4570,67 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
        16.  In  PCRE2, the upper/lower case character properties Lu and Ll are
        not affected when case-independent matching is specified. For  example,
        \p{Lu} always matches an upper case letter. I think Perl has changed in
-       this respect; in the release at the time of writing (5.16), \p{Lu}  and
+       this respect; in the release at the time of writing (5.24), \p{Lu}  and
        \p{Ll} match all letters, regardless of case, when case independence is
        specified.
 
        17. PCRE2 provides some  extensions  to  the  Perl  regular  expression
        facilities.   Perl  5.10  includes new features that are not in earlier
-       versions of Perl, some of which (such as named parentheses)  have  been
-       in PCRE2 for some time. This list is with respect to Perl 5.10:
+       versions of Perl, some of which (such as  named  parentheses)  were  in
+       PCRE2 for some time before. This list is with respect to Perl 5.26:
 
        (a)  Although  lookbehind  assertions  in PCRE2 must match fixed length
        strings, each alternative branch of a lookbehind assertion can match  a
        different  length  of  string.  Perl requires them all to have the same
        length.
 
-       (b) If PCRE2_DOLLAR_ENDONLY is set and PCRE2_MULTILINE is not set,  the
+       (b) From PCRE2 10.23, back references to groups  of  fixed  length  are
+       supported in lookbehinds, provided that there is no possibility of ref-
+       erencing a non-unique number or name. Perl does not support  backrefer-
+       ences in lookbehinds.
+
+       (c)  If PCRE2_DOLLAR_ENDONLY is set and PCRE2_MULTILINE is not set, the
        $ meta-character matches only at the very end of the string.
 
-       (c)  A  backslash  followed  by  a  letter  with  no special meaning is
+       (d) A backslash followed  by  a  letter  with  no  special  meaning  is
        faulted. (Perl can be made to issue a warning.)
 
-       (d) If PCRE2_UNGREEDY is set, the greediness of the repetition  quanti-
+       (e)  If PCRE2_UNGREEDY is set, the greediness of the repetition quanti-
        fiers is inverted, that is, by default they are not greedy, but if fol-
        lowed by a question mark they are.
 
-       (e) PCRE2_ANCHORED can be used at matching time to force a  pattern  to
+       (f)  PCRE2_ANCHORED  can be used at matching time to force a pattern to
        be tried only at the first matching position in the subject string.
 
-       (f)      The      PCRE2_NOTBOL,      PCRE2_NOTEOL,      PCRE2_NOTEMPTY,
-       PCRE2_NOTEMPTY_ATSTART, and PCRE2_NO_AUTO_CAPTURE options have no  Perl
-       equivalents.
+       (g)    The    PCRE2_NOTBOL,    PCRE2_NOTEOL,     PCRE2_NOTEMPTY     and
+       PCRE2_NOTEMPTY_ATSTART options have no Perl equivalents.
 
-       (g)  The  \R escape sequence can be restricted to match only CR, LF, or
+       (h)  The  \R escape sequence can be restricted to match only CR, LF, or
        CRLF by the PCRE2_BSR_ANYCRLF option.
 
-       (h) The callout facility is PCRE2-specific.
+       (i) The callout facility is PCRE2-specific.  Perl  supports  codeblocks
+       and variable interpolation, but not general hooks on every match.
 
-       (i) The partial matching facility is PCRE2-specific.
+       (j) The partial matching facility is PCRE2-specific.
 
-       (j) The alternative matching function (pcre2_dfa_match() matches  in  a
+       (k)  The  alternative matching function (pcre2_dfa_match() matches in a
        different way and is not Perl-compatible.
 
-       (k)  PCRE2 recognizes some special sequences such as (*CR) at the start
-       of a pattern that set overall options that cannot be changed within the
-       pattern.
+       (l) PCRE2 recognizes some special sequences such as (*CR) or  (*NO_JIT)
+       at  the  start  of  a  pattern  that set overall options that cannot be
+       changed within the pattern.
+
+       18. The Perl /a modifier restricts /d numbers to pure  ascii,  and  the
+       /aa  modifier  restricts  /i  case-insensitive  matching to pure ascii,
+       ignoring Unicode rules. This  separation  cannot  be  represented  with
+       PCRE2_UCP.
+
+       19. Perl has different limits than PCRE2. See the pcre2limit documenta-
+       tion for details. Perl went with 5.10 from recursion to iteration keep-
+       ing the intermediate matches on the heap, which is ~10% slower but does
+       not fall into any stack-overflow limit. PCRE2 made a similar change  at
+       release  10.30,  and also has many build-time and run-time customizable
+       limits.
 
 
 AUTHOR
@@ -4193,8 +4642,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 15 March 2015
-       Copyright (c) 1997-2015 University of Cambridge.
+       Last updated: 18 April 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -4342,8 +4791,8 @@ RETURN VALUES FROM JIT MATCHING
        The error code PCRE2_ERROR_MATCHLIMIT is returned by the  JIT  code  if
        searching  a  very large pattern tree goes on for too long, as it is in
        the same circumstance when JIT is not used, but the details of  exactly
-       what  is counted are not the same. The PCRE2_ERROR_RECURSIONLIMIT error
-       code is never returned when JIT matching is used.
+       what is counted are not the same. The PCRE2_ERROR_DEPTHLIMIT error code
+       is never returned when JIT matching is used.
 
 
 CONTROLLING THE JIT STACK
@@ -4362,13 +4811,10 @@ CONTROLLING THE JIT STACK
        It returns a pointer to an opaque structure of type pcre2_jit_stack, or
        NULL if there is an error. The pcre2_jit_stack_free() function is  used
        to  free a stack that is no longer needed. (For the technically minded:
-       the address space is allocated by mmap or VirtualAlloc.)
+       the address space is allocated by  mmap  or  VirtualAlloc.)  A  maximum
+       stack size of 512K to 1M should be more than enough for any pattern.
 
-       JIT uses far less memory for recursion than the interpretive code,  and
-       a  maximum  stack size of 512K to 1M should be more than enough for any
-       pattern.
-
-       The pcre2_jit_stack_assign() function specifies which  stack  JIT  code
+       The  pcre2_jit_stack_assign()  function  specifies which stack JIT code
        should use. Its arguments are as follows:
 
          pcre2_match_context  *mcontext
@@ -4377,7 +4823,7 @@ CONTROLLING THE JIT STACK
 
        The first argument is a pointer to a match context. When this is subse-
        quently passed to a matching function, its information determines which
-       JIT  stack  is  used. There are three cases for the values of the other
+       JIT stack is used. There are three cases for the values  of  the  other
        two options:
 
          (1) If callback is NULL and data is NULL, an internal 32K block
@@ -4395,34 +4841,34 @@ CONTROLLING THE JIT STACK
              return value must be a valid JIT stack, the result of calling
              pcre2_jit_stack_create().
 
-       A callback function is obeyed whenever JIT code is about to be run;  it
+       A  callback function is obeyed whenever JIT code is about to be run; it
        is not obeyed when pcre2_match() is called with options that are incom-
-       patible for JIT matching. A callback function can therefore be used  to
-       determine  whether  a  match  operation  was  executed by JIT or by the
+       patible  for JIT matching. A callback function can therefore be used to
+       determine whether a match operation was  executed  by  JIT  or  by  the
        interpreter.
 
        You may safely use the same JIT stack for more than one pattern (either
-       by  assigning  directly  or  by  callback), as long as the patterns are
+       by assigning directly or by callback), as  long  as  the  patterns  are
        matched sequentially in the same thread. Currently, the only way to set
-       up  non-sequential matches in one thread is to use callouts: if a call-
-       out function starts another match, that match must use a different  JIT
+       up non-sequential matches in one thread is to use callouts: if a  call-
+       out  function starts another match, that match must use a different JIT
        stack to the one used for currently suspended match(es).
 
-       In  a multithread application, if you do not specify a JIT stack, or if
-       you assign or pass back NULL from  a  callback,  that  is  thread-safe,
-       because  each  thread has its own machine stack. However, if you assign
-       or pass back a non-NULL JIT stack, this must be a different  stack  for
+       In a multithread application, if you do not specify a JIT stack, or  if
+       you  assign  or  pass  back  NULL from a callback, that is thread-safe,
+       because each thread has its own machine stack. However, if  you  assign
+       or  pass  back a non-NULL JIT stack, this must be a different stack for
        each thread so that the application is thread-safe.
 
-       Strictly  speaking,  even more is allowed. You can assign the same non-
-       NULL stack to a match context that is used by any number  of  patterns,
-       as  long  as  they are not used for matching by multiple threads at the
-       same time. For example, you could use the same stack  in  all  compiled
-       patterns,  with  a global mutex in the callback to wait until the stack
+       Strictly speaking, even more is allowed. You can assign the  same  non-
+       NULL  stack  to a match context that is used by any number of patterns,
+       as long as they are not used for matching by multiple  threads  at  the
+       same  time.  For  example, you could use the same stack in all compiled
+       patterns, with a global mutex in the callback to wait until  the  stack
        is available for use. However, this is an inefficient solution, and not
        recommended.
 
-       This  is a suggestion for how a multithreaded program that needs to set
+       This is a suggestion for how a multithreaded program that needs to  set
        up non-default JIT stacks might operate:
 
          During thread initalization
@@ -4434,7 +4880,7 @@ CONTROLLING THE JIT STACK
          Use a one-line callback function
            return thread_local_var
 
-       All the functions described in this section do nothing if  JIT  is  not
+       All  the  functions  described in this section do nothing if JIT is not
        available.
 
 
@@ -4443,20 +4889,20 @@ JIT STACK FAQ
        (1) Why do we need JIT stacks?
 
        PCRE2 (and JIT) is a recursive, depth-first engine, so it needs a stack
-       where the local data of the current node is pushed before checking  its
+       where  the local data of the current node is pushed before checking its
        child nodes.  Allocating real machine stack on some platforms is diffi-
        cult. For example, the stack chain needs to be updated every time if we
-       extend  the  stack  on  PowerPC.  Although it is possible, its updating
+       extend the stack on PowerPC.  Although it  is  possible,  its  updating
        time overhead decreases performance. So we do the recursion in memory.
 
        (2) Why don't we simply allocate blocks of memory with malloc()?
 
-       Modern operating systems have a  nice  feature:  they  can  reserve  an
+       Modern  operating  systems  have  a  nice  feature: they can reserve an
        address space instead of allocating memory. We can safely allocate mem-
-       ory pages inside this address space, so the stack  could  grow  without
+       ory  pages  inside  this address space, so the stack could grow without
        moving memory data (this is important because of pointers). Thus we can
-       allocate 1M address space, and use only a single memory  page  (usually
-       4K)  if  that is enough. However, we can still grow up to 1M anytime if
+       allocate  1M  address space, and use only a single memory page (usually
+       4K) if that is enough. However, we can still grow up to 1M  anytime  if
        needed.
 
        (3) Who "owns" a JIT stack?
@@ -4464,8 +4910,8 @@ JIT STACK FAQ
        The owner of the stack is the user program, not the JIT studied pattern
        or anything else. The user program must ensure that if a stack is being
        used by pcre2_match(), (that is, it is assigned to a match context that
-       is  passed  to  the  pattern currently running), that stack must not be
-       used by any other threads (to avoid overwriting the same memory  area).
+       is passed to the pattern currently running), that  stack  must  not  be
+       used  by any other threads (to avoid overwriting the same memory area).
        The best practice for multithreaded programs is to allocate a stack for
        each thread, and return this stack through the JIT callback function.
 
@@ -4473,36 +4919,36 @@ JIT STACK FAQ
 
        You can free a JIT stack at any time, as long as it will not be used by
        pcre2_match() again. When you assign the stack to a match context, only
-       a pointer is set. There is no reference counting or  any  other  magic.
+       a  pointer  is  set. There is no reference counting or any other magic.
        You can free compiled patterns, contexts, and stacks in any order, any-
-       time. Just do not call pcre2_match() with a match context  pointing  to
+       time.  Just  do not call pcre2_match() with a match context pointing to
        an already freed stack, as that will cause SEGFAULT. (Also, do not free
-       a stack currently used by pcre2_match() in  another  thread).  You  can
-       also  replace the stack in a context at any time when it is not in use.
+       a  stack  currently  used  by pcre2_match() in another thread). You can
+       also replace the stack in a context at any time when it is not in  use.
        You should free the previous stack before assigning a replacement.
 
-       (5) Should I allocate/free a  stack  every  time  before/after  calling
+       (5)  Should  I  allocate/free  a  stack every time before/after calling
        pcre2_match()?
 
-       No,  because  this  is  too  costly in terms of resources. However, you
-       could implement some clever idea which release the stack if it  is  not
-       used  in  let's  say  two minutes. The JIT callback can help to achieve
+       No, because this is too costly in  terms  of  resources.  However,  you
+       could  implement  some clever idea which release the stack if it is not
+       used in let's say two minutes. The JIT callback  can  help  to  achieve
        this without keeping a list of patterns.
 
-       (6) OK, the stack is for long term memory allocation. But what  happens
-       if  a pattern causes stack overflow with a stack of 1M? Is that 1M kept
+       (6)  OK, the stack is for long term memory allocation. But what happens
+       if a pattern causes stack overflow with a stack of 1M? Is that 1M  kept
        until the stack is freed?
 
-       Especially on embedded sytems, it might be a good idea to release  mem-
-       ory  sometimes  without  freeing the stack. There is no API for this at
-       the moment.  Probably a function call which returns with the  currently
-       allocated  memory for any stack and another which allows releasing mem-
+       Especially  on embedded sytems, it might be a good idea to release mem-
+       ory sometimes without freeing the stack. There is no API  for  this  at
+       the  moment.  Probably a function call which returns with the currently
+       allocated memory for any stack and another which allows releasing  mem-
        ory (shrinking the stack) would be a good idea if someone needs this.
 
        (7) This is too much of a headache. Isn't there any better solution for
        JIT stack handling?
 
-       No,  thanks to Windows. If POSIX threads were used everywhere, we could
+       No, thanks to Windows. If POSIX threads were used everywhere, we  could
        throw out this complicated API.
 
 
@@ -4511,18 +4957,18 @@ FREEING JIT SPECULATIVE MEMORY
        void pcre2_jit_free_unused_memory(pcre2_general_context *gcontext);
 
        The JIT executable allocator does not free all memory when it is possi-
-       ble.   It expects new allocations, and keeps some free memory around to
-       improve allocation speed. However, in low memory conditions,  it  might
-       be  better to free all possible memory. You can cause this to happen by
-       calling pcre2_jit_free_unused_memory(). Its argument is a general  con-
+       ble.  It expects new allocations, and keeps some free memory around  to
+       improve  allocation  speed. However, in low memory conditions, it might
+       be better to free all possible memory. You can cause this to happen  by
+       calling  pcre2_jit_free_unused_memory(). Its argument is a general con-
        text, for custom memory management, or NULL for standard memory manage-
        ment.
 
 
 EXAMPLE CODE
 
-       This is a single-threaded example that specifies a  JIT  stack  without
-       using  a  callback.  A real program should include error checking after
+       This  is  a  single-threaded example that specifies a JIT stack without
+       using a callback. A real program should include  error  checking  after
        all the function calls.
 
          int rc;
@@ -4550,29 +4996,29 @@ EXAMPLE CODE
 JIT FAST PATH API
 
        Because the API described above falls back to interpreted matching when
-       JIT  is  not  available, it is convenient for programs that are written
+       JIT is not available, it is convenient for programs  that  are  written
        for  general  use  in  many  environments.  However,  calling  JIT  via
        pcre2_match() does have a performance impact. Programs that are written
-       for use where JIT is known to be available, and  which  need  the  best
-       possible  performance,  can  instead  use a "fast path" API to call JIT
-       matching directly instead of calling pcre2_match() (obviously only  for
+       for  use  where  JIT  is known to be available, and which need the best
+       possible performance, can instead use a "fast path"  API  to  call  JIT
+       matching  directly instead of calling pcre2_match() (obviously only for
        patterns that have been successfully processed by pcre2_jit_compile()).
 
-       The  fast  path  function  is  called  pcre2_jit_match(),  and it takes
+       The fast path  function  is  called  pcre2_jit_match(),  and  it  takes
        exactly the same arguments as pcre2_match(). The return values are also
        the same, plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or
-       complete) is requested that was not compiled. Unsupported  option  bits
-       (for  example,  PCRE2_ANCHORED)  are  ignored,  as  is the PCRE2_NO_JIT
+       complete)  is  requested that was not compiled. Unsupported option bits
+       (for example, PCRE2_ANCHORED)  are  ignored,  as  is  the  PCRE2_NO_JIT
        option.
 
-       When you call pcre2_match(), as well as testing for invalid options,  a
+       When  you call pcre2_match(), as well as testing for invalid options, a
        number of other sanity checks are performed on the arguments. For exam-
        ple, if the subject pointer is NULL, an immediate error is given. Also,
-       unless  PCRE2_NO_UTF_CHECK  is  set, a UTF subject string is tested for
-       validity. In the interests of speed, these checks do not happen on  the
+       unless PCRE2_NO_UTF_CHECK is set, a UTF subject string  is  tested  for
+       validity.  In the interests of speed, these checks do not happen on the
        JIT fast path, and if invalid data is passed, the result is undefined.
 
-       Bypassing  the  sanity  checks  and the pcre2_match() wrapping can give
+       Bypassing the sanity checks and the  pcre2_match()  wrapping  can  give
        speedups of more than 10%.
 
 
@@ -4590,8 +5036,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 05 June 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 31 March 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -4628,12 +5074,6 @@ SIZE AND OTHER LIMITATIONS
        (that is ~(PCRE2_SIZE)0) is reserved as a special indicator  for  zero-
        terminated strings and unset offsets.
 
-       Note  that  when  using  the  traditional matching function, PCRE2 uses
-       recursion to handle subpatterns and indefinite repetition.  This  means
-       that  the  available stack space may limit the size of a subject string
-       that can be processed by certain patterns. For a  discussion  of  stack
-       issues, see the pcre2stack documentation.
-
        All values in repeating quantifiers must be less than 65536.
 
        The maximum length of a lookbehind assertion is 65535 characters.
@@ -4642,21 +5082,20 @@ SIZE AND OTHER LIMITATIONS
        can be no more than 65535 capturing subpatterns. There is,  however,  a
        limit  to  the  depth  of  nesting  of parenthesized subpatterns of all
        kinds. This is imposed in order to limit the  amount  of  system  stack
-       used  at  compile time. The limit can be specified when PCRE2 is built;
-       the default is 250.
-
-       There is a limit to the number of forward references to subsequent sub-
-       patterns  of  around  200,000.  Repeated  forward references with fixed
-       upper limits, for example, (?2){0,100} when subpattern number 2  is  to
-       the  right,  are included in the count. There is no limit to the number
-       of backward references.
+       used  at compile time. The default limit can be specified when PCRE2 is
+       built; the default default is 250. An application can change this limit
+       by  calling pcre2_set_parens_nest_limit() to set the limit in a compile
+       context.
 
        The maximum length of name for a named subpattern is 32 code units, and
        the maximum number of named subpatterns is 10000.
 
        The  maximum  length  of  a  name  in  a (*MARK), (*PRUNE), (*SKIP), or
-       (*THEN) verb is 255 for the 8-bit library and 65535 for the 16-bit  and
-       32-bit libraries.
+       (*THEN) verb is 255 code units for the 8-bit  library  and  65535  code
+       units for the 16-bit and 32-bit libraries.
+
+       The  maximum  length  of  a string argument to a callout is the largest
+       number a 32-bit unsigned integer can hold.
 
 
 AUTHOR
@@ -4668,8 +5107,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 05 November 2015
-       Copyright (c) 1997-2015 University of Cambridge.
+       Last updated: 30 March 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -5444,19 +5883,26 @@ SPECIAL START-OF-PATTERN ITEMS
        attempt by the application to apply the  JIT  optimization  by  calling
        pcre2_jit_compile() is ignored.
 
-   Setting match and recursion limits
+   Setting match resource limits
 
-       The  caller of pcre2_match() can set a limit on the number of times the
-       internal match() function is called and on the maximum depth of  recur-
-       sive calls. These facilities are provided to catch runaway matches that
-       are provoked by patterns with huge matching trees (a typical example is
-       a  pattern  with  nested unlimited repeats) and to avoid running out of
-       system stack by too  much  recursion.  When  one  of  these  limits  is
-       reached,  pcre2_match()  gives  an error return. The limits can also be
-       set by items at the start of the pattern of the form
+       The pcre2_match() function contains a counter that is incremented every
+       time it goes round its main loop. The caller of pcre2_match() can set a
+       limit  on  this counter, which therefore limits the amount of computing
+       resource used for a match. The maximum depth of nested backtracking can
+       also  be  limited;  this indirectly restricts the amount of heap memory
+       that is used, but there is also an explicit memory limit  that  can  be
+       set.
+
+       These  facilities  are  provided to catch runaway matches that are pro-
+       voked by patterns with huge matching trees (a typical example is a pat-
+       tern  with  nested unlimited repeats applied to a long string that does
+       not match). When one of these limits is reached, pcre2_match() gives an
+       error  return.  The limits can also be set by items at the start of the
+       pattern of the form
 
+         (*LIMIT_HEAP=d)
          (*LIMIT_MATCH=d)
-         (*LIMIT_RECURSION=d)
+         (*LIMIT_DEPTH=d)
 
        where d is any number of decimal digits. However, the value of the set-
        ting  must  be  less than the value set (or defaulted) by the caller of
@@ -5465,23 +5911,36 @@ SPECIAL START-OF-PATTERN ITEMS
        If there is more than one setting of one of  these  limits,  the  lower
        value is used.
 
+       Prior  to  release  10.30, LIMIT_DEPTH was called LIMIT_RECURSION. This
+       name is still recognized for backwards compatibility.
+
+       The heap limit applies only when the pcre2_match() interpreter is  used
+       for matching. It does not apply to JIT or DFA matching. The match limit
+       is used (but in a different way)  when  JIT  is  being  used,  or  when
+       pcre2_dfa_match() is called, to limit computing resource usage by those
+       matching functions. The depth limit is ignored by JIT but  is  relevant
+       for  DFA  matching, which uses function recursion for recursions within
+       the pattern. In this case, the depth limit controls the amount of  sys-
+       tem stack that is used.
+
    Newline conventions
 
-       PCRE2 supports five different conventions for indicating line breaks in
+       PCRE2  supports six different conventions for indicating line breaks in
        strings: a single CR (carriage return) character, a  single  LF  (line-
        feed) character, the two-character sequence CRLF, any of the three pre-
-       ceding, or any Unicode newline sequence. The pcre2api page has  further
-       discussion  about newlines, and shows how to set the newline convention
-       when calling pcre2_compile().
+       ceding, any Unicode newline sequence,  or  the  NUL  character  (binary
+       zero).  The  pcre2api  page  has further discussion about newlines, and
+       shows how to set the newline convention when calling pcre2_compile().
 
        It is also possible to specify a newline convention by starting a  pat-
-       tern string with one of the following five sequences:
+       tern string with one of the following sequences:
 
          (*CR)        carriage return
          (*LF)        linefeed
          (*CRLF)      carriage return, followed by linefeed
          (*ANYCRLF)   any of the three above
          (*ANY)       all Unicode newline sequences
+         (*NUL)       the NUL character (binary zero)
 
        These override the default and the options given to the compiling func-
        tion. For example, on a Unix system where LF  is  the  default  newline
@@ -5498,9 +5957,9 @@ SPECIAL START-OF-PATTERN ITEMS
        acter  when  PCRE2_DOTALL is not set, and the behaviour of \N. However,
        it does not affect what the \R escape  sequence  matches.  By  default,
        this  is any Unicode newline sequence, for Perl compatibility. However,
-       this can be changed; see the description of \R in the section  entitled
-       "Newline  sequences" below. A change of \R setting can be combined with
-       a change of newline convention.
+       this can be changed; see the next section and the description of \R  in
+       the  section entitled "Newline sequences" below. A change of \R setting
+       can be combined with a change of newline convention.
 
    Specifying what \R matches
 
@@ -5514,7 +5973,7 @@ SPECIAL START-OF-PATTERN ITEMS
 EBCDIC CHARACTER CODES
 
        PCRE2 can be compiled to run in an environment that uses EBCDIC as  its
-       character code rather than ASCII or Unicode (typically a mainframe sys-
+       character  code instead of ASCII or Unicode (typically a mainframe sys-
        tem). In the sections below, character code values are  ASCII  or  Uni-
        code; in an EBCDIC environment these characters may have different code
        values, and there are no code points greater than 255.
@@ -5579,11 +6038,11 @@ BACKSLASH
        meaning that character may have. This use of  backslash  as  an  escape
        character applies both inside and outside character classes.
 
-       For  example,  if  you want to match a * character, you write \* in the
-       pattern.  This escaping action applies whether  or  not  the  following
+       For  example,  if you want to match a * character, you must write \* in
+       the pattern. This escaping action applies whether or not the  following
        character  would  otherwise be interpreted as a metacharacter, so it is
        always safe to precede a non-alphanumeric  with  backslash  to  specify
-       that  it stands for itself. In particular, if you want to match a back-
+       that it stands for itself.  In particular, if you want to match a back-
        slash, you write \\.
 
        In a UTF mode, only ASCII numbers and letters have any special  meaning
@@ -5614,15 +6073,16 @@ BACKSLASH
        is not followed by \E later in the pattern, the literal  interpretation
        continues  to  the  end  of  the pattern (that is, \E is assumed at the
        end). If the isolated \Q is inside a character class,  this  causes  an
-       error, because the character class is not terminated.
+       error,  because  the  character  class  is  not terminated by a closing
+       square bracket.
 
    Non-printing characters
 
        A second use of backslash provides a way of encoding non-printing char-
-       acters in patterns in a visible manner. There is no restriction on  the
-       appearance  of non-printing characters in a pattern, but when a pattern
+       acters  in patterns in a visible manner. There is no restriction on the
+       appearance of non-printing characters in a pattern, but when a  pattern
        is being prepared by text editing, it is often easier to use one of the
-       following  escape sequences than the binary character it represents. In
+       following escape sequences than the binary character it represents.  In
        an ASCII or Unicode environment, these escapes are as follows:
 
          \a        alarm, that is, the BEL character (hex 07)
@@ -5639,51 +6099,51 @@ BACKSLASH
          \x{hhh..} character with hex code hhh.. (default mode)
          \uhhhh    character with hex code hhhh (when PCRE2_ALT_BSUX is set)
 
-       The precise effect of \cx on ASCII characters is as follows: if x is  a
-       lower  case  letter,  it  is converted to upper case. Then bit 6 of the
+       The  precise effect of \cx on ASCII characters is as follows: if x is a
+       lower case letter, it is converted to upper case. Then  bit  6  of  the
        character (hex 40) is inverted. Thus \cA to \cZ become hex 01 to hex 1A
-       (A  is  41, Z is 5A), but \c{ becomes hex 3B ({ is 7B), and \c; becomes
-       hex 7B (; is 3B). If the code unit following \c has a value  less  than
-       32  or  greater  than  126, a compile-time error occurs. This locks out
-       non-printable ASCII characters in all modes.
+       (A is 41, Z is 5A), but \c{ becomes hex 3B ({ is 7B), and  \c;  becomes
+       hex  7B  (; is 3B). If the code unit following \c has a value less than
+       32 or greater than 126, a compile-time error occurs.
 
        When PCRE2 is compiled in EBCDIC mode, \a, \e, \f, \n, \r, and \t  gen-
        erate the appropriate EBCDIC code values. The \c escape is processed as
        specified for Perl in the perlebcdic document. The only characters that
        are  allowed  after  \c are A-Z, a-z, or one of @, [, \, ], ^, _, or ?.
-       Any other character provokes a  compile-time  error.  The  sequence  \@
-       encodes  character  code 0; the letters (in either case) encode charac-
-       ters 1-26 (hex 01 to hex 1A); [, \, ], ^, and _ encode characters 27-31
-       (hex 1B to hex 1F), and \? becomes either 255 (hex FF) or 95 (hex 5F).
-
-       Thus,  apart  from  \?,  these escapes generate the same character code
-       values as they do in an ASCII environment, though the meanings  of  the
-       values  mostly  differ.  For example, \G always generates code value 7,
+       Any other character provokes a compile-time  error.  The  sequence  \c@
+       encodes  character code 0; after \c the letters (in either case) encode
+       characters 1-26 (hex 01 to hex 1A); [, \, ], ^, and _ encode characters
+       27-31  (hex  1B  to  hex 1F), and \c? becomes either 255 (hex FF) or 95
+       (hex 5F).
+
+       Thus, apart from \c?, these escapes generate the  same  character  code
+       values  as  they do in an ASCII environment, though the meanings of the
+       values mostly differ. For example, \cG always generates code  value  7,
        which is BEL in ASCII but DEL in EBCDIC.
 
-       The sequence \? generates DEL (127, hex 7F) in  an  ASCII  environment,
-       but  because  127  is  not a control character in EBCDIC, Perl makes it
-       generate the APC character. Unfortunately, there are  several  variants
-       of  EBCDIC.  In  most  of them the APC character has the value 255 (hex
-       FF), but in the one Perl calls POSIX-BC its value is 95  (hex  5F).  If
-       certain  other characters have POSIX-BC values, PCRE2 makes \? generate
+       The  sequence  \c? generates DEL (127, hex 7F) in an ASCII environment,
+       but because 127 is not a control character in  EBCDIC,  Perl  makes  it
+       generate  the  APC character. Unfortunately, there are several variants
+       of EBCDIC. In most of them the APC character has  the  value  255  (hex
+       FF),  but  in  the one Perl calls POSIX-BC its value is 95 (hex 5F). If
+       certain other characters have POSIX-BC values, PCRE2 makes \c? generate
        95; otherwise it generates 255.
 
-       After \0 up to two further octal digits are read. If  there  are  fewer
-       than  two  digits,  just  those  that  are  present  are used. Thus the
+       After  \0  up  to two further octal digits are read. If there are fewer
+       than two digits, just  those  that  are  present  are  used.  Thus  the
        sequence \0\x\015 specifies two binary zeros followed by a CR character
        (code value 13). Make sure you supply two digits after the initial zero
        if the pattern character that follows is itself an octal digit.
 
-       The escape \o must be followed by a sequence of octal digits,  enclosed
-       in  braces.  An  error occurs if this is not the case. This escape is a
-       recent addition to Perl; it provides way of specifying  character  code
-       points  as  octal  numbers  greater than 0777, and it also allows octal
+       The  escape \o must be followed by a sequence of octal digits, enclosed
+       in braces. An error occurs if this is not the case. This  escape  is  a
+       recent  addition  to Perl; it provides way of specifying character code
+       points as octal numbers greater than 0777, and  it  also  allows  octal
        numbers and back references to be unambiguously specified.
 
        For greater clarity and unambiguity, it is best to avoid following \ by
        a digit greater than zero. Instead, use \o{} or \x{} to specify charac-
-       ter numbers, and \g{} to specify back references. The  following  para-
+       ter  numbers,  and \g{} to specify back references. The following para-
        graphs describe the old, ambiguous syntax.
 
        The handling of a backslash followed by a digit other than 0 is compli-
@@ -5691,16 +6151,16 @@ BACKSLASH
 
        Outside a character class, PCRE2 reads the digit and any following dig-
        its as a decimal number. If the number is less than 10, begins with the
-       digit 8 or 9, or if there are at least  that  many  previous  capturing
-       left  parentheses  in the expression, the entire sequence is taken as a
+       digit  8  or  9,  or if there are at least that many previous capturing
+       left parentheses in the expression, the entire sequence is taken  as  a
        back reference. A description of how this works is given later, follow-
-       ing  the  discussion  of  parenthesized  subpatterns.  Otherwise, up to
+       ing the discussion of  parenthesized  subpatterns.   Otherwise,  up  to
        three octal digits are read to form a character code.
 
-       Inside a character class, PCRE2 handles \8 and \9 as the literal  char-
-       acters  "8"  and "9", and otherwise reads up to three octal digits fol-
+       Inside  a character class, PCRE2 handles \8 and \9 as the literal char-
+       acters "8" and "9", and otherwise reads up to three octal  digits  fol-
        lowing the backslash, using them to generate a data character. Any sub-
-       sequent  digits  stand for themselves. For example, outside a character
+       sequent digits stand for themselves. For example, outside  a  character
        class:
 
          \040   is another way of writing an ASCII space
@@ -5717,69 +6177,68 @@ BACKSLASH
                    the value 255 (decimal)
          \81    is always a back reference
 
-       Note that octal values of 100 or greater that are specified using  this
-       syntax  must  not be introduced by a leading zero, because no more than
+       Note  that octal values of 100 or greater that are specified using this
+       syntax must not be introduced by a leading zero, because no  more  than
        three octal digits are ever read.
 
-       By default, after \x that is not followed by {, from zero to two  hexa-
-       decimal  digits  are  read (letters can be in upper or lower case). Any
+       By  default, after \x that is not followed by {, from zero to two hexa-
+       decimal digits are read (letters can be in upper or  lower  case).  Any
        number of hexadecimal digits may appear between \x{ and }. If a charac-
-       ter  other  than  a  hexadecimal digit appears between \x{ and }, or if
+       ter other than a hexadecimal digit appears between \x{  and  },  or  if
        there is no terminating }, an error occurs.
 
-       If the PCRE2_ALT_BSUX option is set, the interpretation  of  \x  is  as
+       If  the  PCRE2_ALT_BSUX  option  is set, the interpretation of \x is as
        just described only when it is followed by two hexadecimal digits. Oth-
-       erwise, it matches a literal "x" character. In this mode mode,  support
-       for  code points greater than 256 is provided by \u, which must be fol-
-       lowed by four hexadecimal digits; otherwise it matches  a  literal  "u"
-       character.
+       erwise,  it  matches a literal "x" character. In this mode, support for
+       code points greater than 256 is provided by \u, which must be  followed
+       by  four hexadecimal digits; otherwise it matches a literal "u" charac-
+       ter.
 
        Characters whose value is less than 256 can be defined by either of the
        two syntaxes for \x (or by \u in PCRE2_ALT_BSUX mode). There is no dif-
-       ference  in  the way they are handled. For example, \xdc is exactly the
+       ference in the way they are handled. For example, \xdc is  exactly  the
        same as \x{dc} (or \u00dc in PCRE2_ALT_BSUX mode).
 
    Constraints on character values
 
-       Characters that are specified using octal or  hexadecimal  numbers  are
+       Characters  that  are  specified using octal or hexadecimal numbers are
        limited to certain values, as follows:
 
-         8-bit non-UTF mode    less than 0x100
-         8-bit UTF-8 mode      less than 0x10ffff and a valid codepoint
-         16-bit non-UTF mode   less than 0x10000
-         16-bit UTF-16 mode    less than 0x10ffff and a valid codepoint
-         32-bit non-UTF mode   less than 0x100000000
-         32-bit UTF-32 mode    less than 0x10ffff and a valid codepoint
+         8-bit non-UTF mode    no greater than 0xff
+         16-bit non-UTF mode   no greater than 0xffff
+         32-bit non-UTF mode   no greater than 0xffffffff
+         All UTF modes         no greater than 0x10ffff and a valid codepoint
 
-       Invalid  Unicode  codepoints  are  the  range 0xd800 to 0xdfff (the so-
-       called "surrogate" codepoints), and 0xffef.
+       Invalid Unicode codepoints are all those in the range 0xd800 to  0xdfff
+       (the so-called "surrogate" codepoints). The check for these can be dis-
+       abled  by  the  caller  of  pcre2_compile()  by  setting   the   option
+       PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES.
 
    Escape sequences in character classes
 
        All the sequences that define a single character value can be used both
-       inside  and  outside character classes. In addition, inside a character
+       inside and outside character classes. In addition, inside  a  character
        class, \b is interpreted as the backspace character (hex 08).
 
-       \N is not allowed in a character class. \B, \R, and \X are not  special
-       inside  a  character  class.  Like other unrecognized alphabetic escape
-       sequences, they cause  an  error.  Outside  a  character  class,  these
+       \N  is not allowed in a character class. \B, \R, and \X are not special
+       inside a character class. Like  other  unrecognized  alphabetic  escape
+       sequences,  they  cause  an  error.  Outside  a  character class, these
        sequences have different meanings.
 
    Unsupported escape sequences
 
-       In  Perl, the sequences \l, \L, \u, and \U are recognized by its string
-       handler and used  to  modify  the  case  of  following  characters.  By
+       In Perl, the sequences \l, \L, \u, and \U are recognized by its  string
+       handler  and  used  to  modify  the  case  of  following characters. By
        default, PCRE2 does not support these escape sequences. However, if the
        PCRE2_ALT_BSUX option is set, \U matches a "U" character, and \u can be
-       used  to define a character by code point, as described in the previous
-       section.
+       used to define a character by code point, as described above.
 
    Absolute and relative back references
 
-       The sequence \g followed by an unsigned or a negative  number,  option-
-       ally  enclosed  in braces, is an absolute or relative back reference. A
-       named back reference can be coded as \g{name}. Back references are dis-
-       cussed later, following the discussion of parenthesized subpatterns.
+       The sequence \g followed by a signed  or  unsigned  number,  optionally
+       enclosed  in braces, is an absolute or relative back reference. A named
+       back reference can be coded as \g{name}. Back references are  discussed
+       later, following the discussion of parenthesized subpatterns.
 
    Absolute and relative subroutine calls
 
@@ -5941,59 +6400,64 @@ BACKSLASH
        tional escape sequences that match characters with specific  properties
        are  available.  In 8-bit non-UTF-8 mode, these sequences are of course
        limited to testing characters whose codepoints are less than  256,  but
-       they do work in this mode.  The extra escape sequences are:
+       they  do work in this mode.  In 32-bit non-UTF mode, codepoints greater
+       than 0x10ffff (the Unicode limit) may be  encountered.  These  are  all
+       treated  as being in the Common script and with an unassigned type. The
+       extra escape sequences are:
 
          \p{xx}   a character with the xx property
          \P{xx}   a character without the xx property
          \X       a Unicode extended grapheme cluster
 
-       The  property  names represented by xx above are limited to the Unicode
+       The property names represented by xx above are limited to  the  Unicode
        script names, the general category properties, "Any", which matches any
        character  (including  newline),  and  some  special  PCRE2  properties
-       (described in the next section).  Other Perl properties such as  "InMu-
-       sicalSymbols"  are  not supported by PCRE2.  Note that \P{Any} does not
+       (described  in the next section).  Other Perl properties such as "InMu-
+       sicalSymbols" are not supported by PCRE2.  Note that \P{Any}  does  not
        match any characters, so always causes a match failure.
 
        Sets of Unicode characters are defined as belonging to certain scripts.
-       A  character from one of these sets can be matched using a script name.
+       A character from one of these sets can be matched using a script  name.
        For example:
 
          \p{Greek}
          \P{Han}
 
-       Those that are not part of an identified script are lumped together  as
+       Those  that are not part of an identified script are lumped together as
        "Common". The current list of scripts is:
 
-       Ahom,   Anatolian_Hieroglyphs,  Arabic,  Armenian,  Avestan,  Balinese,
-       Bamum, Bassa_Vah, Batak, Bengali, Bopomofo, Brahmi, Braille,  Buginese,
-       Buhid,  Canadian_Aboriginal,  Carian, Caucasian_Albanian, Chakma, Cham,
-       Cherokee,  Common,  Coptic,  Cuneiform,  Cypriot,  Cyrillic,   Deseret,
-       Devanagari,  Duployan,  Egyptian_Hieroglyphs,  Elbasan, Ethiopic, Geor-
-       gian, Glagolitic, Gothic,  Grantha,  Greek,  Gujarati,  Gurmukhi,  Han,
-       Hangul, Hanunoo, Hatran, Hebrew, Hiragana, Imperial_Aramaic, Inherited,
-       Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese,  Kaithi,  Kan-
-       nada,  Katakana,  Kayah_Li,  Kharoshthi, Khmer, Khojki, Khudawadi, Lao,
-       Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu, Lycian,  Lydian,  Maha-
-       jani,  Malayalam,  Mandaic,  Manichaean,  Meetei_Mayek,  Mende_Kikakui,
-       Meroitic_Cursive, Meroitic_Hieroglyphs,  Miao,  Modi,  Mongolian,  Mro,
-       Multani,   Myanmar,   Nabataean,  New_Tai_Lue,  Nko,  Ogham,  Ol_Chiki,
-       Old_Hungarian, Old_Italic, Old_North_Arabian, Old_Permic,  Old_Persian,
-       Old_South_Arabian, Old_Turkic, Oriya, Osmanya, Pahawh_Hmong, Palmyrene,
-       Pau_Cin_Hau,  Phags_Pa,  Phoenician,  Psalter_Pahlavi,  Rejang,  Runic,
-       Samaritan, Saurashtra, Sharada, Shavian, Siddham, SignWriting, Sinhala,
-       Sora_Sompeng,  Sundanese,  Syloti_Nagri,  Syriac,  Tagalog,   Tagbanwa,
-       Tai_Le,   Tai_Tham,  Tai_Viet,  Takri,  Tamil,  Telugu,  Thaana,  Thai,
-       Tibetan, Tifinagh, Tirhuta, Ugaritic, Vai, Warang_Citi, Yi.
+       Adlam, Ahom, Anatolian_Hieroglyphs, Arabic,  Armenian,  Avestan,  Bali-
+       nese,  Bamum,  Bassa_Vah,  Batak, Bengali, Bhaiksuki, Bopomofo, Brahmi,
+       Braille, Buginese, Buhid, Canadian_Aboriginal, Carian,  Caucasian_Alba-
+       nian,  Chakma,  Cham,  Cherokee,  Common,  Coptic,  Cuneiform, Cypriot,
+       Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hieroglyphs, Elbasan,
+       Ethiopic,  Georgian, Glagolitic, Gothic, Grantha, Greek, Gujarati, Gur-
+       mukhi, Han, Hangul, Hanunoo, Hatran,  Hebrew,  Hiragana,  Imperial_Ara-
+       maic,    Inherited,    Inscriptional_Pahlavi,   Inscriptional_Parthian,
+       Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer,  Kho-
+       jki,  Khudawadi,  Lao,  Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu,
+       Lycian, Lydian,  Mahajani,  Malayalam,  Mandaic,  Manichaean,  Marchen,
+       Masaram_Gondi,     Meetei_Mayek,    Mende_Kikakui,    Meroitic_Cursive,
+       Meroitic_Hieroglyphs, Miao, Modi,  Mongolian,  Mro,  Multani,  Myanmar,
+       Nabataean,  New_Tai_Lue, Newa, Nko, Nushu, Ogham, Ol_Chiki, Old_Hungar-
+       ian,   Old_Italic,    Old_North_Arabian,    Old_Permic,    Old_Persian,
+       Old_South_Arabian,  Old_Turkic,  Oriya,  Osage,  Osmanya, Pahawh_Hmong,
+       Palmyrene, Pau_Cin_Hau, Phags_Pa, Phoenician, Psalter_Pahlavi,  Rejang,
+       Runic,  Samaritan,  Saurashtra, Sharada, Shavian, Siddham, SignWriting,
+       Sinhala, Sora_Sompeng, Soyombo, Sundanese, Syloti_Nagri, Syriac,  Taga-
+       log,  Tagbanwa,  Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Tangut, Tel-
+       ugu,  Thaana,  Thai,  Tibetan,  Tifinagh,   Tirhuta,   Ugaritic,   Vai,
+       Warang_Citi, Yi, Zanabazar_Square.
 
        Each character has exactly one Unicode general category property, spec-
-       ified  by a two-letter abbreviation. For compatibility with Perl, nega-
-       tion can be specified by including a  circumflex  between  the  opening
-       brace  and  the  property  name.  For  example,  \p{^Lu} is the same as
+       ified by a two-letter abbreviation. For compatibility with Perl,  nega-
+       tion  can  be  specified  by including a circumflex between the opening
+       brace and the property name.  For  example,  \p{^Lu}  is  the  same  as
        \P{Lu}.
 
        If only one letter is specified with \p or \P, it includes all the gen-
-       eral  category properties that start with that letter. In this case, in
-       the absence of negation, the curly brackets in the escape sequence  are
+       eral category properties that start with that letter. In this case,  in
+       the  absence of negation, the curly brackets in the escape sequence are
        optional; these two examples have the same effect:
 
          \p{L}
@@ -6045,45 +6509,48 @@ BACKSLASH
          Zp    Paragraph separator
          Zs    Space separator
 
-       The  special property L& is also supported: it matches a character that
-       has the Lu, Ll, or Lt property, in other words, a letter  that  is  not
+       The special property L& is also supported: it matches a character  that
+       has  the  Lu,  Ll, or Lt property, in other words, a letter that is not
        classified as a modifier or "other".
 
-       The  Cs  (Surrogate)  property  applies only to characters in the range
-       U+D800 to U+DFFF. Such characters are not valid in Unicode strings  and
-       so  cannot  be  tested  by PCRE2, unless UTF validity checking has been
-       turned off (see the discussion of PCRE2_NO_UTF_CHECK  in  the  pcre2api
+       The Cs (Surrogate) property applies only to  characters  in  the  range
+       U+D800  to U+DFFF. Such characters are not valid in Unicode strings and
+       so cannot be tested by PCRE2, unless UTF  validity  checking  has  been
+       turned  off  (see  the discussion of PCRE2_NO_UTF_CHECK in the pcre2api
        page). Perl does not support the Cs property.
 
-       The  long  synonyms  for  property  names  that  Perl supports (such as
-       \p{Letter}) are not supported by PCRE2, nor is it permitted  to  prefix
+       The long synonyms for  property  names  that  Perl  supports  (such  as
+       \p{Letter})  are  not supported by PCRE2, nor is it permitted to prefix
        any of these properties with "Is".
 
        No character that is in the Unicode table has the Cn (unassigned) prop-
        erty.  Instead, this property is assumed for any code point that is not
        in the Unicode table.
 
-       Specifying  caseless  matching  does not affect these escape sequences.
-       For example, \p{Lu} always matches only upper  case  letters.  This  is
+       Specifying caseless matching does not affect  these  escape  sequences.
+       For  example,  \p{Lu}  always  matches only upper case letters. This is
        different from the behaviour of current versions of Perl.
 
-       Matching  characters by Unicode property is not fast, because PCRE2 has
-       to do a multistage table lookup in order to find  a  character's  prop-
+       Matching characters by Unicode property is not fast, because PCRE2  has
+       to  do  a  multistage table lookup in order to find a character's prop-
        erty. That is why the traditional escape sequences such as \d and \w do
-       not use Unicode properties in PCRE2 by default,  though  you  can  make
-       them  do  so by setting the PCRE2_UCP option or by starting the pattern
+       not  use  Unicode  properties  in PCRE2 by default, though you can make
+       them do so by setting the PCRE2_UCP option or by starting  the  pattern
        with (*UCP).
 
    Extended grapheme clusters
 
-       The \X escape matches any number of Unicode  characters  that  form  an
+       The  \X  escape  matches  any number of Unicode characters that form an
        "extended grapheme cluster", and treats the sequence as an atomic group
-       (see below).  Unicode supports various kinds of composite character  by
-       giving  each  character  a grapheme breaking property, and having rules
+       (see  below).  Unicode supports various kinds of composite character by
+       giving each character a grapheme breaking property,  and  having  rules
        that use these properties to define the boundaries of extended grapheme
-       clusters.  \X  always  matches  at least one character. Then it decides
-       whether to add additional characters according to the  following  rules
-       for ending a cluster:
+       clusters. The rules are defined in Unicode Standard Annex 29,  "Unicode
+       Text Segmentation".
+
+       \X  always  matches  at least one character. Then it decides whether to
+       add additional characters according to the following rules for ending a
+       cluster:
 
        1. End at the end of the subject string.
 
@@ -6096,20 +6563,31 @@ BACKSLASH
        be followed by a V or T character; an LVT or T character may be follwed
        only by a T character.
 
-       4. Do not end before extending characters or spacing marks.  Characters
-       with  the  "mark"  property  always have the "extend" grapheme breaking
-       property.
+       4. Do not end before extending  characters  or  spacing  marks  or  the
+       "zero-width  joiner"  characters.  Characters  with the "mark" property
+       always have the "extend" grapheme breaking property.
 
        5. Do not end after prepend characters.
 
+       6. Do not break within emoji modifier sequences (a base character  fol-
+       lowed by a modifier). Extending characters are allowed before the modi-
+       fier.
+
+       7. Do not break within emoji zwj sequences (zero-width jointer followed
+       by "glue after ZWJ" or "base glue after ZWJ").
+
+       8.  Do  not  break  within  emoji flag sequences. That is, do not break
+       between regional indicator (RI) characters if there are an  odd  number
+       of RI characters before the break point.
+
        6. Otherwise, end the cluster.
 
    PCRE2's additional properties
 
-       As well as the standard Unicode properties described above, PCRE2  sup-
-       ports  four  more  that  make it possible to convert traditional escape
+       As  well as the standard Unicode properties described above, PCRE2 sup-
+       ports four more that make it possible  to  convert  traditional  escape
        sequences such as \w and \s to use Unicode properties. PCRE2 uses these
-       non-standard,  non-Perl  properties  internally  when PCRE2_UCP is set.
+       non-standard, non-Perl properties internally  when  PCRE2_UCP  is  set.
        However, they may also be used explicitly. These properties are:
 
          Xan   Any alphanumeric character
@@ -6117,53 +6595,53 @@ BACKSLASH
          Xsp   Any Perl space character
          Xwd   Any Perl "word" character
 
-       Xan matches characters that have either the L (letter) or the  N  (num-
-       ber)  property. Xps matches the characters tab, linefeed, vertical tab,
-       form feed, or carriage return, and any other character that has  the  Z
-       (separator)  property.   Xsp  is  the  same as Xps; in PCRE1 it used to
-       exclude vertical tab, for Perl compatibility,  but  Perl  changed.  Xwd
+       Xan  matches  characters that have either the L (letter) or the N (num-
+       ber) property. Xps matches the characters tab, linefeed, vertical  tab,
+       form  feed,  or carriage return, and any other character that has the Z
+       (separator) property.  Xsp is the same as Xps;  in  PCRE1  it  used  to
+       exclude  vertical  tab,  for  Perl compatibility, but Perl changed. Xwd
        matches the same characters as Xan, plus underscore.
 
-       There  is another non-standard property, Xuc, which matches any charac-
-       ter that can be represented by a Universal Character Name  in  C++  and
-       other  programming  languages.  These are the characters $, @, ` (grave
-       accent), and all characters with Unicode code points  greater  than  or
-       equal  to U+00A0, except for the surrogates U+D800 to U+DFFF. Note that
-       most base (ASCII) characters are excluded. (Universal  Character  Names
-       are  of  the  form \uHHHH or \UHHHHHHHH where H is a hexadecimal digit.
+       There is another non-standard property, Xuc, which matches any  charac-
+       ter  that  can  be represented by a Universal Character Name in C++ and
+       other programming languages. These are the characters $,  @,  `  (grave
+       accent),  and  all  characters with Unicode code points greater than or
+       equal to U+00A0, except for the surrogates U+D800 to U+DFFF. Note  that
+       most  base  (ASCII) characters are excluded. (Universal Character Names
+       are of the form \uHHHH or \UHHHHHHHH where H is  a  hexadecimal  digit.
        Note that the Xuc property does not match these sequences but the char-
        acters that they represent.)
 
    Resetting the match start
 
-       The  escape sequence \K causes any previously matched characters not to
+       The escape sequence \K causes any previously matched characters not  to
        be included in the final matched sequence. For example, the pattern:
 
          foo\Kbar
 
-       matches "foobar", but reports that it has matched "bar".  This  feature
-       is  similar  to  a lookbehind assertion (described below).  However, in
-       this case, the part of the subject before the real match does not  have
-       to  be of fixed length, as lookbehind assertions do. The use of \K does
-       not interfere with the setting of captured  substrings.   For  example,
+       matches  "foobar",  but reports that it has matched "bar". This feature
+       is similar to a lookbehind assertion (described  below).   However,  in
+       this  case, the part of the subject before the real match does not have
+       to be of fixed length, as lookbehind assertions do. The use of \K  does
+       not  interfere  with  the setting of captured substrings.  For example,
        when the pattern
 
          (foo)\Kbar
 
        matches "foobar", the first substring is still set to "foo".
 
-       Perl  documents  that  the  use  of  \K  within assertions is "not well
-       defined". In PCRE2, \K is acted upon when  it  occurs  inside  positive
-       assertions,  but  is  ignored  in negative assertions. Note that when a
-       pattern such as (?=ab\K) matches, the reported start of the  match  can
+       Perl documents that the use  of  \K  within  assertions  is  "not  well
+       defined".  In  PCRE2,  \K  is acted upon when it occurs inside positive
+       assertions, but is ignored in negative assertions.  Note  that  when  a
+       pattern  such  as (?=ab\K) matches, the reported start of the match can
        be greater than the end of the match.
 
    Simple assertions
 
-       The  final use of backslash is for certain simple assertions. An asser-
-       tion specifies a condition that has to be met at a particular point  in
-       a  match, without consuming any characters from the subject string. The
-       use of subpatterns for more complicated assertions is described  below.
+       The final use of backslash is for certain simple assertions. An  asser-
+       tion  specifies a condition that has to be met at a particular point in
+       a match, without consuming any characters from the subject string.  The
+       use  of subpatterns for more complicated assertions is described below.
        The backslashed assertions are:
 
          \b     matches at a word boundary
@@ -6174,184 +6652,184 @@ BACKSLASH
          \z     matches only at the end of the subject
          \G     matches at the first matching position in the subject
 
-       Inside  a  character  class, \b has a different meaning; it matches the
-       backspace character. If any other of  these  assertions  appears  in  a
+       Inside a character class, \b has a different meaning;  it  matches  the
+       backspace  character.  If  any  other  of these assertions appears in a
        character class, an "invalid escape sequence" error is generated.
 
-       A  word  boundary is a position in the subject string where the current
-       character and the previous character do not both match \w or  \W  (i.e.
-       one  matches  \w  and the other matches \W), or the start or end of the
-       string if the first or last character matches \w,  respectively.  In  a
-       UTF  mode,  the  meanings  of  \w  and \W can be changed by setting the
+       A word boundary is a position in the subject string where  the  current
+       character  and  the previous character do not both match \w or \W (i.e.
+       one matches \w and the other matches \W), or the start or  end  of  the
+       string  if  the  first or last character matches \w, respectively. In a
+       UTF mode, the meanings of \w and \W  can  be  changed  by  setting  the
        PCRE2_UCP option. When this is done, it also affects \b and \B. Neither
-       PCRE2  nor Perl has a separate "start of word" or "end of word" metase-
-       quence. However, whatever follows \b normally determines which  it  is.
+       PCRE2 nor Perl has a separate "start of word" or "end of word"  metase-
+       quence.  However,  whatever follows \b normally determines which it is.
        For example, the fragment \ba matches "a" at the start of a word.
 
-       The  \A,  \Z,  and \z assertions differ from the traditional circumflex
+       The \A, \Z, and \z assertions differ from  the  traditional  circumflex
        and dollar (described in the next section) in that they only ever match
-       at  the  very start and end of the subject string, whatever options are
-       set. Thus, they are independent of multiline mode. These  three  asser-
-       tions  are  not  affected  by the PCRE2_NOTBOL or PCRE2_NOTEOL options,
-       which affect only the behaviour of the circumflex and dollar  metachar-
-       acters.  However,  if the startoffset argument of pcre2_match() is non-
-       zero, indicating that matching is to start at a point  other  than  the
-       beginning  of  the subject, \A can never match.  The difference between
-       \Z and \z is that \Z matches before a newline at the end of the  string
+       at the very start and end of the subject string, whatever  options  are
+       set.  Thus,  they are independent of multiline mode. These three asser-
+       tions are not affected by the  PCRE2_NOTBOL  or  PCRE2_NOTEOL  options,
+       which  affect only the behaviour of the circumflex and dollar metachar-
+       acters. However, if the startoffset argument of pcre2_match()  is  non-
+       zero,  indicating  that  matching is to start at a point other than the
+       beginning of the subject, \A can never match.  The  difference  between
+       \Z  and \z is that \Z matches before a newline at the end of the string
        as well as at the very end, whereas \z matches only at the end.
 
-       The  \G assertion is true only when the current matching position is at
-       the start point of the match, as specified by the startoffset  argument
-       of  pcre2_match().  It differs from \A when the value of startoffset is
-       non-zero. By calling  pcre2_match()  multiple  times  with  appropriate
-       arguments,  you  can  mimic Perl's /g option, and it is in this kind of
+       The \G assertion is true only when the current matching position is  at
+       the  start point of the match, as specified by the startoffset argument
+       of pcre2_match(). It differs from \A when the value of  startoffset  is
+       non-zero.  By  calling  pcre2_match()  multiple  times with appropriate
+       arguments, you can mimic Perl's /g option, and it is in  this  kind  of
        implementation where \G can be useful.
 
-       Note, however, that PCRE2's interpretation of \G, as the start  of  the
+       Note,  however,  that PCRE2's interpretation of \G, as the start of the
        current match, is subtly different from Perl's, which defines it as the
-       end of the previous match. In Perl, these can  be  different  when  the
-       previously  matched string was empty. Because PCRE2 does just one match
+       end  of  the  previous  match. In Perl, these can be different when the
+       previously matched string was empty. Because PCRE2 does just one  match
        at a time, it cannot reproduce this behaviour.
 
-       If all the alternatives of a pattern begin with \G, the  expression  is
+       If  all  the alternatives of a pattern begin with \G, the expression is
        anchored to the starting match position, and the "anchored" flag is set
        in the compiled regular expression.
 
 
 CIRCUMFLEX AND DOLLAR
 
-       The circumflex and dollar  metacharacters  are  zero-width  assertions.
-       That  is,  they test for a particular condition being true without con-
+       The  circumflex  and  dollar  metacharacters are zero-width assertions.
+       That is, they test for a particular condition being true  without  con-
        suming any characters from the subject string. These two metacharacters
-       are  concerned  with matching the starts and ends of lines. If the new-
-       line convention is set so that only the two-character sequence CRLF  is
-       recognized  as  a newline, isolated CR and LF characters are treated as
+       are concerned with matching the starts and ends of lines. If  the  new-
+       line  convention is set so that only the two-character sequence CRLF is
+       recognized as a newline, isolated CR and LF characters are  treated  as
        ordinary data characters, and are not recognized as newlines.
 
        Outside a character class, in the default matching mode, the circumflex
-       character  is  an  assertion  that is true only if the current matching
-       point is at the start of the subject string. If the  startoffset  argu-
-       ment  of  pcre2_match() is non-zero, or if PCRE2_NOTBOL is set, circum-
-       flex can never match if the PCRE2_MULTILINE option is unset.  Inside  a
-       character  class,  circumflex  has  an  entirely different meaning (see
+       character is an assertion that is true only  if  the  current  matching
+       point  is  at the start of the subject string. If the startoffset argu-
+       ment of pcre2_match() is non-zero, or if PCRE2_NOTBOL is  set,  circum-
+       flex  can  never match if the PCRE2_MULTILINE option is unset. Inside a
+       character class, circumflex has  an  entirely  different  meaning  (see
        below).
 
-       Circumflex need not be the first character of the pattern if  a  number
-       of  alternatives are involved, but it should be the first thing in each
-       alternative in which it appears if the pattern is ever  to  match  that
-       branch.  If all possible alternatives start with a circumflex, that is,
-       if the pattern is constrained to match only at the start  of  the  sub-
-       ject,  it  is  said  to be an "anchored" pattern. (There are also other
+       Circumflex  need  not be the first character of the pattern if a number
+       of alternatives are involved, but it should be the first thing in  each
+       alternative  in  which  it appears if the pattern is ever to match that
+       branch. If all possible alternatives start with a circumflex, that  is,
+       if  the  pattern  is constrained to match only at the start of the sub-
+       ject, it is said to be an "anchored" pattern.  (There  are  also  other
        constructs that can cause a pattern to be anchored.)
 
-       The dollar character is an assertion that is true only if  the  current
-       matching  point  is  at  the  end of the subject string, or immediately
-       before a newline  at  the  end  of  the  string  (by  default),  unless
+       The  dollar  character is an assertion that is true only if the current
+       matching point is at the end of  the  subject  string,  or  immediately
+       before  a  newline  at  the  end  of  the  string  (by default), unless
        PCRE2_NOTEOL is set. Note, however, that it does not actually match the
        newline. Dollar need not be the last character of the pattern if a num-
        ber of alternatives are involved, but it should be the last item in any
-       branch in which it appears. Dollar has no special meaning in a  charac-
+       branch  in which it appears. Dollar has no special meaning in a charac-
        ter class.
 
-       The  meaning  of  dollar  can be changed so that it matches only at the
-       very end of the string, by setting the PCRE2_DOLLAR_ENDONLY  option  at
+       The meaning of dollar can be changed so that it  matches  only  at  the
+       very  end  of the string, by setting the PCRE2_DOLLAR_ENDONLY option at
        compile time. This does not affect the \Z assertion.
 
        The meanings of the circumflex and dollar metacharacters are changed if
-       the PCRE2_MULTILINE option is set. When this  is  the  case,  a  dollar
-       character  matches before any newlines in the string, as well as at the
-       very end, and a circumflex matches immediately after internal  newlines
-       as  well as at the start of the subject string. It does not match after
-       a newline that ends the string, for compatibility with  Perl.  However,
+       the  PCRE2_MULTILINE  option  is  set.  When this is the case, a dollar
+       character matches before any newlines in the string, as well as at  the
+       very  end, and a circumflex matches immediately after internal newlines
+       as well as at the start of the subject string. It does not match  after
+       a  newline  that ends the string, for compatibility with Perl. However,
        this can be changed by setting the PCRE2_ALT_CIRCUMFLEX option.
 
-       For  example, the pattern /^abc$/ matches the subject string "def\nabc"
-       (where \n represents a newline) in multiline mode, but  not  otherwise.
-       Consequently,  patterns  that  are anchored in single line mode because
-       all branches start with ^ are not anchored in  multiline  mode,  and  a
-       match  for  circumflex  is  possible  when  the startoffset argument of
-       pcre2_match() is non-zero. The PCRE2_DOLLAR_ENDONLY option  is  ignored
+       For example, the pattern /^abc$/ matches the subject string  "def\nabc"
+       (where  \n  represents a newline) in multiline mode, but not otherwise.
+       Consequently, patterns that are anchored in single  line  mode  because
+       all  branches  start  with  ^ are not anchored in multiline mode, and a
+       match for circumflex is  possible  when  the  startoffset  argument  of
+       pcre2_match()  is  non-zero. The PCRE2_DOLLAR_ENDONLY option is ignored
        if PCRE2_MULTILINE is set.
 
-       When  the  newline  convention (see "Newline conventions" below) recog-
-       nizes the two-character sequence CRLF as a newline, this is  preferred,
-       even  if  the  single  characters CR and LF are also recognized as new-
-       lines. For example, if the newline convention  is  "any",  a  multiline
-       mode  circumflex matches before "xyz" in the string "abc\r\nxyz" rather
-       than after CR, even though CR on its own is a valid newline.  (It  also
+       When the newline convention (see "Newline  conventions"  below)  recog-
+       nizes  the two-character sequence CRLF as a newline, this is preferred,
+       even if the single characters CR and LF are  also  recognized  as  new-
+       lines.  For  example,  if  the newline convention is "any", a multiline
+       mode circumflex matches before "xyz" in the string "abc\r\nxyz"  rather
+       than  after  CR, even though CR on its own is a valid newline. (It also
        matches at the very start of the string, of course.)
 
-       Note  that  the sequences \A, \Z, and \z can be used to match the start
-       and end of the subject in both modes, and if all branches of a  pattern
-       start  with \A it is always anchored, whether or not PCRE2_MULTILINE is
+       Note that the sequences \A, \Z, and \z can be used to match  the  start
+       and  end of the subject in both modes, and if all branches of a pattern
+       start with \A it is always anchored, whether or not PCRE2_MULTILINE  is
        set.
 
 
 FULL STOP (PERIOD, DOT) AND \N
 
        Outside a character class, a dot in the pattern matches any one charac-
-       ter  in  the subject string except (by default) a character that signi-
+       ter in the subject string except (by default) a character  that  signi-
        fies the end of a line.
 
-       When a line ending is defined as a single character, dot never  matches
-       that  character; when the two-character sequence CRLF is used, dot does
-       not match CR if it is immediately followed  by  LF,  but  otherwise  it
-       matches  all characters (including isolated CRs and LFs). When any Uni-
-       code line endings are being recognized, dot does not match CR or LF  or
+       When  a line ending is defined as a single character, dot never matches
+       that character; when the two-character sequence CRLF is used, dot  does
+       not  match  CR  if  it  is immediately followed by LF, but otherwise it
+       matches all characters (including isolated CRs and LFs). When any  Uni-
+       code  line endings are being recognized, dot does not match CR or LF or
        any of the other line ending characters.
 
-       The  behaviour  of  dot  with regard to newlines can be changed. If the
-       PCRE2_DOTALL option is set, a dot matches any  one  character,  without
-       exception.   If  the two-character sequence CRLF is present in the sub-
+       The behaviour of dot with regard to newlines can  be  changed.  If  the
+       PCRE2_DOTALL  option  is  set, a dot matches any one character, without
+       exception.  If the two-character sequence CRLF is present in  the  sub-
        ject string, it takes two dots to match it.
 
-       The handling of dot is entirely independent of the handling of  circum-
-       flex  and  dollar,  the  only relationship being that they both involve
+       The  handling of dot is entirely independent of the handling of circum-
+       flex and dollar, the only relationship being  that  they  both  involve
        newlines. Dot has no special meaning in a character class.
 
-       The escape sequence \N behaves like  a  dot,  except  that  it  is  not
-       affected  by  the  PCRE2_DOTALL  option. In other words, it matches any
-       character except one that signifies the end of a line. Perl  also  uses
+       The  escape  sequence  \N  behaves  like  a  dot, except that it is not
+       affected by the PCRE2_DOTALL option. In other  words,  it  matches  any
+       character  except  one that signifies the end of a line. Perl also uses
        \N to match characters by name; PCRE2 does not support this.
 
 
 MATCHING A SINGLE CODE UNIT
 
-       Outside  a character class, the escape sequence \C matches any one code
-       unit, whether or not a UTF mode is set. In the 8-bit library, one  code
-       unit  is  one  byte;  in the 16-bit library it is a 16-bit unit; in the
-       32-bit library it is a 32-bit unit. Unlike a  dot,  \C  always  matches
-       line-ending  characters.  The  feature  is provided in Perl in order to
+       Outside a character class, the escape sequence \C matches any one  code
+       unit,  whether or not a UTF mode is set. In the 8-bit library, one code
+       unit is one byte; in the 16-bit library it is a  16-bit  unit;  in  the
+       32-bit  library  it  is  a 32-bit unit. Unlike a dot, \C always matches
+       line-ending characters. The feature is provided in  Perl  in  order  to
        match individual bytes in UTF-8 mode, but it is unclear how it can use-
        fully be used.
 
-       Because  \C  breaks  up characters into individual code units, matching
-       one unit with \C in UTF-8 or UTF-16 mode means that  the  rest  of  the
-       string  may  start  with  a malformed UTF character. This has undefined
+       Because \C breaks up characters into individual  code  units,  matching
+       one  unit  with  \C  in UTF-8 or UTF-16 mode means that the rest of the
+       string may start with a malformed UTF  character.  This  has  undefined
        results, because PCRE2 assumes that it is matching character by charac-
-       ter  in  a  valid UTF string (by default it checks the subject string's
-       validity at the  start  of  processing  unless  the  PCRE2_NO_UTF_CHECK
+       ter in a valid UTF string (by default it checks  the  subject  string's
+       validity  at  the  start  of  processing  unless the PCRE2_NO_UTF_CHECK
        option is used).
 
-       An   application   can   lock   out  the  use  of  \C  by  setting  the
-       PCRE2_NEVER_BACKSLASH_C option when compiling a  pattern.  It  is  also
+       An  application  can  lock  out  the  use  of   \C   by   setting   the
+       PCRE2_NEVER_BACKSLASH_C  option  when  compiling  a pattern. It is also
        possible to build PCRE2 with the use of \C permanently disabled.
 
-       PCRE2  does  not allow \C to appear in lookbehind assertions (described
-       below) in UTF-8 or UTF-16 modes, because this would make it  impossible
-       to  calculate  the  length  of  the lookbehind. Neither the alternative
+       PCRE2 does not allow \C to appear in lookbehind  assertions  (described
+       below)  in UTF-8 or UTF-16 modes, because this would make it impossible
+       to calculate the length of  the  lookbehind.  Neither  the  alternative
        matching function pcre2_dfa_match() nor the JIT optimizer support \C in
        these UTF modes.  The former gives a match-time error; the latter fails
        to optimize and so the match is always run using the interpreter.
 
-       In the 32-bit library,  however,  \C  is  always  supported  (when  not
-       explicitly  locked  out)  because it always matches a single code unit,
+       In  the  32-bit  library,  however,  \C  is  always supported (when not
+       explicitly locked out) because it always matches a  single  code  unit,
        whether or not UTF-32 is specified.
 
        In general, the \C escape sequence is best avoided. However, one way of
-       using  it  that avoids the problem of malformed UTF-8 or UTF-16 charac-
-       ters is to use a lookahead to check the length of the  next  character,
-       as  in  this  pattern,  which could be used with a UTF-8 string (ignore
+       using it that avoids the problem of malformed UTF-8 or  UTF-16  charac-
+       ters  is  to use a lookahead to check the length of the next character,
+       as in this pattern, which could be used with  a  UTF-8  string  (ignore
        white space and line breaks):
 
          (?| (?=[\x00-\x7f])(\C) |
@@ -6359,10 +6837,10 @@ MATCHING A SINGLE CODE UNIT
              (?=[\x{800}-\x{ffff}])(\C)(\C)(\C) |
              (?=[\x{10000}-\x{1fffff}])(\C)(\C)(\C)(\C))
 
-       In this example, a group that starts  with  (?|  resets  the  capturing
+       In  this  example,  a  group  that starts with (?| resets the capturing
        parentheses numbers in each alternative (see "Duplicate Subpattern Num-
        bers" below). The assertions at the start of each branch check the next
-       UTF-8  character  for  values  whose encoding uses 1, 2, 3, or 4 bytes,
+       UTF-8 character for values whose encoding uses 1, 2,  3,  or  4  bytes,
        respectively. The character's individual bytes are then captured by the
        appropriate number of \C groups.
 
@@ -6371,48 +6849,67 @@ SQUARE BRACKETS AND CHARACTER CLASSES
 
        An opening square bracket introduces a character class, terminated by a
        closing square bracket. A closing square bracket on its own is not spe-
-       cial  by  default.  If a closing square bracket is required as a member
+       cial by default.  If a closing square bracket is required as  a  member
        of the class, it should be the first data character in the class (after
-       an  initial  circumflex,  if present) or escaped with a backslash. This
-       means that, by default, an empty class cannot be defined.  However,  if
-       the  PCRE2_ALLOW_EMPTY_CLASS option is set, a closing square bracket at
+       an initial circumflex, if present) or escaped with  a  backslash.  This
+       means  that,  by default, an empty class cannot be defined. However, if
+       the PCRE2_ALLOW_EMPTY_CLASS option is set, a closing square bracket  at
        the start does end the (empty) class.
 
-       A character class matches a single character in the subject. A  matched
+       A  character class matches a single character in the subject. A matched
        character must be in the set of characters defined by the class, unless
-       the first character in the class definition is a circumflex,  in  which
+       the  first  character in the class definition is a circumflex, in which
        case the subject character must not be in the set defined by the class.
-       If a circumflex is actually required as a member of the  class,  ensure
+       If  a  circumflex is actually required as a member of the class, ensure
        it is not the first character, or escape it with a backslash.
 
-       For  example, the character class [aeiou] matches any lower case vowel,
-       while [^aeiou] matches any character that is not a  lower  case  vowel.
+       For example, the character class [aeiou] matches any lower case  vowel,
+       while  [^aeiou]  matches  any character that is not a lower case vowel.
        Note that a circumflex is just a convenient notation for specifying the
-       characters that are in the class by enumerating those that are  not.  A
-       class  that starts with a circumflex is not an assertion; it still con-
-       sumes a character from the subject string, and therefore  it  fails  if
+       characters  that  are in the class by enumerating those that are not. A
+       class that starts with a circumflex is not an assertion; it still  con-
+       sumes  a  character  from the subject string, and therefore it fails if
        the current pointer is at the end of the string.
 
-       When  caseless  matching  is set, any letters in a class represent both
-       their upper case and lower case versions, so for  example,  a  caseless
-       [aeiou]  matches  "A"  as well as "a", and a caseless [^aeiou] does not
+       When caseless matching is set, any letters in a  class  represent  both
+       their  upper  case  and lower case versions, so for example, a caseless
+       [aeiou] matches "A" as well as "a", and a caseless  [^aeiou]  does  not
        match "A", whereas a caseful version would.
 
-       Characters that might indicate line breaks are  never  treated  in  any
-       special  way  when  matching  character  classes,  whatever line-ending
-       sequence is in use,  and  whatever  setting  of  the  PCRE2_DOTALL  and
-       PCRE2_MULTILINE  options  is  used. A class such as [^a] always matches
+       Characters  that  might  indicate  line breaks are never treated in any
+       special way  when  matching  character  classes,  whatever  line-ending
+       sequence  is  in  use,  and  whatever  setting  of the PCRE2_DOTALL and
+       PCRE2_MULTILINE options is used. A class such as  [^a]  always  matches
        one of these characters.
 
-       The minus (hyphen) character can be used to specify a range of  charac-
-       ters  in  a  character  class.  For  example,  [d-m] matches any letter
-       between d and m, inclusive. If a  minus  character  is  required  in  a
-       class,  it  must  be  escaped  with a backslash or appear in a position
-       where it cannot be interpreted as indicating a range, typically as  the
+       The  character escape sequences \d, \D, \h, \H, \p, \P, \s, \S, \v, \V,
+       \w, and \W may appear in a character class, and add the characters that
+       they  match to the class. For example, [\dABCDEF] matches any hexadeci-
+       mal digit. In UTF modes, the PCRE2_UCP option affects the  meanings  of
+       \d,  \s,  \w  and  their upper case partners, just as it does when they
+       appear outside a character class, as described in the section  entitled
+       "Generic character types" above. The escape sequence \b has a different
+       meaning inside a character class; it matches the  backspace  character.
+       The  sequences  \B,  \N,  \R, and \X are not special inside a character
+       class. Like any other unrecognized  escape  sequences,  they  cause  an
+       error.
+
+       The  minus (hyphen) character can be used to specify a range of charac-
+       ters in a character  class.  For  example,  [d-m]  matches  any  letter
+       between  d  and  m,  inclusive.  If  a minus character is required in a
+       class, it must be escaped with a backslash  or  appear  in  a  position
+       where  it cannot be interpreted as indicating a range, typically as the
        first or last character in the class, or immediately after a range. For
-       example, [b-d-z] matches letters in the range b to d, a hyphen  charac-
+       example,  [b-d-z] matches letters in the range b to d, a hyphen charac-
        ter, or z.
 
+       Perl treats a hyphen as a literal if it appears before or after a POSIX
+       class (see below) or before or after a character type escape such as as
+       \d or \H.  However, unless the hyphen is  the  last  character  in  the
+       class,  Perl  outputs  a  warning  in its warning mode, as this is most
+       likely a user error. As PCRE2 has no facility for warning, an error  is
+       given in these cases.
+
        It is not possible to have the literal character "]" as the end charac-
        ter of a range. A pattern such as [W-]46] is interpreted as a class  of
        two  characters ("W" and "-") followed by a literal string "46]", so it
@@ -6422,15 +6919,15 @@ SQUARE BRACKETS AND CHARACTER CLASSES
        The  octal or hexadecimal representation of "]" can also be used to end
        a range.
 
-       An error is generated if a POSIX character  class  (see  below)  or  an
-       escape  sequence other than one that defines a single character appears
-       at a point where a range ending character  is  expected.  For  example,
-       [z-\xff] is valid, but [A-\d] and [A-[:digit:]] are not.
-
        Ranges normally include all code points between the start and end char-
-       acters, inclusive. They can also be  used  for  code  points  specified
+       acters,  inclusive.  They  can  also  be used for code points specified
        numerically, for example [\000-\037]. Ranges can include any characters
-       that are valid for the current mode.
+       that  are  valid  for  the current mode. In any UTF mode, the so-called
+       "surrogate" characters (those whose code points lie between 0xd800  and
+       0xdfff  inclusive)  may  not  be  specified  explicitly by default (the
+       PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES option disables this  check).  How-
+       ever, ranges such as [\x{d7ff}-\x{e000}], which include the surrogates,
+       are always permitted.
 
        There is a special case in EBCDIC environments  for  ranges  whose  end
        points are both specified as literal letters in the same case. For com-
@@ -6446,18 +6943,6 @@ SQUARE BRACKETS AND CHARACTER CLASSES
        character  tables  for  a French locale are in use, [\xc8-\xcb] matches
        accented E characters in both cases.
 
-       The character escape sequences \d, \D, \h, \H, \p, \P, \s, \S, \v,  \V,
-       \w, and \W may appear in a character class, and add the characters that
-       they match to the class. For example, [\dABCDEF] matches any  hexadeci-
-       mal  digit.  In UTF modes, the PCRE2_UCP option affects the meanings of
-       \d, \s, \w and their upper case partners, just as  it  does  when  they
-       appear  outside a character class, as described in the section entitled
-       "Generic character types" above. The escape sequence \b has a different
-       meaning  inside  a character class; it matches the backspace character.
-       The sequences \B, \N, \R, and \X are not  special  inside  a  character
-       class.  Like  any  other  unrecognized  escape sequences, they cause an
-       error.
-
        A circumflex can conveniently be used with  the  upper  case  character
        types  to specify a more restricted set of characters than the matching
        lower case type.  For example, the class [^\W_] matches any  letter  or
@@ -6594,20 +7079,26 @@ VERTICAL BAR
 
 INTERNAL OPTION SETTING
 
-       The  settings of the PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL, and
-       PCRE2_EXTENDED options (which are Perl-compatible) can be changed  from
-       within  the  pattern  by  a  sequence  of  Perl option letters enclosed
-       between "(?" and ")".  The option letters are
+       The  settings  of  the  PCRE2_CASELESS,  PCRE2_MULTILINE, PCRE2_DOTALL,
+       PCRE2_EXTENDED, PCRE2_EXTENDED_MORE, and PCRE2_NO_AUTO_CAPTURE  options
+       (which are Perl-compatible) can be changed from within the pattern by a
+       sequence of Perl option letters enclosed  between  "(?"  and  ")".  The
+       option letters are
 
          i  for PCRE2_CASELESS
          m  for PCRE2_MULTILINE
+         n  for PCRE2_NO_AUTO_CAPTURE
          s  for PCRE2_DOTALL
          x  for PCRE2_EXTENDED
+         xx for PCRE2_EXTENDED_MORE
 
        For example, (?im) sets caseless, multiline matching. It is also possi-
-       ble to unset these options by preceding the letter with a hyphen, and a
-       combined setting and unsetting such as (?im-sx), which sets PCRE2_CASE-
-       LESS    and    PCRE2_MULTILINE   while   unsetting   PCRE2_DOTALL   and
+       ble to unset these options by preceding the letter with a  hyphen.  The
+       two  "extended"  options are not independent; unsetting either one can-
+       cels the effects of both of them.
+
+       A  combined  setting  and  unsetting  such  as  (?im-sx),  which   sets
+       PCRE2_CASELESS  and  PCRE2_MULTILINE  while  unsetting PCRE2_DOTALL and
        PCRE2_EXTENDED, is also permitted. If a letter appears both before  and
        after  the  hyphen, the option is unset. An empty options setting "(?)"
        is allowed. Needless to say, it has no effect.
@@ -6618,32 +7109,27 @@ INTERNAL OPTION SETTING
 
        When one of these option changes occurs at  top  level  (that  is,  not
        inside  subpattern parentheses), the change applies to the remainder of
-       the pattern that follows. If the change is placed right at the start of
-       a  pattern,  PCRE2  extracts  it  into  the global options (and it will
-       therefore show up in data extracted by the  pcre2_pattern_info()  func-
-       tion).
-
-       An  option  change  within a subpattern (see below for a description of
-       subpatterns) affects only that part of the subpattern that follows  it,
-       so
+       the pattern that follows. An option change  within  a  subpattern  (see
+       below  for  a description of subpatterns) affects only that part of the
+       subpattern that follows it, so
 
          (a(?i)b)c
 
-       matches  abc  and  aBc and no other strings (assuming PCRE2_CASELESS is
-       not used).  By this means, options can be made to have  different  set-
+       matches abc and aBc and no other strings  (assuming  PCRE2_CASELESS  is
+       not  used).   By this means, options can be made to have different set-
        tings in different parts of the pattern. Any changes made in one alter-
        native do carry on into subsequent branches within the same subpattern.
        For example,
 
          (a(?i)b|c)
 
-       matches  "ab",  "aB",  "c",  and "C", even though when matching "C" the
-       first branch is abandoned before the option setting.  This  is  because
-       the  effects  of option settings happen at compile time. There would be
+       matches "ab", "aB", "c", and "C", even though  when  matching  "C"  the
+       first  branch  is  abandoned before the option setting. This is because
+       the effects of option settings happen at compile time. There  would  be
        some very weird behaviour otherwise.
 
-       As a convenient shorthand, if any option settings are required  at  the
-       start  of a non-capturing subpattern (see the next section), the option
+       As  a  convenient shorthand, if any option settings are required at the
+       start of a non-capturing subpattern (see the next section), the  option
        letters may appear between the "?" and the ":". Thus the two patterns
 
          (?i:saturday|sunday)
@@ -6651,14 +7137,14 @@ INTERNAL OPTION SETTING
 
        match exactly the same set of strings.
 
-       Note: There are other PCRE2-specific options that can  be  set  by  the
+       Note:  There  are  other  PCRE2-specific options that can be set by the
        application when the compiling function is called. The pattern can con-
-       tain special leading sequences such as (*CRLF)  to  override  what  the
-       application  has  set  or what has been defaulted. Details are given in
-       the section entitled "Newline sequences"  above.  There  are  also  the
-       (*UTF)  and  (*UCP)  leading  sequences that can be used to set UTF and
-       Unicode property modes; they are equivalent to  setting  the  PCRE2_UTF
-       and  PCRE2_UCP  options, respectively. However, the application can set
+       tain  special  leading  sequences  such as (*CRLF) to override what the
+       application has set or what has been defaulted. Details  are  given  in
+       the  section  entitled  "Newline  sequences"  above. There are also the
+       (*UTF) and (*UCP) leading sequences that can be used  to  set  UTF  and
+       Unicode  property  modes;  they are equivalent to setting the PCRE2_UTF
+       and PCRE2_UCP options, respectively. However, the application  can  set
        the PCRE2_NEVER_UTF and PCRE2_NEVER_UCP options, which lock out the use
        of the (*UTF) and (*UCP) sequences.
 
@@ -6672,18 +7158,18 @@ SUBPATTERNS
 
          cat(aract|erpillar|)
 
-       matches "cataract", "caterpillar", or "cat". Without  the  parentheses,
+       matches  "cataract",  "caterpillar", or "cat". Without the parentheses,
        it would match "cataract", "erpillar" or an empty string.
 
-       2.  It  sets  up  the  subpattern as a capturing subpattern. This means
+       2. It sets up the subpattern as  a  capturing  subpattern.  This  means
        that, when the whole pattern matches, the portion of the subject string
-       that  matched  the  subpattern is passed back to the caller, separately
-       from the portion that matched the whole pattern. (This applies only  to
-       the  traditional  matching function; the DFA matching function does not
+       that matched the subpattern is passed back to  the  caller,  separately
+       from  the portion that matched the whole pattern. (This applies only to
+       the traditional matching function; the DFA matching function  does  not
        support capturing.)
 
        Opening parentheses are counted from left to right (starting from 1) to
-       obtain  numbers  for  the  capturing  subpatterns.  For example, if the
+       obtain numbers for the  capturing  subpatterns.  For  example,  if  the
        string "the red king" is matched against the pattern
 
          the ((red|white) (king|queen))
@@ -6691,12 +7177,12 @@ SUBPATTERNS
        the captured substrings are "red king", "red", and "king", and are num-
        bered 1, 2, and 3, respectively.
 
-       The  fact  that  plain  parentheses  fulfil two functions is not always
-       helpful.  There are often times when a grouping subpattern is  required
-       without  a capturing requirement. If an opening parenthesis is followed
-       by a question mark and a colon, the subpattern does not do any  captur-
-       ing,  and  is  not  counted when computing the number of any subsequent
-       capturing subpatterns. For example, if the string "the white queen"  is
+       The fact that plain parentheses fulfil  two  functions  is  not  always
+       helpful.   There are often times when a grouping subpattern is required
+       without a capturing requirement. If an opening parenthesis is  followed
+       by  a question mark and a colon, the subpattern does not do any captur-
+       ing, and is not counted when computing the  number  of  any  subsequent
+       capturing  subpatterns. For example, if the string "the white queen" is
        matched against the pattern
 
          the ((?:red|white) (king|queen))
@@ -6704,37 +7190,37 @@ SUBPATTERNS
        the captured substrings are "white queen" and "queen", and are numbered
        1 and 2. The maximum number of capturing subpatterns is 65535.
 
-       As a convenient shorthand, if any option settings are required  at  the
-       start  of  a  non-capturing  subpattern,  the option letters may appear
+       As  a  convenient shorthand, if any option settings are required at the
+       start of a non-capturing subpattern,  the  option  letters  may  appear
        between the "?" and the ":". Thus the two patterns
 
          (?i:saturday|sunday)
          (?:(?i)saturday|sunday)
 
        match exactly the same set of strings. Because alternative branches are
-       tried  from  left  to right, and options are not reset until the end of
-       the subpattern is reached, an option setting in one branch does  affect
-       subsequent  branches,  so  the above patterns match "SUNDAY" as well as
+       tried from left to right, and options are not reset until  the  end  of
+       the  subpattern is reached, an option setting in one branch does affect
+       subsequent branches, so the above patterns match "SUNDAY"  as  well  as
        "Saturday".
 
 
 DUPLICATE SUBPATTERN NUMBERS
 
        Perl 5.10 introduced a feature whereby each alternative in a subpattern
-       uses  the same numbers for its capturing parentheses. Such a subpattern
-       starts with (?| and is itself a non-capturing subpattern. For  example,
+       uses the same numbers for its capturing parentheses. Such a  subpattern
+       starts  with (?| and is itself a non-capturing subpattern. For example,
        consider this pattern:
 
          (?|(Sat)ur|(Sun))day
 
-       Because  the two alternatives are inside a (?| group, both sets of cap-
-       turing parentheses are numbered one. Thus, when  the  pattern  matches,
-       you  can  look  at captured substring number one, whichever alternative
-       matched. This construct is useful when you want to  capture  part,  but
+       Because the two alternatives are inside a (?| group, both sets of  cap-
+       turing  parentheses  are  numbered one. Thus, when the pattern matches,
+       you can look at captured substring number  one,  whichever  alternative
+       matched.  This  construct  is useful when you want to capture part, but
        not all, of one of a number of alternatives. Inside a (?| group, paren-
-       theses are numbered as usual, but the number is reset at the  start  of
-       each  branch.  The numbers of any capturing parentheses that follow the
-       subpattern start after the highest number used in any branch. The  fol-
+       theses  are  numbered as usual, but the number is reset at the start of
+       each branch. The numbers of any capturing parentheses that  follow  the
+       subpattern  start after the highest number used in any branch. The fol-
        lowing example is taken from the Perl documentation. The numbers under-
        neath show in which buffer the captured content will be stored.
 
@@ -6742,14 +7228,14 @@ DUPLICATE SUBPATTERN NUMBERS
          / ( a )  (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x
          # 1            2         2  3        2     3     4
 
-       A back reference to a numbered subpattern uses the  most  recent  value
-       that  is  set  for that number by any subpattern. The following pattern
+       A  back  reference  to a numbered subpattern uses the most recent value
+       that is set for that number by any subpattern.  The  following  pattern
        matches "abcabc" or "defdef":
 
          /(?|(abc)|(def))\1/
 
-       In contrast, a subroutine call to a numbered subpattern  always  refers
-       to  the  first  one in the pattern with the given number. The following
+       In  contrast,  a subroutine call to a numbered subpattern always refers
+       to the first one in the pattern with the given  number.  The  following
        pattern matches "abcabc" or "defabc":
 
          /(?|(abc)|(def))(?1)/
@@ -6757,47 +7243,47 @@ DUPLICATE SUBPATTERN NUMBERS
        A relative reference such as (?-1) is no different: it is just a conve-
        nient way of computing an absolute group number.
 
-       If  a condition test for a subpattern's having matched refers to a non-
-       unique number, the test is true if any of the subpatterns of that  num-
+       If a condition test for a subpattern's having matched refers to a  non-
+       unique  number, the test is true if any of the subpatterns of that num-
        ber have matched.
 
-       An  alternative approach to using this "branch reset" feature is to use
+       An alternative approach to using this "branch reset" feature is to  use
        duplicate named subpatterns, as described in the next section.
 
 
 NAMED SUBPATTERNS
 
-       Identifying capturing parentheses by number is simple, but  it  can  be
-       very  hard  to keep track of the numbers in complicated regular expres-
-       sions. Furthermore, if an  expression  is  modified,  the  numbers  may
+       Identifying  capturing  parentheses  by number is simple, but it can be
+       very hard to keep track of the numbers in complicated  regular  expres-
+       sions.  Furthermore,  if  an  expression  is  modified, the numbers may
        change. To help with this difficulty, PCRE2 supports the naming of sub-
        patterns. This feature was not added to Perl until release 5.10. Python
-       had  the feature earlier, and PCRE1 introduced it at release 4.0, using
-       the Python syntax. PCRE2 supports both the Perl and the Python  syntax.
-       Perl  allows  identically numbered subpatterns to have different names,
+       had the feature earlier, and PCRE1 introduced it at release 4.0,  using
+       the  Python syntax. PCRE2 supports both the Perl and the Python syntax.
+       Perl allows identically numbered subpatterns to have  different  names,
        but PCRE2 does not.
 
-       In PCRE2, a subpattern can be named in one of three ways:  (?<name>...)
-       or  (?'name'...)  as in Perl, or (?P<name>...) as in Python. References
-       to capturing parentheses from other parts of the pattern, such as  back
-       references,  recursion,  and conditions, can be made by name as well as
+       In  PCRE2, a subpattern can be named in one of three ways: (?<name>...)
+       or (?'name'...) as in Perl, or (?P<name>...) as in  Python.  References
+       to  capturing parentheses from other parts of the pattern, such as back
+       references, recursion, and conditions, can be made by name as  well  as
        by number.
 
-       Names consist of up to 32 alphanumeric characters and underscores,  but
-       must  start  with  a  non-digit.  Named capturing parentheses are still
-       allocated numbers as well as names, exactly as if the  names  were  not
+       Names  consist of up to 32 alphanumeric characters and underscores, but
+       must start with a non-digit.  Named  capturing  parentheses  are  still
+       allocated  numbers  as  well as names, exactly as if the names were not
        present. The PCRE2 API provides function calls for extracting the name-
-       to-number translation table from a compiled  pattern.  There  are  also
+       to-number  translation  table  from  a compiled pattern. There are also
        convenience functions for extracting a captured substring by name.
 
-       By  default, a name must be unique within a pattern, but it is possible
-       to relax this constraint by setting the PCRE2_DUPNAMES option  at  com-
-       pile  time.  (Duplicate names are also always permitted for subpatterns
-       with the same number, set up as described  in  the  previous  section.)
-       Duplicate  names  can be useful for patterns where only one instance of
+       By default, a name must be unique within a pattern, but it is  possible
+       to  relax  this constraint by setting the PCRE2_DUPNAMES option at com-
+       pile time.  (Duplicate names are also always permitted for  subpatterns
+       with  the  same  number,  set up as described in the previous section.)
+       Duplicate names can be useful for patterns where only one  instance  of
        the named parentheses can match.  Suppose you want to match the name of
-       a  weekday,  either as a 3-letter abbreviation or as the full name, and
-       in both cases you  want  to  extract  the  abbreviation.  This  pattern
+       a weekday, either as a 3-letter abbreviation or as the full  name,  and
+       in  both  cases  you  want  to  extract  the abbreviation. This pattern
        (ignoring the line breaks) does the job:
 
          (?<DN>Mon|Fri|Sun)(?:day)?|
@@ -6806,18 +7292,18 @@ NAMED SUBPATTERNS
          (?<DN>Thu)(?:rsday)?|
          (?<DN>Sat)(?:urday)?
 
-       There  are  five capturing substrings, but only one is ever set after a
+       There are five capturing substrings, but only one is ever set  after  a
        match.  (An alternative way of solving this problem is to use a "branch
        reset" subpattern, as described in the previous section.)
 
-       The  convenience  functions for extracting the data by name returns the
-       substring for the first (and in this example, the only)  subpattern  of
-       that  name  that  matched.  This saves searching to find which numbered
+       The convenience functions for extracting the data by name  returns  the
+       substring  for  the first (and in this example, the only) subpattern of
+       that name that matched. This saves searching  to  find  which  numbered
        subpattern it was.
 
-       If you make a back reference to  a  non-unique  named  subpattern  from
-       elsewhere  in the pattern, the subpatterns to which the name refers are
-       checked in the order in which they appear in the overall  pattern.  The
+       If  you  make  a  back  reference to a non-unique named subpattern from
+       elsewhere in the pattern, the subpatterns to which the name refers  are
+       checked  in  the order in which they appear in the overall pattern. The
        first one that is set is used for the reference. For example, this pat-
        tern matches both "foofoo" and "barbar" but not "foobar" or "barfoo":
 
@@ -6825,29 +7311,29 @@ NAMED SUBPATTERNS
 
 
        If you make a subroutine call to a non-unique named subpattern, the one
-       that  corresponds  to  the first occurrence of the name is used. In the
+       that corresponds to the first occurrence of the name is  used.  In  the
        absence of duplicate numbers (see the previous section) this is the one
        with the lowest number.
 
        If you use a named reference in a condition test (see the section about
        conditions below), either to check whether a subpattern has matched, or
-       to  check for recursion, all subpatterns with the same name are tested.
-       If the condition is true for any one of them, the overall condition  is
-       true.  This  is  the  same  behaviour as testing by number. For further
-       details of the interfaces  for  handling  named  subpatterns,  see  the
+       to check for recursion, all subpatterns with the same name are  tested.
+       If  the condition is true for any one of them, the overall condition is
+       true. This is the same behaviour as  testing  by  number.  For  further
+       details  of  the  interfaces  for  handling  named subpatterns, see the
        pcre2api documentation.
 
        Warning: You cannot use different names to distinguish between two sub-
-       patterns with the same number because PCRE2 uses only the numbers  when
+       patterns  with the same number because PCRE2 uses only the numbers when
        matching. For this reason, an error is given at compile time if differ-
-       ent names are given to subpatterns with the same number.  However,  you
+       ent  names  are given to subpatterns with the same number. However, you
        can always give the same name to subpatterns with the same number, even
        when PCRE2_DUPNAMES is not set.
 
 
 REPETITION
 
-       Repetition is specified by quantifiers, which can  follow  any  of  the
+       Repetition  is  specified  by  quantifiers, which can follow any of the
        following items:
 
          a literal data character
@@ -6861,17 +7347,17 @@ REPETITION
          a parenthesized subpattern (including most assertions)
          a subroutine call to a subpattern (recursive or otherwise)
 
-       The  general repetition quantifier specifies a minimum and maximum num-
-       ber of permitted matches, by giving the two numbers in  curly  brackets
-       (braces),  separated  by  a comma. The numbers must be less than 65536,
+       The general repetition quantifier specifies a minimum and maximum  num-
+       ber  of  permitted matches, by giving the two numbers in curly brackets
+       (braces), separated by a comma. The numbers must be  less  than  65536,
        and the first must be less than or equal to the second. For example:
 
          z{2,4}
 
-       matches "zz", "zzz", or "zzzz". A closing brace on its  own  is  not  a
-       special  character.  If  the second number is omitted, but the comma is
-       present, there is no upper limit; if the second number  and  the  comma
-       are  both omitted, the quantifier specifies an exact number of required
+       matches  "zz",  "zzz",  or  "zzzz". A closing brace on its own is not a
+       special character. If the second number is omitted, but  the  comma  is
+       present,  there  is  no upper limit; if the second number and the comma
+       are both omitted, the quantifier specifies an exact number of  required
        matches. Thus
 
          [aeiou]{3,}
@@ -6880,50 +7366,50 @@ REPETITION
 
          \d{8}
 
-       matches exactly 8 digits. An opening curly bracket that  appears  in  a
-       position  where a quantifier is not allowed, or one that does not match
-       the syntax of a quantifier, is taken as a literal character. For  exam-
+       matches  exactly  8  digits. An opening curly bracket that appears in a
+       position where a quantifier is not allowed, or one that does not  match
+       the  syntax of a quantifier, is taken as a literal character. For exam-
        ple, {,6} is not a quantifier, but a literal string of four characters.
 
        In UTF modes, quantifiers apply to characters rather than to individual
-       code units. Thus, for example, \x{100}{2} matches two characters,  each
+       code  units. Thus, for example, \x{100}{2} matches two characters, each
        of which is represented by a two-byte sequence in a UTF-8 string. Simi-
-       larly, \X{3} matches three Unicode extended grapheme clusters, each  of
-       which  may  be  several  code  units long (and they may be of different
+       larly,  \X{3} matches three Unicode extended grapheme clusters, each of
+       which may be several code units long (and  they  may  be  of  different
        lengths).
 
        The quantifier {0} is permitted, causing the expression to behave as if
        the previous item and the quantifier were not present. This may be use-
-       ful for subpatterns that are referenced as subroutines  from  elsewhere
+       ful  for  subpatterns that are referenced as subroutines from elsewhere
        in the pattern (but see also the section entitled "Defining subpatterns
-       for use by reference only" below). Items other  than  subpatterns  that
+       for  use  by  reference only" below). Items other than subpatterns that
        have a {0} quantifier are omitted from the compiled pattern.
 
-       For  convenience, the three most common quantifiers have single-charac-
+       For convenience, the three most common quantifiers have  single-charac-
        ter abbreviations:
 
          *    is equivalent to {0,}
          +    is equivalent to {1,}
          ?    is equivalent to {0,1}
 
-       It is possible to construct infinite loops by  following  a  subpattern
+       It  is  possible  to construct infinite loops by following a subpattern
        that can match no characters with a quantifier that has no upper limit,
        for example:
 
          (a?)*
 
-       Earlier versions of Perl and PCRE1 used to give  an  error  at  compile
+       Earlier  versions  of  Perl  and PCRE1 used to give an error at compile
        time for such patterns. However, because there are cases where this can
        be useful, such patterns are now accepted, but if any repetition of the
-       subpattern  does in fact match no characters, the loop is forcibly bro-
+       subpattern does in fact match no characters, the loop is forcibly  bro-
        ken.
 
-       By default, the quantifiers are "greedy", that is, they match  as  much
-       as  possible  (up  to  the  maximum number of permitted times), without
-       causing the rest of the pattern to fail. The classic example  of  where
+       By  default,  the quantifiers are "greedy", that is, they match as much
+       as possible (up to the maximum  number  of  permitted  times),  without
+       causing  the  rest of the pattern to fail. The classic example of where
        this gives problems is in trying to match comments in C programs. These
-       appear between /* and */ and within the comment,  individual  *  and  /
-       characters  may  appear. An attempt to match C comments by applying the
+       appear  between  /*  and  */ and within the comment, individual * and /
+       characters may appear. An attempt to match C comments by  applying  the
        pattern
 
          /\*.*\*/
@@ -6932,19 +7418,19 @@ REPETITION
 
          /* first comment */  not comment  /* second comment */
 
-       fails, because it matches the entire string owing to the greediness  of
+       fails,  because it matches the entire string owing to the greediness of
        the .*  item.
 
        If a quantifier is followed by a question mark, it ceases to be greedy,
-       and instead matches the minimum number of times possible, so  the  pat-
+       and  instead  matches the minimum number of times possible, so the pat-
        tern
 
          /\*.*?\*/
 
-       does  the  right  thing with the C comments. The meaning of the various
-       quantifiers is not otherwise changed,  just  the  preferred  number  of
-       matches.   Do  not  confuse this use of question mark with its use as a
-       quantifier in its own right. Because it has two uses, it can  sometimes
+       does the right thing with the C comments. The meaning  of  the  various
+       quantifiers  is  not  otherwise  changed,  just the preferred number of
+       matches.  Do not confuse this use of question mark with its  use  as  a
+       quantifier  in its own right. Because it has two uses, it can sometimes
        appear doubled, as in
 
          \d??\d
@@ -6953,45 +7439,45 @@ REPETITION
        only way the rest of the pattern matches.
 
        If the PCRE2_UNGREEDY option is set (an option that is not available in
-       Perl),  the  quantifiers are not greedy by default, but individual ones
-       can be made greedy by following them with a  question  mark.  In  other
+       Perl), the quantifiers are not greedy by default, but  individual  ones
+       can  be  made  greedy  by following them with a question mark. In other
        words, it inverts the default behaviour.
 
-       When  a  parenthesized  subpattern  is quantified with a minimum repeat
-       count that is greater than 1 or with a limited maximum, more memory  is
-       required  for  the  compiled  pattern, in proportion to the size of the
+       When a parenthesized subpattern is quantified  with  a  minimum  repeat
+       count  that is greater than 1 or with a limited maximum, more memory is
+       required for the compiled pattern, in proportion to  the  size  of  the
        minimum or maximum.
 
-       If a pattern starts with  .*  or  .{0,}  and  the  PCRE2_DOTALL  option
-       (equivalent  to  Perl's /s) is set, thus allowing the dot to match new-
-       lines, the pattern is implicitly  anchored,  because  whatever  follows
-       will  be  tried against every character position in the subject string,
-       so there is no point in retrying the  overall  match  at  any  position
+       If  a  pattern  starts  with  .*  or  .{0,} and the PCRE2_DOTALL option
+       (equivalent to Perl's /s) is set, thus allowing the dot to  match  new-
+       lines,  the  pattern  is  implicitly anchored, because whatever follows
+       will be tried against every character position in the  subject  string,
+       so  there  is  no  point  in retrying the overall match at any position
        after the first. PCRE2 normally treats such a pattern as though it were
        preceded by \A.
 
-       In cases where it is known that the subject  string  contains  no  new-
-       lines,  it  is worth setting PCRE2_DOTALL in order to obtain this opti-
+       In  cases  where  it  is known that the subject string contains no new-
+       lines, it is worth setting PCRE2_DOTALL in order to obtain  this  opti-
        mization, or alternatively, using ^ to indicate anchoring explicitly.
 
-       However, there are some cases where the optimization  cannot  be  used.
+       However,  there  are  some cases where the optimization cannot be used.
        When .*  is inside capturing parentheses that are the subject of a back
        reference elsewhere in the pattern, a match at the start may fail where
        a later one succeeds. Consider, for example:
 
          (.*)abc\1
 
-       If  the subject is "xyz123abc123" the match point is the fourth charac-
+       If the subject is "xyz123abc123" the match point is the fourth  charac-
        ter. For this reason, such a pattern is not implicitly anchored.
 
-       Another case where implicit anchoring is not applied is when the  lead-
-       ing  .* is inside an atomic group. Once again, a match at the start may
+       Another  case where implicit anchoring is not applied is when the lead-
+       ing .* is inside an atomic group. Once again, a match at the start  may
        fail where a later one succeeds. Consider this pattern:
 
          (?>.*?a)b
 
-       It matches "ab" in the subject "aab". The use of the backtracking  con-
-       trol  verbs  (*PRUNE)  and  (*SKIP) also disable this optimization, and
+       It  matches "ab" in the subject "aab". The use of the backtracking con-
+       trol verbs (*PRUNE) and (*SKIP) also  disable  this  optimization,  and
        there is an option, PCRE2_NO_DOTSTAR_ANCHOR, to do so explicitly.
 
        When a capturing subpattern is repeated, the value captured is the sub-
@@ -7000,8 +7486,8 @@ REPETITION
          (tweedle[dume]{3}\s*)+
 
        has matched "tweedledum tweedledee" the value of the captured substring
-       is "tweedledee". However, if there are  nested  capturing  subpatterns,
-       the  corresponding captured values may have been set in previous itera-
+       is  "tweedledee".  However,  if there are nested capturing subpatterns,
+       the corresponding captured values may have been set in previous  itera-
        tions. For example, after
 
          (a|(b))+
@@ -7011,53 +7497,53 @@ REPETITION
 
 ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS
 
-       With both maximizing ("greedy") and minimizing ("ungreedy"  or  "lazy")
-       repetition,  failure  of what follows normally causes the repeated item
-       to be re-evaluated to see if a different number of repeats  allows  the
-       rest  of  the pattern to match. Sometimes it is useful to prevent this,
-       either to change the nature of the match, or to cause it  fail  earlier
-       than  it otherwise might, when the author of the pattern knows there is
+       With  both  maximizing ("greedy") and minimizing ("ungreedy" or "lazy")
+       repetition, failure of what follows normally causes the  repeated  item
+       to  be  re-evaluated to see if a different number of repeats allows the
+       rest of the pattern to match. Sometimes it is useful to  prevent  this,
+       either  to  change the nature of the match, or to cause it fail earlier
+       than it otherwise might, when the author of the pattern knows there  is
        no point in carrying on.
 
-       Consider, for example, the pattern \d+foo when applied to  the  subject
+       Consider,  for  example, the pattern \d+foo when applied to the subject
        line
 
          123456bar
 
        After matching all 6 digits and then failing to match "foo", the normal
-       action of the matcher is to try again with only 5 digits  matching  the
-       \d+  item,  and  then  with  4,  and  so on, before ultimately failing.
-       "Atomic grouping" (a term taken from Jeffrey  Friedl's  book)  provides
-       the  means for specifying that once a subpattern has matched, it is not
+       action  of  the matcher is to try again with only 5 digits matching the
+       \d+ item, and then with  4,  and  so  on,  before  ultimately  failing.
+       "Atomic  grouping"  (a  term taken from Jeffrey Friedl's book) provides
+       the means for specifying that once a subpattern has matched, it is  not
        to be re-evaluated in this way.
 
-       If we use atomic grouping for the previous example, the  matcher  gives
-       up  immediately  on failing to match "foo" the first time. The notation
+       If  we  use atomic grouping for the previous example, the matcher gives
+       up immediately on failing to match "foo" the first time.  The  notation
        is a kind of special parenthesis, starting with (?> as in this example:
 
          (?>\d+)foo
 
-       This kind of parenthesis "locks up" the  part of the  pattern  it  con-
-       tains  once  it  has matched, and a failure further into the pattern is
-       prevented from backtracking into it. Backtracking past it  to  previous
+       This  kind  of  parenthesis "locks up" the  part of the pattern it con-
+       tains once it has matched, and a failure further into  the  pattern  is
+       prevented  from  backtracking into it. Backtracking past it to previous
        items, however, works as normal.
 
-       An  alternative  description  is that a subpattern of this type matches
-       exactly the string of characters that an identical  standalone  pattern
+       An alternative description is that a subpattern of  this  type  matches
+       exactly  the  string of characters that an identical standalone pattern
        would match, if anchored at the current point in the subject string.
 
        Atomic grouping subpatterns are not capturing subpatterns. Simple cases
        such as the above example can be thought of as a maximizing repeat that
-       must  swallow  everything  it can. So, while both \d+ and \d+? are pre-
-       pared to adjust the number of digits they match in order  to  make  the
+       must swallow everything it can. So, while both \d+ and  \d+?  are  pre-
+       pared  to  adjust  the number of digits they match in order to make the
        rest of the pattern match, (?>\d+) can only match an entire sequence of
        digits.
 
-       Atomic groups in general can of course contain arbitrarily  complicated
-       subpatterns,  and  can  be  nested. However, when the subpattern for an
+       Atomic  groups in general can of course contain arbitrarily complicated
+       subpatterns, and can be nested. However, when  the  subpattern  for  an
        atomic group is just a single repeated item, as in the example above, a
-       simpler  notation,  called  a "possessive quantifier" can be used. This
-       consists of an additional + character  following  a  quantifier.  Using
+       simpler notation, called a "possessive quantifier" can  be  used.  This
+       consists  of  an  additional  + character following a quantifier. Using
        this notation, the previous example can be rewritten as
 
          \d++foo
@@ -7067,46 +7553,46 @@ ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS
 
          (abc|xyz){2,3}+
 
-       Possessive  quantifiers  are  always  greedy;  the   setting   of   the
-       PCRE2_UNGREEDY  option  is  ignored. They are a convenient notation for
-       the simpler forms of atomic group. However, there is no  difference  in
+       Possessive   quantifiers   are   always  greedy;  the  setting  of  the
+       PCRE2_UNGREEDY option is ignored. They are a  convenient  notation  for
+       the  simpler  forms of atomic group. However, there is no difference in
        the meaning of a possessive quantifier and the equivalent atomic group,
-       though there may be a performance  difference;  possessive  quantifiers
+       though  there  may  be a performance difference; possessive quantifiers
        should be slightly faster.
 
-       The  possessive  quantifier syntax is an extension to the Perl 5.8 syn-
-       tax.  Jeffrey Friedl originated the idea (and the name)  in  the  first
+       The possessive quantifier syntax is an extension to the Perl  5.8  syn-
+       tax.   Jeffrey  Friedl  originated the idea (and the name) in the first
        edition of his book. Mike McCloskey liked it, so implemented it when he
        built Sun's Java package, and PCRE1 copied it from there. It ultimately
        found its way into Perl at release 5.10.
 
-       PCRE2  has  an  optimization  that automatically "possessifies" certain
-       simple pattern constructs. For example, the sequence A+B is treated  as
-       A++B  because  there is no point in backtracking into a sequence of A's
+       PCRE2 has an optimization  that  automatically  "possessifies"  certain
+       simple  pattern constructs. For example, the sequence A+B is treated as
+       A++B because there is no point in backtracking into a sequence  of  A's
        when B must follow.  This feature can be disabled by the PCRE2_NO_AUTO-
        POSSESS option, or starting the pattern with (*NO_AUTO_POSSESS).
 
-       When  a  pattern  contains an unlimited repeat inside a subpattern that
-       can itself be repeated an unlimited number of  times,  the  use  of  an
-       atomic  group  is  the  only way to avoid some failing matches taking a
+       When a pattern contains an unlimited repeat inside  a  subpattern  that
+       can  itself  be  repeated  an  unlimited number of times, the use of an
+       atomic group is the only way to avoid some  failing  matches  taking  a
        very long time indeed. The pattern
 
          (\D+|<\d+>)*[!?]
 
-       matches an unlimited number of substrings that either consist  of  non-
-       digits,  or  digits  enclosed in <>, followed by either ! or ?. When it
+       matches  an  unlimited number of substrings that either consist of non-
+       digits, or digits enclosed in <>, followed by either ! or  ?.  When  it
        matches, it runs quickly. However, if it is applied to
 
          aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
-       it takes a long time before reporting  failure.  This  is  because  the
-       string  can be divided between the internal \D+ repeat and the external
-       * repeat in a large number of ways, and all  have  to  be  tried.  (The
-       example  uses  [!?]  rather than a single character at the end, because
-       both PCRE2 and Perl have an optimization that allows for  fast  failure
-       when  a single character is used. They remember the last single charac-
-       ter that is required for a match, and fail early if it is  not  present
-       in  the  string.)  If  the pattern is changed so that it uses an atomic
+       it  takes  a  long  time  before reporting failure. This is because the
+       string can be divided between the internal \D+ repeat and the  external
+       *  repeat  in  a  large  number of ways, and all have to be tried. (The
+       example uses [!?] rather than a single character at  the  end,  because
+       both  PCRE2  and Perl have an optimization that allows for fast failure
+       when a single character is used. They remember the last single  charac-
+       ter  that  is required for a match, and fail early if it is not present
+       in the string.) If the pattern is changed so that  it  uses  an  atomic
        group, like this:
 
          ((?>\D+)|<\d+>)*[!?]
@@ -7118,71 +7604,75 @@ BACK REFERENCES
 
        Outside a character class, a backslash followed by a digit greater than
        0 (and possibly further digits) is a back reference to a capturing sub-
-       pattern earlier (that is, to its left) in the pattern,  provided  there
+       pattern  earlier  (that is, to its left) in the pattern, provided there
        have been that many previous capturing left parentheses.
 
-       However,  if the decimal number following the backslash is less than 8,
-       it is always taken as a back reference, and causes  an  error  only  if
-       there  are  not that many capturing left parentheses in the entire pat-
-       tern. In other words, the parentheses that are referenced need  not  be
-       to  the  left of the reference for numbers less than 8. A "forward back
-       reference" of this type can make sense when a  repetition  is  involved
-       and  the  subpattern to the right has participated in an earlier itera-
+       However, if the decimal number following the backslash is less than  8,
+       it  is  always  taken  as a back reference, and causes an error only if
+       there are not that many capturing left parentheses in the  entire  pat-
+       tern.  In  other words, the parentheses that are referenced need not be
+       to the left of the reference for numbers less than 8. A  "forward  back
+       reference"  of  this  type can make sense when a repetition is involved
+       and the subpattern to the right has participated in an  earlier  itera-
        tion.
 
-       It is not possible to have a numerical "forward back  reference"  to  a
-       subpattern  whose  number  is  8  or  more  using this syntax because a
-       sequence such as \50 is interpreted as a character  defined  in  octal.
+       It  is  not  possible to have a numerical "forward back reference" to a
+       subpattern whose number is 8  or  more  using  this  syntax  because  a
+       sequence  such  as  \50 is interpreted as a character defined in octal.
        See the subsection entitled "Non-printing characters" above for further
-       details of the handling of digits following a backslash.  There  is  no
-       such  problem  when named parentheses are used. A back reference to any
+       details  of  the  handling of digits following a backslash. There is no
+       such problem when named parentheses are used. A back reference  to  any
        subpattern is possible using named parentheses (see below).
 
-       Another way of avoiding the ambiguity inherent in  the  use  of  digits
-       following  a  backslash  is  to use the \g escape sequence. This escape
-       must be followed by an unsigned number or a negative number, optionally
-       enclosed in braces. These examples are all identical:
+       Another  way  of  avoiding  the ambiguity inherent in the use of digits
+       following a backslash is to use the \g  escape  sequence.  This  escape
+       must be followed by a signed or unsigned number, optionally enclosed in
+       braces. These examples are all identical:
 
          (ring), \1
          (ring), \g1
          (ring), \g{1}
 
-       An  unsigned number specifies an absolute reference without the ambigu-
+       An unsigned number specifies an absolute reference without the  ambigu-
        ity that is present in the older syntax. It is also useful when literal
-       digits follow the reference. A negative number is a relative reference.
+       digits follow the reference. A signed number is a  relative  reference.
        Consider this example:
 
          (abc(def)ghi)\g{-1}
 
        The sequence \g{-1} is a reference to the most recently started captur-
        ing subpattern before \g, that is, is it equivalent to \2 in this exam-
-       ple.  Similarly, \g{-2} would be equivalent to \1. The use of  relative
-       references  can  be helpful in long patterns, and also in patterns that
-       are created by  joining  together  fragments  that  contain  references
+       ple.   Similarly, \g{-2} would be equivalent to \1. The use of relative
+       references can be helpful in long patterns, and also in  patterns  that
+       are  created  by  joining  together  fragments  that contain references
        within themselves.
 
-       A  back  reference matches whatever actually matched the capturing sub-
-       pattern in the current subject string, rather  than  anything  matching
+       The sequence \g{+1} is a reference to the  next  capturing  subpattern.
+       This  kind  of forward reference can be useful it patterns that repeat.
+       Perl does not support the use of + in this way.
+
+       A back reference matches whatever actually matched the  capturing  sub-
+       pattern  in  the  current subject string, rather than anything matching
        the subpattern itself (see "Subpatterns as subroutines" below for a way
        of doing that). So the pattern
 
          (sens|respons)e and \1ibility
 
-       matches "sense and sensibility" and "response and responsibility",  but
-       not  "sense and responsibility". If caseful matching is in force at the
-       time of the back reference, the case of letters is relevant. For  exam-
+       matches  "sense and sensibility" and "response and responsibility", but
+       not "sense and responsibility". If caseful matching is in force at  the
+       time  of the back reference, the case of letters is relevant. For exam-
        ple,
 
          ((?i)rah)\s+\1
 
-       matches  "rah  rah"  and  "RAH RAH", but not "RAH rah", even though the
+       matches "rah rah" and "RAH RAH", but not "RAH  rah",  even  though  the
        original capturing subpattern is matched caselessly.
 
-       There are several different ways of writing back  references  to  named
-       subpatterns.  The  .NET syntax \k{name} and the Perl syntax \k<name> or
-       \k'name' are supported, as is the Python syntax (?P=name). Perl  5.10's
+       There  are  several  different ways of writing back references to named
+       subpatterns. The .NET syntax \k{name} and the Perl syntax  \k<name>  or
+       \k'name'  are supported, as is the Python syntax (?P=name). Perl 5.10's
        unified back reference syntax, in which \g can be used for both numeric
-       and named references, is also supported. We  could  rewrite  the  above
+       and  named  references,  is  also supported. We could rewrite the above
        example in any of the following ways:
 
          (?<p1>(?i)rah)\s+\k<p1>
@@ -7190,86 +7680,96 @@ BACK REFERENCES
          (?P<p1>(?i)rah)\s+(?P=p1)
          (?<p1>(?i)rah)\s+\g{p1}
 
-       A  subpattern  that  is  referenced  by  name may appear in the pattern
+       A subpattern that is referenced by  name  may  appear  in  the  pattern
        before or after the reference.
 
-       There may be more than one back reference to the same subpattern. If  a
-       subpattern  has  not actually been used in a particular match, any back
+       There  may be more than one back reference to the same subpattern. If a
+       subpattern has not actually been used in a particular match,  any  back
        references to it always fail by default. For example, the pattern
 
          (a|(bc))\2
 
-       always fails if it starts to match "a" rather than  "bc".  However,  if
-       the  PCRE2_MATCH_UNSET_BACKREF  option  is  set at compile time, a back
+       always  fails  if  it starts to match "a" rather than "bc". However, if
+       the PCRE2_MATCH_UNSET_BACKREF option is set at  compile  time,  a  back
        reference to an unset value matches an empty string.
 
-       Because there may be many capturing parentheses in a pattern, all  dig-
-       its  following a backslash are taken as part of a potential back refer-
-       ence number.  If the pattern continues with  a  digit  character,  some
-       delimiter  must  be  used  to  terminate  the  back  reference.  If the
-       PCRE2_EXTENDED option is set, this can be white space.  Otherwise,  the
+       Because  there may be many capturing parentheses in a pattern, all dig-
+       its following a backslash are taken as part of a potential back  refer-
+       ence  number.   If  the  pattern continues with a digit character, some
+       delimiter must  be  used  to  terminate  the  back  reference.  If  the
+       PCRE2_EXTENDED  option  is set, this can be white space. Otherwise, the
        \g{ syntax or an empty comment (see "Comments" below) can be used.
 
    Recursive back references
 
-       A  back reference that occurs inside the parentheses to which it refers
-       fails when the subpattern is first used, so, for example,  (a\1)  never
-       matches.   However,  such references can be useful inside repeated sub-
+       A back reference that occurs inside the parentheses to which it  refers
+       fails  when  the subpattern is first used, so, for example, (a\1) never
+       matches.  However, such references can be useful inside  repeated  sub-
        patterns. For example, the pattern
 
          (a|b\1)+
 
        matches any number of "a"s and also "aba", "ababbaa" etc. At each iter-
-       ation  of  the  subpattern,  the  back  reference matches the character
-       string corresponding to the previous iteration. In order  for  this  to
-       work,  the  pattern must be such that the first iteration does not need
-       to match the back reference. This can be done using alternation, as  in
+       ation of the subpattern,  the  back  reference  matches  the  character
+       string  corresponding  to  the previous iteration. In order for this to
+       work, the pattern must be such that the first iteration does  not  need
+       to  match the back reference. This can be done using alternation, as in
        the example above, or by a quantifier with a minimum of zero.
 
-       Back  references of this type cause the group that they reference to be
-       treated as an atomic group.  Once the whole group has been  matched,  a
-       subsequent  matching  failure cannot cause backtracking into the middle
+       Back references of this type cause the group that they reference to  be
+       treated  as  an atomic group.  Once the whole group has been matched, a
+       subsequent matching failure cannot cause backtracking into  the  middle
        of the group.
 
 
 ASSERTIONS
 
-       An assertion is a test on the characters  following  or  preceding  the
+       An  assertion  is  a  test on the characters following or preceding the
        current matching point that does not consume any characters. The simple
-       assertions coded as \b, \B, \A, \G, \Z,  \z,  ^  and  $  are  described
+       assertions  coded  as  \b,  \B,  \A,  \G, \Z, \z, ^ and $ are described
        above.
 
-       More  complicated  assertions  are  coded as subpatterns. There are two
-       kinds: those that look ahead of the current  position  in  the  subject
-       string,  and  those  that  look  behind  it. An assertion subpattern is
-       matched in the normal way, except that it does not  cause  the  current
-       matching position to be changed.
-
-       Assertion  subpatterns are not capturing subpatterns. If such an asser-
-       tion contains capturing subpatterns within it, these  are  counted  for
-       the  purposes  of numbering the capturing subpatterns in the whole pat-
-       tern. However, substring capturing is carried  out  only  for  positive
-       assertions. (Perl sometimes, but not always, does do capturing in nega-
-       tive assertions.)
-
-       For  compatibility  with  Perl,  most  assertion  subpatterns  may   be
-       repeated;  though  it  makes  no sense to assert the same thing several
-       times, the side effect of capturing  parentheses  may  occasionally  be
-       useful.  However,  an  assertion  that forms the condition for a condi-
-       tional subpattern may not be quantified. In practice, for other  asser-
+       More complicated assertions are coded as  subpatterns.  There  are  two
+       kinds:  those  that  look  ahead of the current position in the subject
+       string, and those that look behind it, and in each  case  an  assertion
+       may  be  positive  (must  succeed for matching to continue) or negative
+       (must not succeed for matching to continue). An assertion subpattern is
+       matched  in the normal way, except that, when matching continues after-
+       wards, the matching position in the subject string is as it was at  the
+       start of the assertion.
+
+       Assertion  subpatterns  are  not capturing subpatterns. If an assertion
+       contains capturing subpatterns within it, these  are  counted  for  the
+       purposes  of  numbering the capturing subpatterns in the whole pattern.
+       However, substring capturing is carried out only  for  positive  asser-
+       tions that succeed, that is, one of their branches matches, so matching
+       continues after the assertion. If all branches of a positive  assertion
+       fail to match, nothing is captured, and control is passed to the previ-
+       ous backtracking point.
+
+       No capturing is done for a negative assertion unless it is  being  used
+       as  a condition in a conditional subpattern (see the discussion below).
+       Matching continues after a non-conditional negative assertion  only  if
+       all its branches fail to match.
+
+       For   compatibility  with  Perl,  most  assertion  subpatterns  may  be
+       repeated; though it makes no sense to assert  the  same  thing  several
+       times,  the  side  effect  of capturing parentheses may occasionally be
+       useful. However, an assertion that forms the  condition  for  a  condi-
+       tional  subpattern may not be quantified. In practice, for other asser-
        tions, there only three cases:
 
-       (1)  If  the  quantifier  is  {0}, the assertion is never obeyed during
-       matching.  However, it may  contain  internal  capturing  parenthesized
+       (1) If the quantifier is {0}, the  assertion  is  never  obeyed  during
+       matching.   However,  it  may  contain internal capturing parenthesized
        groups that are called from elsewhere via the subroutine mechanism.
 
-       (2)  If quantifier is {0,n} where n is greater than zero, it is treated
-       as if it were {0,1}. At run time, the rest  of  the  pattern  match  is
+       (2) If quantifier is {0,n} where n is greater than zero, it is  treated
+       as  if  it  were  {0,1}.  At run time, the rest of the pattern match is
        tried with and without the assertion, the order depending on the greed-
        iness of the quantifier.
 
-       (3) If the minimum repetition is greater than zero, the  quantifier  is
-       ignored.   The  assertion  is  obeyed just once when encountered during
+       (3)  If  the minimum repetition is greater than zero, the quantifier is
+       ignored.  The assertion is obeyed just  once  when  encountered  during
        matching.
 
    Lookahead assertions
@@ -7279,38 +7779,38 @@ ASSERTIONS
 
          \w+(?=;)
 
-       matches  a word followed by a semicolon, but does not include the semi-
+       matches a word followed by a semicolon, but does not include the  semi-
        colon in the match, and
 
          foo(?!bar)
 
-       matches any occurrence of "foo" that is not  followed  by  "bar".  Note
+       matches  any  occurrence  of  "foo" that is not followed by "bar". Note
        that the apparently similar pattern
 
          (?!foo)bar
 
-       does  not  find  an  occurrence  of "bar" that is preceded by something
-       other than "foo"; it finds any occurrence of "bar" whatsoever,  because
+       does not find an occurrence of "bar"  that  is  preceded  by  something
+       other  than "foo"; it finds any occurrence of "bar" whatsoever, because
        the assertion (?!foo) is always true when the next three characters are
        "bar". A lookbehind assertion is needed to achieve the other effect.
 
        If you want to force a matching failure at some point in a pattern, the
-       most  convenient  way  to  do  it  is with (?!) because an empty string
-       always matches, so an assertion that requires there not to be an  empty
+       most convenient way to do it is  with  (?!)  because  an  empty  string
+       always  matches, so an assertion that requires there not to be an empty
        string must always fail.  The backtracking control verb (*FAIL) or (*F)
        is a synonym for (?!).
 
    Lookbehind assertions
 
-       Lookbehind assertions start with (?<= for positive assertions and  (?<!
+       Lookbehind  assertions start with (?<= for positive assertions and (?<!
        for negative assertions. For example,
 
          (?<!foo)bar
 
-       does  find  an  occurrence  of "bar" that is not preceded by "foo". The
-       contents of a lookbehind assertion are restricted  such  that  all  the
+       does find an occurrence of "bar" that is not  preceded  by  "foo".  The
+       contents  of  a  lookbehind  assertion are restricted such that all the
        strings it matches must have a fixed length. However, if there are sev-
-       eral top-level alternatives, they do not all  have  to  have  the  same
+       eral  top-level  alternatives,  they  do  not all have to have the same
        fixed length. Thus
 
          (?<=bullock|donkey)
@@ -7319,62 +7819,74 @@ ASSERTIONS
 
          (?<!dogs?|cats?)
 
-       causes  an  error at compile time. Branches that match different length
-       strings are permitted only at the top level of a lookbehind  assertion.
+       causes an error at compile time. Branches that match  different  length
+       strings  are permitted only at the top level of a lookbehind assertion.
        This is an extension compared with Perl, which requires all branches to
        match the same length of string. An assertion such as
 
          (?<=ab(c|de))
 
-       is not permitted, because its single top-level  branch  can  match  two
-       different  lengths,  but  it is acceptable to PCRE2 if rewritten to use
+       is  not  permitted,  because  its single top-level branch can match two
+       different lengths, but it is acceptable to PCRE2 if  rewritten  to  use
        two top-level branches:
 
          (?<=abc|abde)
 
-       In some cases, the escape sequence \K (see above) can be  used  instead
+       In  some  cases, the escape sequence \K (see above) can be used instead
        of a lookbehind assertion to get round the fixed-length restriction.
 
-       The  implementation  of lookbehind assertions is, for each alternative,
-       to temporarily move the current position back by the fixed  length  and
+       The implementation of lookbehind assertions is, for  each  alternative,
+       to  temporarily  move the current position back by the fixed length and
        then try to match. If there are insufficient characters before the cur-
        rent position, the assertion fails.
 
-       In a UTF mode, PCRE2 does not allow the \C escape (which matches a sin-
-       gle  code  unit even in a UTF mode) to appear in lookbehind assertions,
-       because it makes it impossible to calculate the length of  the  lookbe-
-       hind.  The \X and \R escapes, which can match different numbers of code
-       units, are also not permitted.
-
-       "Subroutine" calls (see below) such as (?2) or (?&X) are  permitted  in
-       lookbehinds,  as  long as the subpattern matches a fixed-length string.
-       Recursion, however, is not supported.
-
-       Possessive quantifiers can  be  used  in  conjunction  with  lookbehind
+       In  UTF-8  and  UTF-16 modes, PCRE2 does not allow the \C escape (which
+       matches a single code unit even in a UTF mode) to appear in  lookbehind
+       assertions,  because  it makes it impossible to calculate the length of
+       the lookbehind. The \X and \R escapes, which can match  different  num-
+       bers of code units, are never permitted in lookbehinds.
+
+       "Subroutine"  calls  (see below) such as (?2) or (?&X) are permitted in
+       lookbehinds, as long as the subpattern matches a  fixed-length  string.
+       However,  recursion,  that is, a "subroutine" call into a group that is
+       already active, is not supported.
+
+       Perl does not support back references in lookbehinds. PCRE2  does  sup-
+       port   them,   but   only   if   certain   conditions   are   met.  The
+       PCRE2_MATCH_UNSET_BACKREF option must not be set, there must be no  use
+       of (?| in the pattern (it creates duplicate subpattern numbers), and if
+       the back reference is by name, the name must be unique. Of course,  the
+       referenced  subpattern  must  itself  be of fixed length. The following
+       pattern matches words containing at least two characters that begin and
+       end with the same character:
+
+          \b(\w)\w++(?<=\1)
+
+       Possessive  quantifiers  can  be  used  in  conjunction with lookbehind
        assertions to specify efficient matching of fixed-length strings at the
        end of subject strings. Consider a simple pattern such as
 
          abcd$
 
-       when applied to a long string that does  not  match.  Because  matching
-       proceeds  from  left to right, PCRE2 will look for each "a" in the sub-
-       ject and then see if what follows matches the rest of the  pattern.  If
+       when  applied  to  a  long string that does not match. Because matching
+       proceeds from left to right, PCRE2 will look for each "a" in  the  sub-
+       ject  and  then see if what follows matches the rest of the pattern. If
        the pattern is specified as
 
          ^.*abcd$
 
-       the  initial .* matches the entire string at first, but when this fails
+       the initial .* matches the entire string at first, but when this  fails
        (because there is no following "a"), it backtracks to match all but the
-       last  character,  then all but the last two characters, and so on. Once
-       again the search for "a" covers the entire string, from right to  left,
+       last character, then all but the last two characters, and so  on.  Once
+       again  the search for "a" covers the entire string, from right to left,
        so we are no better off. However, if the pattern is written as
 
          ^.*+(?<=abcd)
 
        there can be no backtracking for the .*+ item because of the possessive
        quantifier; it can match only the entire string. The subsequent lookbe-
-       hind  assertion  does  a single test on the last four characters. If it
-       fails, the match fails immediately. For  long  strings,  this  approach
+       hind assertion does a single test on the last four  characters.  If  it
+       fails,  the  match  fails  immediately. For long strings, this approach
        makes a significant difference to the processing time.
 
    Using multiple assertions
@@ -7383,18 +7895,18 @@ ASSERTIONS
 
          (?<=\d{3})(?<!999)foo
 
-       matches  "foo" preceded by three digits that are not "999". Notice that
-       each of the assertions is applied independently at the  same  point  in
-       the  subject  string.  First  there  is a check that the previous three
-       characters are all digits, and then there is  a  check  that  the  same
+       matches "foo" preceded by three digits that are not "999". Notice  that
+       each  of  the  assertions is applied independently at the same point in
+       the subject string. First there is a  check  that  the  previous  three
+       characters  are  all  digits,  and  then there is a check that the same
        three characters are not "999".  This pattern does not match "foo" pre-
-       ceded by six characters, the first of which are  digits  and  the  last
-       three  of  which  are not "999". For example, it doesn't match "123abc-
+       ceded  by  six  characters,  the first of which are digits and the last
+       three of which are not "999". For example, it  doesn't  match  "123abc-
        foo". A pattern to do that is
 
          (?<=\d{3}...)(?<!999)foo
 
-       This time the first assertion looks at the  preceding  six  characters,
+       This  time  the  first assertion looks at the preceding six characters,
        checking that the first three are digits, and then the second assertion
        checks that the preceding three characters are not "999".
 
@@ -7402,29 +7914,29 @@ ASSERTIONS
 
          (?<=(?<!foo)bar)baz
 
-       matches an occurrence of "baz" that is preceded by "bar" which in  turn
+       matches  an occurrence of "baz" that is preceded by "bar" which in turn
        is not preceded by "foo", while
 
          (?<=\d{3}(?!999)...)foo
 
-       is  another pattern that matches "foo" preceded by three digits and any
+       is another pattern that matches "foo" preceded by three digits and  any
        three characters that are not "999".
 
 
 CONDITIONAL SUBPATTERNS
 
-       It is possible to cause the matching process to obey a subpattern  con-
-       ditionally  or to choose between two alternative subpatterns, depending
-       on the result of an assertion, or whether a specific capturing  subpat-
-       tern  has  already  been matched. The two possible forms of conditional
+       It  is possible to cause the matching process to obey a subpattern con-
+       ditionally or to choose between two alternative subpatterns,  depending
+       on  the result of an assertion, or whether a specific capturing subpat-
+       tern has already been matched. The two possible  forms  of  conditional
        subpattern are:
 
          (?(condition)yes-pattern)
          (?(condition)yes-pattern|no-pattern)
 
-       If the condition is satisfied, the yes-pattern is used;  otherwise  the
-       no-pattern  (if  present)  is used. If there are more than two alterna-
-       tives in the subpattern, a compile-time error occurs. Each of  the  two
+       If  the  condition is satisfied, the yes-pattern is used; otherwise the
+       no-pattern (if present) is used. If there are more  than  two  alterna-
+       tives  in  the subpattern, a compile-time error occurs. Each of the two
        alternatives may itself contain nested subpatterns of any form, includ-
        ing  conditional  subpatterns;  the  restriction  to  two  alternatives
        applies only at the level of the condition. This pattern fragment is an
@@ -7433,93 +7945,114 @@ CONDITIONAL SUBPATTERNS
          (?(1) (A|B|C) | (D | (?(2)E|F) | E) )
 
 
-       There are five kinds of condition: references  to  subpatterns,  refer-
-       ences  to  recursion,  two pseudo-conditions called DEFINE and VERSION,
+       There  are  five  kinds of condition: references to subpatterns, refer-
+       ences to recursion, two pseudo-conditions called  DEFINE  and  VERSION,
        and assertions.
 
    Checking for a used subpattern by number
 
-       If the text between the parentheses consists of a sequence  of  digits,
+       If  the  text between the parentheses consists of a sequence of digits,
        the condition is true if a capturing subpattern of that number has pre-
-       viously matched. If there is more than one  capturing  subpattern  with
-       the  same  number  (see  the earlier section about duplicate subpattern
-       numbers), the condition is true if any of them have matched. An  alter-
-       native  notation is to precede the digits with a plus or minus sign. In
-       this case, the subpattern number is relative rather than absolute.  The
-       most  recently opened parentheses can be referenced by (?(-1), the next
-       most recent by (?(-2), and so on. Inside loops it can also  make  sense
+       viously  matched.  If  there is more than one capturing subpattern with
+       the same number (see the earlier  section  about  duplicate  subpattern
+       numbers),  the condition is true if any of them have matched. An alter-
+       native notation is to precede the digits with a plus or minus sign.  In
+       this  case, the subpattern number is relative rather than absolute. The
+       most recently opened parentheses can be referenced by (?(-1), the  next
+       most  recent  by (?(-2), and so on. Inside loops it can also make sense
        to refer to subsequent groups. The next parentheses to be opened can be
-       referenced as (?(+1), and so on. (The value zero in any of these  forms
+       referenced  as (?(+1), and so on. (The value zero in any of these forms
        is not used; it provokes a compile-time error.)
 
-       Consider  the  following  pattern, which contains non-significant white
-       space to make it more readable (assume the PCRE2_EXTENDED  option)  and
+       Consider the following pattern, which  contains  non-significant  white
+       space  to  make it more readable (assume the PCRE2_EXTENDED option) and
        to divide it into three parts for ease of discussion:
 
          ( \( )?    [^()]+    (?(1) \) )
 
-       The  first  part  matches  an optional opening parenthesis, and if that
+       The first part matches an optional opening  parenthesis,  and  if  that
        character is present, sets it as the first captured substring. The sec-
-       ond  part  matches one or more characters that are not parentheses. The
-       third part is a conditional subpattern that tests whether  or  not  the
-       first  set  of  parentheses  matched.  If they did, that is, if subject
-       started with an opening parenthesis, the condition is true, and so  the
-       yes-pattern  is  executed and a closing parenthesis is required. Other-
-       wise, since no-pattern is not present, the subpattern matches  nothing.
-       In  other  words,  this  pattern matches a sequence of non-parentheses,
+       ond part matches one or more characters that are not  parentheses.  The
+       third  part  is  a conditional subpattern that tests whether or not the
+       first set of parentheses matched. If they  did,  that  is,  if  subject
+       started  with an opening parenthesis, the condition is true, and so the
+       yes-pattern is executed and a closing parenthesis is  required.  Other-
+       wise,  since no-pattern is not present, the subpattern matches nothing.
+       In other words, this pattern matches  a  sequence  of  non-parentheses,
        optionally enclosed in parentheses.
 
-       If you were embedding this pattern in a larger one,  you  could  use  a
+       If  you  were  embedding  this pattern in a larger one, you could use a
        relative reference:
 
          ...other stuff... ( \( )?    [^()]+    (?(-1) \) ) ...
 
-       This  makes  the  fragment independent of the parentheses in the larger
+       This makes the fragment independent of the parentheses  in  the  larger
        pattern.
 
    Checking for a used subpattern by name
 
-       Perl uses the syntax (?(<name>)...) or (?('name')...)  to  test  for  a
-       used  subpattern  by  name.  For compatibility with earlier versions of
-       PCRE1, which had this facility before Perl, the syntax (?(name)...)  is
-       also recognized.
+       Perl  uses  the  syntax  (?(<name>)...) or (?('name')...) to test for a
+       used subpattern by name. For compatibility  with  earlier  versions  of
+       PCRE1,  which had this facility before Perl, the syntax (?(name)...) is
+       also recognized. Note, however, that undelimited  names  consisting  of
+       the  letter  R followed by digits are ambiguous (see the following sec-
+       tion).
 
        Rewriting the above example to use a named subpattern gives this:
 
          (?<OPEN> \( )?    [^()]+    (?(<OPEN>) \) )
 
-       If  the  name used in a condition of this kind is a duplicate, the test
-       is applied to all subpatterns of the same name, and is true if any  one
+       If the name used in a condition of this kind is a duplicate,  the  test
+       is  applied to all subpatterns of the same name, and is true if any one
        of them has matched.
 
    Checking for pattern recursion
 
-       If the condition is the string (R), and there is no subpattern with the
-       name R, the condition is true if a recursive call to the whole  pattern
-       or any subpattern has been made. If digits or a name preceded by amper-
-       sand follow the letter R, for example:
+       "Recursion" in this sense refers to any subroutine-like call  from  one
+       part  of  the  pattern to another, whether or not it is actually recur-
+       sive. See the sections entitled "Recursive patterns"  and  "Subpatterns
+       as subroutines" below for details of recursion and subpattern calls.
 
-         (?(R3)...) or (?(R&name)...)
+       If  a  condition is the string (R), and there is no subpattern with the
+       name R, the condition is true if matching is currently in  a  recursion
+       or  subroutine  call  to the whole pattern or any subpattern. If digits
+       follow the letter R, and there is no subpattern  with  that  name,  the
+       condition is true if the most recent call is into a subpattern with the
+       given number, which must exist somewhere in the overall  pattern.  This
+       is a contrived example that is equivalent to a+b:
+
+         ((?(R1)a+|(?1)b))
+
+       However,  in both cases, if there is a subpattern with a matching name,
+       the condition tests for its being set,  as  described  in  the  section
+       above,  instead of testing for recursion. For example, creating a group
+       with the name R1 by adding (?<R1>)  to  the  above  pattern  completely
+       changes its meaning.
+
+       If a name preceded by ampersand follows the letter R, for example:
+
+         (?(R&name)...)
 
        the condition is true if the most recent recursion is into a subpattern
-       whose number or name is given. This condition does not check the entire
-       recursion stack. If the name used in a condition  of  this  kind  is  a
+       of that name (which must exist within the pattern).
+
+       This condition does not check the entire recursion stack. It tests only
+       the  current  level.  If the name used in a condition of this kind is a
        duplicate, the test is applied to all subpatterns of the same name, and
        is true if any one of them is the most recent recursion.
 
-       At "top level", all these recursion test  conditions  are  false.   The
-       syntax for recursive patterns is described below.
+       At "top level", all these recursion test conditions are false.
 
    Defining subpatterns for use by reference only
 
-       If  the  condition  is  the string (DEFINE), and there is no subpattern
-       with the name DEFINE, the condition is  always  false.  In  this  case,
-       there  may  be  only  one  alternative  in the subpattern. It is always
-       skipped if control reaches this point  in  the  pattern;  the  idea  of
-       DEFINE  is that it can be used to define subroutines that can be refer-
-       enced from elsewhere. (The use of subroutines is described below.)  For
-       example,  a  pattern  to match an IPv4 address such as "192.168.23.245"
-       could be written like this (ignore white space and line breaks):
+       If the condition is the string (DEFINE), the condition is always false,
+       even if there is a group with the name DEFINE. In this case, there  may
+       be only one alternative in the subpattern. It is always skipped if con-
+       trol reaches this point in the pattern; the idea of DEFINE is  that  it
+       can  be  used  to  define subroutines that can be referenced from else-
+       where. (The use of subroutines is described below.) For example, a pat-
+       tern to match an IPv4 address such as "192.168.23.245" could be written
+       like this (ignore white space and line breaks):
 
          (?(DEFINE) (?<byte> 2[0-4]\d | 25[0-5] | 1\d\d | [1-9]?\d) )
          \b (?&byte) (\.(?&byte)){3} \b
@@ -7566,48 +8099,55 @@ CONDITIONAL SUBPATTERNS
        strings in one of the two forms dd-aaa-dd or dd-dd-dd,  where  aaa  are
        letters and dd are digits.
 
+       When  an  assertion that is a condition contains capturing subpatterns,
+       any capturing that occurs in a matching branch is retained  afterwards,
+       for both positive and negative assertions, because matching always con-
+       tinues after the assertion, whether it succeeds or fails. (Compare non-
+       conditional  assertions,  when  captures are retained only for positive
+       assertions that succeed.)
+
 
 COMMENTS
 
        There are two ways of including comments in patterns that are processed
-       by PCRE2. In both cases, the start of the comment  must  not  be  in  a
-       character  class,  nor  in  the middle of any other sequence of related
-       characters such as (?: or a subpattern name or number.  The  characters
+       by  PCRE2.  In  both  cases,  the start of the comment must not be in a
+       character class, nor in the middle of any  other  sequence  of  related
+       characters  such  as (?: or a subpattern name or number. The characters
        that make up a comment play no part in the pattern matching.
 
-       The  sequence (?# marks the start of a comment that continues up to the
-       next closing parenthesis. Nested parentheses are not permitted. If  the
-       PCRE2_EXTENDED  option is set, an unescaped # character also introduces
-       a comment, which in this case continues to immediately after  the  next
-       newline  character  or character sequence in the pattern. Which charac-
-       ters are interpreted as newlines is controlled by an option  passed  to
-       the  compiling  function  or  by a special sequence at the start of the
-       pattern, as described in the  section  entitled  "Newline  conventions"
-       above.  Note  that the end of this type of comment is a literal newline
-       sequence in the pattern; escape sequences that happen  to  represent  a
-       newline   do  not  count.  For  example,  consider  this  pattern  when
-       PCRE2_EXTENDED is set, and the default  newline  convention  (a  single
+       The sequence (?# marks the start of a comment that continues up to  the
+       next  closing parenthesis. Nested parentheses are not permitted. If the
+       PCRE2_EXTENDED option is set, an unescaped # character also  introduces
+       a  comment,  which in this case continues to immediately after the next
+       newline character or character sequence in the pattern.  Which  charac-
+       ters  are  interpreted as newlines is controlled by an option passed to
+       the compiling function or by a special sequence at  the  start  of  the
+       pattern,  as  described  in  the section entitled "Newline conventions"
+       above. Note that the end of this type of comment is a  literal  newline
+       sequence  in  the  pattern; escape sequences that happen to represent a
+       newline  do  not  count.  For  example,  consider  this  pattern   when
+       PCRE2_EXTENDED  is  set,  and  the default newline convention (a single
        linefeed character) is in force:
 
          abc #comment \n still comment
 
-       On  encountering  the # character, pcre2_compile() skips along, looking
-       for a newline in the pattern. The sequence \n is still literal at  this
-       stage,  so  it does not terminate the comment. Only an actual character
+       On encountering the # character, pcre2_compile() skips  along,  looking
+       for  a newline in the pattern. The sequence \n is still literal at this
+       stage, so it does not terminate the comment. Only an  actual  character
        with the code value 0x0a (the default newline) does so.
 
 
 RECURSIVE PATTERNS
 
-       Consider the problem of matching a string in parentheses, allowing  for
-       unlimited  nested  parentheses.  Without the use of recursion, the best
-       that can be done is to use a pattern that  matches  up  to  some  fixed
-       depth  of  nesting.  It  is not possible to handle an arbitrary nesting
+       Consider  the problem of matching a string in parentheses, allowing for
+       unlimited nested parentheses. Without the use of  recursion,  the  best
+       that  can  be  done  is  to use a pattern that matches up to some fixed
+       depth of nesting. It is not possible to  handle  an  arbitrary  nesting
        depth.
 
        For some time, Perl has provided a facility that allows regular expres-
-       sions  to recurse (amongst other things). It does this by interpolating
-       Perl code in the expression at run time, and the code can refer to  the
+       sions to recurse (amongst other things). It does this by  interpolating
+       Perl  code in the expression at run time, and the code can refer to the
        expression itself. A Perl pattern using code interpolation to solve the
        parentheses problem can be created like this:
 
@@ -7617,206 +8157,171 @@ RECURSIVE PATTERNS
        refers recursively to the pattern in which it appears.
 
        Obviously,  PCRE2  cannot  support  the  interpolation  of  Perl  code.
-       Instead, it supports special syntax for recursion of  the  entire  pat-
+       Instead,  it  supports  special syntax for recursion of the entire pat-
        tern, and also for individual subpattern recursion. After its introduc-
-       tion in PCRE1 and Python,  this  kind  of  recursion  was  subsequently
+       tion  in  PCRE1  and  Python,  this  kind of recursion was subsequently
        introduced into Perl at release 5.10.
 
-       A  special  item  that consists of (? followed by a number greater than
-       zero and a closing parenthesis is a recursive subroutine  call  of  the
-       subpattern  of  the  given  number, provided that it occurs inside that
-       subpattern. (If not, it is a non-recursive subroutine  call,  which  is
-       described  in  the  next  section.)  The special item (?R) or (?0) is a
+       A special item that consists of (? followed by a  number  greater  than
+       zero  and  a  closing parenthesis is a recursive subroutine call of the
+       subpattern of the given number, provided that  it  occurs  inside  that
+       subpattern.  (If  not,  it is a non-recursive subroutine call, which is
+       described in the next section.) The special item  (?R)  or  (?0)  is  a
        recursive call of the entire regular expression.
 
-       This PCRE2 pattern solves the nested parentheses  problem  (assume  the
+       This  PCRE2  pattern  solves the nested parentheses problem (assume the
        PCRE2_EXTENDED option is set so that white space is ignored):
 
          \( ( [^()]++ | (?R) )* \)
 
-       First  it matches an opening parenthesis. Then it matches any number of
-       substrings which can either be a  sequence  of  non-parentheses,  or  a
-       recursive  match  of the pattern itself (that is, a correctly parenthe-
+       First it matches an opening parenthesis. Then it matches any number  of
+       substrings  which  can  either  be  a sequence of non-parentheses, or a
+       recursive match of the pattern itself (that is, a  correctly  parenthe-
        sized substring).  Finally there is a closing parenthesis. Note the use
        of a possessive quantifier to avoid backtracking into sequences of non-
        parentheses.
 
-       If this were part of a larger pattern, you would not  want  to  recurse
+       If  this  were  part of a larger pattern, you would not want to recurse
        the entire pattern, so instead you could use this:
 
          ( \( ( [^()]++ | (?1) )* \) )
 
-       We  have  put the pattern into parentheses, and caused the recursion to
+       We have put the pattern into parentheses, and caused the  recursion  to
        refer to them instead of the whole pattern.
 
-       In a larger pattern,  keeping  track  of  parenthesis  numbers  can  be
-       tricky.  This is made easier by the use of relative references. Instead
+       In  a  larger  pattern,  keeping  track  of  parenthesis numbers can be
+       tricky. This is made easier by the use of relative references.  Instead
        of (?1) in the pattern above you can write (?-2) to refer to the second
-       most  recently  opened  parentheses  preceding  the recursion. In other
-       words, a negative number counts capturing  parentheses  leftwards  from
+       most recently opened parentheses  preceding  the  recursion.  In  other
+       words,  a  negative  number counts capturing parentheses leftwards from
        the point at which it is encountered.
 
        Be aware however, that if duplicate subpattern numbers are in use, rel-
-       ative references refer to the earliest subpattern with the  appropriate
+       ative  references refer to the earliest subpattern with the appropriate
        number. Consider, for example:
 
          (?|(a)|(b)) (c) (?-2)
 
-       The  first  two  capturing  groups (a) and (b) are both numbered 1, and
-       group (c) is number 2. When the reference  (?-2)  is  encountered,  the
+       The first two capturing groups (a) and (b) are  both  numbered  1,  and
+       group  (c)  is  number  2. When the reference (?-2) is encountered, the
        second most recently opened parentheses has the number 1, but it is the
-       first such group (the (a) group) to which the  recursion  refers.  This
-       would  be  the  same  if  an absolute reference (?1) was used. In other
-       words, relative references are just a shorthand for computing  a  group
+       first  such  group  (the (a) group) to which the recursion refers. This
+       would be the same if an absolute reference  (?1)  was  used.  In  other
+       words,  relative  references are just a shorthand for computing a group
        number.
 
-       It  is  also  possible  to refer to subsequently opened parentheses, by
-       writing references such as (?+2). However, these  cannot  be  recursive
-       because  the  reference  is  not inside the parentheses that are refer-
-       enced. They are always non-recursive subroutine calls, as described  in
+       It is also possible to refer to  subsequently  opened  parentheses,  by
+       writing  references  such  as (?+2). However, these cannot be recursive
+       because the reference is not inside the  parentheses  that  are  refer-
+       enced.  They are always non-recursive subroutine calls, as described in
        the next section.
 
-       An  alternative  approach  is to use named parentheses. The Perl syntax
-       for this is (?&name); PCRE1's earlier syntax  (?P>name)  is  also  sup-
+       An alternative approach is to use named parentheses.  The  Perl  syntax
+       for  this  is  (?&name);  PCRE1's earlier syntax (?P>name) is also sup-
        ported. We could rewrite the above example as follows:
 
          (?<pn> \( ( [^()]++ | (?&pn) )* \) )
 
-       If  there  is more than one subpattern with the same name, the earliest
+       If there is more than one subpattern with the same name,  the  earliest
        one is used.
 
        The example pattern that we have been looking at contains nested unlim-
-       ited  repeats,  and  so the use of a possessive quantifier for matching
-       strings of non-parentheses is important when applying  the  pattern  to
+       ited repeats, and so the use of a possessive  quantifier  for  matching
+       strings  of  non-parentheses  is important when applying the pattern to
        strings that do not match. For example, when this pattern is applied to
 
          (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()
 
-       it  yields  "no  match" quickly. However, if a possessive quantifier is
-       not used, the match runs for a very long time indeed because there  are
-       so  many  different  ways the + and * repeats can carve up the subject,
+       it yields "no match" quickly. However, if a  possessive  quantifier  is
+       not  used, the match runs for a very long time indeed because there are
+       so many different ways the + and * repeats can carve  up  the  subject,
        and all have to be tested before failure can be reported.
 
-       At the end of a match, the values of capturing  parentheses  are  those
-       from  the outermost level. If you want to obtain intermediate values, a
+       At  the  end  of a match, the values of capturing parentheses are those
+       from the outermost level. If you want to obtain intermediate values,  a
        callout function can be used (see below and the pcre2callout documenta-
        tion). If the pattern above is matched against
 
          (ab(cd)ef)
 
-       the  value  for  the  inner capturing parentheses (numbered 2) is "ef",
-       which is the last value taken on at the top level. If a capturing  sub-
-       pattern  is  not  matched at the top level, its final captured value is
-       unset, even if it was (temporarily) set at a deeper  level  during  the
+       the value for the inner capturing parentheses  (numbered  2)  is  "ef",
+       which  is the last value taken on at the top level. If a capturing sub-
+       pattern is not matched at the top level, its final  captured  value  is
+       unset,  even  if  it was (temporarily) set at a deeper level during the
        matching process.
 
        If there are more than 15 capturing parentheses in a pattern, PCRE2 has
-       to obtain extra memory from the heap to store data during a  recursion.
-       If   no   memory   can   be   obtained,   the   match  fails  with  the
+       to  obtain extra memory from the heap to store data during a recursion.
+       If  no  memory  can   be   obtained,   the   match   fails   with   the
        PCRE2_ERROR_NOMEMORY error.
 
-       Do not confuse the (?R) item with the condition (R),  which  tests  for
-       recursion.   Consider  this pattern, which matches text in angle brack-
-       ets, allowing for arbitrary nesting. Only digits are allowed in  nested
-       brackets  (that is, when recursing), whereas any characters are permit-
+       Do  not  confuse  the (?R) item with the condition (R), which tests for
+       recursion.  Consider this pattern, which matches text in  angle  brack-
+       ets,  allowing for arbitrary nesting. Only digits are allowed in nested
+       brackets (that is, when recursing), whereas any characters are  permit-
        ted at the outer level.
 
          < (?: (?(R) \d++  | [^<>]*+) | (?R)) * >
 
-       In this pattern, (?(R) is the start of a conditional  subpattern,  with
-       two  different  alternatives for the recursive and non-recursive cases.
+       In  this  pattern, (?(R) is the start of a conditional subpattern, with
+       two different alternatives for the recursive and  non-recursive  cases.
        The (?R) item is the actual recursive call.
 
    Differences in recursion processing between PCRE2 and Perl
 
-       Recursion processing in PCRE2 differs from Perl in two important  ways.
-       In PCRE2 (like Python, but unlike Perl), a recursive subpattern call is
-       always treated as an atomic group. That is, once it has matched some of
-       the subject string, it is never re-entered, even if it contains untried
-       alternatives and there is a subsequent matching failure.  This  can  be
-       illustrated  by the following pattern, which purports to match a palin-
-       dromic string that contains an odd number of characters  (for  example,
-       "a", "aba", "abcba", "abcdcba"):
-
-         ^(.|(.)(?1)\2)$
-
-       The idea is that it either matches a single character, or two identical
-       characters surrounding a sub-palindrome. In Perl, this  pattern  works;
-       in  PCRE2  it  does not if the pattern is longer than three characters.
-       Consider the subject string "abcba":
-
-       At the top level, the first character is matched, but as it is  not  at
-       the end of the string, the first alternative fails; the second alterna-
-       tive is taken and the recursion kicks in. The recursive call to subpat-
-       tern  1  successfully  matches the next character ("b"). (Note that the
-       beginning and end of line tests are not part of the recursion).
-
-       Back at the top level, the next character ("c") is compared  with  what
-       subpattern  2 matched, which was "a". This fails. Because the recursion
-       is treated as an atomic group, there are now  no  backtracking  points,
-       and  so  the  entire  match fails. (Perl is able, at this point, to re-
-       enter the recursion and try the second alternative.)  However,  if  the
-       pattern is written with the alternatives in the other order, things are
-       different:
+       Some former differences between PCRE2 and Perl no longer exist.
 
-         ^((.)(?1)\2|.)$
+       Before  release 10.30, recursion processing in PCRE2 differed from Perl
+       in that a recursive subpattern call was always  treated  as  an  atomic
+       group.  That is, once it had matched some of the subject string, it was
+       never re-entered, even if it contained untried alternatives  and  there
+       was  a  subsequent matching failure. (Historical note: PCRE implemented
+       recursion before Perl did.)
 
-       This time, the recursing alternative is tried first, and  continues  to
-       recurse  until  it runs out of characters, at which point the recursion
-       fails. But this time we do have  another  alternative  to  try  at  the
-       higher  level.  That  is  the  big difference: in the previous case the
-       remaining alternative is at a deeper recursion level, which PCRE2  can-
-       not use.
+       Starting with release 10.30, recursive subroutine calls are  no  longer
+       treated as atomic. That is, they can be re-entered to try unused alter-
+       natives if there is a matching failure later in the  pattern.  This  is
+       now  compatible  with the way Perl works. If you want a subroutine call
+       to be atomic, you must explicitly enclose it in an atomic group.
 
-       To  change  the pattern so that it matches all palindromic strings, not
-       just those with an odd number of characters, it is tempting  to  change
-       the pattern to this:
+       Supporting backtracking into recursions  simplifies  certain  types  of
+       recursive  pattern.  For  example,  this  pattern  matches  palindromic
+       strings:
 
          ^((.)(?1)\2|.?)$
 
-       Again,  this  works in Perl, but not in PCRE2, and for the same reason.
-       When a deeper recursion has matched a single character,  it  cannot  be
-       entered  again  in  order  to match an empty string. The solution is to
-       separate the two cases, and write out the odd and even cases as  alter-
-       natives at the higher level:
-
-         ^(?:((.)(?1)\2|)|((.)(?3)\4|.))
-
-       If  you  want  to match typical palindromic phrases, the pattern has to
-       ignore all non-word characters, which can be done like this:
+       The second branch in the group matches a single  central  character  in
+       the  palindrome  when there are an odd number of characters, or nothing
+       when there are an even number of characters, but in order  to  work  it
+       has  to  be  able  to  try the second case when the rest of the pattern
+       match fails. If you want to match typical palindromic phrases, the pat-
+       tern  has  to  ignore  all  non-word characters, which can be done like
+       this:
 
-         ^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$
+         ^\W*+((.)\W*+(?1)\W*+\2|\W*+.?)\W*+$
 
        If run with the PCRE2_CASELESS option,  this  pattern  matches  phrases
-       such  as  "A  man, a plan, a canal: Panama!" and it works in both PCRE2
-       and Perl. Note the use of the possessive quantifier *+ to  avoid  back-
-       tracking  into  sequences  of  non-word characters. Without this, PCRE2
-       takes a great deal longer (ten times or more) to match typical phrases,
-       and Perl takes so long that you think it has gone into a loop.
-
-       WARNING:  The  palindrome-matching patterns above work only if the sub-
-       ject string does not start with a palindrome that is shorter  than  the
-       entire  string.  For example, although "abcba" is correctly matched, if
-       the subject is "ababa", PCRE2 finds the palindrome "aba" at the  start,
-       then  fails at top level because the end of the string does not follow.
-       Once again, it cannot jump back into the recursion to try other  alter-
-       natives, so the entire match fails.
-
-       The  second  way in which PCRE2 and Perl differ in their recursion pro-
-       cessing is in the handling of captured values. In Perl, when a  subpat-
-       tern  is  called recursively or as a subpattern (see the next section),
-       it has no access to any values that were captured  outside  the  recur-
-       sion,  whereas  in  PCRE2 these values can be referenced. Consider this
-       pattern:
+       such  as "A man, a plan, a canal: Panama!". Note the use of the posses-
+       sive quantifier *+ to avoid backtracking  into  sequences  of  non-word
+       characters. Without this, PCRE2 takes a great deal longer (ten times or
+       more) to match typical phrases, and Perl takes so long that  you  think
+       it has gone into a loop.
+
+       Another  way  in which PCRE2 and Perl used to differ in their recursion
+       processing is in the handling of captured  values.  Formerly  in  Perl,
+       when  a  subpattern  was called recursively or as a subpattern (see the
+       next section), it had no access to any values that were  captured  out-
+       side  the  recursion,  whereas in PCRE2 these values can be referenced.
+       Consider this pattern:
 
          ^(.)(\1|a(?2))
 
-       In PCRE2, this pattern matches "bab". The first  capturing  parentheses
-       match  "b",  then in the second group, when the back reference \1 fails
-       to match "b", the second alternative matches "a" and then recurses.  In
-       the  recursion,  \1 does now match "b" and so the whole match succeeds.
-       In Perl, the pattern fails to match because inside the  recursive  call
-       \1 cannot access the externally set value.
+       This pattern matches "bab". The first capturing parentheses match  "b",
+       then  in  the  second  group, when the back reference \1 fails to match
+       "b", the second alternative matches  "a"  and  then  recurses.  In  the
+       recursion,  \1 does now match "b" and so the whole match succeeds. This
+       match used to fail in Perl, but in later versions (I  tried  5.024)  it
+       now works.
 
 
 SUBPATTERNS AS SUBROUTINES
@@ -7844,12 +8349,10 @@ SUBPATTERNS AS SUBROUTINES
        two  strings.  Another  example  is  given  in the discussion of DEFINE
        above.
 
-       All subroutine calls, whether recursive or not, are always  treated  as
-       atomic  groups. That is, once a subroutine has matched some of the sub-
-       ject string, it is never re-entered, even if it contains untried alter-
-       natives  and  there  is  a  subsequent  matching failure. Any capturing
-       parentheses that are set during the subroutine  call  revert  to  their
-       previous values afterwards.
+       Like recursions, subroutine calls used to be  treated  as  atomic,  but
+       this  changed  at  PCRE2 release 10.30, so backtracking into subroutine
+       calls can now occur. However, any capturing parentheses  that  are  set
+       during the subroutine call revert to their previous values afterwards.
 
        Processing  options  such as case-independence are fixed when a subpat-
        tern is defined, so if it is used as a subroutine, such options  cannot
@@ -7956,43 +8459,46 @@ CALLOUTS
 
 BACKTRACKING CONTROL
 
-       Perl 5.10 introduced a number of "Special Backtracking Control  Verbs",
-       which  are  still  described in the Perl documentation as "experimental
-       and subject to change or removal in a future version of Perl". It  goes
-       on  to  say:  "Their  usage in production code should be noted to avoid
-       problems during upgrades." The same remarks apply to the PCRE2 features
-       described in this section.
-
-       The  new verbs make use of what was previously invalid syntax: an open-
-       ing parenthesis followed by an asterisk. They are generally of the form
-       (*VERB) or (*VERB:NAME). Some verbs take either form, possibly behaving
-       differently depending on whether or not a name is present.
+       There are a number of special  "Backtracking  Control  Verbs"  (to  use
+       Perl's  terminology)  that  modify the behaviour of backtracking during
+       matching. They are generally of the form (*VERB) or (*VERB:NAME).  Some
+       verbs  take  either  form,  possibly  behaving differently depending on
+       whether or not a name is present.
 
        By default, for compatibility with Perl, a  name  is  any  sequence  of
        characters that does not include a closing parenthesis. The name is not
        processed in any way, and it is  not  possible  to  include  a  closing
-       parenthesis in the name.  However, if the PCRE2_ALT_VERBNAMES option is
-       set, normal backslash processing is applied to verb names and  only  an
-       unescaped  closing parenthesis terminates the name. A closing parenthe-
-       sis can be included in a name either as \) or between \Q and \E. If the
-       PCRE2_EXTENDED  option  is  set,  unescaped whitespace in verb names is
-       skipped and #-comments are recognized, exactly as in the  rest  of  the
-       pattern.
-
-       The  maximum  length of a name is 255 in the 8-bit library and 65535 in
-       the 16-bit and 32-bit libraries. If the name is empty, that is, if  the
-       closing  parenthesis immediately follows the colon, the effect is as if
+       parenthesis   in  the  name.   This  can  be  changed  by  setting  the
+       PCRE2_ALT_VERBNAMES option, but the result is no  longer  Perl-compati-
+       ble.
+
+       When  PCRE2_ALT_VERBNAMES  is  set,  backslash processing is applied to
+       verb names and only an unescaped  closing  parenthesis  terminates  the
+       name.  However, the only backslash items that are permitted are \Q, \E,
+       and sequences such as \x{100} that define character code points.  Char-
+       acter type escapes such as \d are faulted.
+
+       A closing parenthesis can be included in a name either as \) or between
+       \Q and \E. In addition to backslash processing, if  the  PCRE2_EXTENDED
+       option  is also set, unescaped whitespace in verb names is skipped, and
+       #-comments are recognized, exactly as  in  the  rest  of  the  pattern.
+       PCRE2_EXTENDED does not affect verb names unless PCRE2_ALT_VERBNAMES is
+       also set.
+
+       The maximum length of a name is 255 in the 8-bit library and  65535  in
+       the  16-bit and 32-bit libraries. If the name is empty, that is, if the
+       closing parenthesis immediately follows the colon, the effect is as  if
        the colon were not there. Any number of these verbs may occur in a pat-
        tern.
 
-       Since  these  verbs  are  specifically related to backtracking, most of
-       them can be used only when the pattern is to be matched using the  tra-
-       ditional matching function, because these use a backtracking algorithm.
-       With the exception of (*FAIL), which behaves like  a  failing  negative
+       Since these verbs are specifically related  to  backtracking,  most  of
+       them  can be used only when the pattern is to be matched using the tra-
+       ditional matching function, because that uses a backtracking algorithm.
+       With  the  exception  of (*FAIL), which behaves like a failing negative
        assertion, the backtracking control verbs cause an error if encountered
        by the DFA matching function.
 
-       The behaviour of these verbs in repeated  groups,  assertions,  and  in
+       The  behaviour  of  these  verbs in repeated groups, assertions, and in
        subpatterns called as subroutines (whether or not recursively) is docu-
        mented below.
 
@@ -8000,71 +8506,71 @@ BACKTRACKING CONTROL
 
        PCRE2 contains some optimizations that are used to speed up matching by
        running some checks at the start of each match attempt. For example, it
-       may know the minimum length of matching subject, or that  a  particular
+       may  know  the minimum length of matching subject, or that a particular
        character must be present. When one of these optimizations bypasses the
-       running of a match,  any  included  backtracking  verbs  will  not,  of
+       running  of  a  match,  any  included  backtracking  verbs will not, of
        course, be processed. You can suppress the start-of-match optimizations
-       by setting the PCRE2_NO_START_OPTIMIZE option when  calling  pcre2_com-
-       pile(),  or by starting the pattern with (*NO_START_OPT). There is more
+       by  setting  the PCRE2_NO_START_OPTIMIZE option when calling pcre2_com-
+       pile(), or by starting the pattern with (*NO_START_OPT). There is  more
        discussion of this option in the section entitled "Compiling a pattern"
        in the pcre2api documentation.
 
-       Experiments  with  Perl  suggest that it too has similar optimizations,
+       Experiments with Perl suggest that it too  has  similar  optimizations,
        sometimes leading to anomalous results.
 
    Verbs that act immediately
 
-       The following verbs act as soon as they are encountered. They  may  not
+       The  following  verbs act as soon as they are encountered. They may not
        be followed by a name.
 
           (*ACCEPT)
 
-       This  verb causes the match to end successfully, skipping the remainder
-       of the pattern. However, when it is inside a subpattern that is  called
-       as  a  subroutine, only that subpattern is ended successfully. Matching
+       This verb causes the match to end successfully, skipping the  remainder
+       of  the pattern. However, when it is inside a subpattern that is called
+       as a subroutine, only that subpattern is ended  successfully.  Matching
        then continues at the outer level. If (*ACCEPT) in triggered in a posi-
-       tive  assertion,  the  assertion succeeds; in a negative assertion, the
+       tive assertion, the assertion succeeds; in a  negative  assertion,  the
        assertion fails.
 
-       If (*ACCEPT) is inside capturing parentheses, the data so far  is  cap-
+       If  (*ACCEPT)  is inside capturing parentheses, the data so far is cap-
        tured. For example:
 
          A((?:A|B(*ACCEPT)|C)D)
 
-       This  matches  "AB", "AAD", or "ACD"; when it matches "AB", "B" is cap-
+       This matches "AB", "AAD", or "ACD"; when it matches "AB", "B"  is  cap-
        tured by the outer parentheses.
 
          (*FAIL) or (*F)
 
-       This verb causes a matching failure, forcing backtracking to occur.  It
-       is  equivalent to (?!) but easier to read. The Perl documentation notes
-       that it is probably useful only when combined  with  (?{})  or  (??{}).
-       Those  are, of course, Perl features that are not present in PCRE2. The
-       nearest equivalent is the callout feature, as for example in this  pat-
+       This  verb causes a matching failure, forcing backtracking to occur. It
+       is equivalent to (?!) but easier to read. The Perl documentation  notes
+       that  it  is  probably  useful only when combined with (?{}) or (??{}).
+       Those are, of course, Perl features that are not present in PCRE2.  The
+       nearest  equivalent is the callout feature, as for example in this pat-
        tern:
 
          a+(?C)(*FAIL)
 
-       A  match  with the string "aaaa" always fails, but the callout is taken
+       A match with the string "aaaa" always fails, but the callout  is  taken
        before each backtrack happens (in this example, 10 times).
 
    Recording which path was taken
 
-       There is one verb whose main purpose  is  to  track  how  a  match  was
-       arrived  at,  though  it  also  has a secondary use in conjunction with
+       There  is  one  verb  whose  main  purpose  is to track how a match was
+       arrived at, though it also has a  secondary  use  in  conjunction  with
        advancing the match starting point (see (*SKIP) below).
 
          (*MARK:NAME) or (*:NAME)
 
-       A name is always  required  with  this  verb.  There  may  be  as  many
-       instances  of  (*MARK) as you like in a pattern, and their names do not
+       A  name  is  always  required  with  this  verb.  There  may be as many
+       instances of (*MARK) as you like in a pattern, and their names  do  not
        have to be unique.
 
-       When a match succeeds, the name of the  last-encountered  (*MARK:NAME),
-       (*PRUNE:NAME),  or  (*THEN:NAME) on the matching path is passed back to
-       the caller as described in  the  section  entitled  "Other  information
-       about  the  match" in the pcre2api documentation. Here is an example of
-       pcre2test output, where the "mark" modifier requests the retrieval  and
+       When  a  match succeeds, the name of the last-encountered (*MARK:NAME),
+       (*PRUNE:NAME), or (*THEN:NAME) on the matching path is passed  back  to
+       the  caller  as  described  in  the section entitled "Other information
+       about the match" in the pcre2api documentation. Here is an  example  of
+       pcre2test  output, where the "mark" modifier requests the retrieval and
        outputting of (*MARK) data:
 
            re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
@@ -8076,72 +8582,72 @@ BACKTRACKING CONTROL
          MK: B
 
        The (*MARK) name is tagged with "MK:" in this output, and in this exam-
-       ple it indicates which of the two alternatives matched. This is a  more
-       efficient  way of obtaining this information than putting each alterna-
+       ple  it indicates which of the two alternatives matched. This is a more
+       efficient way of obtaining this information than putting each  alterna-
        tive in its own capturing parentheses.
 
-       If a verb with a name is encountered in a positive  assertion  that  is
-       true,  the  name  is recorded and passed back if it is the last-encoun-
+       If  a  verb  with a name is encountered in a positive assertion that is
+       true, the name is recorded and passed back if it  is  the  last-encoun-
        tered. This does not happen for negative assertions or failing positive
        assertions.
 
-       After  a  partial match or a failed match, the last encountered name in
+       After a partial match or a failed match, the last encountered  name  in
        the entire match process is returned. For example:
 
            re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
          data> XP
          No match, mark = B
 
-       Note that in this unanchored example the  mark  is  retained  from  the
+       Note  that  in  this  unanchored  example the mark is retained from the
        match attempt that started at the letter "X" in the subject. Subsequent
        match attempts starting at "P" and then with an empty string do not get
        as far as the (*MARK) item, but nevertheless do not reset it.
 
-       If  you  are  interested  in  (*MARK)  values after failed matches, you
-       should probably set the PCRE2_NO_START_OPTIMIZE option (see  above)  to
+       If you are interested in  (*MARK)  values  after  failed  matches,  you
+       should  probably  set the PCRE2_NO_START_OPTIMIZE option (see above) to
        ensure that the match is always attempted.
 
    Verbs that act after backtracking
 
        The following verbs do nothing when they are encountered. Matching con-
-       tinues with what follows, but if there is no subsequent match,  causing
-       a  backtrack  to  the  verb, a failure is forced. That is, backtracking
-       cannot pass to the left of the verb. However, when one of  these  verbs
-       appears inside an atomic group (which includes any group that is called
-       as a subroutine) or in an assertion that is true, its  effect  is  con-
-       fined  to that group, because once the group has been matched, there is
-       never any backtracking into it. In this situation, backtracking has  to
-       jump to the left of the entire atomic group or assertion.
-
-       These  verbs  differ  in exactly what kind of failure occurs when back-
-       tracking reaches them. The behaviour described below  is  what  happens
-       when  the  verb is not in a subroutine or an assertion. Subsequent sec-
+       tinues  with what follows, but if there is no subsequent match, causing
+       a backtrack to the verb, a failure is  forced.  That  is,  backtracking
+       cannot  pass  to the left of the verb. However, when one of these verbs
+       appears inside an atomic group or in an assertion  that  is  true,  its
+       effect  is  confined  to  that  group,  because once the group has been
+       matched, there is never any backtracking into it.  In  this  situation,
+       backtracking  has  to  jump  to  the left of the entire atomic group or
+       assertion.
+
+       These verbs differ in exactly what kind of failure  occurs  when  back-
+       tracking  reaches  them.  The behaviour described below is what happens
+       when the verb is not in a subroutine or an assertion.  Subsequent  sec-
        tions cover these special cases.
 
          (*COMMIT)
 
-       This verb, which may not be followed by a name, causes the whole  match
+       This  verb, which may not be followed by a name, causes the whole match
        to fail outright if there is a later matching failure that causes back-
-       tracking to reach it. Even if the pattern  is  unanchored,  no  further
+       tracking  to  reach  it.  Even if the pattern is unanchored, no further
        attempts to find a match by advancing the starting point take place. If
-       (*COMMIT) is the only backtracking verb that is  encountered,  once  it
-       has  been  passed  pcre2_match() is committed to finding a match at the
+       (*COMMIT)  is  the  only backtracking verb that is encountered, once it
+       has been passed pcre2_match() is committed to finding a  match  at  the
        current starting point, or not at all. For example:
 
          a+(*COMMIT)b
 
-       This matches "xxaab" but not "aacaab". It can be thought of as  a  kind
+       This  matches  "xxaab" but not "aacaab". It can be thought of as a kind
        of dynamic anchor, or "I've started, so I must finish." The name of the
-       most recently passed (*MARK) in the path is passed back when  (*COMMIT)
+       most  recently passed (*MARK) in the path is passed back when (*COMMIT)
        forces a match failure.
 
-       If  there  is more than one backtracking verb in a pattern, a different
-       one that follows (*COMMIT) may be triggered first,  so  merely  passing
+       If there is more than one backtracking verb in a pattern,  a  different
+       one  that  follows  (*COMMIT) may be triggered first, so merely passing
        (*COMMIT) during a match does not always guarantee that a match must be
        at this starting point.
 
-       Note that (*COMMIT) at the start of a pattern is not  the  same  as  an
-       anchor,  unless PCRE2's start-of-match optimizations are turned off, as
+       Note  that  (*COMMIT)  at  the start of a pattern is not the same as an
+       anchor, unless PCRE2's start-of-match optimizations are turned off,  as
        shown in this output from pcre2test:
 
            re> /(*COMMIT)abc/
@@ -8152,33 +8658,32 @@ BACKTRACKING CONTROL
          data> xyzabc
          No match
 
-       For the first pattern, PCRE2 knows that any match must start with  "a",
-       so  the optimization skips along the subject to "a" before applying the
-       pattern to the first set of data. The match attempt then succeeds.  The
-       second  pattern disables the optimization that skips along to the first
-       character. The pattern is now applied  starting  at  "x",  and  so  the
-       (*COMMIT)  causes  the  match to fail without trying any other starting
+       For  the first pattern, PCRE2 knows that any match must start with "a",
+       so the optimization skips along the subject to "a" before applying  the
+       pattern  to the first set of data. The match attempt then succeeds. The
+       second pattern disables the optimization that skips along to the  first
+       character.  The  pattern  is  now  applied  starting at "x", and so the
+       (*COMMIT) causes the match to fail without trying  any  other  starting
        points.
 
          (*PRUNE) or (*PRUNE:NAME)
 
-       This verb causes the match to fail at the current starting position  in
+       This  verb causes the match to fail at the current starting position in
        the subject if there is a later matching failure that causes backtrack-
-       ing to reach it. If the pattern is unanchored, the  normal  "bumpalong"
-       advance  to  the next starting character then happens. Backtracking can
-       occur as usual to the left of (*PRUNE), before it is reached,  or  when
-       matching  to  the  right  of  (*PRUNE), but if there is no match to the
-       right, backtracking cannot cross (*PRUNE). In simple cases, the use  of
-       (*PRUNE)  is just an alternative to an atomic group or possessive quan-
+       ing  to  reach it. If the pattern is unanchored, the normal "bumpalong"
+       advance to the next starting character then happens.  Backtracking  can
+       occur  as  usual to the left of (*PRUNE), before it is reached, or when
+       matching to the right of (*PRUNE), but if there  is  no  match  to  the
+       right,  backtracking cannot cross (*PRUNE). In simple cases, the use of
+       (*PRUNE) is just an alternative to an atomic group or possessive  quan-
        tifier, but there are some uses of (*PRUNE) that cannot be expressed in
-       any  other  way. In an anchored pattern (*PRUNE) has the same effect as
+       any other way. In an anchored pattern (*PRUNE) has the same  effect  as
        (*COMMIT).
 
-       The   behaviour   of   (*PRUNE:NAME)   is   the   not   the   same   as
-       (*MARK:NAME)(*PRUNE).   It  is  like  (*MARK:NAME)  in that the name is
-       remembered for  passing  back  to  the  caller.  However,  (*SKIP:NAME)
-       searches  only  for  names  set  with  (*MARK),  ignoring  those set by
-       (*PRUNE) or (*THEN).
+       The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE).
+       It is like (*MARK:NAME) in that the name is remembered for passing back
+       to  the  caller. However, (*SKIP:NAME) searches only for names set with
+       (*MARK), ignoring those set by (*PRUNE) or (*THEN).
 
          (*SKIP)
 
@@ -8311,50 +8816,55 @@ BACKTRACKING CONTROL
 
    Backtracking verbs in assertions
 
-       (*FAIL)  in  an assertion has its normal effect: it forces an immediate
-       backtrack.
+       (*FAIL)  in any assertion has its normal effect: it forces an immediate
+       backtrack. The behaviour of the other  backtracking  verbs  depends  on
+       whether  or  not the assertion is standalone or acting as the condition
+       in a conditional subpattern.
 
-       (*ACCEPT) in a positive assertion causes the assertion to succeed with-
-       out  any  further processing. In a negative assertion, (*ACCEPT) causes
-       the assertion to fail without any further processing.
+       (*ACCEPT) in a standalone positive assertion causes  the  assertion  to
+       succeed  without any further processing; captured strings are retained.
+       In a standalone negative assertion, (*ACCEPT) causes the  assertion  to
+       fail without any further processing; captured substrings are discarded.
 
-       The other backtracking verbs are not treated specially if  they  appear
-       in  a  positive  assertion.  In  particular,  (*THEN) skips to the next
-       alternative in the innermost enclosing  group  that  has  alternations,
-       whether or not this is within the assertion.
+       If  the  assertion is a condition, (*ACCEPT) causes the condition to be
+       true for a positive assertion and false for a  negative  one;  captured
+       substrings are retained in both cases.
 
-       Negative  assertions  are,  however, different, in order to ensure that
-       changing a positive assertion into a  negative  assertion  changes  its
-       result. Backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes a neg-
-       ative assertion to be true, without considering any further alternative
-       branches in the assertion.  Backtracking into (*THEN) causes it to skip
-       to the next enclosing alternative within the assertion (the normal  be-
-       haviour),  but  if  the  assertion  does  not have such an alternative,
-       (*THEN) behaves like (*PRUNE).
+       The  effect of (*THEN) is not allowed to escape beyond an assertion. If
+       there are no more branches to try, (*THEN) causes a positive  assertion
+       to be false, and a negative assertion to be true.
+
+       The  other  backtracking verbs are not treated specially if they appear
+       in a standalone positive assertion. In a  conditional  positive  asser-
+       tion, backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the con-
+       dition to be false. However, for both standalone and conditional  nega-
+       tive  assertions,  backtracking  into  (*COMMIT),  (*SKIP), or (*PRUNE)
+       causes the assertion to be true, without considering any further alter-
+       native branches.
 
    Backtracking verbs in subroutines
 
-       These behaviours occur whether or not the subpattern is  called  recur-
+       These  behaviours  occur whether or not the subpattern is called recur-
        sively.  Perl's treatment of subroutines is different in some cases.
 
-       (*FAIL)  in  a subpattern called as a subroutine has its normal effect:
+       (*FAIL) in a subpattern called as a subroutine has its  normal  effect:
        it forces an immediate backtrack.
 
-       (*ACCEPT) in a subpattern called as a subroutine causes the  subroutine
-       match  to succeed without any further processing. Matching then contin-
+       (*ACCEPT)  in a subpattern called as a subroutine causes the subroutine
+       match to succeed without any further processing. Matching then  contin-
        ues after the subroutine call.
 
        (*COMMIT), (*SKIP), and (*PRUNE) in a subpattern called as a subroutine
        cause the subroutine match to fail.
 
-       (*THEN)  skips to the next alternative in the innermost enclosing group
-       within the subpattern that has alternatives. If there is no such  group
+       (*THEN) skips to the next alternative in the innermost enclosing  group
+       within  the subpattern that has alternatives. If there is no such group
        within the subpattern, (*THEN) causes the subroutine match to fail.
 
 
 SEE ALSO
 
-       pcre2api(3),    pcre2callout(3),    pcre2matching(3),   pcre2syntax(3),
+       pcre2api(3),   pcre2callout(3),    pcre2matching(3),    pcre2syntax(3),
        pcre2(3).
 
 
@@ -8367,8 +8877,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 20 June 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 12 September 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -8389,11 +8899,12 @@ PCRE2 PERFORMANCE
 COMPILED PATTERN MEMORY USAGE
 
        Patterns are compiled by PCRE2 into a reasonably efficient interpretive
-       code, so that most simple patterns do not  use  much  memory.  However,
-       there  is  one case where the memory usage of a compiled pattern can be
-       unexpectedly large. If a parenthesized subpattern has a quantifier with
-       a minimum greater than 1 and/or a limited maximum, the whole subpattern
-       is repeated in the compiled code. For example, the pattern
+       code, so that most simple patterns do not use much memory  for  storing
+       the compiled version. However, there is one case where the memory usage
+       of a compiled pattern can be unexpectedly  large.  If  a  parenthesized
+       subpattern has a quantifier with a minimum greater than 1 and/or a lim-
+       ited maximum, the whole subpattern is repeated in  the  compiled  code.
+       For example, the pattern
 
          (abc|def){2,4}
 
@@ -8401,134 +8912,188 @@ COMPILED PATTERN MEMORY USAGE
 
          (abc|def)(abc|def)((abc|def)(abc|def)?)?
 
-       (Technical aside: It is done this way so that backtrack  points  within
+       (Technical  aside:  It is done this way so that backtrack points within
        each of the repetitions can be independently maintained.)
 
-       For  regular expressions whose quantifiers use only small numbers, this
-       is not usually a problem. However, if the numbers are large,  and  par-
-       ticularly  if  such repetitions are nested, the memory usage can become
+       For regular expressions whose quantifiers use only small numbers,  this
+       is  not  usually a problem. However, if the numbers are large, and par-
+       ticularly if such repetitions are nested, the memory usage  can  become
        an embarrassment. For example, the very simple pattern
 
          ((ab){1,1000}c){1,3}
 
-       uses 51K bytes when compiled using the 8-bit  library.  When  PCRE2  is
-       compiled  with its default internal pointer size of two bytes, the size
-       limit on a compiled pattern is 64K code units in the 8-bit  and  16-bit
-       libraries, and this is reached with the above pattern if the outer rep-
-       etition is increased from 3 to 4. PCRE2 can be compiled to  use  larger
-       internal  pointers  and thus handle larger compiled patterns, but it is
-       better to try to rewrite your pattern to use less memory if you can.
+       uses  over  50K bytes when compiled using the 8-bit library. When PCRE2
+       is compiled with its default internal pointer size of  two  bytes,  the
+       size  limit  on  a  compiled pattern is 64K code units in the 8-bit and
+       16-bit libraries, and this is reached with the  above  pattern  if  the
+       outer repetition is increased from 3 to 4. PCRE2 can be compiled to use
+       larger internal pointers and thus handle larger compiled patterns,  but
+       it  is  better to try to rewrite your pattern to use less memory if you
+       can.
 
        One way of reducing the memory usage for such patterns is to  make  use
        of PCRE2's "subroutine" facility. Re-writing the above pattern as
 
          ((ab)(?2){0,999}c)(?1){0,2}
 
-       reduces the memory requirements to 18K, and indeed it remains under 20K
-       even with the outer repetition increased to 100. However, this  pattern
-       is  not  exactly equivalent, because the "subroutine" calls are treated
-       as atomic groups into which there can be no backtracking if there is  a
-       subsequent  matching  failure.  Therefore, PCRE2 cannot do this kind of
-       rewriting automatically.  Furthermore, there is a  noticeable  loss  of
-       speed  when executing the modified pattern. Nevertheless, if the atomic
-       grouping is not a problem and the loss of  speed  is  acceptable,  this
-       kind  of rewriting will allow you to process patterns that PCRE2 cannot
-       otherwise handle.
-
-
-STACK USAGE AT RUN TIME
-
-       When pcre2_match() is used for matching, certain kinds of  pattern  can
-       cause  it  to  use large amounts of the process stack. In some environ-
-       ments the default process stack is quite small, and if it runs out  the
-       result  is  often  SIGSEGV.  Rewriting your pattern can often help. The
-       pcre2stack documentation discusses this issue in detail.
+       reduces  the  memory  requirements to around 16K, and indeed it remains
+       under 20K even with the outer repetition  increased  to  100.  However,
+       this kind of pattern is not always exactly equivalent, because any cap-
+       tures within subroutine calls are lost when the  subroutine  completes.
+       If  this  is  not  a  problem, this kind of rewriting will allow you to
+       process patterns that PCRE2 cannot otherwise handle. The matching  per-
+       formance  of  the two different versions of the pattern are roughly the
+       same. (This applies from release 10.30 - things were different in  ear-
+       lier releases.)
+
+
+STACK AND HEAP USAGE AT RUN TIME
+
+       From release 10.30, the interpretive (non-JIT) version of pcre2_match()
+       uses very little system stack at run time. In earlier  releases  recur-
+       sive  function  calls  could  use a great deal of stack, and this could
+       cause problems, but this usage has been eliminated. Backtracking  posi-
+       tions  are now explicitly remembered in memory frames controlled by the
+       code. An initial 20K vector of frames is allocated on the system  stack
+       (enough for about 100 frames for small patterns), but if this is insuf-
+       ficient, heap memory is used. The amount of heap memory can be limited;
+       if  the  limit  is  set to zero, only the initial stack vector is used.
+       Rewriting patterns to be time-efficient, as described below,  may  also
+       reduce the memory requirements.
+
+       In  contrast  to  pcre2_match(),  pcre2_dfa_match()  does use recursive
+       function calls, but  only  for  processing  atomic  groups,  lookaround
+       assertions, and recursion within the pattern. Too much nested recursion
+       may cause stack issues. The "match depth"  parameter  can  be  used  to
+       limit the depth of function recursion in pcre2_dfa_match().
 
 
 PROCESSING TIME
 
-       Certain items in regular expression patterns are processed  more  effi-
+       Certain  items  in regular expression patterns are processed more effi-
        ciently than others. It is more efficient to use a character class like
-       [aeiou]  than  a  set  of   single-character   alternatives   such   as
-       (a|e|i|o|u).  In  general,  the simplest construction that provides the
+       [aeiou]   than   a   set   of  single-character  alternatives  such  as
+       (a|e|i|o|u). In general, the simplest construction  that  provides  the
        required behaviour is usually the most efficient. Jeffrey Friedl's book
-       contains  a  lot  of useful general discussion about optimizing regular
-       expressions for efficient performance. This  document  contains  a  few
+       contains a lot of useful general discussion  about  optimizing  regular
+       expressions  for  efficient  performance.  This document contains a few
        observations about PCRE2.
 
-       Using  Unicode  character  properties  (the  \p, \P, and \X escapes) is
-       slow, because PCRE2 has to use a multi-stage table lookup  whenever  it
-       needs  a  character's  property. If you can find an alternative pattern
+       Using Unicode character properties (the \p,  \P,  and  \X  escapes)  is
+       slow,  because  PCRE2 has to use a multi-stage table lookup whenever it
+       needs a character's property. If you can find  an  alternative  pattern
        that does not use character properties, it will probably be faster.
 
-       By default, the escape sequences \b, \d, \s,  and  \w,  and  the  POSIX
-       character  classes  such  as  [:alpha:]  do not use Unicode properties,
+       By  default,  the  escape  sequences  \b, \d, \s, and \w, and the POSIX
+       character classes such as [:alpha:]  do  not  use  Unicode  properties,
        partly for backwards compatibility, and partly for performance reasons.
-       However,  you  can  set  the PCRE2_UCP option or start the pattern with
-       (*UCP) if you want Unicode character properties to be  used.  This  can
-       double  the  matching  time  for  items  such  as \d, when matched with
-       pcre2_match(); the performance loss is less with a DFA  matching  func-
+       However, you can set the PCRE2_UCP option or  start  the  pattern  with
+       (*UCP)  if  you  want Unicode character properties to be used. This can
+       double the matching time for  items  such  as  \d,  when  matched  with
+       pcre2_match();  the  performance loss is less with a DFA matching func-
        tion, and in both cases there is not much difference for \b.
 
-       When  a pattern begins with .* not in atomic parentheses, nor in paren-
-       theses that are the subject of a backreference,  and  the  PCRE2_DOTALL
-       option  is  set,  the pattern is implicitly anchored by PCRE2, since it
-       can match only at the start of a subject string.  If  the  pattern  has
+       When a pattern begins with .* not in atomic parentheses, nor in  paren-
+       theses  that  are  the subject of a backreference, and the PCRE2_DOTALL
+       option is set, the pattern is implicitly anchored by  PCRE2,  since  it
+       can  match  only  at  the start of a subject string. If the pattern has
        multiple top-level branches, they must all be anchorable. The optimiza-
-       tion can be disabled by  the  PCRE2_NO_DOTSTAR_ANCHOR  option,  and  is
+       tion  can  be  disabled  by  the PCRE2_NO_DOTSTAR_ANCHOR option, and is
        automatically disabled if the pattern contains (*PRUNE) or (*SKIP).
 
-       If  PCRE2_DOTALL  is  not  set,  PCRE2  cannot  make this optimization,
+       If PCRE2_DOTALL is  not  set,  PCRE2  cannot  make  this  optimization,
        because the dot metacharacter does not then match a newline, and if the
-       subject  string contains newlines, the pattern may match from the char-
+       subject string contains newlines, the pattern may match from the  char-
        acter immediately following one of them instead of from the very start.
        For example, the pattern
 
          .*second
 
-       matches  the subject "first\nand second" (where \n stands for a newline
-       character), with the match starting at the seventh character. In  order
-       to  do  this, PCRE2 has to retry the match starting after every newline
+       matches the subject "first\nand second" (where \n stands for a  newline
+       character),  with the match starting at the seventh character. In order
+       to do this, PCRE2 has to retry the match starting after  every  newline
        in the subject.
 
-       If you are using such a pattern with subject strings that do  not  con-
-       tain   newlines,   the   best   performance   is  obtained  by  setting
-       PCRE2_DOTALL, or starting the pattern with  ^.*  or  ^.*?  to  indicate
+       If  you  are using such a pattern with subject strings that do not con-
+       tain  newlines,  the  best   performance   is   obtained   by   setting
+       PCRE2_DOTALL,  or  starting  the  pattern  with ^.* or ^.*? to indicate
        explicit anchoring. That saves PCRE2 from having to scan along the sub-
        ject looking for a newline to restart at.
 
-       Beware of patterns that contain nested indefinite  repeats.  These  can
-       take  a  long time to run when applied to a string that does not match.
+       Beware  of  patterns  that contain nested indefinite repeats. These can
+       take a long time to run when applied to a string that does  not  match.
        Consider the pattern fragment
 
          ^(a+)*
 
-       This can match "aaaa" in 16 different ways, and this  number  increases
-       very  rapidly  as the string gets longer. (The * repeat can match 0, 1,
-       2, 3, or 4 times, and for each of those cases other than 0 or 4, the  +
-       repeats  can  match  different numbers of times.) When the remainder of
-       the pattern is such that the entire match is going to fail,  PCRE2  has
-       in  principle  to  try  every  possible variation, and this can take an
+       This  can  match "aaaa" in 16 different ways, and this number increases
+       very rapidly as the string gets longer. (The * repeat can match  0,  1,
+       2,  3, or 4 times, and for each of those cases other than 0 or 4, the +
+       repeats can match different numbers of times.) When  the  remainder  of
+       the  pattern  is such that the entire match is going to fail, PCRE2 has
+       in principle to try every possible variation,  and  this  can  take  an
        extremely long time, even for relatively short strings.
 
        An optimization catches some of the more simple cases such as
 
          (a+)*b
 
-       where a literal character follows. Before  embarking  on  the  standard
-       matching  procedure, PCRE2 checks that there is a "b" later in the sub-
-       ject string, and if there is not, it fails the match immediately.  How-
-       ever,  when  there  is no following literal this optimization cannot be
+       where  a  literal  character  follows. Before embarking on the standard
+       matching procedure, PCRE2 checks that there is a "b" later in the  sub-
+       ject  string, and if there is not, it fails the match immediately. How-
+       ever, when there is no following literal this  optimization  cannot  be
        used. You can see the difference by comparing the behaviour of
 
          (a+)*\d
 
-       with the pattern above. The former gives  a  failure  almost  instantly
-       when  applied  to  a  whole  line of "a" characters, whereas the latter
+       with  the  pattern  above.  The former gives a failure almost instantly
+       when applied to a whole line of  "a"  characters,  whereas  the  latter
        takes an appreciable time with strings longer than about 20 characters.
 
        In many cases, the solution to this kind of performance issue is to use
-       an atomic group or a possessive quantifier.
+       an atomic group or a possessive quantifier. This can often reduce  mem-
+       ory requirements as well. As another example, consider this pattern:
+
+         ([^<]|<(?!inet))+
+
+       It  matches  from wherever it starts until it encounters "<inet" or the
+       end of the data, and is the kind of pattern that  might  be  used  when
+       processing an XML file. Each iteration of the outer parentheses matches
+       either one character that is not "<" or a "<" that is not  followed  by
+       "inet".  However,  each time a parenthesis is processed, a backtracking
+       position is passed, so this formulation uses a memory  frame  for  each
+       matched character. For a long string, a lot of memory is required. Con-
+       sider now this  rewritten  pattern,  which  matches  exactly  the  same
+       strings:
+
+         ([^<]++|<(?!inet))+
+
+       This runs much faster, because sequences of characters that do not con-
+       tain "<" are "swallowed" in one item inside the parentheses, and a pos-
+       sessive  quantifier  is  used to stop any backtracking into the runs of
+       non-"<" characters. This version also uses a lot  less  memory  because
+       entry  to  a  new  set of parentheses happens only when a "<" character
+       that is not followed by "inet" is encountered (and we  assume  this  is
+       relatively rare).
+
+       This example shows that one way of optimizing performance when matching
+       long subject strings is to write repeated parenthesized subpatterns  to
+       match more than one character whenever possible.
+
+   SETTING RESOURCE LIMITS
+
+       You  can  set  limits on the amount of processing that takes place when
+       matching, and on the amount of heap memory that is  used.  The  default
+       values of the limits are very large, and unlikely ever to operate. They
+       can be changed when PCRE2 is built, and  they  can  also  be  set  when
+       pcre2_match()  or  pcre2_dfa_match()  is  called.  For details of these
+       interfaces, see the pcre2build documentation and the  section  entitled
+       "The match context" in the pcre2api documentation.
+
+       The  pcre2test  test program has a modifier called "find_limits" which,
+       if applied to a subject line, causes it to  find  the  smallest  limits
+       that allow a pattern to match. This is done by repeatedly matching with
+       different limits.
 
 
 AUTHOR
@@ -8540,8 +9105,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 02 January 2015
-       Copyright (c) 1997-2015 University of Cambridge.
+       Last updated: 08 April 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -8593,32 +9158,34 @@ DESCRIPTION
 
        There are also some options that are not defined by POSIX.  These  have
        been  added  at  the  request  of users who want to make use of certain
-       PCRE2-specific features via the POSIX calling interface.
-
-       When PCRE2 is called via these functions, it is only the  API  that  is
-       POSIX-like  in  style.  The syntax and semantics of the regular expres-
-       sions themselves are still those of Perl, subject  to  the  setting  of
-       various  PCRE2 options, as described below. "POSIX-like in style" means
-       that the API approximates to the POSIX  definition;  it  is  not  fully
-       POSIX-compatible,  and  in  multi-unit  encoding domains it is probably
+       PCRE2-specific features via the POSIX calling interface or to  add  BSD
+       or GNU functionality.
+
+       When  PCRE2  is  called via these functions, it is only the API that is
+       POSIX-like in style. The syntax and semantics of  the  regular  expres-
+       sions  themselves  are  still  those of Perl, subject to the setting of
+       various PCRE2 options, as described below. "POSIX-like in style"  means
+       that  the  API  approximates  to  the POSIX definition; it is not fully
+       POSIX-compatible, and in multi-unit encoding  domains  it  is  probably
        even less compatible.
 
        The header for these functions is supplied as pcre2posix.h to avoid any
-       potential  clash  with  other  POSIX  libraries.  It can, of course, be
+       potential clash with other POSIX  libraries.  It  can,  of  course,  be
        renamed or aliased as regex.h, which is the "correct" name. It provides
-       two  structure  types,  regex_t  for  compiled internal forms, and reg-
-       match_t for returning captured substrings. It also  defines  some  con-
-       stants  whose  names  start  with  "REG_";  these  are used for setting
+       two structure types, regex_t for  compiled  internal  forms,  and  reg-
+       match_t  for  returning  captured substrings. It also defines some con-
+       stants whose names start  with  "REG_";  these  are  used  for  setting
        options and identifying error codes.
 
 
 COMPILING A PATTERN
 
-       The function regcomp() is called to compile a pattern into an  internal
-       form.  The  pattern  is  a C string terminated by a binary zero, and is
-       passed in the argument pattern. The preg argument is  a  pointer  to  a
-       regex_t  structure that is used as a base for storing information about
-       the compiled regular expression.
+       The  function regcomp() is called to compile a pattern into an internal
+       form. By default, the pattern is a C string terminated by a binary zero
+       (but  see  REG_PEND below). The preg argument is a pointer to a regex_t
+       structure that is used as a base for storing information about the com-
+       piled  regular  expression. (It is also used for input when REG_PEND is
+       set.)
 
        The argument cflags is either zero, or contains one or more of the bits
        defined by the following macros:
@@ -8641,14 +9208,34 @@ COMPILING A PATTERN
        the defined POSIX behaviour for REG_NEWLINE  (see  the  following  sec-
        tion).
 
+         REG_NOSPEC
+
+       The  PCRE2_LITERAL  option is set when the regular expression is passed
+       for compilation to the native function. This disables all meta  charac-
+       ters  in the pattern, causing it to be treated as a literal string. The
+       only other options that are  allowed  with  REG_NOSPEC  are  REG_ICASE,
+       REG_NOSUB,  REG_PEND,  and REG_UTF. Note that REG_NOSPEC is not part of
+       the POSIX standard.
+
          REG_NOSUB
 
-       When  a  pattern that is compiled with this flag is passed to regexec()
-       for matching, the nmatch and pmatch arguments are ignored, and no  cap-
+       When a pattern that is compiled with this flag is passed  to  regexec()
+       for  matching, the nmatch and pmatch arguments are ignored, and no cap-
        tured strings are returned. Versions of the PCRE library prior to 10.22
-       used to set the  PCRE2_NO_AUTO_CAPTURE  compile  option,  but  this  no
+       used  to  set  the  PCRE2_NO_AUTO_CAPTURE  compile  option, but this no
        longer happens because it disables the use of back references.
 
+         REG_PEND
+
+       If this option is set, the reg_endp field in the preg structure  (which
+       has the type const char *) must be set to point to the character beyond
+       the end of the pattern before calling regcomp(). The pattern itself may
+       now  contain binary zeroes, which are treated as data characters. With-
+       out REG_PEND, a binary zero terminates  the  pattern  and  the  re_endp
+       field  is  ignored.  This  is a GNU extension to the POSIX standard and
+       should be used with caution in software  intended  to  be  portable  to
+       other systems.
+
          REG_UCP
 
        The  PCRE2_UCP  option is set when the regular expression is passed for
@@ -8678,11 +9265,12 @@ COMPILING A PATTERN
        ter (they are not) or by a negative class such as [^a] (they are).
 
        The  yield of regcomp() is zero on success, and non-zero otherwise. The
-       preg structure is filled in on success, and one member of the structure
-       is  public: re_nsub contains the number of capturing subpatterns in the
-       regular expression. Various error codes are defined in the header file.
+       preg structure is filled in on success, and one  other  member  of  the
+       structure  (as  well as re_endp) is public: re_nsub contains the number
+       of capturing subpatterns in the regular expression. Various error codes
+       are defined in the header file.
 
-       NOTE: If the yield of regcomp() is non-zero, you must  not  attempt  to
+       NOTE:  If  the  yield of regcomp() is non-zero, you must not attempt to
        use the contents of the preg structure. If, for example, you pass it to
        regexec(), the result is undefined and your program is likely to crash.
 
@@ -8690,9 +9278,9 @@ COMPILING A PATTERN
 MATCHING NEWLINE CHARACTERS
 
        This area is not simple, because POSIX and Perl take different views of
-       things.   It  is not possible to get PCRE2 to obey POSIX semantics, but
+       things.  It is not possible to get PCRE2 to obey POSIX  semantics,  but
        then PCRE2 was never intended to be a POSIX engine. The following table
-       lists  the  different  possibilities for matching newline characters in
+       lists the different possibilities for matching  newline  characters  in
        Perl and PCRE2:
 
                                  Default   Change with
@@ -8713,25 +9301,25 @@ MATCHING NEWLINE CHARACTERS
          $ matches \n in middle     no     REG_NEWLINE
          ^ matches \n in middle     no     REG_NEWLINE
 
-       This behaviour is not what happens when PCRE2 is called via  its  POSIX
-       API.  By  default, PCRE2's behaviour is the same as Perl's, except that
-       there is no equivalent for PCRE2_DOLLAR_ENDONLY in Perl. In both  PCRE2
+       This  behaviour  is not what happens when PCRE2 is called via its POSIX
+       API. By default, PCRE2's behaviour is the same as Perl's,  except  that
+       there  is no equivalent for PCRE2_DOLLAR_ENDONLY in Perl. In both PCRE2
        and Perl, there is no way to stop newline from matching [^a].
 
-       Default  POSIX newline handling can be obtained by setting PCRE2_DOTALL
-       and PCRE2_DOLLAR_ENDONLY when  calling  pcre2_compile()  directly,  but
-       there  is  no  way  to make PCRE2 behave exactly as for the REG_NEWLINE
-       action. When using the POSIX API, passing REG_NEWLINE to  PCRE2's  reg-
+       Default POSIX newline handling can be obtained by setting  PCRE2_DOTALL
+       and  PCRE2_DOLLAR_ENDONLY  when  calling  pcre2_compile() directly, but
+       there is no way to make PCRE2 behave exactly  as  for  the  REG_NEWLINE
+       action.  When  using the POSIX API, passing REG_NEWLINE to PCRE2's reg-
        comp() function causes PCRE2_MULTILINE to be passed to pcre2_compile(),
-       and REG_DOTALL passes PCRE2_DOTALL. There is no way to pass  PCRE2_DOL-
+       and  REG_DOTALL passes PCRE2_DOTALL. There is no way to pass PCRE2_DOL-
        LAR_ENDONLY.
 
 
 MATCHING A PATTERN
 
-       The  function  regexec()  is  called  to  match a compiled pattern preg
-       against a given string, which is by default terminated by a  zero  byte
-       (but  see  REG_STARTEND below), subject to the options in eflags. These
+       The function regexec() is called  to  match  a  compiled  pattern  preg
+       against  a  given string, which is by default terminated by a zero byte
+       (but see REG_STARTEND below), subject to the options in  eflags.  These
        can be:
 
          REG_NOTBOL
@@ -8741,9 +9329,9 @@ MATCHING A PATTERN
 
          REG_NOTEMPTY
 
-       The  PCRE2_NOTEMPTY  option  is  set  when calling the underlying PCRE2
-       matching function. Note that REG_NOTEMPTY is  not  part  of  the  POSIX
-       standard.  However, setting this option can give more POSIX-like behav-
+       The PCRE2_NOTEMPTY option is set  when  calling  the  underlying  PCRE2
+       matching  function.  Note  that  REG_NOTEMPTY  is not part of the POSIX
+       standard. However, setting this option can give more POSIX-like  behav-
        iour in some situations.
 
          REG_NOTEOL
@@ -8753,15 +9341,24 @@ MATCHING A PATTERN
 
          REG_STARTEND
 
-       The  string  is  considered to start at string + pmatch[0].rm_so and to
-       have a terminating NUL located at string + pmatch[0].rm_eo (there  need
-       not  actually  be  a  NUL at that location), regardless of the value of
-       nmatch. This is a BSD extension, compatible with but not  specified  by
-       IEEE  Standard  1003.2  (POSIX.2),  and  should be used with caution in
-       software intended to be portable to other systems. Note that a non-zero
-       rm_so does not imply REG_NOTBOL; REG_STARTEND affects only the location
-       of the string, not how it is matched. Setting REG_STARTEND and  passing
-       pmatch  as  NULL  are  mutually  exclusive;  the  error  REG_INVARG  is
+       When this option is set, the subject  string  is  starts  at  string  +
+       pmatch[0].rm_so  and  ends  at  string  + pmatch[0].rm_eo, which should
+       point to the first character beyond the string.  There  may  be  binary
+       zeroes within the subject string, and indeed, using REG_STARTEND is the
+       only way to pass a subject string that contains a binary zero.
+
+       Whatever the value of  pmatch[0].rm_so,  the  offsets  of  the  matched
+       string  and  any  captured  substrings  are still given relative to the
+       start of string itself. (Before PCRE2 release 10.30  these  were  given
+       relative  to  string  +  pmatch[0].rm_so,  but  this differs from other
+       implementations.)
+
+       This is a BSD extension, compatible with  but  not  specified  by  IEEE
+       Standard  1003.2 (POSIX.2), and should be used with caution in software
+       intended to be portable to other systems. Note that  a  non-zero  rm_so
+       does  not  imply REG_NOTBOL; REG_STARTEND affects only the location and
+       length of the string, not how it is matched. Setting  REG_STARTEND  and
+       passing  pmatch as NULL are mutually exclusive; the error REG_INVARG is
        returned.
 
        If the pattern was compiled with the REG_NOSUB flag, no data about  any
@@ -8816,8 +9413,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 31 January 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 15 June 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -8949,26 +9546,29 @@ SECURITY CONCERNS
        use within individual applications.  As  such,  the  data  supplied  to
        pcre2_serialize_decode()  is expected to be trusted data, not data from
        arbitrary external sources.  There  is  only  some  simple  consistency
-       checking, not complete validation of what is being re-loaded.
+       checking, not complete validation of what is being re-loaded. Corrupted
+       data may cause undefined results. For example, if the length field of a
+       pattern in the serialized data is corrupted, the deserializing code may
+       read beyond the end of the byte stream that is passed to it.
 
 
 SAVING COMPILED PATTERNS
 
        Before compiled patterns can be saved they must be serialized, that is,
-       converted to a stream of bytes. A single byte stream  may  contain  any
-       number  of  compiled patterns, but they must all use the same character
+       converted  to  a  stream of bytes. A single byte stream may contain any
+       number of compiled patterns, but they must all use the  same  character
        tables. A single copy of the tables is included in the byte stream (its
        size is 1088 bytes). For more details of character tables, see the sec-
        tion on locale support in the pcre2api documentation.
 
-       The function pcre2_serialize_encode() creates a serialized byte  stream
-       from  a  list of compiled patterns. Its first two arguments specify the
+       The  function pcre2_serialize_encode() creates a serialized byte stream
+       from a list of compiled patterns. Its first two arguments  specify  the
        list, being a pointer to a vector of pointers to compiled patterns, and
        the length of the vector. The third and fourth arguments point to vari-
        ables which are set to point to the created byte stream and its length,
-       respectively.  The  final  argument  is a pointer to a general context,
-       which can be used to specify custom memory  mangagement  functions.  If
-       this  argument  is NULL, malloc() is used to obtain memory for the byte
+       respectively. The final argument is a pointer  to  a  general  context,
+       which  can  be  used to specify custom memory mangagement functions. If
+       this argument is NULL, malloc() is used to obtain memory for  the  byte
        stream. The yield of the function is the number of serialized patterns,
        or one of the following negative error codes:
 
@@ -8978,12 +9578,12 @@ SAVING COMPILED PATTERNS
          PCRE2_ERROR_MIXEDTABLES  the patterns do not all use the same tables
          PCRE2_ERROR_NULL         the 1st, 3rd, or 4th argument is NULL
 
-       PCRE2_ERROR_BADMAGIC  means  either that a pattern's code has been cor-
-       rupted, or that a slot in the vector does not point to a compiled  pat-
+       PCRE2_ERROR_BADMAGIC means either that a pattern's code has  been  cor-
+       rupted,  or that a slot in the vector does not point to a compiled pat-
        tern.
 
        Once a set of patterns has been serialized you can save the data in any
-       appropriate manner. Here is sample code that compiles two patterns  and
+       appropriate  manner. Here is sample code that compiles two patterns and
        writes them to a file. It assumes that the variable fd refers to a file
        that is open for output. The error checking that should be present in a
        real application has been omitted for simplicity.
@@ -9001,13 +9601,13 @@ SAVING COMPILED PATTERNS
            &bytescount, NULL);
          errorcode = fwrite(bytes, 1, bytescount, fd);
 
-       Note  that  the  serialized data is binary data that may contain any of
-       the 256 possible byte  values.  On  systems  that  make  a  distinction
+       Note that the serialized data is binary data that may  contain  any  of
+       the  256  possible  byte  values.  On  systems  that make a distinction
        between binary and non-binary data, be sure that the file is opened for
        binary output.
 
-       Serializing a set of patterns leaves the original  data  untouched,  so
-       they  can  still  be used for matching. Their memory must eventually be
+       Serializing  a  set  of patterns leaves the original data untouched, so
+       they can still be used for matching. Their memory  must  eventually  be
        freed in the usual way by calling pcre2_code_free(). When you have fin-
        ished with the byte stream, it too must be freed by calling pcre2_seri-
        alize_free().
@@ -9015,11 +9615,11 @@ SAVING COMPILED PATTERNS
 
 RE-USING PRECOMPILED PATTERNS
 
-       In order to re-use a set of saved patterns  you  must  first  make  the
-       serialized  byte stream available in main memory (for example, by read-
-       ing from a file). The management of this memory  block  is  up  to  the
+       In  order  to  re-use  a  set of saved patterns you must first make the
+       serialized byte stream available in main memory (for example, by  read-
+       ing  from  a  file).  The  management of this memory block is up to the
        application.  You  can  use  the  pcre2_serialize_get_number_of_codes()
-       function to find out how many compiled patterns are in  the  serialized
+       function  to  find out how many compiled patterns are in the serialized
        data without actually decoding the patterns:
 
          uint8_t *bytes = <serialized data>;
@@ -9027,10 +9627,10 @@ RE-USING PRECOMPILED PATTERNS
 
        The pcre2_serialize_decode() function reads a byte stream and recreates
        the compiled patterns in new memory blocks, setting pointers to them in
-       a  vector.  The  first two arguments are a pointer to a suitable vector
-       and its length, and the third argument points to  a  byte  stream.  The
-       final  argument is a pointer to a general context, which can be used to
-       specify custom memory mangagement functions for the  decoded  patterns.
+       a vector. The first two arguments are a pointer to  a  suitable  vector
+       and  its  length,  and  the third argument points to a byte stream. The
+       final argument is a pointer to a general context, which can be used  to
+       specify  custom  memory mangagement functions for the decoded patterns.
        If this argument is NULL, malloc() and free() are used. After deserial-
        ization, the byte stream is no longer needed and can be discarded.
 
@@ -9040,9 +9640,9 @@ RE-USING PRECOMPILED PATTERNS
          int32_t number_of_codes =
            pcre2_serialize_decode(list_of_codes, 2, bytes, NULL);
 
-       If the vector is not large enough for all  the  patterns  in  the  byte
-       stream,  it  is  filled  with  those  that  fit,  and the remainder are
-       ignored. The yield of the function is the number of  decoded  patterns,
+       If  the  vector  is  not  large enough for all the patterns in the byte
+       stream, it is filled  with  those  that  fit,  and  the  remainder  are
+       ignored.  The  yield of the function is the number of decoded patterns,
        or one of the following negative error codes:
 
          PCRE2_ERROR_BADDATA    second argument is zero or less
@@ -9052,24 +9652,24 @@ RE-USING PRECOMPILED PATTERNS
          PCRE2_ERROR_MEMORY     memory allocation failed
          PCRE2_ERROR_NULL       first or third argument is NULL
 
-       PCRE2_ERROR_BADMAGIC  may mean that the data is corrupt, or that it was
+       PCRE2_ERROR_BADMAGIC may mean that the data is corrupt, or that it  was
        compiled on a system with different endianness.
 
        Decoded patterns can be used for matching in the usual way, and must be
-       freed  by  calling pcre2_code_free(). However, be aware that there is a
-       potential race issue if you  are  using  multiple  patterns  that  were
-       decoded  from  a  single  byte stream in a multithreaded application. A
+       freed by calling pcre2_code_free(). However, be aware that there  is  a
+       potential  race  issue  if  you  are  using multiple patterns that were
+       decoded from a single byte stream in  a  multithreaded  application.  A
        single copy of the character tables is used by all the decoded patterns
        and a reference count is used to arrange for its memory to be automati-
-       cally freed when the last pattern is freed, but there is no locking  on
-       this  reference count. Therefore, if you want to call pcre2_code_free()
-       for these patterns in different threads,  you  must  arrange  your  own
-       locking,  and  ensure  that  pcre2_code_free()  cannot be called by two
+       cally  freed when the last pattern is freed, but there is no locking on
+       this reference count. Therefore, if you want to call  pcre2_code_free()
+       for  these  patterns  in  different  threads, you must arrange your own
+       locking, and ensure that pcre2_code_free()  cannot  be  called  by  two
        threads at the same time.
 
-       If a pattern was processed by pcre2_jit_compile() before being  serial-
-       ized,  the  JIT data is discarded and so is no longer available after a
-       save/restore cycle. You can, however, process a restored  pattern  with
+       If  a pattern was processed by pcre2_jit_compile() before being serial-
+       ized, the JIT data is discarded and so is no longer available  after  a
+       save/restore  cycle.  You can, however, process a restored pattern with
        pcre2_jit_compile() if you wish.
 
 
@@ -9082,174 +9682,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 24 May 2016
-       Copyright (c) 1997-2016 University of Cambridge.
-------------------------------------------------------------------------------
-
-
-PCRE2STACK(3)              Library Functions Manual              PCRE2STACK(3)
-
-
-
-NAME
-       PCRE2 - Perl-compatible regular expressions (revised API)
-
-PCRE2 DISCUSSION OF STACK USAGE
-
-       When  you  call  pcre2_match(),  it  makes  use of an internal function
-       called match(). This calls itself recursively at branch points  in  the
-       pattern,  in  order  to  remember the state of the match so that it can
-       back up and try a different alternative after a  failure.  As  matching
-       proceeds  deeper  and deeper into the tree of possibilities, the recur-
-       sion depth increases. The match() function is also called in other cir-
-       cumstances,  for  example,  whenever  a  parenthesized  sub-pattern  is
-       entered, and in certain cases of repetition.
-
-       Not all calls of match() increase the recursion depth; for an item such
-       as  a* it may be called several times at the same level, after matching
-       different numbers of a's. Furthermore, in a number of cases  where  the
-       result  of  the  recursive call would immediately be passed back as the
-       result of the current call (a "tail recursion"), the function  is  just
-       restarted instead.
-
-       Each  time the internal match() function is called recursively, it uses
-       memory from the process stack. For certain kinds of pattern  and  data,
-       very  large  amounts of stack may be needed, despite the recognition of
-       "tail recursion". Note that if  PCRE2  is  compiled  with  the  -fsani-
-       tize=address  option  of  the  GCC compiler, the stack requirements are
-       greatly increased.
-
-       The above comments apply when pcre2_match() is run in its normal inter-
-       pretive manner. If the compiled pattern was processed by pcre2_jit_com-
-       pile(), and just-in-time compiling  was  successful,  and  the  options
-       passed  to  pcre2_match()  were  not incompatible, the matching process
-       uses the JIT-compiled code instead of the  match()  function.  In  this
-       case, the memory requirements are handled entirely differently. See the
-       pcre2jit documentation for details.
-
-       The  pcre2_dfa_match()  function  operates  in  a  different   way   to
-       pcre2_match(),  and uses recursion only when there is a regular expres-
-       sion recursion or subroutine call in the  pattern.  This  includes  the
-       processing  of assertion and "once-only" subpatterns, which are handled
-       like subroutine calls.  Normally, these are never very  deep,  and  the
-       limit  on  the  complexity  of  pcre2_dfa_match()  is controlled by the
-       amount of workspace it is given.  However, it is possible to write pat-
-       terns  with  runaway  infinite  recursions;  such  patterns  will cause
-       pcre2_dfa_match() to run out of stack. At present, there is no  protec-
-       tion against this.
-
-       The  comments  that  follow do NOT apply to pcre2_dfa_match(); they are
-       relevant only for pcre2_match() without the JIT optimization.
-
-   Reducing pcre2_match()'s stack usage
-
-       You can often reduce the amount of recursion, and therefore the  amount
-       of  stack  used,  by  modifying the pattern that is being matched. Con-
-       sider, for example, this pattern:
-
-         ([^<]|<(?!inet))+
-
-       It matches from wherever it starts until it encounters "<inet"  or  the
-       end  of  the  data,  and is the kind of pattern that might be used when
-       processing an XML file. Each iteration of the outer parentheses matches
-       either  one  character that is not "<" or a "<" that is not followed by
-       "inet". However, each time a  parenthesis  is  processed,  a  recursion
-       occurs, so this formulation uses a stack frame for each matched charac-
-       ter. For a long string, a lot of stack is required. Consider  now  this
-       rewritten pattern, which matches exactly the same strings:
-
-         ([^<]++|<(?!inet))+
-
-       This  uses very much less stack, because runs of characters that do not
-       contain "<" are "swallowed" in one item inside the parentheses.  Recur-
-       sion  happens  only when a "<" character that is not followed by "inet"
-       is encountered (and we assume this is relatively  rare).  A  possessive
-       quantifier  is  used  to stop any backtracking into the runs of non-"<"
-       characters, but that is not related to stack usage.
-
-       This example shows that one way of avoiding stack problems when  match-
-       ing long subject strings is to write repeated parenthesized subpatterns
-       to match more than one character whenever possible.
-
-   Compiling PCRE2 to use heap instead of stack for pcre2_match()
-
-       In environments where stack memory is constrained, you  might  want  to
-       compile PCRE2 to use heap memory instead of stack for remembering back-
-       up points when pcre2_match() is running. This makes it run more slowly,
-       however. Details of how to do this are given in the pcre2build documen-
-       tation. When built in this way, instead of using the stack, PCRE2  gets
-       memory  for  remembering  backup  points from the heap. By default, the
-       memory is obtained by calling the system malloc() function, but you can
-       arrange to supply your own memory management function. For details, see
-       the section entitled "The match context" in the pcre2api documentation.
-       Since the block sizes are always the same, it may be possible to imple-
-       ment customized a memory handler that is more efficient than the  stan-
-       dard function. The memory blocks obtained for this purpose are retained
-       and re-used if possible while pcre2_match() is running.  They  are  all
-       freed just before it exits.
-
-   Limiting pcre2_match()'s stack usage
-
-       You can set limits on the number of times the internal match() function
-       is called, both in total and  recursively.  If  a  limit  is  exceeded,
-       pcre2_match()  returns  an  error  code. Setting suitable limits should
-       prevent it from running out of stack. The default values of the  limits
-       are  very large, and unlikely ever to operate. They can be changed when
-       PCRE2 is built, and they can also be set when pcre2_match() is  called.
-       For  details  of these interfaces, see the pcre2build documentation and
-       the section entitled "The match context" in the pcre2api documentation.
-
-       As a very rough rule of thumb, you should reckon on about 500 bytes per
-       recursion.  Thus,  if  you  want  to limit your stack usage to 8Mb, you
-       should set the limit at 16000 recursions. A 64Mb stack,  on  the  other
-       hand, can support around 128000 recursions.
-
-       The  pcre2test  test program has a modifier called "find_limits" which,
-       if applied to a subject line, causes it to  find  the  smallest  limits
-       that  allow a a pattern to match. This is done by calling pcre2_match()
-       repeatedly with different limits.
-
-   Changing stack size in Unix-like systems
-
-       In Unix-like environments, there is not often a problem with the  stack
-       unless  very  long  strings  are  involved, though the default limit on
-       stack size varies from system to system. Values from 8Mb  to  64Mb  are
-       common. You can find your default limit by running the command:
-
-         ulimit -s
-
-       Unfortunately,  the  effect  of  running out of stack is often SIGSEGV,
-       though sometimes a more explicit error message is given. You  can  nor-
-       mally increase the limit on stack size by code such as this:
-
-         struct rlimit rlim;
-         getrlimit(RLIMIT_STACK, &rlim);
-         rlim.rlim_cur = 100*1024*1024;
-         setrlimit(RLIMIT_STACK, &rlim);
-
-       This  reads  the current limits (soft and hard) using getrlimit(), then
-       attempts to increase the soft limit to  100Mb  using  setrlimit().  You
-       must do this before calling pcre2_match().
-
-   Changing stack size in Mac OS X
-
-       Using setrlimit(), as described above, should also work on Mac OS X. It
-       is also possible to set a stack size when linking a program. There is a
-       discussion   about   stack  sizes  in  Mac  OS  X  at  this  web  site:
-       http://developer.apple.com/qa/qa2005/qa1419.html.
-
-
-AUTHOR
-
-       Philip Hazel
-       University Computing Service
-       Cambridge, England.
-
-
-REVISION
-
-       Last updated: 21 November 2014
-       Copyright (c) 1997-2014 University of Cambridge.
+       Last updated: 21 March 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -9526,18 +9960,21 @@ OPTION SETTING
          (?i)            caseless
          (?J)            allow duplicate names
          (?m)            multiline
+         (?n)            no auto capture
          (?s)            single line (dotall)
          (?U)            default ungreedy (lazy)
-         (?x)            extended (ignore white space)
+         (?x)            extended: ignore white space except in classes
+         (?xx)           as (?x) but also ignore space and tab in classes
          (?-...)         unset option(s)
 
        The following are recognized only at the very start  of  a  pattern  or
        after  one  of the newline or \R options with similar syntax. More than
-       one of them may appear.
+       one of them may appear. For the first three, d is a decimal number.
 
-         (*LIMIT_MATCH=d) set the match limit to d (decimal number)
-         (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number)
-         (*NOTEMPTY)     set PCRE2_NOTEMPTY when matching
+         (*LIMIT_DEPTH=d) set the backtracking limit to d
+         (*LIMIT_HEAP=d)  set the heap size limit to d kilobytes
+         (*LIMIT_MATCH=d) set the match limit to d
+         (*NOTEMPTY)      set PCRE2_NOTEMPTY when matching
          (*NOTEMPTY_ATSTART) set PCRE2_NOTEMPTY_ATSTART when matching
          (*NO_AUTO_POSSESS) no auto-possessification (PCRE2_NO_AUTO_POSSESS)
          (*NO_DOTSTAR_ANCHOR) no .* anchoring (PCRE2_NO_DOTSTAR_ANCHOR)
@@ -9546,16 +9983,17 @@ OPTION SETTING
          (*UTF)          set appropriate UTF mode for the library in use
          (*UCP)          set PCRE2_UCP (use Unicode properties for \d etc)
 
-       Note that LIMIT_MATCH and LIMIT_RECURSION can only reduce the value  of
-       the  limits  set by the caller of pcre2_match(), not increase them. The
-       application can lock out the use of (*UTF) and (*UCP)  by  setting  the
-       PCRE2_NEVER_UTF  or  PCRE2_NEVER_UCP  options, respectively, at compile
-       time.
+       Note that LIMIT_DEPTH, LIMIT_HEAP, and LIMIT_MATCH can only reduce  the
+       value   of   the   limits   set  by  the  caller  of  pcre2_match()  or
+       pcre2_dfa_match(), not increase them. LIMIT_RECURSION  is  an  obsolete
+       synonym for LIMIT_DEPTH. The application can lock out the use of (*UTF)
+       and (*UCP) by setting the PCRE2_NEVER_UTF or  PCRE2_NEVER_UCP  options,
+       respectively, at compile time.
 
 
 NEWLINE CONVENTION
 
-       These are recognized only at the very start of  the  pattern  or  after
+       These  are  recognized  only  at the very start of the pattern or after
        option settings with a similar syntax.
 
          (*CR)           carriage return only
@@ -9563,11 +10001,12 @@ NEWLINE CONVENTION
          (*CRLF)         carriage return followed by linefeed
          (*ANYCRLF)      all three of the above
          (*ANY)          any Unicode newline sequence
+         (*NUL)          the NUL character (binary zero)
 
 
 WHAT \R MATCHES
 
-       These  are  recognized  only  at the very start of the pattern or after
+       These are recognized only at the very start of  the  pattern  or  after
        option setting with a similar syntax.
 
          (*BSR_ANYCRLF)  CR, LF, or CRLF
@@ -9589,6 +10028,9 @@ BACKREFERENCES
          \n              reference by number (can be ambiguous)
          \gn             reference by number
          \g{n}           reference by number
+         \g+n            relative reference by number (PCRE2 extension)
+         \g-n            relative reference by number
+         \g{+n}          relative reference by number (PCRE2 extension)
          \g{-n}          relative reference by number
          \k<name>        reference by name (Perl)
          \k'name'        reference by name (Perl)
@@ -9625,14 +10067,18 @@ CONDITIONAL PATTERNS
          (?(-n)              relative reference condition
          (?(<name>)          named reference condition (Perl)
          (?('name')          named reference condition (Perl)
-         (?(name)            named reference condition (PCRE2)
+         (?(name)            named reference condition (PCRE2, deprecated)
          (?(R)               overall recursion condition
-         (?(Rn)              specific group recursion condition
-         (?(R&name)          specific recursion condition
+         (?(Rn)              specific numbered group recursion condition
+         (?(R&name)          specific named group recursion condition
          (?(DEFINE)          define subpattern for reference
          (?(VERSION[>]=n.m)  test PCRE2 version
          (?(assert)          assertion condition
 
+       Note  the  ambiguity of (?(R) and (?(Rn) which might be named reference
+       conditions or recursion tests. Such a condition  is  interpreted  as  a
+       reference condition if the relevant named group exists.
+
 
 BACKTRACKING CONTROL
 
@@ -9642,7 +10088,7 @@ BACKTRACKING CONTROL
          (*FAIL)         force backtrack; synonym (*F)
          (*MARK:NAME)    set name to be passed back; synonym (*:NAME)
 
-       The following act only when a subsequent match failure causes  a  back-
+       The  following  act only when a subsequent match failure causes a back-
        track to reach them. They all force a match failure, but they differ in
        what happens afterwards. Those that advance the start-of-match point do
        so only if the pattern is not anchored.
@@ -9664,14 +10110,14 @@ CALLOUTS
          (?C"text")      callout with string data
 
        The allowed string delimiters are ` ' " ^ % # $ (which are the same for
-       the start and the end), and the starting delimiter { matched  with  the
-       ending  delimiter  }. To encode the ending delimiter within the string,
+       the  start  and the end), and the starting delimiter { matched with the
+       ending delimiter }. To encode the ending delimiter within  the  string,
        double it.
 
 
 SEE ALSO
 
-       pcre2pattern(3),   pcre2api(3),   pcre2callout(3),    pcre2matching(3),
+       pcre2pattern(3),    pcre2api(3),   pcre2callout(3),   pcre2matching(3),
        pcre2(3).
 
 
@@ -9684,8 +10130,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 16 October 2015
-       Copyright (c) 1997-2015 University of Cambridge.
+       Last updated: 17 June 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
@@ -9724,7 +10170,7 @@ UNICODE PROPERTY SUPPORT
        names  for  properties are supported. For example, \p{L} matches a let-
        ter. Its Perl synonym, \p{Letter}, is not supported.   Furthermore,  in
        Perl,  many properties may optionally be prefixed by "Is", for compati-
-       bility with Perl 5.6. PCRE does not support this.
+       bility with Perl 5.6. PCRE2 does not support this.
 
 
 WIDE CHARACTERS AND UTF MODES
@@ -9775,64 +10221,78 @@ WIDE CHARACTERS AND UTF MODES
        escapes (\h, \H, \v, and \V) do match all the appropriate Unicode char-
        acters, whether or not PCRE2_UCP is set.
 
-       Case-insensitive matching in UTF mode makes use of Unicode  properties.
-       A  few  Unicode characters such as Greek sigma have more than two code-
-       points that are case-equivalent, and these are treated as such.
+
+CASE-EQUIVALENCE IN UTF MODES
+
+       Case-insensitive matching in a UTF mode makes use of Unicode properties
+       except for characters whose code points are less than 128 and that have
+       at most two case-equivalent values. For these, a direct table lookup is
+       used  for speed. A few Unicode characters such as Greek sigma have more
+       than two codepoints that are case-equivalent, and these are treated  as
+       such.
 
 
 VALIDITY OF UTF STRINGS
 
-       When the PCRE2_UTF option is set, the strings passed  as  patterns  and
+       When  the  PCRE2_UTF  option is set, the strings passed as patterns and
        subjects are (by default) checked for validity on entry to the relevant
-       functions.  If an invalid UTF string is passed, an negative error  code
-       is  returned.  The  code  unit offset to the offending character can be
-       extracted from the match data block by  calling  pcre2_get_startchar(),
+       functions.   If an invalid UTF string is passed, an negative error code
+       is returned. The code unit offset to the  offending  character  can  be
+       extracted  from  the match data block by calling pcre2_get_startchar(),
        which is used for this purpose after a UTF error.
 
        UTF-16 and UTF-32 strings can indicate their endianness by special code
-       knows as a byte-order mark (BOM). The PCRE2  functions  do  not  handle
+       knows  as  a  byte-order  mark (BOM). The PCRE2 functions do not handle
        this, expecting strings to be in host byte order.
 
        A UTF string is checked before any other processing takes place. In the
-       case of pcre2_match()  and  pcre2_dfa_match()  calls  with  a  non-zero
-       starting  offset, the check is applied only to that part of the subject
-       that could be inspected during matching, and there is a check that  the
-       starting  offset points to the first code unit of a character or to the
-       end of the subject. If there are no lookbehind assertions in  the  pat-
-       tern,  the check starts at the starting offset. Otherwise, it starts at
-       the length of the longest lookbehind before the starting offset, or  at
-       the  start  of the subject if there are not that many characters before
-       the starting offset. Note that the sequences \b and \B are  one-charac-
+       case  of  pcre2_match()  and  pcre2_dfa_match()  calls  with a non-zero
+       starting offset, the check is applied only to that part of the  subject
+       that  could be inspected during matching, and there is a check that the
+       starting offset points to the first code unit of a character or to  the
+       end  of  the subject. If there are no lookbehind assertions in the pat-
+       tern, the check starts at the starting offset. Otherwise, it starts  at
+       the  length of the longest lookbehind before the starting offset, or at
+       the start of the subject if there are not that many  characters  before
+       the  starting offset. Note that the sequences \b and \B are one-charac-
        ter lookbehinds.
 
-       In  addition  to checking the format of the string, there is a check to
+       In addition to checking the format of the string, there is a  check  to
        ensure that all code points lie in the range U+0 to U+10FFFF, excluding
-       the  surrogate  area. The so-called "non-character" code points are not
+       the surrogate area. The so-called "non-character" code points  are  not
        excluded because Unicode corrigendum #9 makes it clear that they should
        not be.
 
-       Characters  in  the "Surrogate Area" of Unicode are reserved for use by
-       UTF-16, where they are used in pairs to encode code points with  values
-       greater  than  0xFFFF. The code points that are encoded by UTF-16 pairs
-       are available independently in the  UTF-8  and  UTF-32  encodings.  (In
-       other  words,  the  whole  surrogate  thing is a fudge for UTF-16 which
+       Characters in the "Surrogate Area" of Unicode are reserved for  use  by
+       UTF-16,  where they are used in pairs to encode code points with values
+       greater than 0xFFFF. The code points that are encoded by  UTF-16  pairs
+       are  available  independently  in  the  UTF-8 and UTF-32 encodings. (In
+       other words, the whole surrogate thing is  a  fudge  for  UTF-16  which
        unfortunately messes up UTF-8 and UTF-32.)
 
-       In some situations, you may already know that your strings  are  valid,
-       and  therefore  want  to  skip these checks in order to improve perfor-
-       mance, for example in the case of a long subject string that  is  being
-       scanned  repeatedly.   If you set the PCRE2_NO_UTF_CHECK option at com-
-       pile time or at match time, PCRE2 assumes that the pattern  or  subject
+       In  some  situations, you may already know that your strings are valid,
+       and therefore want to skip these checks in  order  to  improve  perfor-
+       mance,  for  example in the case of a long subject string that is being
+       scanned repeatedly.  If you set the PCRE2_NO_UTF_CHECK option  at  com-
+       pile  time  or at match time, PCRE2 assumes that the pattern or subject
        it is given (respectively) contains only valid UTF code unit sequences.
 
-       Passing  PCRE2_NO_UTF_CHECK  to pcre2_compile() just disables the check
+       Passing PCRE2_NO_UTF_CHECK to pcre2_compile() just disables  the  check
        for the pattern; it does not also apply to subject strings. If you want
-       to  disable the check for a subject string you must pass this option to
+       to disable the check for a subject string you must pass this option  to
        pcre2_match() or pcre2_dfa_match().
 
-       If you pass an invalid UTF string when PCRE2_NO_UTF_CHECK is  set,  the
+       If  you  pass an invalid UTF string when PCRE2_NO_UTF_CHECK is set, the
        result is undefined and your program may crash or loop indefinitely.
 
+       Note that setting PCRE2_NO_UTF_CHECK at compile time does  not  disable
+       the  error  that  is given if an escape sequence for an invalid Unicode
+       code point is encountered in the pattern. If you want to  allow  escape
+       sequences  such  as  \x{d800}  (a surrogate code point) you can set the
+       PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra option. However, this is pos-
+       sible only in UTF-8 and UTF-32 modes, because these values are not rep-
+       resentable in UTF-16.
+
    Errors in UTF-8 strings
 
        The following negative error codes are given for invalid UTF-8 strings:
@@ -9843,10 +10303,10 @@ VALIDITY OF UTF STRINGS
          PCRE2_ERROR_UTF8_ERR4
          PCRE2_ERROR_UTF8_ERR5
 
-       The  string  ends  with a truncated UTF-8 character; the code specifies
-       how many bytes are missing (1 to 5). Although RFC 3629 restricts  UTF-8
-       characters  to  be  no longer than 4 bytes, the encoding scheme (origi-
-       nally defined by RFC 2279) allows for  up  to  6  bytes,  and  this  is
+       The string ends with a truncated UTF-8 character;  the  code  specifies
+       how  many bytes are missing (1 to 5). Although RFC 3629 restricts UTF-8
+       characters to be no longer than 4 bytes, the  encoding  scheme  (origi-
+       nally  defined  by  RFC  2279)  allows  for  up to 6 bytes, and this is
        checked first; hence the possibility of 4 or 5 missing bytes.
 
          PCRE2_ERROR_UTF8_ERR6
@@ -9856,24 +10316,24 @@ VALIDITY OF UTF STRINGS
          PCRE2_ERROR_UTF8_ERR10
 
        The two most significant bits of the 2nd, 3rd, 4th, 5th, or 6th byte of
-       the character do not have the binary value 0b10 (that  is,  either  the
+       the  character  do  not have the binary value 0b10 (that is, either the
        most significant bit is 0, or the next bit is 1).
 
          PCRE2_ERROR_UTF8_ERR11
          PCRE2_ERROR_UTF8_ERR12
 
-       A  character that is valid by the RFC 2279 rules is either 5 or 6 bytes
+       A character that is valid by the RFC 2279 rules is either 5 or 6  bytes
        long; these code points are excluded by RFC 3629.
 
          PCRE2_ERROR_UTF8_ERR13
 
-       A 4-byte character has a value greater than 0x10fff; these code  points
+       A  4-byte character has a value greater than 0x10fff; these code points
        are excluded by RFC 3629.
 
          PCRE2_ERROR_UTF8_ERR14
 
-       A  3-byte  character  has  a  value in the range 0xd800 to 0xdfff; this
-       range of code points are reserved by RFC 3629 for use with UTF-16,  and
+       A 3-byte character has a value in the  range  0xd800  to  0xdfff;  this
+       range  of code points are reserved by RFC 3629 for use with UTF-16, and
        so are excluded from UTF-8.
 
          PCRE2_ERROR_UTF8_ERR15
@@ -9882,26 +10342,26 @@ VALIDITY OF UTF STRINGS
          PCRE2_ERROR_UTF8_ERR18
          PCRE2_ERROR_UTF8_ERR19
 
-       A  2-, 3-, 4-, 5-, or 6-byte character is "overlong", that is, it codes
-       for a value that can be represented by fewer bytes, which  is  invalid.
-       For  example,  the two bytes 0xc0, 0xae give the value 0x2e, whose cor-
+       A 2-, 3-, 4-, 5-, or 6-byte character is "overlong", that is, it  codes
+       for  a  value that can be represented by fewer bytes, which is invalid.
+       For example, the two bytes 0xc0, 0xae give the value 0x2e,  whose  cor-
        rect coding uses just one byte.
 
          PCRE2_ERROR_UTF8_ERR20
 
        The two most significant bits of the first byte of a character have the
-       binary  value 0b10 (that is, the most significant bit is 1 and the sec-
-       ond is 0). Such a byte can only validly occur as the second  or  subse-
+       binary value 0b10 (that is, the most significant bit is 1 and the  sec-
+       ond  is  0). Such a byte can only validly occur as the second or subse-
        quent byte of a multi-byte character.
 
          PCRE2_ERROR_UTF8_ERR21
 
-       The  first byte of a character has the value 0xfe or 0xff. These values
+       The first byte of a character has the value 0xfe or 0xff. These  values
        can never occur in a valid UTF-8 string.
 
    Errors in UTF-16 strings
 
-       The following  negative  error  codes  are  given  for  invalid  UTF-16
+       The  following  negative  error  codes  are  given  for  invalid UTF-16
        strings:
 
          PCRE2_ERROR_UTF16_ERR1  Missing low surrogate at end of string
@@ -9911,7 +10371,7 @@ VALIDITY OF UTF STRINGS
 
    Errors in UTF-32 strings
 
-       The  following  negative  error  codes  are  given  for  invalid UTF-32
+       The following  negative  error  codes  are  given  for  invalid  UTF-32
        strings:
 
          PCRE2_ERROR_UTF32_ERR1  Surrogate character (0xd800 to 0xdfff)
@@ -9927,8 +10387,8 @@ AUTHOR
 
 REVISION
 
-       Last updated: 03 July 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 17 May 2017
+       Copyright (c) 1997-2017 University of Cambridge.
 ------------------------------------------------------------------------------
 
 
diff --git a/doc/pcre2_callout_enumerate.3 b/doc/pcre2_callout_enumerate.3
index 4573bb4..109c9be 100644
--- a/doc/pcre2_callout_enumerate.3
+++ b/doc/pcre2_callout_enumerate.3
@@ -1,4 +1,4 @@
-.TH PCRE2_COMPILE 3 "23 March 2015" "PCRE2 10.20"
+.TH PCRE2_COMPILE 3 "23 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -24,20 +24,21 @@ for success and non-zero otherwise. The arguments are:
   \fIcallout_data\fP   User data that is passed to the callback
 .sp
 The \fIcallback()\fP function is passed a pointer to a data block containing
-the following fields:
+the following fields (not necessarily in this order):
 .sp
-  \fIversion\fP                Block version number
-  \fIpattern_position\fP       Offset to next item in pattern
-  \fInext_item_length\fP       Length of next item in pattern
-  \fIcallout_number\fP         Number for numbered callouts
-  \fIcallout_string_offset\fP  Offset to string within pattern
-  \fIcallout_string_length\fP  Length of callout string
-  \fIcallout_string\fP         Points to callout string or is NULL
+  uint32_t   \fIversion\fP                Block version number
+  uint32_t   \fIcallout_number\fP         Number for numbered callouts
+  PCRE2_SIZE \fIpattern_position\fP       Offset to next item in pattern
+  PCRE2_SIZE \fInext_item_length\fP       Length of next item in pattern
+  PCRE2_SIZE \fIcallout_string_offset\fP  Offset to string within pattern
+  PCRE2_SIZE \fIcallout_string_length\fP  Length of callout string
+  PCRE2_SPTR \fIcallout_string\fP         Points to callout string or is NULL
 .sp
-The second argument is the callout data that was passed to
-\fBpcre2_callout_enumerate()\fP. The \fBcallback()\fP function must return zero
-for success. Any other value causes the pattern scan to stop, with the value
-being passed back as the result of \fBpcre2_callout_enumerate()\fP.
+The second argument passed to the \fBcallback()\fP function is the callout data
+that was passed to \fBpcre2_callout_enumerate()\fP. The \fBcallback()\fP
+function must return zero for success. Any other value causes the pattern scan
+to stop, with the value being passed back as the result of
+\fBpcre2_callout_enumerate()\fP.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_code_copy.3 b/doc/pcre2_code_copy.3
index 270b3a6..09b4705 100644
--- a/doc/pcre2_code_copy.3
+++ b/doc/pcre2_code_copy.3
@@ -1,4 +1,4 @@
-.TH PCRE2_CODE_COPY 3 "26 February 2016" "PCRE2 10.22"
+.TH PCRE2_CODE_COPY 3 "22 November 2016" "PCRE2 10.23"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -16,8 +16,9 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 This function makes a copy of the memory used for a compiled pattern, excluding
 any memory used by the JIT compiler. Without a subsequent call to
 \fBpcre2_jit_compile()\fP, the copy can be used only for non-JIT matching. The
-yield of the function is NULL if \fIcode\fP is NULL or if sufficient memory
-cannot be obtained.
+pointer to the character tables is copied, not the tables themselves (see
+\fBpcre2_code_copy_with_tables()\fP). The yield of the function is NULL if
+\fIcode\fP is NULL or if sufficient memory cannot be obtained.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_code_copy_with_tables.3 b/doc/pcre2_code_copy_with_tables.3
new file mode 100644
index 0000000..cfbddb3
--- /dev/null
+++ b/doc/pcre2_code_copy_with_tables.3
@@ -0,0 +1,32 @@
+.TH PCRE2_CODE_COPY 3 "22 November 2016" "PCRE2 10.23"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *\fIcode\fP);
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function makes a copy of the memory used for a compiled pattern, excluding
+any memory used by the JIT compiler. Without a subsequent call to
+\fBpcre2_jit_compile()\fP, the copy can be used only for non-JIT matching.
+Unlike \fBpcre2_code_copy()\fP, a separate copy of the character tables is also
+made, with the new code pointing to it. This memory will be automatically freed
+when \fBpcre2_code_free()\fP is called. The yield of the function is NULL if
+\fIcode\fP is NULL or if sufficient memory cannot be obtained.
+.P
+There is a complete description of the PCRE2 native API in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+page and a description of the POSIX API in the
+.\" HREF
+\fBpcre2posix\fP
+.\"
+page.
diff --git a/doc/pcre2_code_free.3 b/doc/pcre2_code_free.3
index 5127081..7376869 100644
--- a/doc/pcre2_code_free.3
+++ b/doc/pcre2_code_free.3
@@ -1,4 +1,4 @@
-.TH PCRE2_CODE_FREE 3 "29 July 2015" "PCRE2 10.21"
+.TH PCRE2_CODE_FREE 3 "23 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -14,7 +14,9 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .rs
 .sp
 This function frees the memory used for a compiled pattern, including any
-memory used by the JIT compiler.
+memory used by the JIT compiler. If the compiled pattern was created by a call
+to \fBpcre2_code_copy_with_tables()\fP, the memory for the character tables is
+also freed.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_compile.3 b/doc/pcre2_compile.3
index 1e0dca5..19f35c3 100644
--- a/doc/pcre2_compile.3
+++ b/doc/pcre2_compile.3
@@ -1,4 +1,4 @@
-.TH PCRE2_COMPILE 3 "22 April 2015" "PCRE2 10.20"
+.TH PCRE2_COMPILE 3 "16 June 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -25,26 +25,34 @@ arguments are:
   \fIerroffset\fP     Where to put an error offset
   \fIccontext\fP      Pointer to a compile context or NULL
 .sp
-The length of the string and any error offset that is returned are in code
-units, not characters. A compile context is needed only if you want to change
+The length of the pattern and any error offset that is returned are in code
+units, not characters. A compile context is needed only if you want to provide
+custom memory allocation functions, or to provide an external function for
+system stack size checking, or to change one or more of these parameters:
 .sp
-  What \eR matches (Unicode newlines or CR, LF, CRLF only)
-  PCRE2's character tables
-  The newline character sequence
-  The compile time nested parentheses limit
+  What \eR matches (Unicode newlines, or CR, LF, CRLF only);
+  PCRE2's character tables;
+  The newline character sequence;
+  The compile time nested parentheses limit;
+  The maximum pattern length (in code units) that is allowed.
+  The additional options bits (see pcre2_set_compile_extra_options())
 .sp
-or provide an external function for stack size checking. The option bits are:
+The option bits are:
 .sp
   PCRE2_ANCHORED           Force pattern anchoring
+  PCRE2_ALLOW_EMPTY_CLASS  Allow empty classes
   PCRE2_ALT_BSUX           Alternative handling of \eu, \eU, and \ex
   PCRE2_ALT_CIRCUMFLEX     Alternative handling of ^ in multiline mode
+  PCRE2_ALT_VERBNAMES      Process backslashes in verb names
   PCRE2_AUTO_CALLOUT       Compile automatic callouts
   PCRE2_CASELESS           Do caseless matching
   PCRE2_DOLLAR_ENDONLY     $ not to match newline at end
   PCRE2_DOTALL             . matches anything including NL
   PCRE2_DUPNAMES           Allow duplicate names for subpatterns
+  PCRE2_ENDANCHORED        Pattern can match only at end of subject
   PCRE2_EXTENDED           Ignore white space and # comments
   PCRE2_FIRSTLINE          Force matching to be before newline
+  PCRE2_LITERAL            Pattern characters are all literal
   PCRE2_MATCH_UNSET_BACKREF  Match unset back references
   PCRE2_MULTILINE          ^ and $ match newlines within data
   PCRE2_NEVER_BACKSLASH_C  Lock out the use of \eC in patterns
@@ -59,19 +67,21 @@ or provide an external function for stack size checking. The option bits are:
                              (only relevant if PCRE2_UTF is set)
   PCRE2_UCP                Use Unicode properties for \ed, \ew, etc.
   PCRE2_UNGREEDY           Invert greediness of quantifiers
+  PCRE2_USE_OFFSET_LIMIT   Enable offset limit for unanchored matching
   PCRE2_UTF                Treat pattern and subjects as UTF strings
 .sp
-PCRE2 must be built with Unicode support in order to use PCRE2_UTF, PCRE2_UCP
-and related options.
+PCRE2 must be built with Unicode support (the default) in order to use
+PCRE2_UTF, PCRE2_UCP and related options.
 .P
 The yield of the function is a pointer to a private data structure that
 contains the compiled pattern, or NULL if an error was detected.
 .P
-There is a complete description of the PCRE2 native API in the
+There is a complete description of the PCRE2 native API, with more detail on
+each option, in the
 .\" HREF
 \fBpcre2api\fP
 .\"
-page and a description of the POSIX API in the
+page, and a description of the POSIX API in the
 .\" HREF
 \fBpcre2posix\fP
 .\"
diff --git a/doc/pcre2_config.3 b/doc/pcre2_config.3
index 0c29ce6..ab9623d 100644
--- a/doc/pcre2_config.3
+++ b/doc/pcre2_config.3
@@ -1,4 +1,4 @@
-.TH PCRE2_CONFIG 3 "20 April 2014" "PCRE2 10.0"
+.TH PCRE2_CONFIG 3 "16 September 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -31,22 +31,29 @@ point to a uint32_t integer variable. The available codes are:
   PCRE2_CONFIG_BSR             Indicates what \eR matches by default:
                                  PCRE2_BSR_UNICODE
                                  PCRE2_BSR_ANYCRLF
+  PCRE2_CONFIG_COMPILED_WIDTHS Which of 8/16/32 support was compiled
+  PCRE2_CONFIG_DEPTHLIMIT      Default backtracking depth limit
+  PCRE2_CONFIG_HEAPLIMIT       Default heap memory limit
+.\" JOIN
   PCRE2_CONFIG_JIT             Availability of just-in-time compiler
                                 support (1=yes 0=no)
-  PCRE2_CONFIG_JITTARGET       Information about the target archi-
-                                 tecture for the JIT compiler
+.\" JOIN
+  PCRE2_CONFIG_JITTARGET       Information (a string) about the target
+                                 architecture for the JIT compiler
   PCRE2_CONFIG_LINKSIZE        Configured internal link size (2, 3, 4)
   PCRE2_CONFIG_MATCHLIMIT      Default internal resource limit
+  PCRE2_CONFIG_NEVER_BACKSLASH_C  Whether or not \eC is disabled
   PCRE2_CONFIG_NEWLINE         Code for the default newline sequence:
                                  PCRE2_NEWLINE_CR
                                  PCRE2_NEWLINE_LF
                                  PCRE2_NEWLINE_CRLF
                                  PCRE2_NEWLINE_ANY
                                  PCRE2_NEWLINE_ANYCRLF
+                                 PCRE2_NEWLINE_NUL
   PCRE2_CONFIG_PARENSLIMIT     Default parentheses nesting limit
-  PCRE2_CONFIG_RECURSIONLIMIT  Internal recursion depth limit
-  PCRE2_CONFIG_STACKRECURSE    Recursion implementation (1=stack
-                                 0=heap)
+  PCRE2_CONFIG_RECURSIONLIMIT  Obsolete: use PCRE2_CONFIG_DEPTHLIMIT
+  PCRE2_CONFIG_STACKRECURSE    Obsolete: always returns 0
+.\" JOIN
   PCRE2_CONFIG_UNICODE         Availability of Unicode support (1=yes
                                  0=no)
   PCRE2_CONFIG_UNICODE_VERSION The Unicode version (a string)
diff --git a/doc/pcre2_convert_context_copy.3 b/doc/pcre2_convert_context_copy.3
new file mode 100644
index 0000000..827c3e9
--- /dev/null
+++ b/doc/pcre2_convert_context_copy.3
@@ -0,0 +1,26 @@
+.TH PCRE2_CONVERT_CONTEXT_COPY 3 "10 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B pcre2_convert_context *pcre2_convert_context_copy(
+.B "  pcre2_convert_context *\fIcvcontext\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It makes a new copy of a convert context, using the memory allocation function
+that was used for the original context. The result is NULL if the memory cannot
+be obtained.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_convert_context_create.3 b/doc/pcre2_convert_context_create.3
new file mode 100644
index 0000000..91c17fb
--- /dev/null
+++ b/doc/pcre2_convert_context_create.3
@@ -0,0 +1,27 @@
+.TH PCRE2_CONVERT_CONTEXT_CREATE 3 "10 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B pcre2_convert_context *pcre2_convert_context_create(
+.B "  pcre2_general_context *\fIgcontext\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It creates and initializes a new convert context. If its argument is
+NULL, \fBmalloc()\fP is used to get the necessary memory; otherwise the memory
+allocation function within the general context is used. The result is NULL if
+the memory could not be obtained.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_convert_context_free.3 b/doc/pcre2_convert_context_free.3
new file mode 100644
index 0000000..fd5b13c
--- /dev/null
+++ b/doc/pcre2_convert_context_free.3
@@ -0,0 +1,25 @@
+.TH PCRE2_CONVERT_CONTEXT_FREE 3 "10 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B void pcre2_convert_context_free(pcre2_convert_context *\fIcvcontext\fP);
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It frees the memory occupied by a convert context, using the memory
+freeing function from the general context with which it was created, or
+\fBfree()\fP if that was not set.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_converted_pattern_free.3 b/doc/pcre2_converted_pattern_free.3
new file mode 100644
index 0000000..687e078
--- /dev/null
+++ b/doc/pcre2_converted_pattern_free.3
@@ -0,0 +1,25 @@
+.TH PCRE2_CONVERTED_PATTERN_FREE 3 "11 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B void pcre2_converted_pattern_free(PCRE2_UCHAR *\fIconverted_pattern\fP);
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It frees the memory occupied by a converted pattern that was obtained by
+calling \fBpcre2_pattern_convert()\fP with arguments that caused it to place
+the converted pattern into newly obtained heap memory.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_dfa_match.3 b/doc/pcre2_dfa_match.3
index f45da0d..7839145 100644
--- a/doc/pcre2_dfa_match.3
+++ b/doc/pcre2_dfa_match.3
@@ -1,4 +1,4 @@
-.TH PCRE2_DFA_MATCH 3 "12 May 2013" "PCRE2 10.00"
+.TH PCRE2_DFA_MATCH 3 "30 May 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -19,8 +19,9 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
 This function matches a compiled regular expression against a given subject
 string, using an alternative matching algorithm that scans the subject string
-just once (\fInot\fP Perl-compatible). (The Perl-compatible matching function
-is \fBpcre2_match()\fP.) The arguments for this function are:
+just once (except when processing lookaround assertions). This function is
+\fInot\fP Perl-compatible (the Perl-compatible matching function is
+\fBpcre2_match()\fP). The arguments for this function are:
 .sp
   \fIcode\fP         Points to the compiled pattern
   \fIsubject\fP      Points to the subject string
@@ -33,22 +34,28 @@ is \fBpcre2_match()\fP.) The arguments for this function are:
   \fIwscount\fP      Number of elements in the vector
 .sp
 For \fBpcre2_dfa_match()\fP, a match context is needed only if you want to set
-up a callout function. The \fIlength\fP and \fIstartoffset\fP values are code
-units, not characters. The options are:
+up a callout function or specify the match and/or the recursion depth limits.
+The \fIlength\fP and \fIstartoffset\fP values are code units, not characters.
+The options are:
 .sp
   PCRE2_ANCHORED          Match only at the first position
+  PCRE2_ENDANCHORED       Pattern can match only at end of subject
   PCRE2_NOTBOL            Subject is not the beginning of a line
   PCRE2_NOTEOL            Subject is not the end of a line
   PCRE2_NOTEMPTY          An empty string is not a valid match
+.\" JOIN
   PCRE2_NOTEMPTY_ATSTART  An empty string at the start of the subject
                            is not a valid match
+.\" JOIN
   PCRE2_NO_UTF_CHECK      Do not check the subject for UTF
                            validity (only relevant if PCRE2_UTF
                            was set at compile time)
+.\" JOIN
+  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial
+                           match even if there is a full match
+.\" JOIN
   PCRE2_PARTIAL_SOFT      Return PCRE2_ERROR_PARTIAL for a partial
-                            match if no full matches are found
-  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial match
-                           even if there is a full match as well
+                           match if no full matches are found
   PCRE2_DFA_RESTART       Restart after a partial match
   PCRE2_DFA_SHORTEST      Return only the shortest match
 .sp
diff --git a/doc/pcre2_get_error_message.3 b/doc/pcre2_get_error_message.3
index 9378b18..3d3e0de 100644
--- a/doc/pcre2_get_error_message.3
+++ b/doc/pcre2_get_error_message.3
@@ -1,4 +1,4 @@
-.TH PCRE2_GET_ERROR_MESSAGE 3 "17 June 2016" "PCRE2 10.22"
+.TH PCRE2_GET_ERROR_MESSAGE 3 "24 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -22,11 +22,11 @@ errors are negative numbers. The arguments are:
   \fIbuffer\fP      where to put the message
   \fIbufflen\fP     the length of the buffer (code units)
 .sp
-The function returns the length of the message, excluding the trailing zero, or
-the negative error code PCRE2_ERROR_NOMEMORY if the buffer is too small. In
-this case, the returned message is truncated (but still with a trailing zero).
-If \fIerrorcode\fP does not contain a recognized error code number, the
-negative value PCRE2_ERROR_BADDATA is returned.
+The function returns the length of the message in code units, excluding the
+trailing zero, or the negative error code PCRE2_ERROR_NOMEMORY if the buffer is
+too small. In this case, the returned message is truncated (but still with a
+trailing zero). If \fIerrorcode\fP does not contain a recognized error code
+number, the negative value PCRE2_ERROR_BADDATA is returned.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_get_mark.3 b/doc/pcre2_get_mark.3
index e741dfe..dce377d 100644
--- a/doc/pcre2_get_mark.3
+++ b/doc/pcre2_get_mark.3
@@ -1,4 +1,4 @@
-.TH PCRE2_GET_MARK 3 "24 October 2014" "PCRE2 10.00"
+.TH PCRE2_GET_MARK 3 "13 October 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -14,11 +14,14 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .rs
 .sp
 After a call of \fBpcre2_match()\fP that was passed the match block that is
-this function's argument, this function returns a pointer to the last (*MARK)
-name that was encountered. The name is zero-terminated, and is within the
-compiled pattern. If no (*MARK) name is available, NULL is returned. A (*MARK)
-name may be available after a failed match or a partial match, as well as after
-a successful one.
+this function's argument, this function returns a pointer to the last (*MARK),
+(*PRUNE), or (*THEN) name that was encountered during the matching process. The
+name is zero-terminated, and is within the compiled pattern. The length of the
+name is in the preceding code unit. If no name is available, NULL is returned.
+.P
+After a successful match, the name that is returned is the last one on the
+matching path. After a failed match or a partial match, the last encountered
+name is returned.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_jit_stack_create.3 b/doc/pcre2_jit_stack_create.3
index d530d50..61ccf79 100644
--- a/doc/pcre2_jit_stack_create.3
+++ b/doc/pcre2_jit_stack_create.3
@@ -1,4 +1,4 @@
-.TH PCRE2_JIT_STACK_CREATE 3 "03 November 2014" "PCRE2 10.00"
+.TH PCRE2_JIT_STACK_CREATE 3 "24 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -20,10 +20,9 @@ maximum size to which it is allowed to grow. The final argument is a general
 context, for memory allocation functions, or NULL for standard memory
 allocation. The result can be passed to the JIT run-time code by calling
 \fBpcre2_jit_stack_assign()\fP to associate the stack with a compiled pattern,
-which can then be processed by \fBpcre2_match()\fP. If the "fast path" JIT
-matcher, \fBpcre2_jit_match()\fP is used, the stack can be passed directly as
-an argument. A maximum stack size of 512K to 1M should be more than enough for
-any pattern. For more details, see the
+which can then be processed by \fBpcre2_match()\fP or \fBpcre2_jit_match()\fP.
+A maximum stack size of 512K to 1M should be more than enough for any pattern.
+For more details, see the
 .\" HREF
 \fBpcre2jit\fP
 .\"
diff --git a/doc/pcre2_maketables.3 b/doc/pcre2_maketables.3
index 322dba7..740954b 100644
--- a/doc/pcre2_maketables.3
+++ b/doc/pcre2_maketables.3
@@ -1,4 +1,4 @@
-.TH PCRE2_MAKETABLES 3 "21 October 2014" "PCRE2 10.00"
+.TH PCRE2_MAKETABLES 3 "17 April 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -7,15 +7,15 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .B #include <pcre2.h>
 .PP
 .SM
-.B const unsigned char *pcre2_maketables(pcre22_general_context *\fIgcontext\fP);
+.B const unsigned char *pcre2_maketables(pcre2_general_context *\fIgcontext\fP);
 .
 .SH DESCRIPTION
 .rs
 .sp
-This function builds a set of character tables for character values less than
-256. These can be passed to \fBpcre2_compile()\fP in a compile context in order
-to override the internal, built-in tables (which were either defaulted or made
-by \fBpcre2_maketables()\fP when PCRE2 was compiled). See the
+This function builds a set of character tables for character code points that
+are less than 256. These can be passed to \fBpcre2_compile()\fP in a compile
+context in order to override the internal, built-in tables (which were either
+defaulted or made by \fBpcre2_maketables()\fP when PCRE2 was compiled). See the
 .\" HREF
 \fBpcre2_set_character_tables()\fP
 .\"
diff --git a/doc/pcre2_match.3 b/doc/pcre2_match.3
index f25cace..6f7aefb 100644
--- a/doc/pcre2_match.3
+++ b/doc/pcre2_match.3
@@ -1,4 +1,4 @@
-.TH PCRE2_MATCH 3 "21 October 2014" "PCRE2 10.00"
+.TH PCRE2_MATCH 3 "14 November 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -18,7 +18,13 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
 This function matches a compiled regular expression against a given subject
 string, using a matching algorithm that is similar to Perl's. It returns
-offsets to captured substrings. Its arguments are:
+offsets to what it has matched and to captured substrings via the
+\fBmatch_data\fP block, which can be processed by functions with names that
+start with \fBpcre2_get_ovector_...()\fP or \fBpcre2_substring_...()\fP. The
+return from \fBpcre2_match()\fP is one more than the highest numbered capturing
+pair that has been set (for example, 1 if there are no captures), zero if the
+vector of offsets is too small, or a negative error code for no match and other
+errors. The function arguments are:
 .sp
   \fIcode\fP         Points to the compiled pattern
   \fIsubject\fP      Points to the subject string
@@ -31,26 +37,35 @@ offsets to captured substrings. Its arguments are:
 A match context is needed only if you want to:
 .sp
   Set up a callout function
-  Change the limit for calling the internal function \fImatch()\fP
-  Change the limit for calling \fImatch()\fP recursively
-  Set custom memory management when the heap is used for recursion
+  Set a matching offset limit
+  Change the heap memory limit
+  Change the backtracking match limit
+  Change the backtracking depth limit
+  Set custom memory management specifically for the match
 .sp
 The \fIlength\fP and \fIstartoffset\fP values are code
-units, not characters. The options are:
+units, not characters. The length may be given as PCRE2_ZERO_TERMINATE for a
+subject that is terminated by a binary zero code unit. The options are:
 .sp
   PCRE2_ANCHORED          Match only at the first position
+  PCRE2_ENDANCHORED       Pattern can match only at end of subject
   PCRE2_NOTBOL            Subject string is not the beginning of a line
   PCRE2_NOTEOL            Subject string is not the end of a line
   PCRE2_NOTEMPTY          An empty string is not a valid match
+.\" JOIN
   PCRE2_NOTEMPTY_ATSTART  An empty string at the start of the subject
                            is not a valid match
+  PCRE2_NO_JIT            Do not use JIT matching
+.\" JOIN
   PCRE2_NO_UTF_CHECK      Do not check the subject for UTF
                            validity (only relevant if PCRE2_UTF
                            was set at compile time)
+.\" JOIN
+  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial
+                           match even if there is a full match
+.\" JOIN
   PCRE2_PARTIAL_SOFT      Return PCRE2_ERROR_PARTIAL for a partial
                             match if no full matches are found
-  PCRE2_PARTIAL_HARD      Return PCRE2_ERROR_PARTIAL for a partial match
-                           if that is found before a full match
 .sp
 For details of partial matching, see the
 .\" HREF
diff --git a/doc/pcre2_match_data_free.3 b/doc/pcre2_match_data_free.3
index 5e4bc62..e22074b 100644
--- a/doc/pcre2_match_data_free.3
+++ b/doc/pcre2_match_data_free.3
@@ -1,4 +1,4 @@
-.TH PCRE2_MATCH_DATA_FREE 3 "24 October 2014" "PCRE2 10.00"
+.TH PCRE2_MATCH_DATA_FREE 3 "25 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -14,8 +14,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .rs
 .sp
 This function frees the memory occupied by a match data block, using the memory
-freeing function from the general context with which it was created, or
-\fBfree()\fP if that was not set.
+freeing function from the general context or compiled pattern with which it was
+created, or \fBfree()\fP if that was not set.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_pattern_convert.3 b/doc/pcre2_pattern_convert.3
new file mode 100644
index 0000000..b72acb7
--- /dev/null
+++ b/doc/pcre2_pattern_convert.3
@@ -0,0 +1,55 @@
+.TH PCRE2_PATTERN_CONVERT 3 "11 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_pattern_convert(PCRE2_SPTR \fIpattern\fP, PCRE2_SIZE \fIlength\fP,
+.B "  uint32_t \fIoptions\fP, PCRE2_UCHAR **\fIbuffer\fP,"
+.B "  PCRE2_SIZE *\fIblength\fP, pcre2_convert_context *\fIcvcontext\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It converts a foreign pattern (for example, a glob) into a PCRE2 regular
+expression pattern. Its arguments are:
+.sp
+  \fIpattern\fP     The foreign pattern
+  \fIlength\fP      The length of the input pattern or PCRE2_ZERO_TERMINATED
+  \fIoptions\fP     Option bits
+  \fIbuffer\fP      Pointer to pointer to output buffer, or NULL
+  \fIblength\fP     Pointer to output length field
+  \fIcvcontext\fP   Pointer to a convert context or NULL
+.sp
+The length of the converted pattern (excluding the terminating zero) is
+returned via \fIblength\fP. If \fIbuffer\fP is NULL, the function just returns
+the output length. If \fIbuffer\fP points to a NULL pointer, heap memory is
+obtained for the converted pattern, using the allocator in the context if
+present (or else \fBmalloc()\fP), and the field pointed to by \fIbuffer\fP is
+updated. If \fIbuffer\fP points to a non-NULL field, that must point to a
+buffer whose size is in the variable pointed to by \fIblength\fP. This value is
+updated.
+.P
+The option bits are:
+.sp
+  PCRE2_CONVERT_UTF                     Input is UTF
+  PCRE2_CONVERT_NO_UTF_CHECK            Do not check UTF validity
+  PCRE2_CONVERT_POSIX_BASIC             Convert POSIX basic pattern
+  PCRE2_CONVERT_POSIX_EXTENDED          Convert POSIX extended pattern
+  PCRE2_CONVERT_GLOB                    ) Convert
+  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR  )   various types
+  PCRE2_CONVERT_GLOB_NO_STARSTAR        )     of glob
+.sp
+The return value from \fBpcre2_pattern_convert()\fP is zero on success or a
+non-zero PCRE2 error code.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_pattern_info.3 b/doc/pcre2_pattern_info.3
index 575840b..64bfc45 100644
--- a/doc/pcre2_pattern_info.3
+++ b/doc/pcre2_pattern_info.3
@@ -1,4 +1,4 @@
-.TH PCRE2_PATTERN_INFO 3 "21 November 2015" "PCRE2 10.21"
+.TH PCRE2_PATTERN_INFO 3 "16 December 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -15,7 +15,7 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
 This function returns information about a compiled pattern. Its arguments are:
 .sp
-  \fIcode\fP     Pointer to a compiled regular expression
+  \fIcode\fP     Pointer to a compiled regular expression pattern
   \fIwhat\fP     What information is required
   \fIwhere\fP    Where to put the information
 .sp
@@ -29,25 +29,38 @@ request are as follows:
                                PCRE2_BSR_UNICODE: Unicode line endings
                                PCRE2_BSR_ANYCRLF: CR, LF, or CRLF only
   PCRE2_INFO_CAPTURECOUNT    Number of capturing subpatterns
+.\" JOIN
+  PCRE2_INFO_DEPTHLIMIT      Backtracking depth limit if set,
+                               otherwise PCRE2_ERROR_UNSET
+  PCRE2_INFO_EXTRAOPTIONS    Extra options that were passed in the
+                               compile context
   PCRE2_INFO_FIRSTBITMAP     Bitmap of first code units, or NULL
   PCRE2_INFO_FIRSTCODETYPE   Type of start-of-match information
                                0 nothing set
                                1 first code unit is set
                                2 start of string or after newline
   PCRE2_INFO_FIRSTCODEUNIT   First code unit when type is 1
+  PCRE2_INFO_FRAMESIZE       Size of backtracking frame
   PCRE2_INFO_HASBACKSLASHC   Return 1 if pattern contains \eC
+.\" JOIN
   PCRE2_INFO_HASCRORLF       Return 1 if explicit CR or LF matches
                                exist in the pattern
+.\" JOIN
+  PCRE2_INFO_HEAPLIMIT       Heap memory limit if set,
+                               otherwise PCRE2_ERROR_UNSET
   PCRE2_INFO_JCHANGED        Return 1 if (?J) or (?-J) was used
   PCRE2_INFO_JITSIZE         Size of JIT compiled code, or 0
   PCRE2_INFO_LASTCODETYPE    Type of must-be-present information
                                0 nothing set
                                1 code unit is set
   PCRE2_INFO_LASTCODEUNIT    Last code unit when type is 1
+.\" JOIN
   PCRE2_INFO_MATCHEMPTY      1 if the pattern can match an
                                empty string, 0 otherwise
+.\" JOIN
   PCRE2_INFO_MATCHLIMIT      Match limit if set,
                                otherwise PCRE2_ERROR_UNSET
+.\" JOIN
   PCRE2_INFO_MAXLOOKBEHIND   Length (in characters) of the longest
                                lookbehind assertion
   PCRE2_INFO_MINLENGTH       Lower bound length of matching strings
@@ -60,8 +73,8 @@ request are as follows:
                                PCRE2_NEWLINE_CRLF
                                PCRE2_NEWLINE_ANY
                                PCRE2_NEWLINE_ANYCRLF
-  PCRE2_INFO_RECURSIONLIMIT  Recursion limit if set,
-                               otherwise PCRE2_ERROR_UNSET
+                               PCRE2_NEWLINE_NUL
+  PCRE2_INFO_RECURSIONLIMIT  Obsolete synonym for PCRE2_INFO_DEPTHLIMIT
   PCRE2_INFO_SIZE            Size of compiled pattern
 .sp
 If \fIwhere\fP is NULL, the function returns the amount of memory needed for
diff --git a/doc/pcre2_set_callout.3 b/doc/pcre2_set_callout.3
index 2f86f69..cb48e14 100644
--- a/doc/pcre2_set_callout.3
+++ b/doc/pcre2_set_callout.3
@@ -1,4 +1,4 @@
-.TH PCRE2_SET_CALLOUT 3 "24 October 2014" "PCRE2 10.00"
+.TH PCRE2_SET_CALLOUT 3 "21 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -17,7 +17,7 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
 This function sets the callout fields in a match context (the first argument).
 The second argument specifies a callout function, and the third argument is an
-opaque data time that is passed to it. The result of this function is always
+opaque data item that is passed to it. The result of this function is always
 zero.
 .P
 There is a complete description of the PCRE2 native API in the
diff --git a/doc/pcre2_set_compile_extra_options.3 b/doc/pcre2_set_compile_extra_options.3
new file mode 100644
index 0000000..1d73a8f
--- /dev/null
+++ b/doc/pcre2_set_compile_extra_options.3
@@ -0,0 +1,38 @@
+.TH PCRE2_SET_MAX_PATTERN_LENGTH 3 "16 June 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_set_compile_extra_options(pcre2_compile_context *\fIccontext\fP,
+.B "  PCRE2_SIZE \fIextra_options\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function sets additional option bits for \fBpcre2_compile()\fP that are
+housed in a compile context. It completely replaces all the bits. The extra
+options are:
+.sp
+.\" JOIN
+  PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES  Allow \ex{df800} to \ex{dfff}
+                                         in UTF-8 and UTF-32 modes
+.\" JOIN
+  PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL    Treat all invalid escapes as
+                                         a literal following character
+  PCRE2_EXTRA_MATCH_LINE               Pattern matches whole lines
+  PCRE2_EXTRA_MATCH_WORD               Pattern matches "words"
+.sp
+There is a complete description of the PCRE2 native API in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+page and a description of the POSIX API in the
+.\" HREF
+\fBpcre2posix\fP
+.\"
+page.
diff --git a/doc/pcre2_set_depth_limit.3 b/doc/pcre2_set_depth_limit.3
new file mode 100644
index 0000000..62bc7fe
--- /dev/null
+++ b/doc/pcre2_set_depth_limit.3
@@ -0,0 +1,28 @@
+.TH PCRE2_SET_DEPTH_LIMIT 3 "25 March 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_set_depth_limit(pcre2_match_context *\fImcontext\fP,
+.B "  uint32_t \fIvalue\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function sets the backtracking depth limit field in a match context. The
+result is always zero.
+.P
+There is a complete description of the PCRE2 native API in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+page and a description of the POSIX API in the
+.\" HREF
+\fBpcre2posix\fP
+.\"
+page.
diff --git a/doc/pcre2_set_glob_escape.3 b/doc/pcre2_set_glob_escape.3
new file mode 100644
index 0000000..d5637af
--- /dev/null
+++ b/doc/pcre2_set_glob_escape.3
@@ -0,0 +1,29 @@
+.TH PCRE2_SET_GLOB_ESCAPE 3 "11 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_set_glob_escape(pcre2_convert_context *\fIcvcontext\fP,
+.B "  uint32_t \fIescape_char\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It sets the escape character that is used when converting globs. The second
+argument must either be zero (meaning there is no escape character) or a
+punctuation character whose code point is less than 256. The default is grave
+accent if running under Windows, otherwise backslash. The result of the
+function is zero for success or PCRE2_ERROR_BADDATA if the second argument is
+invalid.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_set_glob_separator.3 b/doc/pcre2_set_glob_separator.3
new file mode 100644
index 0000000..273b515
--- /dev/null
+++ b/doc/pcre2_set_glob_separator.3
@@ -0,0 +1,28 @@
+.TH PCRE2_SET_GLOB_SEPARATOR 3 "11 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_set_glob_separator(pcre2_convert_context *\fIcvcontext\fP,
+.B "  uint32_t \fIseparator_char\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function is part of an experimental set of pattern conversion functions.
+It sets the component separator character that is used when converting globs.
+The second argument must one of the characters forward slash, backslash, or
+dot. The default is backslash when running under Windows, otherwise forward
+slash. The result of the function is zero for success or PCRE2_ERROR_BADDATA if
+the second argument is invalid.
+.P
+The pattern conversion functions are described in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
diff --git a/doc/pcre2_set_heap_limit.3 b/doc/pcre2_set_heap_limit.3
new file mode 100644
index 0000000..a99b4ab
--- /dev/null
+++ b/doc/pcre2_set_heap_limit.3
@@ -0,0 +1,28 @@
+.TH PCRE2_SET_DEPTH_LIMIT 3 "11 April 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_set_heap_limit(pcre2_match_context *\fImcontext\fP,
+.B "  uint32_t \fIvalue\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function sets the backtracking heap limit field in a match context. The
+result is always zero.
+.P
+There is a complete description of the PCRE2 native API in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+page and a description of the POSIX API in the
+.\" HREF
+\fBpcre2posix\fP
+.\"
+page.
diff --git a/doc/pcre2_set_max_pattern_length.3 b/doc/pcre2_set_max_pattern_length.3
new file mode 100644
index 0000000..7aa01c7
--- /dev/null
+++ b/doc/pcre2_set_max_pattern_length.3
@@ -0,0 +1,31 @@
+.TH PCRE2_SET_MAX_PATTERN_LENGTH 3 "05 October 2016" "PCRE2 10.23"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH SYNOPSIS
+.rs
+.sp
+.B #include <pcre2.h>
+.PP
+.nf
+.B int pcre2_set_max_pattern_length(pcre2_compile_context *\fIccontext\fP,
+.B "  PCRE2_SIZE \fIvalue\fP);"
+.fi
+.
+.SH DESCRIPTION
+.rs
+.sp
+This function sets, in a compile context, the maximum text length (in code
+units) of the pattern that can be compiled. The result is always zero. If a
+longer pattern is passed to \fBpcre2_compile()\fP there is an immediate error
+return. The default is effectively unlimited, being the largest value a
+PCRE2_SIZE variable can hold.
+.P
+There is a complete description of the PCRE2 native API in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+page and a description of the POSIX API in the
+.\" HREF
+\fBpcre2posix\fP
+.\"
+page.
diff --git a/doc/pcre2_set_newline.3 b/doc/pcre2_set_newline.3
index 8237500..0bccfc7 100644
--- a/doc/pcre2_set_newline.3
+++ b/doc/pcre2_set_newline.3
@@ -1,4 +1,4 @@
-.TH PCRE2_SET_NEWLINE 3 "22 October 2014" "PCRE2 10.00"
+.TH PCRE2_SET_NEWLINE 3 "26 May 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -23,6 +23,7 @@ matching patterns. The second argument must be one of:
   PCRE2_NEWLINE_CRLF      CR followed by LF only
   PCRE2_NEWLINE_ANYCRLF   Any of the above
   PCRE2_NEWLINE_ANY       Any Unicode newline sequence
+  PCRE2_NEWLINE_NUL       The NUL character (binary zero)
 .sp
 The result is zero for success or PCRE2_ERROR_BADDATA if the second argument is
 invalid.
diff --git a/doc/pcre2_set_recursion_limit.3 b/doc/pcre2_set_recursion_limit.3
index ab1f3cd..26f4257 100644
--- a/doc/pcre2_set_recursion_limit.3
+++ b/doc/pcre2_set_recursion_limit.3
@@ -1,4 +1,4 @@
-.TH PCRE2_SET_RECURSION_LIMIT 3 "24 October 2014" "PCRE2 10.00"
+.TH PCRE2_SET_RECURSION_LIMIT 3 "25 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -14,8 +14,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .SH DESCRIPTION
 .rs
 .sp
-This function sets the recursion limit field in a match context. The result is
-always zero.
+This function is obsolete and should not be used in new code. Use
+\fBpcre2_set_depth_limit()\fP instead.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_set_recursion_memory_management.3 b/doc/pcre2_set_recursion_memory_management.3
index 9b5887a..12f175d 100644
--- a/doc/pcre2_set_recursion_memory_management.3
+++ b/doc/pcre2_set_recursion_memory_management.3
@@ -1,4 +1,4 @@
-.TH PCRE2_SET_RECURSION_MEMORY_MANAGEMENT 3 "24 October 2014" "PCRE2 10.00"
+.TH PCRE2_SET_RECURSION_MEMORY_MANAGEMENT 3 "25 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -16,13 +16,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .SH DESCRIPTION
 .rs
 .sp
-This function sets the match context fields for custom memory management when
-PCRE2 is compiled to use the heap instead of the system stack for recursive
-function calls while matching. When PCRE2 is compiled to use the stack (the
-default) this function does nothing. The first argument is a match context, the
-second and third specify the memory allocation and freeing functions, and the
-final argument is an opaque value that is passed to them whenever they are
-called. The result of this function is always zero.
+From release 10.30 onwards, this function is obsolete and does nothing. The
+result is always zero.
 .P
 There is a complete description of the PCRE2 native API in the
 .\" HREF
diff --git a/doc/pcre2_substitute.3 b/doc/pcre2_substitute.3
index e69e0cc..7da668c 100644
--- a/doc/pcre2_substitute.3
+++ b/doc/pcre2_substitute.3
@@ -1,4 +1,4 @@
-.TH PCRE2_SUBSTITUTE 3 "12 December 2015" "PCRE2 10.21"
+.TH PCRE2_SUBSTITUTE 3 "04 April 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -35,24 +35,32 @@ Its arguments are:
   \fIoutputbuffer\fP  Points to the output buffer
   \fIoutlengthptr\fP  Points to the length of the output buffer
 .sp
-A match context is needed only if you want to:
+A match data block is needed only if you want to inspect the data from the
+match that is returned in that block. A match context is needed only if you
+want to:
 .sp
   Set up a callout function
-  Change the limit for calling the internal function \fImatch()\fP
-  Change the limit for calling \fImatch()\fP recursively
-  Set custom memory management when the heap is used for recursion
+  Set a matching offset limit
+  Change the backtracking match limit
+  Change the backtracking depth limit
+  Set custom memory management in the match context
 .sp
 The \fIlength\fP, \fIstartoffset\fP and \fIrlength\fP values are code
 units, not characters, as is the contents of the variable pointed at by
 \fIoutlengthptr\fP, which is updated to the actual length of the new string.
-The options are:
+The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
+zero-terminated strings. The options are:
 .sp
   PCRE2_ANCHORED             Match only at the first position
+  PCRE2_ENDANCHORED          Pattern can match only at end of subject
   PCRE2_NOTBOL               Subject is not the beginning of a line
   PCRE2_NOTEOL               Subject is not the end of a line
   PCRE2_NOTEMPTY             An empty string is not a valid match
+.\" JOIN
   PCRE2_NOTEMPTY_ATSTART     An empty string at the start of the
                               subject is not a valid match
+  PCRE2_NO_JIT               Do not use JIT matching
+.\" JOIN
   PCRE2_NO_UTF_CHECK         Do not check the subject or replacement
                               for UTF validity (only relevant if
                               PCRE2_UTF was set at compile time)
diff --git a/doc/pcre2api.3 b/doc/pcre2api.3
index db61ea0..786b314 100644
--- a/doc/pcre2api.3
+++ b/doc/pcre2api.3
@@ -1,11 +1,11 @@
-.TH PCRE2API 3 "17 June 2016" "PCRE2 10.22"
+.TH PCRE2API 3 "31 December 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
 .B #include <pcre2.h>
 .sp
-PCRE2 is a new API for PCRE. This document contains a description of all its
-functions. See the
+PCRE2 is a new API for PCRE, starting at release 10.0. This document contains a
+description of all its native functions. See the
 .\" HREF
 \fBpcre2\fP
 .\"
@@ -90,6 +90,9 @@ document for an overview of all the PCRE2 documentation.
 .B int pcre2_set_character_tables(pcre2_compile_context *\fIccontext\fP,
 .B "  const unsigned char *\fItables\fP);"
 .sp
+.B int pcre2_set_compile_extra_options(pcre2_compile_context *\fIccontext\fP,
+.B "  uint32_t \fIextra_options\fP);"
+.sp
 .B int pcre2_set_max_pattern_length(pcre2_compile_context *\fIccontext\fP,
 .B "  PCRE2_SIZE \fIvalue\fP);"
 .sp
@@ -120,19 +123,17 @@ document for an overview of all the PCRE2 documentation.
 .B "  int (*\fIcallout_function\fP)(pcre2_callout_block *, void *),"
 .B "  void *\fIcallout_data\fP);"
 .sp
-.B int pcre2_set_match_limit(pcre2_match_context *\fImcontext\fP,
-.B "  uint32_t \fIvalue\fP);"
-.sp
 .B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
 .B "  PCRE2_SIZE \fIvalue\fP);"
 .sp
-.B int pcre2_set_recursion_limit(pcre2_match_context *\fImcontext\fP,
+.B int pcre2_set_heap_limit(pcre2_match_context *\fImcontext\fP,
 .B "  uint32_t \fIvalue\fP);"
 .sp
-.B int pcre2_set_recursion_memory_management(
-.B "  pcre2_match_context *\fImcontext\fP,"
-.B "  void *(*\fIprivate_malloc\fP)(PCRE2_SIZE, void *),"
-.B "  void (*\fIprivate_free\fP)(void *, void *), void *\fImemory_data\fP);"
+.B int pcre2_set_match_limit(pcre2_match_context *\fImcontext\fP,
+.B "  uint32_t \fIvalue\fP);"
+.sp
+.B int pcre2_set_depth_limit(pcre2_match_context *\fImcontext\fP,
+.B "  uint32_t \fIvalue\fP);"
 .fi
 .
 .
@@ -235,6 +236,8 @@ document for an overview of all the PCRE2 documentation.
 .nf
 .B pcre2_code *pcre2_code_copy(const pcre2_code *\fIcode\fP);
 .sp
+.B pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *\fIcode\fP);
+.sp
 .B int pcre2_get_error_message(int \fIerrorcode\fP, PCRE2_UCHAR *\fIbuffer\fP,
 .B "  PCRE2_SIZE \fIbufflen\fP);"
 .sp
@@ -250,6 +253,60 @@ document for an overview of all the PCRE2 documentation.
 .fi
 .
 .
+.SH "PCRE2 NATIVE API OBSOLETE FUNCTIONS"
+.rs
+.sp
+.nf
+.B int pcre2_set_recursion_limit(pcre2_match_context *\fImcontext\fP,
+.B "  uint32_t \fIvalue\fP);"
+.sp
+.B int pcre2_set_recursion_memory_management(
+.B "  pcre2_match_context *\fImcontext\fP,"
+.B "  void *(*\fIprivate_malloc\fP)(PCRE2_SIZE, void *),"
+.B "  void (*\fIprivate_free\fP)(void *, void *), void *\fImemory_data\fP);"
+.fi
+.sp
+These functions became obsolete at release 10.30 and are retained only for
+backward compatibility. They should not be used in new code. The first is
+replaced by \fBpcre2_set_depth_limit()\fP; the second is no longer needed and
+has no effect (it always returns zero).
+.
+.
+.SH "PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS"
+.rs
+.sp
+.nf
+.B pcre2_convert_context *pcre2_convert_context_create(
+.B "  pcre2_general_context *\fIgcontext\fP);"
+.sp
+.B pcre2_convert_context *pcre2_convert_context_copy(
+.B "  pcre2_convert_context *\fIcvcontext\fP);"
+.sp
+.B void pcre2_convert_context_free(pcre2_convert_context *\fIcvcontext\fP);
+.sp
+.B int pcre2_set_glob_escape(pcre2_convert_context *\fIcvcontext\fP,
+.B "  uint32_t \fIescape_char\fP);"
+.sp
+.B int pcre2_set_glob_separator(pcre2_convert_context *\fIcvcontext\fP,
+.B "  uint32_t \fIseparator_char\fP);"
+.sp
+.B int pcre2_pattern_convert(PCRE2_SPTR \fIpattern\fP, PCRE2_SIZE \fIlength\fP,
+.B "  uint32_t \fIoptions\fP, PCRE2_UCHAR **\fIbuffer\fP,"
+.B "  PCRE2_SIZE *\fIblength\fP, pcre2_convert_context *\fIcvcontext\fP);"
+.sp
+.B void pcre2_converted_pattern_free(PCRE2_UCHAR *\fIconverted_pattern\fP);
+.fi
+.sp
+These functions provide a way of converting non-PCRE2 patterns into
+patterns that can be processed by \fBpcre2_compile()\fP. This facility is
+experimental and may be changed in future releases. At present, "globs" and
+POSIX basic and extended patterns can be converted. Details are given in the
+.\" HREF
+\fBpcre2convert\fP
+.\"
+documentation.
+.
+.
 .SH "PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES"
 .rs
 .sp
@@ -300,11 +357,11 @@ When using multiple libraries in an application, you must take care when
 processing any particular pattern to use only functions from a single library.
 For example, if you want to run a match using a pattern that was compiled with
 \fBpcre2_compile_16()\fP, you must do so with \fBpcre2_match_16()\fP, not
-\fBpcre2_match_8()\fP.
+\fBpcre2_match_8()\fP or \fBpcre2_match_32()\fP.
 .P
 In the function summaries above, and in the rest of this document and other
 PCRE2 documents, functions and data types are described using their generic
-names, without the 8, 16, or 32 suffix.
+names, without the _8, _16, or _32 suffix.
 .
 .
 .SH "PCRE2 API OVERVIEW"
@@ -313,23 +370,23 @@ names, without the 8, 16, or 32 suffix.
 PCRE2 has its own native API, which is described in this document. There are
 also some wrapper functions for the 8-bit library that correspond to the
 POSIX regular expression API, but they do not give access to all the
-functionality. They are described in the
+functionality of PCRE2. They are described in the
 .\" HREF
 \fBpcre2posix\fP
 .\"
 documentation. Both these APIs define a set of C function calls.
 .P
 The native API C data types, function prototypes, option values, and error
-codes are defined in the header file \fBpcre2.h\fP, which contains definitions
-of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release numbers for the
-library. Applications can use these to include support for different releases
-of PCRE2.
+codes are defined in the header file \fBpcre2.h\fP, which also contains
+definitions of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release numbers
+for the library. Applications can use these to include support for different
+releases of PCRE2.
 .P
 In a Windows environment, if you want to statically link an application program
 against a non-dll PCRE2 library, you must define PCRE2_STATIC before including
 \fBpcre2.h\fP.
 .P
-The functions \fBpcre2_compile()\fP, and \fBpcre2_match()\fP are used for
+The functions \fBpcre2_compile()\fP and \fBpcre2_match()\fP are used for
 compiling and matching regular expressions in a Perl-compatible manner. A
 sample program that demonstrates the simplest way of using them is provided in
 the file called \fIpcre2demo.c\fP in the PCRE2 source distribution. A listing
@@ -343,10 +400,16 @@ documentation, and the
 .\"
 documentation describes how to compile and run it.
 .P
-Just-in-time compiler support is an optional feature of PCRE2 that can be built
-in appropriate hardware environments. It greatly speeds up the matching
+The compiling and matching functions recognize various options that are passed
+as bits in an options argument. There are also some more complicated parameters
+such as custom memory management functions and resource limits that are passed
+in "contexts" (which are just memory blocks, described below). Simple
+applications do not need to make use of contexts.
+.P
+Just-in-time (JIT) compiler support is an optional feature of PCRE2 that can be
+built in appropriate hardware environments. It greatly speeds up the matching
 performance of many patterns. Programs can request that it be used if
-available, by calling \fBpcre2_jit_compile()\fP after a pattern has been
+available by calling \fBpcre2_jit_compile()\fP after a pattern has been
 successfully compiled by \fBpcre2_compile()\fP. This does nothing if JIT
 support is not available.
 .P
@@ -356,8 +419,8 @@ More complicated programs might need to make use of the specialist functions
 .P
 JIT matching is automatically used by \fBpcre2_match()\fP if it is available,
 unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
-matching, which gives improved performance. The JIT-specific functions are
-discussed in the
+matching, which gives improved performance at the expense of less sanity
+checking. The JIT-specific functions are discussed in the
 .\" HREF
 \fBpcre2jit\fP
 .\"
@@ -367,7 +430,7 @@ A second matching function, \fBpcre2_dfa_match()\fP, which is not
 Perl-compatible, is also provided. This uses a different algorithm for the
 matching. The alternative algorithm finds all possible matches (at a given
 point in the subject), and scans the subject just once (unless there are
-lookbehind assertions). However, this algorithm does not return captured
+lookaround assertions). However, this algorithm does not return captured
 substrings. A description of the two matching algorithms and their advantages
 and disadvantages is given in the
 .\" HREF
@@ -390,7 +453,7 @@ been matched by \fBpcre2_match()\fP. They are:
   \fBpcre2_substring_number_from_name()\fP
 .sp
 \fBpcre2_substring_free()\fP and \fBpcre2_substring_list_free()\fP are also
-provided, to free the memory used for extracted strings.
+provided, to free memory used for extracted strings.
 .P
 The function \fBpcre2_substitute()\fP can be called to match a pattern and
 return a copy of the subject string with substitutions for parts that were
@@ -482,8 +545,8 @@ and does not change when the pattern is matched. Therefore, it is thread-safe,
 that is, the same compiled pattern can be used by more than one thread
 simultaneously. For example, an application can compile all its patterns at the
 start, before forking off multiple threads that use them. However, if the
-just-in-time optimization feature is being used, it needs separate memory stack
-areas for each thread. See the
+just-in-time (JIT) optimization feature is being used, it needs separate memory
+stack areas for each thread. See the
 .\" HREF
 \fBpcre2jit\fP
 .\"
@@ -509,8 +572,9 @@ If JIT is being used, but the JIT compilation is not being done immediately,
 (perhaps waiting to see if the pattern is used often enough) similar logic is
 required. JIT compilation updates a pointer within the compiled code block, so
 a thread must gain unique write access to the pointer before calling
-\fBpcre2_jit_compile()\fP. Alternatively, \fBpcre2_code_copy()\fP can be used
-to obtain a private copy of the compiled code.
+\fBpcre2_jit_compile()\fP. Alternatively, \fBpcre2_code_copy()\fP or
+\fBpcre2_code_copy_with_tables()\fP can be used to obtain a private copy of the
+compiled code before calling the JIT compiler.
 .
 .
 .SS "Context blocks"
@@ -533,10 +597,10 @@ thread-specific copy.
 .SS "Match blocks"
 .rs
 .sp
-The matching functions need a block of memory for working space and for storing
-the results of a match. This includes details of what was matched, as well as
-additional information such as the name of a (*MARK) setting. Each thread must
-provide its own copy of this memory.
+The matching functions need a block of memory for storing the results of a
+match. This includes details of what was matched, as well as additional
+information such as the name of a (*MARK) setting. Each thread must provide its
+own copy of this memory.
 .
 .
 .SH "PCRE2 CONTEXTS"
@@ -608,15 +672,16 @@ The memory used for a general context should be freed by calling:
 .SS "The compile context"
 .rs
 .sp
-A compile context is required if you want to change the default values of any
-of the following compile-time parameters:
+A compile context is required if you want to provide an external function for
+stack checking during compilation or to change the default values of any of the
+following compile-time parameters:
 .sp
   What \eR matches (Unicode newlines or CR, LF, CRLF only)
   PCRE2's character tables
   The newline character sequence
   The compile time nested parentheses limit
   The maximum length of the pattern string
-  An external function for stack checking
+  The extra options bits (none set by default)
 .sp
 A compile context is also required if you are using custom memory management.
 If none of these apply, just pass NULL as the context argument of
@@ -659,15 +724,32 @@ argument is a general context. This function builds a set of character tables
 in the current locale.
 .sp
 .nf
+.B int pcre2_set_compile_extra_options(pcre2_compile_context *\fIccontext\fP,
+.B "  uint32_t \fIextra_options\fP);"
+.fi
+.sp
+As PCRE2 has developed, almost all the 32 option bits that are available in
+the \fIoptions\fP argument of \fBpcre2_compile()\fP have been used up. To avoid
+running out, the compile context contains a set of extra option bits which are
+used for some newer, assumed rarer, options. This function sets those bits. It
+always sets all the bits (either on or off). It does not modify any existing
+setting. The available options are defined in the section entitled "Extra
+compile options"
+.\" HTML <a href="#extracompileoptions">
+.\" </a>
+below.
+.\"
+.sp
+.nf
 .B int pcre2_set_max_pattern_length(pcre2_compile_context *\fIccontext\fP,
 .B "  PCRE2_SIZE \fIvalue\fP);"
 .fi
 .sp
-This sets a maximum length, in code units, for the pattern string that is to be
-compiled. If the pattern is longer, an error is generated. This facility is
-provided so that applications that accept patterns from external sources can
-limit their size. The default is the largest number that a PCRE2_SIZE variable
-can hold, which is effectively unlimited.
+This sets a maximum length, in code units, for any pattern string that is
+compiled with this context. If the pattern is longer, an error is generated.
+This facility is provided so that applications that accept patterns from
+external sources can limit their size. The default is the largest number that a
+PCRE2_SIZE variable can hold, which is effectively unlimited.
 .sp
 .nf
 .B int pcre2_set_newline(pcre2_compile_context *\fIccontext\fP,
@@ -677,14 +759,22 @@ can hold, which is effectively unlimited.
 This specifies which characters or character sequences are to be recognized as
 newlines. The value must be one of PCRE2_NEWLINE_CR (carriage return only),
 PCRE2_NEWLINE_LF (linefeed only), PCRE2_NEWLINE_CRLF (the two-character
-sequence CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any of the above), or
-PCRE2_NEWLINE_ANY (any Unicode newline sequence).
+sequence CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any of the above),
+PCRE2_NEWLINE_ANY (any Unicode newline sequence), or PCRE2_NEWLINE_NUL (the
+NUL character, that is a binary zero).
 .P
-When a pattern is compiled with the PCRE2_EXTENDED option, the value of this
-parameter affects the recognition of white space and the end of internal
-comments starting with #. The value is saved with the compiled pattern for
-subsequent use by the JIT compiler and by the two interpreted matching
-functions, \fIpcre2_match()\fP and \fIpcre2_dfa_match()\fP.
+A pattern can override the value set in the compile context by starting with a
+sequence such as (*CRLF). See the
+.\" HREF
+\fBpcre2pattern\fP
+.\"
+page for details.
+.P
+When a pattern is compiled with the PCRE2_EXTENDED or PCRE2_EXTENDED_MORE
+option, the newline convention affects the recognition of white space and the
+end of internal comments starting with #. The value is saved with the compiled
+pattern for subsequent use by the JIT compiler and by the two interpreted
+matching functions, \fIpcre2_match()\fP and \fIpcre2_dfa_match()\fP.
 .sp
 .nf
 .B int pcre2_set_parens_nest_limit(pcre2_compile_context *\fIccontext\fP,
@@ -693,7 +783,8 @@ functions, \fIpcre2_match()\fP and \fIpcre2_dfa_match()\fP.
 .sp
 This parameter ajusts the limit, set when PCRE2 is built (default 250), on the
 depth of parenthesis nesting in a pattern. This limit stops rogue patterns
-using up too much system stack when being compiled.
+using up too much system stack when being compiled. The limit applies to
+parentheses of all kinds, not just capturing parentheses.
 .sp
 .nf
 .B int pcre2_set_compile_recursion_guard(pcre2_compile_context *\fIccontext\fP,
@@ -703,10 +794,10 @@ using up too much system stack when being compiled.
 There is at least one application that runs PCRE2 in threads with very limited
 system stack, where running out of stack is to be avoided at all costs. The
 parenthesis limit above cannot take account of how much stack is actually
-available. For a finer control, you can supply a function that is called
-whenever \fBpcre2_compile()\fP starts to compile a parenthesized part of a
-pattern. This function can check the actual stack size (or anything else that
-it wants to, of course).
+available during compilation. For a finer control, you can supply a function
+that is called whenever \fBpcre2_compile()\fP starts to compile a parenthesized
+part of a pattern. This function can check the actual stack size (or anything
+else that it wants to, of course).
 .P
 The first argument to the callout function gives the current depth of
 nesting, and the second is user data that is set up by the last argument of
@@ -718,15 +809,15 @@ zero if all is well, or non-zero to force an error.
 .SS "The match context"
 .rs
 .sp
-A match context is required if you want to change the default values of any
-of the following match-time parameters:
+A match context is required if you want to:
 .sp
-  A callout function
-  The offset limit for matching an unanchored pattern
-  The limit for calling \fBmatch()\fP (see below)
-  The limit for calling \fBmatch()\fP recursively
+  Set up a callout function
+  Set an offset limit for matching an unanchored pattern
+  Change the limit on the amount of heap used when matching
+  Change the backtracking match limit
+  Change the backtracking depth limit
+  Set custom memory management specifically for the match
 .sp
-A match context is also required if you are using custom memory management.
 If none of these apply, just pass NULL as the context argument of
 \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP, or \fBpcre2_jit_match()\fP.
 .P
@@ -752,7 +843,7 @@ PCRE2_ERROR_BADDATA if invalid data is detected.
 .B "  void *\fIcallout_data\fP);"
 .fi
 .sp
-This sets up a "callout" function, which PCRE2 will call at specified points
+This sets up a "callout" function for PCRE2 to call at specified points
 during a matching operation. Details are given in the
 .\" HREF
 \fBpcre2callout\fP
@@ -768,22 +859,61 @@ The \fIoffset_limit\fP parameter limits how far an unanchored search can
 advance in the subject string. The default value is PCRE2_UNSET. The
 \fBpcre2_match()\fP and \fBpcre2_dfa_match()\fP functions return
 PCRE2_ERROR_NOMATCH if a match with a starting point before or at the given
-offset is not found. For example, if the pattern /abc/ is matched against
-"123abc" with an offset limit less than 3, the result is PCRE2_ERROR_NO_MATCH.
-A match can never be found if the \fIstartoffset\fP argument of
-\fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP is greater than the offset
-limit.
-.P
-When using this facility, you must set PCRE2_USE_OFFSET_LIMIT when calling
-\fBpcre2_compile()\fP so that when JIT is in use, different code can be
+offset is not found. The \fBpcre2_substitute()\fP function makes no more
+substitutions.
+.P
+For example, if the pattern /abc/ is matched against "123abc" with an offset
+limit less than 3, the result is PCRE2_ERROR_NO_MATCH. A match can never be
+found if the \fIstartoffset\fP argument of \fBpcre2_match()\fP,
+\fBpcre2_dfa_match()\fP, or \fBpcre2_substitute()\fP is greater than the offset
+limit set in the match context.
+.P
+When using this facility, you must set the PCRE2_USE_OFFSET_LIMIT option when
+calling \fBpcre2_compile()\fP so that when JIT is in use, different code can be
 compiled. If a match is started with a non-default match limit when
 PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.
 .P
 The offset limit facility can be used to track progress when searching large
-subject strings. See also the PCRE2_FIRSTLINE option, which requires a match to
-start within the first line of the subject. If this is set with an offset
-limit, a match must occur in the first line and also within the offset limit.
-In other words, whichever limit comes first is used.
+subject strings or to limit the extent of global substitutions. See also the
+PCRE2_FIRSTLINE option, which requires a match to start before or at the first
+newline that follows the start of matching in the subject. If this is set with
+an offset limit, a match must occur in the first line and also within the
+offset limit. In other words, whichever limit comes first is used.
+.sp
+.nf
+.B int pcre2_set_heap_limit(pcre2_match_context *\fImcontext\fP,
+.B "  uint32_t \fIvalue\fP);"
+.fi
+.sp
+The \fIheap_limit\fP parameter specifies, in units of kilobytes, the maximum
+amount of heap memory that \fBpcre2_match()\fP may use to hold backtracking
+information when running an interpretive match. This limit does not apply to
+matching with the JIT optimization, which has its own memory control
+arrangements (see the
+.\" HREF
+\fBpcre2jit\fP
+.\"
+documentation for more details), nor does it apply to \fBpcre2_dfa_match()\fP.
+If the limit is reached, the negative error code PCRE2_ERROR_HEAPLIMIT is
+returned. The default limit is set when PCRE2 is built; the default default is
+very large and is essentially "unlimited".
+.P
+A value for the heap limit may also be supplied by an item at the start of a
+pattern of the form
+.sp
+  (*LIMIT_HEAP=ddd)
+.sp
+where ddd is a decimal number. However, such a setting is ignored unless ddd is
+less than the limit set by the caller of \fBpcre2_match()\fP or, if no such
+limit is set, less than the default.
+.P
+The \fBpcre2_match()\fP function starts out using a 20K vector on the system
+stack for recording backtracking points. The more nested backtracking points
+there are (that is, the deeper the search tree), the more memory is needed.
+Heap memory is used only if the initial vector is too small. If the heap limit
+is set to a value less than 21 (in particular, zero) no heap memory will be
+used. In this case, only patterns that do not have a lot of nested backtracking
+can be successfully processed.
 .sp
 .nf
 .B int pcre2_set_match_limit(pcre2_match_context *\fImcontext\fP,
@@ -791,17 +921,17 @@ In other words, whichever limit comes first is used.
 .fi
 .sp
 The \fImatch_limit\fP parameter provides a means of preventing PCRE2 from using
-up too many resources when processing patterns that are not going to match, but
-which have a very large number of possibilities in their search trees. The
-classic example is a pattern that uses nested unlimited repeats.
-.P
-Internally, \fBpcre2_match()\fP uses a function called \fBmatch()\fP, which it
-calls repeatedly (sometimes recursively). The limit set by \fImatch_limit\fP is
-imposed on the number of times this function is called during a match, which
-has the effect of limiting the amount of backtracking that can take place. For
+up too many computing resources when processing patterns that are not going to
+match, but which have a very large number of possibilities in their search
+trees. The classic example is a pattern that uses nested unlimited repeats.
+.P
+There is an internal counter in \fBpcre2_match()\fP that is incremented each
+time round its main matching loop. If this value reaches the match limit,
+\fBpcre2_match()\fP returns the negative value PCRE2_ERROR_MATCHLIMIT. This has
+the effect of limiting the amount of backtracking that can take place. For
 patterns that are not anchored, the count restarts from zero for each position
-in the subject string. This limit is not relevant to \fBpcre2_dfa_match()\fP,
-which ignores it.
+in the subject string. This limit also applies to \fBpcre2_dfa_match()\fP,
+though the counting is done in a different way.
 .P
 When \fBpcre2_match()\fP is called with a pattern that was successfully
 processed by \fBpcre2_jit_compile()\fP, the way in which matching is executed
@@ -811,75 +941,49 @@ is also used in this case (but in a different way) to limit how long the
 matching can continue.
 .P
 The default value for the limit can be set when PCRE2 is built; the default
-default is 10 million, which handles all but the most extreme cases. If the
-limit is exceeded, \fBpcre2_match()\fP returns PCRE2_ERROR_MATCHLIMIT. A value
+default is 10 million, which handles all but the most extreme cases. A value
 for the match limit may also be supplied by an item at the start of a pattern
 of the form
 .sp
   (*LIMIT_MATCH=ddd)
 .sp
 where ddd is a decimal number. However, such a setting is ignored unless ddd is
-less than the limit set by the caller of \fBpcre2_match()\fP or, if no such
-limit is set, less than the default.
+less than the limit set by the caller of \fBpcre2_match()\fP or
+\fBpcre2_dfa_match()\fP or, if no such limit is set, less than the default.
 .sp
 .nf
-.B int pcre2_set_recursion_limit(pcre2_match_context *\fImcontext\fP,
+.B int pcre2_set_depth_limit(pcre2_match_context *\fImcontext\fP,
 .B "  uint32_t \fIvalue\fP);"
 .fi
 .sp
-The \fIrecursion_limit\fP parameter is similar to \fImatch_limit\fP, but
-instead of limiting the total number of times that \fBmatch()\fP is called, it
-limits the depth of recursion. The recursion depth is a smaller number than the
-total number of calls, because not all calls to \fBmatch()\fP are recursive.
-This limit is of use only if it is set smaller than \fImatch_limit\fP.
-.P
-Limiting the recursion depth limits the amount of system stack that can be
-used, or, when PCRE2 has been compiled to use memory on the heap instead of the
-stack, the amount of heap memory that can be used. This limit is not relevant,
-and is ignored, when matching is done using JIT compiled code or by the
-\fBpcre2_dfa_match()\fP function.
-.P
-The default value for \fIrecursion_limit\fP can be set when PCRE2 is built; the
-default default is the same value as the default for \fImatch_limit\fP. If the
-limit is exceeded, \fBpcre2_match()\fP returns PCRE2_ERROR_RECURSIONLIMIT. A
-value for the recursion limit may also be supplied by an item at the start of a
-pattern of the form
-.sp
-  (*LIMIT_RECURSION=ddd)
+This parameter limits the depth of nested backtracking in \fBpcre2_match()\fP.
+Each time a nested backtracking point is passed, a new memory "frame" is used
+to remember the state of matching at that point. Thus, this parameter
+indirectly limits the amount of memory that is used in a match. However,
+because the size of each memory "frame" depends on the number of capturing
+parentheses, the actual memory limit varies from pattern to pattern. This limit
+was more useful in versions before 10.30, where function recursion was used for
+backtracking.
+.P
+The depth limit is not relevant, and is ignored, when matching is done using
+JIT compiled code. However, it is supported by \fBpcre2_dfa_match()\fP, which
+uses it to limit the depth of internal recursive function calls that implement
+atomic groups, lookaround assertions, and pattern recursions. This is,
+therefore, an indirect limit on the amount of system stack that is used. A
+recursive pattern such as /(.)(?1)/, when matched to a very long string using
+\fBpcre2_dfa_match()\fP, can use a great deal of stack.
+.P
+The default value for the depth limit can be set when PCRE2 is built; the
+default default is the same value as the default for the match limit. If the
+limit is exceeded, \fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP returns
+PCRE2_ERROR_DEPTHLIMIT. A value for the depth limit may also be supplied by an
+item at the start of a pattern of the form
+.sp
+  (*LIMIT_DEPTH=ddd)
 .sp
 where ddd is a decimal number. However, such a setting is ignored unless ddd is
-less than the limit set by the caller of \fBpcre2_match()\fP or, if no such
-limit is set, less than the default.
-.sp
-.nf
-.B int pcre2_set_recursion_memory_management(
-.B "  pcre2_match_context *\fImcontext\fP,"
-.B "  void *(*\fIprivate_malloc\fP)(PCRE2_SIZE, void *),"
-.B "  void (*\fIprivate_free\fP)(void *, void *), void *\fImemory_data\fP);"
-.fi
-.sp
-This function sets up two additional custom memory management functions for use
-by \fBpcre2_match()\fP when PCRE2 is compiled to use the heap for remembering
-backtracking data, instead of recursive function calls that use the system
-stack. There is a discussion about PCRE2's stack usage in the
-.\" HREF
-\fBpcre2stack\fP
-.\"
-documentation. See the
-.\" HREF
-\fBpcre2build\fP
-.\"
-documentation for details of how to build PCRE2.
-.P
-Using the heap for recursion is a non-standard way of building PCRE2, for use
-in environments that have limited stacks. Because of the greater use of memory
-management, \fBpcre2_match()\fP runs more slowly. Functions that are different
-to the general custom memory functions are provided so that special-purpose
-external code can be used for this case, because the memory blocks are all the
-same size. The blocks are retained by \fBpcre2_match()\fP until it is about to
-exit so that they can be re-used when possible during the match. In the absence
-of these functions, the normal custom memory management functions are used, if
-supplied, otherwise the system functions.
+less than the limit set by the caller of \fBpcre2_match()\fP or
+\fBpcre2_dfa_match()\fP or, if no such limit is set, less than the default.
 .
 .
 .SH "CHECKING BUILD-TIME OPTIONS"
@@ -915,6 +1019,25 @@ PCRE2_BSR_UNICODE means that \eR matches any Unicode line ending sequence; a
 value of PCRE2_BSR_ANYCRLF means that \eR matches only CR, LF, or CRLF. The
 default can be overridden when a pattern is compiled.
 .sp
+  PCRE2_CONFIG_COMPILED_WIDTHS
+.sp
+The output is a uint32_t integer whose lower bits indicate which code unit
+widths were selected when PCRE2 was built. The 1-bit indicates 8-bit support,
+and the 2-bit and 4-bit indicate 16-bit and 32-bit support, respectively.
+.sp
+  PCRE2_CONFIG_DEPTHLIMIT
+.sp
+The output is a uint32_t integer that gives the default limit for the depth of
+nested backtracking in \fBpcre2_match()\fP or the depth of nested recursions
+and lookarounds in \fBpcre2_dfa_match()\fP. Further details are given with
+\fBpcre2_set_depth_limit()\fP above.
+.sp
+  PCRE2_CONFIG_HEAPLIMIT
+.sp
+The output is a uint32_t integer that gives, in kilobytes, the default limit
+for the amount of heap memory used by \fBpcre2_match()\fP. Further details are
+given with \fBpcre2_set_heap_limit()\fP above.
+.sp
   PCRE2_CONFIG_JIT
 .sp
 The output is a uint32_t integer that is set to one if support for just-in-time
@@ -948,9 +1071,9 @@ be compiled by those two libraries, but at the expense of slower matching.
 .sp
   PCRE2_CONFIG_MATCHLIMIT
 .sp
-The output is a uint32_t integer that gives the default limit for the number of
-internal matching function calls in a \fBpcre2_match()\fP execution. Further
-details are given with \fBpcre2_match()\fP below.
+The output is a uint32_t integer that gives the default match limit for
+\fBpcre2_match()\fP. Further details are given with
+\fBpcre2_set_match_limit()\fP above.
 .sp
   PCRE2_CONFIG_NEWLINE
 .sp
@@ -962,10 +1085,16 @@ sequence that is recognized as meaning "newline". The values are:
   PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
   PCRE2_NEWLINE_ANY      Any Unicode line ending
   PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
+  PCRE2_NEWLINE_NUL      The NUL character (binary zero)
 .sp
 The default should normally correspond to the standard sequence for your
 operating system.
 .sp
+  PCRE2_CONFIG_NEVER_BACKSLASH_C
+.sp
+The output is a uint32_t integer that is set to one if the use of \eC was
+permanently disabled when PCRE2 was built; otherwise it is set to zero.
+.sp
   PCRE2_CONFIG_PARENSLIMIT
 .sp
 The output is a uint32_t integer that gives the maximum depth of nesting
@@ -975,19 +1104,10 @@ PCRE2 is built; the default is 250. This limit does not take into account the
 stack that may already be used by the calling application. For finer control
 over compilation stack usage, see \fBpcre2_set_compile_recursion_guard()\fP.
 .sp
-  PCRE2_CONFIG_RECURSIONLIMIT
-.sp
-The output is a uint32_t integer that gives the default limit for the depth of
-recursion when calling the internal matching function in a \fBpcre2_match()\fP
-execution. Further details are given with \fBpcre2_match()\fP below.
-.sp
   PCRE2_CONFIG_STACKRECURSE
 .sp
-The output is a uint32_t integer that is set to one if internal recursion when
-running \fBpcre2_match()\fP is implemented by recursive function calls that use
-the system stack to remember their state. This is the usual way that PCRE2 is
-compiled. The output is zero if PCRE2 was compiled to use blocks of data on the
-heap instead of recursive function calls.
+This parameter is obsolete and should not be used in new code. The output is a
+uint32_t integer that is always set to zero.
 .sp
   PCRE2_CONFIG_UNICODE_VERSION
 .sp
@@ -1006,7 +1126,7 @@ available; otherwise it is set to zero. Unicode support implies UTF support.
 .sp
   PCRE2_CONFIG_VERSION
 .sp
-The \fIwhere\fP argument should point to a buffer that is at least 12 code
+The \fIwhere\fP argument should point to a buffer that is at least 24 code
 units long. (The exact length required can be found by calling
 \fBpcre2_config()\fP with \fBwhere\fP set to NULL.) The buffer is filled with
 the PCRE2 version string, zero-terminated. The number of code units used is
@@ -1026,11 +1146,13 @@ zero.
 .B void pcre2_code_free(pcre2_code *\fIcode\fP);
 .sp
 .B pcre2_code *pcre2_code_copy(const pcre2_code *\fIcode\fP);
+.sp
+.B pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *\fIcode\fP);
 .fi
 .P
 The \fBpcre2_compile()\fP function compiles a pattern into an internal form.
-The pattern is defined by a pointer to a string of code units and a length. If
-the pattern is zero-terminated, the length can be specified as
+The pattern is defined by a pointer to a string of code units and a length (in
+code units). If the pattern is zero-terminated, the length can be specified as
 PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of memory that
 contains the compiled pattern and related data, or NULL if an error occurred.
 .P
@@ -1048,9 +1170,24 @@ below),
 .\"
 the JIT information cannot be copied (because it is position-dependent).
 The new copy can initially be used only for non-JIT matching, though it can be
-passed to \fBpcre2_jit_compile()\fP if required. The \fBpcre2_code_copy()\fP
-function provides a way for individual threads in a multithreaded application
-to acquire a private copy of shared compiled code.
+passed to \fBpcre2_jit_compile()\fP if required.
+.P
+The \fBpcre2_code_copy()\fP function provides a way for individual threads in a
+multithreaded application to acquire a private copy of shared compiled code.
+However, it does not make a copy of the character tables used by the compiled
+pattern; the new pattern code points to the same tables as the original code.
+(See
+.\" HTML <a href="#jitcompiling">
+.\" </a>
+"Locale Support"
+.\"
+below for details of these character tables.) In many applications the same
+tables are used throughout, so this behaviour is appropriate. Nevertheless,
+there are occasions when a copy of a compiled pattern and the relevant tables
+are needed. The \fBpcre2_code_copy_with_tables()\fP provides this facility.
+Copies of both the code and the tables are made, with the new code pointing to
+the new tables. The memory for the new tables is automatically freed when
+\fBpcre2_code_free()\fP is called for the new copy of the compiled code.
 .P
 NOTE: When one of the matching functions is called, pointers to the compiled
 pattern and the subject string are set in the match data block so that they can
@@ -1076,8 +1213,8 @@ documentation).
 .P
 For those options that can be different in different parts of the pattern, the
 contents of the \fIoptions\fP argument specifies their settings at the start of
-compilation. The PCRE2_ANCHORED and PCRE2_NO_UTF_CHECK options can be set at
-the time of matching as well as at compile time.
+compilation. The PCRE2_ANCHORED, PCRE2_ENDANCHORED, and PCRE2_NO_UTF_CHECK
+options can be set at the time of matching as well as at compile time.
 .P
 Other, less frequently required compile-time parameters (for example, the
 newline setting) can be provided in a compile context (as described
@@ -1093,16 +1230,30 @@ respectively, when \fBpcre2_compile()\fP returns NULL because a compilation
 error has occurred. The values are not defined when compilation is successful
 and \fBpcre2_compile()\fP returns a non-NULL value.
 .P
-The \fBpcre2_get_error_message()\fP function (see "Obtaining a textual error
+There are nearly 100 positive error codes that \fBpcre2_compile()\fP may return
+if it finds an error in the pattern. There are also some negative error codes
+that are used for invalid UTF strings. These are the same as given by
+\fBpcre2_match()\fP and \fBpcre2_dfa_match()\fP, and are described in the
+.\" HREF
+\fBpcre2unicode\fP
+.\"
+page. There is no separate documentation for the positive error codes, because
+the textual error messages that are obtained by calling the
+\fBpcre2_get_error_message()\fP function (see "Obtaining a textual error
 message"
 .\" HTML <a href="#geterrormessage">
 .\" </a>
 below)
 .\"
-provides a textual message for each error code. Compilation errors have
-positive error codes; UTF formatting error codes are negative. For an invalid
-UTF-8 or UTF-16 string, the offset is that of the first code unit of the
-failing character.
+should be self-explanatory. Macro names starting with PCRE2_ERROR_ are defined
+for both positive and negative error codes in \fBpcre2.h\fP.
+.P
+The value returned in \fIerroroffset\fP is an indication of where in the
+pattern the error occurred. It is not necessarily the furthest point in the
+pattern that was read. For example, after the error "lookbehind assertion is
+not fixed length", the error offset points to the start of the failing
+assertion. For an invalid UTF-8 or UTF-16 string, the offset is that of the
+first code unit of the failing character.
 .P
 Some errors are not detected until the whole pattern has been scanned; in these
 cases, the offset passed back is the length of the pattern. Note that the
@@ -1178,13 +1329,15 @@ include a closing parenthesis in the name. However, if the PCRE2_ALT_VERBNAMES
 option is set, normal backslash processing is applied to verb names and only an
 unescaped closing parenthesis terminates the name. A closing parenthesis can be
 included in a name either as \e) or between \eQ and \eE. If the PCRE2_EXTENDED
-option is set, unescaped whitespace in verb names is skipped and #-comments are
-recognized, exactly as in the rest of the pattern.
+or PCRE2_EXTENDED_MORE option is set, unescaped whitespace in verb names is
+skipped and #-comments are recognized in this mode, exactly as in the rest of
+the pattern.
 .sp
   PCRE2_AUTO_CALLOUT
 .sp
 If this bit is set, \fBpcre2_compile()\fP automatically inserts callout items,
-all with number 255, before each pattern item. For discussion of the callout
+all with number 255, before each pattern item, except immediately before or
+after an explicit callout in the pattern. For discussion of the callout
 facility, see the
 .\" HREF
 \fBpcre2callout\fP
@@ -1195,7 +1348,13 @@ documentation.
 .sp
 If this bit is set, letters in the pattern match both upper and lower case
 letters in the subject. It is equivalent to Perl's /i option, and it can be
-changed within a pattern by a (?i) option setting.
+changed within a pattern by a (?i) option setting. If PCRE2_UTF is set, Unicode
+properties are used for all characters with more than one other case, and for
+all characters whose code points are greater than U+007f. For lower valued
+characters with only one other case, a lookup table is used for speed. When
+PCRE2_UTF is not set, a lookup table is used for all code points less than 256,
+and higher code points (available only in 16-bit or 32-bit mode) are treated as
+not having another case.
 .sp
   PCRE2_DOLLAR_ENDONLY
 .sp
@@ -1227,6 +1386,29 @@ details of named subpatterns below; see also the
 .\"
 documentation.
 .sp
+  PCRE2_ENDANCHORED
+.sp
+If this bit is set, the end of any pattern match must be right at the end of
+the string being searched (the "subject string"). If the pattern match
+succeeds by reaching (*ACCEPT), but does not reach the end of the subject, the
+match fails at the current starting point. For unanchored patterns, a new match
+is then tried at the next starting point. However, if the match succeeds by
+reaching the end of the pattern, but not the end of the subject, backtracking
+occurs and an alternative match may be found. Consider these two patterns:
+.sp
+  .(*ACCEPT)|..
+  .|..
+.sp
+If matched against "abc" with PCRE2_ENDANCHORED set, the first matches "c"
+whereas the second matches "bc". The effect of PCRE2_ENDANCHORED can also be
+achieved by appropriate constructs in the pattern itself, which is the only way
+to do it in Perl.
+.P
+For DFA matching with \fBpcre2_dfa_match()\fP, PCRE2_ENDANCHORED applies only
+to the first (that is, the longest) matched string. Other parallel matches,
+which are necessarily substrings of the first one, must obviously end before
+the end of the subject.
+.sp
   PCRE2_EXTENDED
 .sp
 If this bit is set, most white space characters in the pattern are totally
@@ -1254,14 +1436,39 @@ sequence at the start of the pattern, as described in the section entitled
 in the \fBpcre2pattern\fP documentation. A default is defined when PCRE2 is
 built.
 .sp
+  PCRE2_EXTENDED_MORE
+.sp
+This option has the effect of PCRE2_EXTENDED, but, in addition, unescaped space
+and horizontal tab characters are ignored inside a character class.
+PCRE2_EXTENDED_MORE is equivalent to Perl's 5.26 /xx option, and it can be
+changed within a pattern by a (?xx) option setting.
+.sp
   PCRE2_FIRSTLINE
 .sp
-If this option is set, an unanchored pattern is required to match before or at
-the first newline in the subject string, though the matched text may continue
-over the newline. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
-general limiting facility. If PCRE2_FIRSTLINE is set with an offset limit, a
-match must occur in the first line and also within the offset limit. In other
-words, whichever limit comes first is used.
+If this option is set, the start of an unanchored pattern match must be before
+or at the first newline in the subject string following the start of matching,
+though the matched text may continue over the newline. If \fIstartoffset\fP is
+non-zero, the limiting newline is not necessarily the first newline in the
+subject. For example, if the subject string is "abc\enxyz" (where \en
+represents a single-character newline) a pattern match for "yz" succeeds with
+PCRE2_FIRSTLINE if \fIstartoffset\fP is greater than 3. See also
+PCRE2_USE_OFFSET_LIMIT, which provides a more general limiting facility. If
+PCRE2_FIRSTLINE is set with an offset limit, a match must occur in the first
+line and also within the offset limit. In other words, whichever limit comes
+first is used.
+.sp
+  PCRE2_LITERAL
+.sp
+If this option is set, all meta-characters in the pattern are disabled, and it
+is treated as a literal string. Matching literal strings with a regular
+expression engine is not the most efficient way of doing it. If you are doing a
+lot of literal matching and are worried about efficiency, you should consider
+using other approaches. The only other main options that are allowed with
+PCRE2_LITERAL are: PCRE2_ANCHORED, PCRE2_ENDANCHORED, PCRE2_AUTO_CALLOUT,
+PCRE2_CASELESS, PCRE2_FIRSTLINE, PCRE2_NO_START_OPTIMIZE, PCRE2_NO_UTF_CHECK,
+PCRE2_UTF, and PCRE2_USE_OFFSET_LIMIT. The extra options PCRE2_EXTRA_MATCH_LINE
+and PCRE2_EXTRA_MATCH_WORD are also supported. Any other options cause an
+error.
 .sp
   PCRE2_MATCH_UNSET_BACKREF
 .sp
@@ -1325,8 +1532,8 @@ PCRE2_NEVER_UTF causes an error.
 If this option is set, it disables the use of numbered capturing parentheses in
 the pattern. Any opening parenthesis that is not followed by ? behaves as if it
 were followed by ?: but named parentheses can still be used for capturing (and
-they acquire numbers in the usual way). There is no equivalent of this option
-in Perl. Note that, if this option is set, references to capturing groups (back
+they acquire numbers in the usual way). This is the same as Perl's /n option.
+Note that, when this option is set, references to capturing groups (back
 references or recursion/subroutine calls) may only refer to named groups,
 though the reference can be by name or by number.
 .sp
@@ -1361,8 +1568,8 @@ compiler.
 .P
 There are a number of optimizations that may occur at the start of a match, in
 order to speed up the process. For example, if it is known that an unanchored
-match must start with a specific character, the matching code searches the
-subject for that character, and fails immediately if it cannot find it, without
+match must start with a specific code unit value, the matching code searches
+the subject for that value, and fails immediately if it cannot find it, without
 actually running the main matching function. This means that a special item
 such as (*COMMIT) at the start of a pattern is not considered until after a
 suitable starting point for the match has been found. Also, when callouts or
@@ -1389,9 +1596,10 @@ current starting position, which in this case, it does. However, if the same
 match is run with PCRE2_NO_START_OPTIMIZE set, the initial scan along the
 subject string does not happen. The first match attempt is run starting from
 "D" and when this fails, (*COMMIT) prevents any further matches being tried, so
-the overall result is "no match". There are also other start-up optimizations.
-For example, a minimum length for the subject may be recorded. Consider the
-pattern
+the overall result is "no match".
+.P
+There are also other start-up optimizations. For example, a minimum length for
+the subject may be recorded. Consider the pattern
 .sp
   (*MARK:A)(X|Y)
 .sp
@@ -1423,16 +1631,30 @@ in the
 .\" HREF
 \fBpcre2unicode\fP
 .\"
-document.
-If an invalid UTF sequence is found, \fBpcre2_compile()\fP returns a negative
-error code.
+document. If an invalid UTF sequence is found, \fBpcre2_compile()\fP returns a
+negative error code.
 .P
-If you know that your pattern is valid, and you want to skip this check for
-performance reasons, you can set the PCRE2_NO_UTF_CHECK option. When it is set,
-the effect of passing an invalid UTF string as a pattern is undefined. It may
-cause your program to crash or loop. Note that this option can also be passed
-to \fBpcre2_match()\fP and \fBpcre_dfa_match()\fP, to suppress validity
-checking of the subject string.
+If you know that your pattern is a valid UTF string, and you want to skip this
+check for performance reasons, you can set the PCRE2_NO_UTF_CHECK option. When
+it is set, the effect of passing an invalid UTF string as a pattern is
+undefined. It may cause your program to crash or loop.
+.P
+Note that this option can also be passed to \fBpcre2_match()\fP and
+\fBpcre_dfa_match()\fP, to suppress UTF validity checking of the subject
+string.
+.P
+Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the
+error that is given if an escape sequence for an invalid Unicode code point is
+encountered in the pattern. In particular, the so-called "surrogate" code
+points (0xd800 to 0xdfff) are invalid. If you want to allow escape sequences
+such as \ex{d800} you can set the PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra
+option, as described in the section entitled "Extra compile options"
+.\" HTML <a href="#extracompileoptions">
+.\" </a>
+below.
+.\"
+However, this is possible only in UTF-8 and UTF-32 modes, because these values
+are not representable in UTF-16.
 .sp
   PCRE2_UCP
 .sp
@@ -1450,7 +1672,7 @@ in the
 .\"
 page. If you set PCRE2_UCP, matching one of the items it affects takes much
 longer. The option is available only if PCRE2 has been compiled with Unicode
-support.
+support (which is the default).
 .sp
   PCRE2_UNGREEDY
 .sp
@@ -1478,32 +1700,78 @@ This option causes PCRE2 to regard both the pattern and the subject strings
 that are subsequently processed as strings of UTF characters instead of
 single-code-unit strings. It is available when PCRE2 is built to include
 Unicode support (which is the default). If Unicode support is not available,
-the use of this option provokes an error. Details of how this option changes
-the behaviour of PCRE2 are given in the
+the use of this option provokes an error. Details of how PCRE2_UTF changes the
+behaviour of PCRE2 are given in the
 .\" HREF
 \fBpcre2unicode\fP
 .\"
 page.
 .
 .
-.SH "COMPILATION ERROR CODES"
+.\" HTML <a name="extracompileoptions"></a>
+.SS "Extra compile options"
 .rs
 .sp
-There are over 80 positive error codes that \fBpcre2_compile()\fP may return
-(via \fIerrorcode\fP) if it finds an error in the pattern. There are also some
-negative error codes that are used for invalid UTF strings. These are the same
-as given by \fBpcre2_match()\fP and \fBpcre2_dfa_match()\fP, and are described
-in the
-.\" HREF
-\fBpcre2unicode\fP
-.\"
-page. The \fBpcre2_get_error_message()\fP function (see "Obtaining a textual
-error message"
-.\" HTML <a href="#geterrormessage">
-.\" </a>
-below)
-.\"
-can be called to obtain a textual error message from any error code.
+Unlike the main compile-time options, the extra options are not saved with the
+compiled pattern. The option bits that can be set in a compile context by
+calling the \fBpcre2_set_compile_extra_options()\fP function are as follows:
+.sp
+  PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
+.sp
+This option applies when compiling a pattern in UTF-8 or UTF-32 mode. It is
+forbidden in UTF-16 mode, and ignored in non-UTF modes. Unicode "surrogate"
+code points in the range 0xd800 to 0xdfff are used in pairs in UTF-16 to encode
+code points with values in the range 0x10000 to 0x10ffff. The surrogates cannot
+therefore be represented in UTF-16. They can be represented in UTF-8 and
+UTF-32, but are defined as invalid code points, and cause errors if encountered
+in a UTF-8 or UTF-32 string that is being checked for validity by PCRE2.
+.P
+These values also cause errors if encountered in escape sequences such as
+\ex{d912} within a pattern. However, it seems that some applications, when
+using PCRE2 to check for unwanted characters in UTF-8 strings, explicitly test
+for the surrogates using escape sequences. The PCRE2_NO_UTF_CHECK option does
+not disable the error that occurs, because it applies only to the testing of
+input strings for UTF validity.
+.P
+If the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set, surrogate code
+point values in UTF-8 and UTF-32 patterns no longer provoke errors and are
+incorporated in the compiled pattern. However, they can only match subject
+characters if the matching function is called with PCRE2_NO_UTF_CHECK set.
+.sp
+  PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
+.sp
+This is a dangerous option. Use with care. By default, an unrecognized escape
+such as \ej or a malformed one such as \ex{2z} causes a compile-time error when
+detected by \fBpcre2_compile()\fP. Perl is somewhat inconsistent in handling
+such items: for example, \ej is treated as a literal "j", and non-hexadecimal
+digits in \ex{} are just ignored, though warnings are given in both cases if
+Perl's warning switch is enabled. However, a malformed octal number after \eo{
+always causes an error in Perl.
+.P
+If the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL extra option is passed to
+\fBpcre2_compile()\fP, all unrecognized or erroneous escape sequences are
+treated as single-character escapes. For example, \ej is a literal "j" and
+\ex{2z} is treated as the literal string "x{2z}". Setting this option means
+that typos in patterns may go undetected and have unexpected results. This is a
+dangerous option. Use with care.
+.sp
+  PCRE2_EXTRA_MATCH_LINE
+.sp
+This option is provided for use by the \fB-x\fP option of \fBpcre2grep\fP. It
+causes the pattern only to match complete lines. This is achieved by
+automatically inserting the code for "^(?:" at the start of the compiled
+pattern and ")$" at the end. Thus, when PCRE2_MULTILINE is set, the matched
+line may be in the middle of the subject string. This option can be used with
+PCRE2_LITERAL.
+.sp
+  PCRE2_EXTRA_MATCH_WORD
+.sp
+This option is provided for use by the \fB-w\fP option of \fBpcre2grep\fP. It
+causes the pattern only to match strings that have a word boundary at the start
+and the end. This is achieved by automatically inserting the code for "\eb(?:"
+at the start of the compiled pattern and ")\eb" at the end. The option may be
+used with PCRE2_LITERAL. However, it is ignored if PCRE2_EXTRA_MATCH_LINE is
+also set.
 .
 .
 .\" HTML <a name="jitcompiling"></a>
@@ -1541,7 +1809,7 @@ documentation.
 JIT compilation is a heavyweight optimization. It can take some time for
 patterns to be analyzed, and for one-off matches and simple patterns the
 benefit of faster execution might be offset by a much slower compilation time.
-Most, but not all patterns can be optimized by the JIT compiler.
+Most (but not all) patterns can be optimized by the JIT compiler.
 .
 .
 .\" HTML <a name="localesupport"></a>
@@ -1552,10 +1820,10 @@ PCRE2 handles caseless matching, and determines whether characters are letters,
 digits, or whatever, by reference to a set of tables, indexed by character code
 point. This applies only to characters whose code points are less than 256. By
 default, higher-valued code points never match escapes such as \ew or \ed.
-However, if PCRE2 is built with UTF support, all characters can be tested with
-\ep and \eP, or, alternatively, the PCRE2_UCP option can be set when a pattern
-is compiled; this causes \ew and friends to use Unicode property support
-instead of the built-in tables.
+However, if PCRE2 is built with Unicode support, all characters can be tested
+with \ep and \eP, or, alternatively, the PCRE2_UCP option can be set when a
+pattern is compiled; this causes \ew and friends to use Unicode property
+support instead of the built-in tables.
 .P
 The use of locales with Unicode is discouraged. If you are handling characters
 with code points greater than 128, you should either use Unicode support, or
@@ -1594,7 +1862,7 @@ available for as long as it is needed.
 The pointer that is passed (via the compile context) to \fBpcre2_compile()\fP
 is saved with the compiled pattern, and the same tables are used by
 \fBpcre2_match()\fP and \fBpcre_dfa_match()\fP. Thus, for any single pattern,
-compilation, and matching all happen in the same locale, but different patterns
+compilation and matching both happen in the same locale, but different patterns
 can be processed in different locales.
 .
 .
@@ -1617,7 +1885,7 @@ pattern. The second argument specifies which piece of information is required,
 and the third argument is a pointer to a variable to receive the data. If the
 third argument is NULL, the first argument is ignored, and the function returns
 the size in bytes of the variable that is required for the information
-requested. Otherwise, The yield of the function is zero for success, or one of
+requested. Otherwise, the yield of the function is zero for success, or one of
 the following negative numbers:
 .sp
   PCRE2_ERROR_NULL           the argument \fIcode\fP was NULL
@@ -1641,12 +1909,15 @@ are as follows:
 .sp
   PCRE2_INFO_ALLOPTIONS
   PCRE2_INFO_ARGOPTIONS
+  PCRE2_INFO_EXTRAOPTIONS
 .sp
-Return a copy of the pattern's options. The third argument should point to a
+Return copies of the pattern's options. The third argument should point to a
 \fBuint32_t\fP variable. PCRE2_INFO_ARGOPTIONS returns exactly the options that
 were passed to \fBpcre2_compile()\fP, whereas PCRE2_INFO_ALLOPTIONS returns
 the compile options as modified by any top-level (*XXX) option settings such as
-(*UTF) at the start of the pattern itself.
+(*UTF) at the start of the pattern itself. PCRE2_INFO_EXTRAOPTIONS returns the
+extra options that were set in the compile context by calling the
+pcre2_set_compile_extra_options() function.
 .P
 For example, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED
 option, the result for PCRE2_INFO_ALLOPTIONS is PCRE2_EXTENDED and PCRE2_UTF.
@@ -1670,8 +1941,8 @@ following are true:
   .* is not in a capturing group that is the subject
        of a back reference
   PCRE2_DOTALL is in force for .*
-  Neither (*PRUNE) nor (*SKIP) appears in the pattern.
-  PCRE2_NO_DOTSTAR_ANCHOR is not set.
+  Neither (*PRUNE) nor (*SKIP) appears in the pattern
+  PCRE2_NO_DOTSTAR_ANCHOR is not set
 .sp
 For patterns that are auto-anchored, the PCRE2_ANCHORED bit is set in the
 options returned for PCRE2_INFO_ALLOPTIONS.
@@ -1699,6 +1970,15 @@ Return the highest capturing subpattern number in the pattern. In patterns
 where (?| is not used, this is also the total number of capturing subpatterns.
 The third argument should point to an \fBuint32_t\fP variable.
 .sp
+  PCRE2_INFO_DEPTHLIMIT
+.sp
+If the pattern set a backtracking depth limit by including an item of the form
+(*LIMIT_DEPTH=nnnn) at the start, the value is returned. The third argument
+should point to an unsigned 32-bit integer. If no such value has been set, the
+call to \fBpcre2_pattern_info()\fP returns the error PCRE2_ERROR_UNSET. Note
+that this limit will only be used during matching if it is less than the limit
+set or defaulted by the caller of the match function.
+.sp
   PCRE2_INFO_FIRSTBITMAP
 .sp
 In the absence of a single first code unit for a non-anchored pattern,
@@ -1715,21 +1995,29 @@ returned. Otherwise NULL is returned. The third argument should point to an
 Return information about the first code unit of any matched string, for a
 non-anchored pattern. The third argument should point to an \fBuint32_t\fP
 variable. If there is a fixed first value, for example, the letter "c" from a
-pattern such as (cat|cow|coyote), 1 is returned, and the character value can be
-retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed first value, but
-it is known that a match can occur only at the start of the subject or
-following a newline in the subject, 2 is returned. Otherwise, and for anchored
-patterns, 0 is returned.
+pattern such as (cat|cow|coyote), 1 is returned, and the value can be retrieved
+using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed first value, but it is
+known that a match can occur only at the start of the subject or following a
+newline in the subject, 2 is returned. Otherwise, and for anchored patterns, 0
+is returned.
 .sp
   PCRE2_INFO_FIRSTCODEUNIT
 .sp
-Return the value of the first code unit of any matched string in the situation
+Return the value of the first code unit of any matched string for a pattern
 where PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0. The third
 argument should point to an \fBuint32_t\fP variable. In the 8-bit library, the
 value is always less than 256. In the 16-bit library the value can be up to
 0xffff. In the 32-bit library in UTF-32 mode the value can be up to 0x10ffff,
 and up to 0xffffffff when not using UTF-32 mode.
 .sp
+  PCRE2_INFO_FRAMESIZE
+.sp
+Return the size (in bytes) of the data frames that are used to remember
+backtracking positions when the pattern is processed by \fBpcre2_match()\fP
+without the use of JIT. The third argument should point to an \fBsize_t\fP
+variable. The frame size depends on the number of capturing parentheses in the
+pattern. Each additional capturing group adds two PCRE2_SIZE variables.
+.sp
   PCRE2_INFO_HASBACKSLASHC
 .sp
 Return 1 if the pattern contains any instances of \eC, otherwise 0. The third
@@ -1739,7 +2027,17 @@ argument should point to an \fBuint32_t\fP variable.
 .sp
 Return 1 if the pattern contains any explicit matches for CR or LF characters,
 otherwise 0. The third argument should point to an \fBuint32_t\fP variable. An
-explicit match is either a literal CR or LF character, or \er or \en.
+explicit match is either a literal CR or LF character, or \er or \en or one of
+the equivalent hexadecimal or octal escape sequences.
+.sp
+  PCRE2_INFO_HEAPLIMIT
+.sp
+If the pattern set a heap memory limit by including an item of the form
+(*LIMIT_HEAP=nnnn) at the start, the value is returned. The third argument
+should point to an unsigned 32-bit integer. If no such value has been set, the
+call to \fBpcre2_pattern_info()\fP returns the error PCRE2_ERROR_UNSET. Note
+that this limit will only be used during matching if it is less than the limit
+set or defaulted by the caller of the match function.
 .sp
   PCRE2_INFO_JCHANGED
 .sp
@@ -1766,10 +2064,10 @@ PCRE2_INFO_LASTCODEUNIT), but for /^a\edz\ed/ the returned value is 0.
 .sp
   PCRE2_INFO_LASTCODEUNIT
 .sp
-Return the value of the rightmost literal data unit that must exist in any
-matched string, other than at its start, if such a value has been recorded. The
-third argument should point to an \fBuint32_t\fP variable. If there is no such
-value, 0 is returned.
+Return the value of the rightmost literal code unit that must exist in any
+matched string, other than at its start, for a pattern where
+PCRE2_INFO_LASTCODETYPE returns 1. Otherwise, return 0. The third argument
+should point to an \fBuint32_t\fP variable.
 .sp
   PCRE2_INFO_MATCHEMPTY
 .sp
@@ -1784,7 +2082,9 @@ in such cases.
 If the pattern set a match limit by including an item of the form
 (*LIMIT_MATCH=nnnn) at the start, the value is returned. The third argument
 should point to an unsigned 32-bit integer. If no such value has been set, the
-call to \fBpcre2_pattern_info()\fP returns the error PCRE2_ERROR_UNSET.
+call to \fBpcre2_pattern_info()\fP returns the error PCRE2_ERROR_UNSET. Note
+that this limit will only be used during matching if it is less than the limit
+set or defaulted by the caller of the match function.
 .sp
   PCRE2_INFO_MAXLOOKBEHIND
 .sp
@@ -1796,7 +2096,8 @@ require a one-character lookbehind. \eA also registers a one-character
 lookbehind, though it does not actually inspect the previous character. This is
 to ensure that at least one character from the old segment is retained when a
 new segment is processed. Otherwise, if there are no lookbehinds in the
-pattern, \eA might match incorrectly at the start of a new segment.
+pattern, \eA might match incorrectly at the start of a second or subsequent
+segment.
 .sp
   PCRE2_INFO_MINLENGTH
 .sp
@@ -1878,23 +2179,17 @@ different for each compiled pattern.
 .sp
   PCRE2_INFO_NEWLINE
 .sp
-The output is a \fBuint32_t\fP with one of the following values:
+The output is one of the following \fBuint32_t\fP values:
 .sp
   PCRE2_NEWLINE_CR       Carriage return (CR)
   PCRE2_NEWLINE_LF       Linefeed (LF)
   PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
   PCRE2_NEWLINE_ANY      Any Unicode line ending
   PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
+  PCRE2_NEWLINE_NUL      The NUL character (binary zero)
 .sp
-This specifies the default character sequence that will be recognized as
-meaning "newline" while matching.
-.sp
-  PCRE2_INFO_RECURSIONLIMIT
-.sp
-If the pattern set a recursion limit by including an item of the form
-(*LIMIT_RECURSION=nnnn) at the start, the value is returned. The third
-argument should point to an unsigned 32-bit integer. If no such value has been
-set, the call to \fBpcre2_pattern_info()\fP returns the error PCRE2_ERROR_UNSET.
+This identifies the character sequence that will be recognized as meaning
+"newline" while matching.
 .sp
   PCRE2_INFO_SIZE
 .sp
@@ -1964,16 +2259,16 @@ Information about a successful or unsuccessful match is placed in a match
 data block, which is an opaque structure that is accessed by function calls. In
 particular, the match data block contains a vector of offsets into the subject
 string that define the matched part of the subject and any substrings that were
-captured. This is know as the \fIovector\fP.
+captured. This is known as the \fIovector\fP.
 .P
 Before calling \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP, or
 \fBpcre2_jit_match()\fP you must create a match data block by calling one of
 the creation functions above. For \fBpcre2_match_data_create()\fP, the first
 argument is the number of pairs of offsets in the \fIovector\fP. One pair of
 offsets is required to identify the string that matched the whole pattern, with
-another pair for each captured substring. For example, a value of 4 creates
-enough space to record the matched portion of the subject plus three captured
-substrings. A minimum of at least 1 pair is imposed by
+an additional pair for each captured substring. For example, a value of 4
+creates enough space to record the matched portion of the subject plus three
+captured substrings. A minimum of at least 1 pair is imposed by
 \fBpcre2_match_data_create()\fP, so it is always possible to return the overall
 matched string.
 .P
@@ -2052,7 +2347,7 @@ Here is an example of a simple call to \fBpcre2_match()\fP:
     11,             /* the length of the subject string */
     0,              /* start at offset 0 in the subject */
     0,              /* default options */
-    match_data,     /* the match data block */
+    md,             /* the match data block */
     NULL);          /* a match context; NULL means use defaults */
 .sp
 If the subject string is zero-terminated, the length can be given as
@@ -2116,9 +2411,11 @@ newline convention recognizes CRLF as a newline, and if so, and the current
 character is CR followed by LF, advance the starting offset by two characters
 instead of one.
 .P
-If a non-zero starting offset is passed when the pattern is anchored, one
+If a non-zero starting offset is passed when the pattern is anchored, a single
 attempt to match at the given offset is made. This can only succeed if the
-pattern does not require the match to be at the start of the subject.
+pattern does not require the match to be at the start of the subject. In other
+words, the anchoring must be the result of setting the PCRE2_ANCHORED option or
+the use of .* with PCRE2_DOTALL, not by starting the pattern with ^ or \eA.
 .
 .
 .\" HTML <a name="matchoptions"></a>
@@ -2126,15 +2423,15 @@ pattern does not require the match to be at the start of the subject.
 .rs
 .sp
 The unused bits of the \fIoptions\fP argument for \fBpcre2_match()\fP must be
-zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
-PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
-PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is
-described below.
+zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_ENDANCHORED,
+PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
+PCRE2_NO_JIT, PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT.
+Their action is described below.
 .P
-Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
-compiler. If it is set, JIT matching is disabled and the normal interpretive
-code in \fBpcre2_match()\fP is run. Apart from PCRE2_NO_JIT (obviously), the
-remaining options are supported for JIT matching.
+Setting PCRE2_ANCHORED or PCRE2_ENDANCHORED at match time is not supported by
+the just-in-time (JIT) compiler. If it is set, JIT matching is disabled and the
+interpretive code in \fBpcre2_match()\fP is run. Apart from PCRE2_NO_JIT
+(obviously), the remaining options are supported for JIT matching.
 .sp
   PCRE2_ANCHORED
 .sp
@@ -2144,6 +2441,12 @@ to be anchored by virtue of its contents, it cannot be made unachored at
 matching time. Note that setting the option at match time disables JIT
 matching.
 .sp
+  PCRE2_ENDANCHORED
+.sp
+If the PCRE2_ENDANCHORED option is set, any string that \fBpcre2_match()\fP
+matches must be right at the end of the subject string. Note that setting the
+option at match time disables JIT matching.
+.sp
   PCRE2_NOTBOL
 .sp
 This option specifies that first character of the subject string is not the
@@ -2228,12 +2531,12 @@ page.
 If you know that your subject is valid, and you want to skip these checks for
 performance reasons, you can set the PCRE2_NO_UTF_CHECK option when calling
 \fBpcre2_match()\fP. You might want to do this for the second and subsequent
-calls to \fBpcre2_match()\fP if you are making repeated calls to find all the
-matches in a single subject string.
+calls to \fBpcre2_match()\fP if you are making repeated calls to find other
+matches in the same subject string.
 .P
-NOTE: When PCRE2_NO_UTF_CHECK is set, the effect of passing an invalid string
-as a subject, or an invalid value of \fIstartoffset\fP, is undefined. Your
-program may crash or loop indefinitely.
+WARNING: When PCRE2_NO_UTF_CHECK is set, the effect of passing an invalid
+string as a subject, or an invalid value of \fIstartoffset\fP, is undefined.
+Your program may crash or loop indefinitely.
 .sp
   PCRE2_PARTIAL_HARD
   PCRE2_PARTIAL_SOFT
@@ -2300,9 +2603,9 @@ start, it skips both the CR and the LF before retrying. However, the pattern
 reference, and so advances only by one character after the first failure.
 .P
 An explicit match for CR of LF is either a literal appearance of one of those
-characters in the pattern, or one of the \er or \en escape sequences. Implicit
-matches such as [^X] do not count, nor does \es, even though it includes CR and
-LF in the characters that it matches.
+characters in the pattern, or one of the \er or \en or equivalent octal or
+hexadecimal escape sequences. Implicit matches such as [^X] do not count, nor
+does \es, even though it includes CR and LF in the characters that it matches.
 .P
 Notwithstanding the above, anomalous effects may still occur when CRLF is a
 valid newline sequence and explicit \er or \en escapes appear in the pattern.
@@ -2366,12 +2669,12 @@ identify the part of the subject that was partially matched. See the
 .\"
 documentation for details of partial matching.
 .P
-After a successful match, the first pair of offsets identifies the portion of
-the subject string that was matched by the entire pattern. The next pair is
-used for the first capturing subpattern, and so on. The value returned by
+After a fully successful match, the first pair of offsets identifies the
+portion of the subject string that was matched by the entire pattern. The next
+pair is used for the first captured substring, and so on. The value returned by
 \fBpcre2_match()\fP is one more than the highest numbered pair that has been
 set. For example, if two substrings have been captured, the returned value is
-3. If there are no capturing subpatterns, the return value from a successful
+3. If there are no captured substrings, the return value from a successful
 match is 1, indicating that just the first pair of offsets has been set.
 .P
 If a pattern uses the \eK escape sequence within a positive assertion, the
@@ -2386,11 +2689,7 @@ returned.
 If the ovector is too small to hold all the captured substring offsets, as much
 as possible is filled in, and the function returns a value of zero. If captured
 substrings are not of interest, \fBpcre2_match()\fP may be called with a match
-data block whose ovector is of minimum length (that is, one pair). However, if
-the pattern contains back references and the \fIovector\fP is not big enough to
-remember the related substrings, PCRE2 has to get additional memory for use
-during matching. Thus it is usually advisable to set up a match data block
-containing an ovector of reasonable size.
+data block whose ovector is of minimum length (that is, one pair).
 .P
 It is possible for capturing subpattern number \fIn+1\fP to match some part of
 the subject when subpattern \fIn\fP has not been used at all. For example, if
@@ -2430,24 +2729,27 @@ appropriate circumstances. If they are called at other times, the result is
 undefined.
 .P
 After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
-to match (PCRE2_ERROR_NOMATCH), a (*MARK) name may be available, and
-\fBpcre2_get_mark()\fP can be called. It returns a pointer to the
-zero-terminated name, which is within the compiled pattern. Otherwise NULL is
-returned. The length of the (*MARK) name (excluding the terminating zero) is
-stored in the code unit that preceeds the name. You should use this instead of
-relying on the terminating zero if the (*MARK) name might contain a binary
-zero.
-.P
-After a successful match, the (*MARK) name that is returned is the
-last one encountered on the matching path through the pattern. After a "no
-match" or a partial match, the last encountered (*MARK) name is returned. For
-example, consider this pattern:
+to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be
+available. The function \fBpcre2_get_mark()\fP can be called to access this
+name. The same function applies to all three verbs. It returns a pointer to the
+zero-terminated name, which is within the compiled pattern. If no name is
+available, NULL is returned. The length of the name (excluding the terminating
+zero) is stored in the code unit that precedes the name. You should use this
+length instead of relying on the terminating zero if the name might contain a
+binary zero.
+.P
+After a successful match, the name that is returned is the last (*MARK),
+(*PRUNE), or (*THEN) name encountered on the matching path through the pattern.
+Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example,
+if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
+After a "no match" or a partial match, the last encountered name is returned.
+For example, consider this pattern:
 .sp
   ^(*MARK:A)((*MARK:B)a|b)c
 .sp
-When it matches "bc", the returned mark is A. The B mark is "seen" in the first
+When it matches "bc", the returned name is A. The B mark is "seen" in the first
 branch of the group, but it is not on the matching path. On the other hand,
-when this pattern fails to match "bx", the returned mark is B.
+when this pattern fails to match "bx", the returned name is B.
 .P
 After a successful match, a partial match, or one of the invalid UTF errors
 (for example, PCRE2_ERROR_UTF8_ERR5), \fBpcre2_get_startchar()\fP can be
@@ -2506,8 +2808,9 @@ returned when the magic number is not present.
 .sp
   PCRE2_ERROR_BADMODE
 .sp
-This error is given when a pattern that was compiled by the 8-bit library is
-passed to a 16-bit or 32-bit library function, or vice versa.
+This error is given when a compiled pattern is passed to a function in a
+library of a different code unit width, for example, a pattern compiled by
+the 8-bit library is passed to a 16-bit or 32-bit library function.
 .sp
   PCRE2_ERROR_BADOFFSET
 .sp
@@ -2534,22 +2837,19 @@ use by callout functions that want to cause \fBpcre2_match()\fP or
 .\"
 documentation for details.
 .sp
+  PCRE2_ERROR_DEPTHLIMIT
+.sp
+The nested backtracking depth limit was reached.
+.sp
+  PCRE2_ERROR_HEAPLIMIT
+.sp
+The heap limit was reached.
+.sp
   PCRE2_ERROR_INTERNAL
 .sp
 An unexpected internal error has occurred. This error could be caused by a bug
 in PCRE2 or by overwriting of the compiled pattern.
 .sp
-  PCRE2_ERROR_JIT_BADOPTION
-.sp
-This error is returned when a pattern that was successfully studied using JIT
-is being matched, but the matching mode (partial or complete match) does not
-correspond to any JIT compilation mode. When the JIT fast path function is
-used, this error may be also given for invalid options. See the
-.\" HREF
-\fBpcre2jit\fP
-.\"
-documentation for more details.
-.sp
   PCRE2_ERROR_JIT_STACKLIMIT
 .sp
 This error is returned when a pattern that was successfully studied using JIT
@@ -2562,15 +2862,14 @@ documentation for more details.
 .sp
   PCRE2_ERROR_MATCHLIMIT
 .sp
-The backtracking limit was reached.
+The backtracking match limit was reached.
 .sp
   PCRE2_ERROR_NOMEMORY
 .sp
-If a pattern contains back references, but the ovector is not big enough to
-remember the referenced substrings, PCRE2 gets a block of memory at the start
-of matching to use for this purpose. There are some other special cases where
-extra memory is needed during matching. This error is given when memory cannot
-be obtained.
+If a pattern contains many nested backtracking points, heap memory is used to
+remember them. This error is given when the memory allocation function (default
+or custom) fails. Note that a different error, PCRE2_ERROR_HEAPLIMIT, is given
+if the amount of memory needed exceeds the heap limit.
 .sp
   PCRE2_ERROR_NULL
 .sp
@@ -2586,10 +2885,6 @@ in the subject string. Some simple patterns that might do this are detected and
 faulted at compile time, but more complicated cases, in particular mutual
 recursions between two different subpatterns, cannot be detected until matching
 is attempted.
-.sp
-  PCRE2_ERROR_RECURSIONLIMIT
-.sp
-The internal recursion limit was reached.
 .
 .
 .\" HTML <a name="geterrormessage"></a>
@@ -2604,8 +2899,8 @@ The internal recursion limit was reached.
 A text message for an error code from any PCRE2 function (compile, match, or
 auxiliary) can be obtained by calling \fBpcre2_get_error_message()\fP. The code
 is passed as the first argument, with the remaining two arguments specifying a
-code unit buffer and its length, into which the text message is placed. Note
-that the message is returned in code units of the appropriate width for the
+code unit buffer and its length in code units, into which the text message is
+placed. The message is returned in code units of the appropriate width for the
 library that is being used.
 .P
 The returned message is terminated with a trailing zero, and the function
@@ -2779,8 +3074,8 @@ calling \fBpcre2_substring_number_from_name()\fP. The first argument is the
 compiled pattern, and the second is the name. The yield of the function is the
 subpattern number, PCRE2_ERROR_NOSUBSTRING if there is no subpattern of that
 name, or PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one subpattern of
-that name. Given the number, you can extract the substring directly, or use one
-of the functions described above.
+that name. Given the number, you can extract the substring directly from the
+ovector, or use one of the "bynumber" functions described above.
 .P
 For convenience, there are also "byname" functions that correspond to the
 "bynumber" functions, the only difference being that the second argument is a
@@ -2855,12 +3150,12 @@ length is in code units, not bytes.
 In the replacement string, which is interpreted as a UTF string in UTF mode,
 and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
 dollar character is an escape character that can specify the insertion of
-characters from capturing groups or (*MARK) items in the pattern. The following
-forms are always recognized:
+characters from capturing groups or (*MARK), (*PRUNE), or (*THEN) items in the
+pattern. The following forms are always recognized:
 .sp
   $$                  insert a dollar character
   $<n> or ${<n>}      insert the contents of group <n>
-  $*MARK or ${*MARK}  insert the name of the last (*MARK) encountered
+  $*MARK or ${*MARK}  insert a (*MARK), (*PRUNE), or (*THEN) name
 .sp
 Either a group number or a group name can be given for <n>. Curly brackets are
 required only if the following character would be interpreted as part of the
@@ -2868,24 +3163,41 @@ number or name. The number may be zero to include the entire matched string.
 For example, if the pattern a(b)c is matched with "=abc=" and the replacement
 string "+$1$0$1+", the result is "=+babcb+=".
 .P
-The facility for inserting a (*MARK) name can be used to perform simple
-simultaneous substitutions, as this \fBpcre2test\fP example shows:
+$*MARK inserts the name from the last encountered (*MARK), (*PRUNE), or (*THEN)
+on the matching path that has a name. (*MARK) must always include a name, but
+(*PRUNE) and (*THEN) need not. For example, in the case of (*MARK:A)(*PRUNE)
+the name inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B".
+This facility can be used to perform simple simultaneous substitutions, as this
+\fBpcre2test\fP example shows:
 .sp
-  /(*:pear)apple|(*:orange)lemon/g,replace=${*MARK}
+  /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
       apple lemon
    2: pear orange
 .sp
 As well as the usual options for \fBpcre2_match()\fP, a number of additional
-options can be set in the \fIoptions\fP argument.
+options can be set in the \fIoptions\fP argument of \fBpcre2_substitute()\fP.
 .P
 PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject string,
-replacing every matching substring. If this is not set, only the first matching
-substring is replaced. If any matched substring has zero length, after the
-substitution has happened, an attempt to find a non-empty match at the same
-position is performed. If this is not successful, the current position is
-advanced by one character except when CRLF is a valid newline sequence and the
-next two characters are CR, LF. In this case, the current position is advanced
-by two characters.
+replacing every matching substring. If this option is not set, only the first
+matching substring is replaced. The search for matches takes place in the
+original subject string (that is, previous replacements do not affect it).
+Iteration is implemented by advancing the \fIstartoffset\fP value for each
+search, which is always passed the entire subject string. If an offset limit is
+set in the match context, searching stops when that limit is reached.
+.P
+You can restrict the effect of a global substitution to a portion of the
+subject string by setting either or both of \fIstartoffset\fP and an offset
+limit. Here is a \fPpcre2test\fP example:
+.sp
+  /B/g,replace=!,use_offset_limit
+  ABC ABC ABC ABC\e=offset=3,offset_limit=12
+   2: ABC A!C A!C ABC
+.sp
+When continuing with global substitutions after matching a substring with zero
+length, an attempt to find a non-empty match at the same offset is performed.
+If this is not successful, the offset is advanced by one character except when
+CRLF is a valid newline sequence and the next two characters are CR, LF. In
+this case, the offset is advanced by two characters.
 .P
 PCRE2_SUBSTITUTE_OVERFLOW_LENGTH changes what happens when the output buffer is
 too small. The default action is to return PCRE2_ERROR_NOMEMORY immediately. If
@@ -2987,10 +3299,10 @@ default.
 .P
 PCRE2_ERROR_BADREPLACEMENT is used for miscellaneous syntax errors in the
 replacement string, with more particular errors being PCRE2_ERROR_BADREPESCAPE
-(invalid escape sequence), PCRE2_ERROR_REPMISSING_BRACE (closing curly bracket
-not found), PCRE2_BADSUBSTITUTION (syntax error in extended group
-substitution), and PCRE2_BADSUBPATTERN (the pattern match ended before it
-started, which can happen if \eK is used in an assertion).
+(invalid escape sequence), PCRE2_ERROR_REPMISSINGBRACE (closing curly bracket
+not found), PCRE2_ERROR_BADSUBSTITUTION (syntax error in extended group
+substitution), and PCRE2_ERROR_BADSUBSPATTERN (the pattern match ended before
+it started, which can happen if \eK is used in an assertion).
 .P
 As for all PCRE2 errors, a text message that describes the error can be
 obtained by calling the \fBpcre2_get_error_message()\fP function (see
@@ -3084,11 +3396,12 @@ other alternatives. Ultimately, when it runs out of matches,
 .P
 The function \fBpcre2_dfa_match()\fP is called to match a subject string
 against a compiled pattern, using a matching algorithm that scans the subject
-string just once, and does not backtrack. This has different characteristics to
-the normal algorithm, and is not compatible with Perl. Some of the features of
-PCRE2 patterns are not supported. Nevertheless, there are times when this kind
-of matching can be useful. For a discussion of the two matching algorithms, and
-a list of features that \fBpcre2_dfa_match()\fP does not support, see the
+string just once (not counting lookaround assertions), and does not backtrack.
+This has different characteristics to the normal algorithm, and is not
+compatible with Perl. Some of the features of PCRE2 patterns are not supported.
+Nevertheless, there are times when this kind of matching can be useful. For a
+discussion of the two matching algorithms, and a list of features that
+\fBpcre2_dfa_match()\fP does not support, see the
 .\" HREF
 \fBpcre2matching\fP
 .\"
@@ -3115,7 +3428,7 @@ Here is an example of a simple call to \fBpcre2_dfa_match()\fP:
     11,             /* the length of the subject string */
     0,              /* start at offset 0 in the subject */
     0,              /* default options */
-    match_data,     /* the match data block */
+    md,             /* the match data block */
     NULL,           /* a match context; NULL means use defaults */
     wspace,         /* working space vector */
     20);            /* number of elements (NOT size in bytes) */
@@ -3124,11 +3437,11 @@ Here is an example of a simple call to \fBpcre2_dfa_match()\fP:
 .rs
 .sp
 The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must
-be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
-PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
-PCRE2_PARTIAL_HARD, PCRE2_PARTIAL_SOFT, PCRE2_DFA_SHORTEST, and
-PCRE2_DFA_RESTART. All but the last four of these are exactly the same as for
-\fBpcre2_match()\fP, so their description is not repeated here.
+be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_ENDANCHORED,
+PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
+PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, PCRE2_PARTIAL_SOFT, PCRE2_DFA_SHORTEST,
+and PCRE2_DFA_RESTART. All but the last four of these are exactly the same as
+for \fBpcre2_match()\fP, so their description is not repeated here.
 .sp
   PCRE2_PARTIAL_HARD
   PCRE2_PARTIAL_SOFT
@@ -3222,7 +3535,7 @@ NOTE: PCRE2's "auto-possessification" optimization usually applies to character
 repeats at the end of a pattern (as well as internally). For example, the
 pattern "a\ed+" is compiled as if it were "a\ed++". For DFA matching, this
 means that only one possible match is found. If you really do want multiple
-matches in such cases, either use an ungreedy repeat auch as "a\ed+?" or set
+matches in such cases, either use an ungreedy repeat such as "a\ed+?" or set
 the PCRE2_NO_AUTO_POSSESS option when compiling.
 .
 .
@@ -3275,7 +3588,7 @@ fail, this error is given.
 .sp
 \fBpcre2build\fP(3), \fBpcre2callout\fP(3), \fBpcre2demo(3)\fP,
 \fBpcre2matching\fP(3), \fBpcre2partial\fP(3), \fBpcre2posix\fP(3),
-\fBpcre2sample\fP(3), \fBpcre2stack\fP(3), \fBpcre2unicode\fP(3).
+\fBpcre2sample\fP(3), \fBpcre2unicode\fP(3).
 .
 .
 .SH AUTHOR
@@ -3292,6 +3605,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 17 June 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 31 December 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2build.3 b/doc/pcre2build.3
index 11b1c57..7586d22 100644
--- a/doc/pcre2build.3
+++ b/doc/pcre2build.3
@@ -1,4 +1,4 @@
-.TH PCRE2BUILD 3 "01 April 2016" "PCRE2 10.22"
+.TH PCRE2BUILD 3 "18 July 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .
@@ -55,21 +55,21 @@ running
 .sp
   ./configure --help
 .sp
-The following sections include descriptions of options whose names begin with
---enable or --disable. These settings specify changes to the defaults for the
-\fBconfigure\fP command. Because of the way that \fBconfigure\fP works,
---enable and --disable always come in pairs, so the complementary option always
-exists as well, but as it specifies the default, it is not described.
+The following sections include descriptions of "on/off" options whose names
+begin with --enable or --disable. Because of the way that \fBconfigure\fP
+works, --enable and --disable always come in pairs, so the complementary option
+always exists as well, but as it specifies the default, it is not described.
+Options that specify values have names that start with --with.
 .
 .
 .SH "BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES"
 .rs
 .sp
 By default, a library called \fBlibpcre2-8\fP is built, containing functions
-that take string arguments contained in vectors of bytes, interpreted either as
+that take string arguments contained in arrays of bytes, interpreted either as
 single-byte characters, or UTF-8 strings. You can also build two other
 libraries, called \fBlibpcre2-16\fP and \fBlibpcre2-32\fP, which process
-strings that are contained in vectors of 16-bit and 32-bit code units,
+strings that are contained in arrays of 16-bit and 32-bit code units,
 respectively. These can be interpreted either as single-unit characters or
 UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
 the following to the \fBconfigure\fP command:
@@ -119,10 +119,10 @@ Alternatively, patterns may be started with (*UTF) unless the application has
 locked this out by setting PCRE2_NEVER_UTF.
 .P
 UTF support allows the libraries to process character code points up to
-0x10ffff in the strings that they handle. It also provides support for
-accessing the Unicode properties of such characters, using pattern escapes such
-as \eP, \ep, and \eX. Only the general category properties such as \fILu\fP and
-\fINd\fP are supported. Details are given in the
+0x10ffff in the strings that they handle. Unicode support also gives access to
+the Unicode properties of characters, using pattern escapes such as \eP, \ep,
+and \eX. Only the general category properties such as \fILu\fP and \fINd\fP are
+supported. Details are given in the
 .\" HREF
 \fBpcre2pattern\fP
 .\"
@@ -151,13 +151,18 @@ out by setting the PCRE2_NEVER_BACKSLASH_C option when calling
 .SH "JUST-IN-TIME COMPILER SUPPORT"
 .rs
 .sp
-Just-in-time compiler support is included in the build by specifying
+Just-in-time (JIT) compiler support is included in the build by specifying
 .sp
   --enable-jit
 .sp
 This support is available only for certain hardware architectures. If this
-option is set for an unsupported architecture, a building error occurs.
-See the
+option is set for an unsupported architecture, a building error occurs. If you
+are running under SELinux you may also want to add
+.sp
+  --enable-jit-sealloc
+.sp
+which enables the use of an execmem allocator in JIT that is compatible with
+SELinux. This has no effect if JIT is not enabled. See the
 .\" HREF
 \fBpcre2jit\fP
 .\"
@@ -192,18 +197,22 @@ to the \fBconfigure\fP command. There is a fourth option, specified by
   --enable-newline-is-anycrlf
 .sp
 which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
-indicating a line ending. Finally, a fifth option, specified by
+indicating a line ending. A fifth option, specified by
 .sp
   --enable-newline-is-any
 .sp
 causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
 sequences are the three just mentioned, plus the single characters VT (vertical
 tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
-separator, U+2028), and PS (paragraph separator, U+2029).
+separator, U+2028), and PS (paragraph separator, U+2029). The final option is
+.sp
+  --enable-newline-is-nul
+.sp
+which causes NUL (binary zero) is set as the default line-ending character.
 .P
 Whatever default line ending convention is selected when PCRE2 is built can be
 overridden by applications that use the library. At build time it is
-conventional to use the standard for your operating system.
+recommended to use the standard for your operating system.
 .
 .
 .SH "WHAT \eR MATCHES"
@@ -217,7 +226,7 @@ specify
 .sp
 the default is changed so that \eR matches only CR, LF, or CRLF. Whatever is
 selected when PCRE2 is built can be overridden by applications that use the
-called.
+library.
 .
 .
 .SH "HANDLING VERY LARGE PATTERNS"
@@ -241,41 +250,13 @@ additional data when handling them. For the 32-bit library the value is always
 4 and cannot be overridden; the value of --with-link-size is ignored.
 .
 .
-.SH "AVOIDING EXCESSIVE STACK USAGE"
-.rs
-.sp
-When matching with the \fBpcre2_match()\fP function, PCRE2 implements
-backtracking by making recursive calls to an internal function called
-\fBmatch()\fP. In environments where the size of the stack is limited, this can
-severely limit PCRE2's operation. (The Unix environment does not usually suffer
-from this problem, but it may sometimes be necessary to increase the maximum
-stack size. There is a discussion in the
-.\" HREF
-\fBpcre2stack\fP
-.\"
-documentation.) An alternative approach to recursion that uses memory from the
-heap to remember data, instead of using recursive function calls, has been
-implemented to work round the problem of limited stack size. If you want to
-build a version of PCRE2 that works this way, add
-.sp
-  --disable-stack-for-recursion
-.sp
-to the \fBconfigure\fP command. By default, the system functions \fBmalloc()\fP
-and \fBfree()\fP are called to manage the heap memory that is required, but
-custom memory management functions can be called instead. PCRE2 runs noticeably
-more slowly when built in this way. This option affects only the
-\fBpcre2_match()\fP function; it is not relevant for \fBpcre2_dfa_match()\fP.
-.
-.
 .SH "LIMITING PCRE2 RESOURCE USAGE"
 .rs
 .sp
-Internally, PCRE2 has a function called \fBmatch()\fP, which it calls
-repeatedly (sometimes recursively) when matching a pattern with the
-\fBpcre2_match()\fP function. By controlling the maximum number of times this
-function may be called during a single matching operation, a limit can be
-placed on the resources used by a single call to \fBpcre2_match()\fP. The limit
-can be changed at run time, as described in the
+The \fBpcre2_match()\fP function increments a counter each time it goes round
+its main loop. Putting a limit on this counter controls the amount of computing
+resource used by a single call to \fBpcre2_match()\fP. The limit can be changed
+at run time, as described in the
 .\" HREF
 \fBpcre2api\fP
 .\"
@@ -284,19 +265,47 @@ setting such as
 .sp
   --with-match-limit=500000
 .sp
-to the \fBconfigure\fP command. This setting has no effect on the
-\fBpcre2_dfa_match()\fP matching function.
+to the \fBconfigure\fP command. This setting also applies to the
+\fBpcre2_dfa_match()\fP matching function, and to JIT matching (though the
+counting is done differently).
 .P
-In some environments it is desirable to limit the depth of recursive calls of
-\fBmatch()\fP more strictly than the total number of calls, in order to
-restrict the maximum amount of stack (or heap, if --disable-stack-for-recursion
-is specified) that is used. A second limit controls this; it defaults to the
-value that is set for --with-match-limit, which imposes no additional
-constraints. However, you can set a lower limit by adding, for example,
+The \fBpcre2_match()\fP function starts out using a 20K vector on the system
+stack to record backtracking points. The more nested backtracking points there
+are (that is, the deeper the search tree), the more memory is needed. If the
+initial vector is not large enough, heap memory is used, up to a certain limit,
+which is specified in kilobytes. The limit can be changed at run time, as
+described in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+documentation. The default limit (in effect unlimited) is 20 million. You can
+change this by a setting such as
 .sp
-  --with-match-limit-recursion=10000
+  --with-heap-limit=500
 .sp
-to the \fBconfigure\fP command. This value can also be overridden at run time.
+which limits the amount of heap to 500 kilobytes. This limit applies only to
+interpretive matching in pcre2_match(). It does not apply when JIT (which has
+its own memory arrangements) is used, nor does it apply to
+\fBpcre2_dfa_match()\fP.
+.P
+You can also explicitly limit the depth of nested backtracking in the
+\fBpcre2_match()\fP interpreter. This limit defaults to the value that is set
+for --with-match-limit. You can set a lower default limit by adding, for
+example,
+.sp
+  --with-match-limit_depth=10000
+.sp
+to the \fBconfigure\fP command. This value can be overridden at run time. This
+depth limit indirectly limits the amount of heap memory that is used, but
+because the size of each backtracking "frame" depends on the number of
+capturing parentheses in a pattern, the amount of heap that is used before the
+limit is reached varies from pattern to pattern. This limit was more useful in
+versions before 10.30, where function recursion was used for backtracking.
+.P
+As well as applying to \fBpcre2_match()\fP, the depth limit also controls
+the depth of recursive function calls in \fBpcre2_dfa_match()\fP. These are
+used for lookaround assertions, atomic groups, and recursion within patterns.
+The limit does not apply to JIT matching.
 .
 .
 .SH "CREATING CHARACTER TABLES AT BUILD TIME"
@@ -312,10 +321,10 @@ only. If you add
 to the \fBconfigure\fP command, the distributed tables are no longer used.
 Instead, a program called \fBdftables\fP is compiled and run. This outputs the
 source for new set of tables, created in the default locale of your C run-time
-system. (This method of replacing the tables does not work if you are cross
+system. This method of replacing the tables does not work if you are cross
 compiling, because \fBdftables\fP is run on the local host. If you need to
 create alternative tables when cross compiling, you will have to do so "by
-hand".)
+hand".
 .
 .
 .SH "USING EBCDIC CODE"
@@ -385,16 +394,19 @@ they are not.
 .sp
 \fBpcre2grep\fP uses an internal buffer to hold a "window" on the file it is
 scanning, in order to be able to output "before" and "after" lines when it
-finds a match. The size of the buffer is controlled by a parameter whose
-default value is 20K. The buffer itself is three times this size, but because
-of the way it is used for holding "before" lines, the longest line that is
-guaranteed to be processable is the parameter size. You can change the default
-parameter value by adding, for example,
+finds a match. The starting size of the buffer is controlled by a parameter
+whose default value is 20K. The buffer itself is three times this size, but
+because of the way it is used for holding "before" lines, the longest line that
+is guaranteed to be processable is the parameter size. If a longer line is
+encountered, \fBpcre2grep\fP automatically expands the buffer, up to a
+specified maximum size, whose default is 1M or the starting size, whichever is
+the larger. You can change the default parameter values by adding, for example,
 .sp
-  --with-pcre2grep-bufsize=50K
+  --with-pcre2grep-bufsize=51200
+  --with-pcre2grep-max-bufsize=2097152
 .sp
-to the \fBconfigure\fP command. The caller of \fPpcre2grep\fP can override this
-value by using --buffer-size on the command line.
+to the \fBconfigure\fP command. The caller of \fPpcre2grep\fP can override
+these values by using --buffer-size and --max-buffer-size on the command line.
 .
 .
 .SH "PCRE2TEST OPTION FOR LIBREADLINE SUPPORT"
@@ -512,6 +524,44 @@ information about code coverage, see the \fBgcov\fP and \fBlcov\fP
 documentation.
 .
 .
+.SH "SUPPORT FOR FUZZERS"
+.rs
+.sp
+There is a special option for use by people who want to run fuzzing tests on
+PCRE2:
+.sp
+  --enable-fuzz-support
+.sp
+At present this applies only to the 8-bit library. If set, it causes an extra
+library called libpcre2-fuzzsupport.a to be built, but not installed. This
+contains a single function called LLVMFuzzerTestOneInput() whose arguments are
+a pointer to a string and the length of the string. When called, this function
+tries to compile the string as a pattern, and if that succeeds, to match it.
+This is done both with no options and with some random options bits that are
+generated from the string.
+.P
+Setting --enable-fuzz-support also causes a binary called \fBpcre2fuzzcheck\fP
+to be created. This is normally run under valgrind or used when PCRE2 is
+compiled with address sanitizing enabled. It calls the fuzzing function and
+outputs information about it is doing. The input strings are specified by
+arguments: if an argument starts with "=" the rest of it is a literal input
+string. Otherwise, it is assumed to be a file name, and the contents of the
+file are the test string.
+.
+.
+.SH "OBSOLETE OPTION"
+.rs
+.sp
+In versions of PCRE2 prior to 10.30, there were two ways of handling
+backtracking in the \fBpcre2_match()\fP function. The default was to use the
+system stack, but if
+.sp
+  --disable-stack-for-recursion
+.sp
+was set, memory on the heap was used. From release 10.30 onwards this has
+changed (the stack is no longer used) and this option now does nothing except
+give a warning.
+.
 .SH "SEE ALSO"
 .rs
 .sp
@@ -532,6 +582,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 01 April 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 18 July 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2callout.3 b/doc/pcre2callout.3
index 6919f5a..e3fd600 100644
--- a/doc/pcre2callout.3
+++ b/doc/pcre2callout.3
@@ -1,4 +1,4 @@
-.TH PCRE2CALLOUT 3 "23 March 2015" "PCRE2 10.20"
+.TH PCRE2CALLOUT 3 "22 December 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH SYNOPSIS
@@ -40,13 +40,22 @@ two callout points:
 .sp
 If the PCRE2_AUTO_CALLOUT option bit is set when a pattern is compiled, PCRE2
 automatically inserts callouts, all with number 255, before each item in the
-pattern. For example, if PCRE2_AUTO_CALLOUT is used with the pattern
+pattern except for immediately before or after an explicit callout. For
+example, if PCRE2_AUTO_CALLOUT is used with the pattern
 .sp
-  A(\ed{2}|--)
+  A(?C3)B
 .sp
 it is processed as if it were
 .sp
-(?C255)A(?C255)((?C255)\ed{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
+  (?C255)A(?C3)B(?C255)
+.sp
+Here is a more complicated example:
+.sp
+  A(\ed{2}|--)
+.sp
+With PCRE2_AUTO_CALLOUT, this pattern is processed as if it were
+.sp
+  (?C255)A(?C255)((?C255)\ed{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
 .sp
 Notice that there is a callout before and after each parenthesis and
 alternation bar. If the pattern contains a conditional group whose condition is
@@ -91,10 +100,10 @@ with PCRE2_ANCHORED and PCRE2_AUTO_CALLOUT and then applied to the string
   No match
 .sp
 This indicates that when matching [bc] fails, there is no backtracking into a+
-and therefore the callouts that would be taken for the backtracks do not occur.
-You can disable the auto-possessify feature by passing PCRE2_NO_AUTO_POSSESS to
-\fBpcre2_compile()\fP, or starting the pattern with (*NO_AUTO_POSSESS). In this
-case, the output changes to this:
+(because it is being treated as a++) and therefore the callouts that would be
+taken for the backtracks do not occur. You can disable the auto-possessify
+feature by passing PCRE2_NO_AUTO_POSSESS to \fBpcre2_compile()\fP, or starting
+the pattern with (*NO_AUTO_POSSESS). In this case, the output changes to this:
 .sp
   --->aaaa
    +0 ^        a+
@@ -115,10 +124,13 @@ By default, an optimization is applied when .* is the first significant item in
 a pattern. If PCRE2_DOTALL is set, so that the dot can match any character, the
 pattern is automatically anchored. If PCRE2_DOTALL is not set, a match can
 start only after an internal newline or at the beginning of the subject, and
-\fBpcre2_compile()\fP remembers this. This optimization is disabled, however,
-if .* is in an atomic group or if there is a back reference to the capturing
-group in which it appears. It is also disabled if the pattern contains (*PRUNE)
-or (*SKIP). However, the presence of callouts does not affect it.
+\fBpcre2_compile()\fP remembers this. If a pattern has more than one top-level
+branch, automatic anchoring occurs if all branches are anchorable.
+.P
+This optimization is disabled, however, if .* is in an atomic group or if there
+is a back reference to the capturing group in which it appears. It is also
+disabled if the pattern contains (*PRUNE) or (*SKIP). However, the presence of
+callouts does not affect it.
 .P
 For example, if the pattern .*\ed is compiled with PCRE2_AUTO_CALLOUT and
 applied to the string "aa", the \fBpcre2test\fP output is:
@@ -148,9 +160,6 @@ pattern with (*NO_DOTSTAR_ANCHOR). In this case, the output changes to:
 This shows more match attempts, starting at the second subject character.
 Another optimization, described in the next section, means that there is no
 subsequent attempt to match with an empty subject.
-.P
-If a pattern has more than one top-level branch, automatic anchoring occurs if
-all branches are anchorable.
 .
 .
 .SS "Other optimizations"
@@ -166,9 +175,10 @@ subject string is "abyz", the lack of "d" means that matching doesn't ever
 start, and the callout is never reached. However, with "abyd", though the
 result is still no match, the callout is obeyed.
 .P
-PCRE2 also knows the minimum length of a matching string, and will immediately
-give a "no match" return without actually running a match if the subject is not
-long enough, or, for unanchored patterns, if it has been scanned far enough.
+For most patterns PCRE2 also knows the minimum length of a matching string, and
+will immediately give a "no match" return without actually running a match if
+the subject is not long enough, or, for unanchored patterns, if it has been
+scanned far enough.
 .P
 You can disable these optimizations by passing the PCRE2_NO_START_OPTIMIZE
 option to \fBpcre2_compile()\fP, or by starting the pattern with
@@ -181,20 +191,22 @@ callouts such as the example above are obeyed.
 .rs
 .sp
 During matching, when PCRE2 reaches a callout point, if an external function is
-set in the match context, it is called. This applies to both normal and DFA
-matching. The first argument to the callout function is a pointer to a
-\fBpcre2_callout\fP block. The second argument is the void * callout data that
-was supplied when the callout was set up by calling \fBpcre2_set_callout()\fP
-(see the
+provided in the match context, it is called. This applies to both normal,
+DFA, and JIT matching. The first argument to the callout function is a pointer
+to a \fBpcre2_callout\fP block. The second argument is the void * callout data
+that was supplied when the callout was set up by calling
+\fBpcre2_set_callout()\fP (see the
 .\" HREF
 \fBpcre2api\fP
 .\"
-documentation). The callout block structure contains the following fields:
+documentation). The callout block structure contains the following fields, not
+necessarily in this order:
 .sp
   uint32_t      \fIversion\fP;
   uint32_t      \fIcallout_number\fP;
   uint32_t      \fIcapture_top\fP;
   uint32_t      \fIcapture_last\fP;
+  uint32_t      \fIcallout_flags\fP;
   PCRE2_SIZE   *\fIoffset_vector\fP;
   PCRE2_SPTR    \fImark\fP;
   PCRE2_SPTR    \fIsubject\fP;
@@ -208,11 +220,12 @@ documentation). The callout block structure contains the following fields:
   PCRE2_SPTR    \fIcallout_string\fP;
 .sp
 The \fIversion\fP field contains the version number of the block format. The
-current version is 1; the three callout string fields were added for this
-version. If you are writing an application that might use an earlier release of
-PCRE2, you should check the version number before accessing any of these
-fields. The version number will increase in future if more fields are added,
-but the intention is never to remove any of the existing fields.
+current version is 2; the three callout string fields were added for version 1,
+and the \fIcallout_flags\fP field for version 2. If you are writing an
+application that might use an earlier release of PCRE2, you should check the
+version number before accessing any of these fields. The version number will
+increase in future if more fields are added, but the intention is never to
+remove any of the existing fields.
 .
 .
 .SS "Fields for numerical callouts"
@@ -220,8 +233,8 @@ but the intention is never to remove any of the existing fields.
 .sp
 For a numerical callout, \fIcallout_string\fP is NULL, and \fIcallout_number\fP
 contains the number of the callout, in the range 0-255. This is the number
-that follows (?C for manual callouts; it is 255 for automatically generated
-callouts.
+that follows (?C for callouts that part of the pattern; it is 255 for
+automatically generated callouts.
 .
 .
 .SS "Fields for string callouts"
@@ -250,12 +263,38 @@ need to report errors in the callout string within the pattern.
 The remaining fields in the callout block are the same for both kinds of
 callout.
 .P
-The \fIoffset_vector\fP field is a pointer to the vector of capturing offsets
-(the "ovector") that was passed to the matching function in the match data
-block. When \fBpcre2_match()\fP is used, the contents can be inspected in
+The \fIoffset_vector\fP field is a pointer to a vector of capturing offsets
+(the "ovector"). You may read the elements in this vector, but you must not
+change any of them.
+.P
+For calls to \fBpcre2_match()\fP, the \fIoffset_vector\fP field is not (since
+release 10.30) a pointer to the actual ovector that was passed to the matching
+function in the match data block. Instead it points to an internal ovector of a
+size large enough to hold all possible captured substrings in the pattern. Note
+that whenever a recursion or subroutine call within a pattern completes, the
+capturing state is reset to what it was before.
+.P
+The \fIcapture_last\fP field contains the number of the most recently captured
+substring, and the \fIcapture_top\fP field contains one more than the number of
+the highest numbered captured substring so far. If no substrings have yet been
+captured, the value of \fIcapture_last\fP is 0 and the value of
+\fIcapture_top\fP is 1. The values of these fields do not always differ by one;
+for example, when the callout in the pattern ((a)(b))(?C2) is taken,
+\fIcapture_last\fP is 1 but \fIcapture_top\fP is 4.
+.P
+The contents of ovector[2] to ovector[<capture_top>*2-1] can be inspected in
 order to extract substrings that have been matched so far, in the same way as
-for extracting substrings after a match has completed. For the DFA matching
-function, this field is not useful.
+extracting substrings after a match has completed. The values in ovector[0] and
+ovector[1] are always PCRE2_UNSET because the match is by definition not
+complete. Substrings that have not been captured but whose numbers are less
+than \fIcapture_top\fP also have both of their ovector slots set to
+PCRE2_UNSET.
+.P
+For DFA matching, the \fIoffset_vector\fP field points to the ovector that was
+passed to the matching function in the match data block, but it holds no useful
+information at callout time because \fBpcre2_dfa_match()\fP does not support
+substring capturing. The value of \fIcapture_top\fP is always 1 and the value
+of \fIcapture_last\fP is always 0 for DFA matching.
 .P
 The \fIsubject\fP and \fIsubject_length\fP fields contain copies of the values
 that were passed to the matching function.
@@ -270,26 +309,19 @@ in the subject.
 The \fIcurrent_position\fP field contains the offset within the subject of the
 current match pointer.
 .P
-When the \fBpcre2_match()\fP is used, the \fIcapture_top\fP field contains one
-more than the number of the highest numbered captured substring so far. If no
-substrings have been captured, the value of \fIcapture_top\fP is one. This is
-always the case when the DFA functions are used, because they do not support
-captured substrings.
-.P
-The \fIcapture_last\fP field contains the number of the most recently captured
-substring. However, when a recursion exits, the value reverts to what it was
-outside the recursion, as do the values of all captured substrings. If no
-substrings have been captured, the value of \fIcapture_last\fP is 0. This is
-always the case for the DFA matching functions.
-.P
 The \fIpattern_position\fP field contains the offset in the pattern string to
 the next item to be matched.
 .P
 The \fInext_item_length\fP field contains the length of the next item to be
-matched in the pattern string. When the callout immediately precedes an
-alternation bar, a closing parenthesis, or the end of the pattern, the length
-is zero. When the callout precedes an opening parenthesis, the length is that
-of the entire subpattern.
+processed in the pattern string. When the callout is at the end of the pattern,
+the length is zero. When the callout precedes an opening parenthesis, the
+length includes meta characters that follow the parenthesis. For example, in a
+callout before an assertion such as (?=ab) the length is 3. For an an
+alternation bar or a closing parenthesis, the length is one, unless a closing
+parenthesis is followed by a quantifier, in which case its length is included.
+(This changed in release 10.23. In earlier releases, before an opening
+parenthesis the length was that of the entire subpattern, and before an
+alternation bar or a closing parenthesis the length was zero.)
 .P
 The \fIpattern_position\fP and \fInext_item_length\fP fields are intended to
 help in distinguishing between different automatic callouts, which all have the
@@ -302,6 +334,33 @@ the zero-terminated name of the most recently passed (*MARK), (*PRUNE), or
 (*THEN) item in the match, or NULL if no such items have been passed. Instances
 of (*PRUNE) or (*THEN) without a name do not obliterate a previous (*MARK). In
 callouts from the DFA matching function this field always contains NULL.
+.P
+The \fIcallout_flags\fP field is always zero in callouts from
+\fBpcre2_dfa_match()\fP or when JIT is being used. When \fBpcre2_match()\fP
+without JIT is used, the following bits may be set:
+.sp
+  PCRE2_CALLOUT_STARTMATCH
+.sp
+This is set for the first callout after the start of matching for each new
+starting position in the subject.
+.sp
+  PCRE2_CALLOUT_BACKTRACK
+.sp
+This is set if there has been a matching backtrack since the previous callout,
+or since the start of matching if this is the first callout from a
+\fBpcre2_match()\fP run.
+.P
+Both bits are set when a backtrack has caused a "bumpalong" to a new starting
+position in the subject. Output from \fBpcre2test\fP does not indicate the
+presence of these bits unless the \fBcallout_extra\fP modifier is set.
+.P
+The information in the \fBcallout_flags\fP field is provided so that
+applications can track and tell their users how matching with backtracking is
+done. This can be useful when trying to optimize patterns, or just to
+understand how PCRE2 works. There is no support in \fBpcre2_dfa_match()\fP
+because there is no backtracking in DFA matching, and there is no support in
+JIT because JIT is all about maximimizing matching performance. In both these
+cases the \fBcallout_flags\fP field is always zero.
 .
 .
 .SH "RETURN VALUES FROM CALLOUTS"
@@ -382,6 +441,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 23 March 2015
-Copyright (c) 1997-2015 University of Cambridge.
+Last updated: 22 December 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2compat.3 b/doc/pcre2compat.3
index a3306d7..8094ebd 100644
--- a/doc/pcre2compat.3
+++ b/doc/pcre2compat.3
@@ -1,4 +1,4 @@
-.TH PCRE2COMPAT 3 "15 March 2015" "PCRE2 10.20"
+.TH PCRE2COMPAT 3 "18 April 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
@@ -6,7 +6,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
 This document describes the differences in the ways that PCRE2 and Perl handle
 regular expressions. The differences described here are with respect to Perl
-versions 5.10 and above.
+versions 5.26, but as both Perl and PCRE2 are continually changing, the
+information may sometimes be out of date.
 .P
 1. PCRE2 has only a subset of Perl's Unicode support. Details of what it does
 have are given in the
@@ -15,16 +16,17 @@ have are given in the
 .\"
 page.
 .P
-2. PCRE2 allows repeat quantifiers only on parenthesized assertions, but they
-do not mean what you might think. For example, (?!a){3} does not assert that
-the next three characters are not "a". It just asserts that the next character
-is not "a" three times (in principle: PCRE2 optimizes this to run the assertion
-just once). Perl allows repeat quantifiers on other assertions such as \eb, but
-these do not seem to have any use.
+2. Like Perl, PCRE2 allows repeat quantifiers on parenthesized assertions, but
+they do not mean what you might think. For example, (?!a){3} does not assert
+that the next three characters are not "a". It just asserts that the next
+character is not "a" three times (in principle: PCRE2 optimizes this to run the
+assertion just once). Perl allows some repeat quantifiers on other assertions,
+for example, \eb* (but not \eb{3}), but these do not seem to have any use.
 .P
-3. Capturing subpatterns that occur inside negative lookahead assertions are
-counted, but their entries in the offsets vector are never set. Perl sometimes
-(but not always) sets its numerical variables from inside negative assertions.
+3. Capturing subpatterns that occur inside negative lookaround assertions are
+counted, but their entries in the offsets vector are set only when a negative
+assertion is a condition that has a matching branch (that is, the condition is
+false).
 .P
 4. The following Perl escape sequences are not supported: \el, \eu, \eL,
 \eU, and \eN when followed by a character name or Unicode value. (\eN on its
@@ -35,13 +37,13 @@ generated by default. However, if the PCRE2_ALT_BSUX option is set,
 \eU and \eu are interpreted as ECMAScript interprets them.
 .P
 5. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE2 is
-built with Unicode support. The properties that can be tested with \ep and \eP
-are limited to the general category properties such as Lu and Nd, script names
-such as Greek or Han, and the derived properties Any and L&. PCRE2 does support
-the Cs (surrogate) property, which Perl does not; the Perl documentation says
-"Because Perl hides the need for the user to understand the internal
-representation of Unicode characters, there is no need to implement the
-somewhat messy concept of surrogates."
+built with Unicode support (the default). The properties that can be tested
+with \ep and \eP are limited to the general category properties such as Lu and
+Nd, script names such as Greek or Han, and the derived properties Any and L&.
+PCRE2 does support the Cs (surrogate) property, which Perl does not; the Perl
+documentation says "Because Perl hides the need for the user to understand the
+internal representation of Unicode characters, there is no need to implement
+the somewhat messy concept of surrogates."
 .P
 6. PCRE2 does support the \eQ...\eE escape for quoting substrings. Characters
 in between are treated as literals. This is slightly different from Perl in
@@ -60,29 +62,16 @@ Note the following examples:
 The \eQ...\eE sequence is recognized both inside and outside character classes.
 .P
 7. Fairly obviously, PCRE2 does not support the (?{code}) and (??{code})
-constructions. However, there is support for recursive patterns. This is not
-available in Perl 5.8, but it is in Perl 5.10. Also, the PCRE2 "callout"
-feature allows an external function to be called during pattern matching. See
-the
+constructions. However, there is support PCRE2's "callout" feature, which
+allows an external function to be called during pattern matching. See the
 .\" HREF
 \fBpcre2callout\fP
 .\"
 documentation for details.
 .P
-8. Subroutine calls (whether recursive or not) are treated as atomic groups.
-Atomic recursion is like Python, but unlike Perl. Captured values that are set
-outside a subroutine call can be referenced from inside in PCRE2, but not in
-Perl. There is a discussion that explains these differences in more detail in
-the
-.\" HTML <a href="pcre2pattern.html#recursiondifference">
-.\" </a>
-section on recursion differences from Perl
-.\"
-in the
-.\" HREF
-\fBpcre2pattern\fP
-.\"
-page.
+8. Subroutine calls (whether recursive or not) were treated as atomic groups up
+to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
+into subroutine calls is now supported, as in Perl.
 .P
 9. If any of the backtracking control verbs are used in a subpattern that is
 called as a subroutine (whether or not recursively), their effect is confined
@@ -96,7 +85,7 @@ processed as anchored at the point where they are tested.
 one that is backtracked onto acts. For example, in the pattern
 A(*COMMIT)B(*PRUNE)C a failure in B triggers (*COMMIT), but a failure in C
 triggers (*PRUNE). Perl's behaviour is more complex; in many cases it is the
-same as PCRE2, but there are examples where it differs.
+same as PCRE2, but there are cases where it differs.
 .P
 11. Most backtracking verbs in assertions have their normal actions. They are
 not confined to the assertion.
@@ -109,17 +98,18 @@ the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
 13. PCRE2's handling of duplicate subpattern numbers and duplicate subpattern
 names is not as general as Perl's. This is a consequence of the fact the PCRE2
 works internally just with numbers, using an external table to translate
-between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b)B),
+between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b>B),
 where the two capturing parentheses have the same number but different names,
 is not supported, and causes an error at compile time. If it were allowed, it
 would not be possible to distinguish which parentheses matched, because both
 names map to capturing subpattern number 1. To avoid this confusing situation,
 an error is given at compile time.
 .P
-14. Perl recognizes comments in some places that PCRE2 does not, for example,
-between the ( and ? at the start of a subpattern. If the /x modifier is set,
-Perl allows white space between ( and ? (though current Perls warn that this is
-deprecated) but PCRE2 never does, even if the PCRE2_EXTENDED option is set.
+14. Perl used to recognize comments in some places that PCRE2 does not, for
+example, between the ( and ? at the start of a subpattern. If the /x modifier
+is set, Perl allowed white space between ( and ? though the latest Perls give
+an error (for a while it was just deprecated). There may still be some cases
+where Perl behaves differently.
 .P
 15. Perl, when in warning mode, gives warnings for character classes such as
 [A-\ed] or [a-[:digit:]]. It then treats the hyphens as literals. PCRE2 has no
@@ -129,46 +119,65 @@ certainly user mistakes.
 16. In PCRE2, the upper/lower case character properties Lu and Ll are not
 affected when case-independent matching is specified. For example, \ep{Lu}
 always matches an upper case letter. I think Perl has changed in this respect;
-in the release at the time of writing (5.16), \ep{Lu} and \ep{Ll} match all
+in the release at the time of writing (5.24), \ep{Lu} and \ep{Ll} match all
 letters, regardless of case, when case independence is specified.
 .P
 17. PCRE2 provides some extensions to the Perl regular expression facilities.
 Perl 5.10 includes new features that are not in earlier versions of Perl, some
-of which (such as named parentheses) have been in PCRE2 for some time. This
-list is with respect to Perl 5.10:
+of which (such as named parentheses) were in PCRE2 for some time before. This
+list is with respect to Perl 5.26:
 .sp
 (a) Although lookbehind assertions in PCRE2 must match fixed length strings,
 each alternative branch of a lookbehind assertion can match a different length
 of string. Perl requires them all to have the same length.
 .sp
-(b) If PCRE2_DOLLAR_ENDONLY is set and PCRE2_MULTILINE is not set, the $
+(b) From PCRE2 10.23, back references to groups of fixed length are supported
+in lookbehinds, provided that there is no possibility of referencing a
+non-unique number or name. Perl does not support backreferences in lookbehinds.
+.sp
+(c) If PCRE2_DOLLAR_ENDONLY is set and PCRE2_MULTILINE is not set, the $
 meta-character matches only at the very end of the string.
 .sp
-(c) A backslash followed by a letter with no special meaning is faulted. (Perl
+(d) A backslash followed by a letter with no special meaning is faulted. (Perl
 can be made to issue a warning.)
 .sp
-(d) If PCRE2_UNGREEDY is set, the greediness of the repetition quantifiers is
+(e) If PCRE2_UNGREEDY is set, the greediness of the repetition quantifiers is
 inverted, that is, by default they are not greedy, but if followed by a
 question mark they are.
 .sp
-(e) PCRE2_ANCHORED can be used at matching time to force a pattern to be tried
+(f) PCRE2_ANCHORED can be used at matching time to force a pattern to be tried
 only at the first matching position in the subject string.
 .sp
-(f) The PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, and
-PCRE2_NO_AUTO_CAPTURE options have no Perl equivalents.
+(g) The PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY and PCRE2_NOTEMPTY_ATSTART
+options have no Perl equivalents.
 .sp
-(g) The \eR escape sequence can be restricted to match only CR, LF, or CRLF
+(h) The \eR escape sequence can be restricted to match only CR, LF, or CRLF
 by the PCRE2_BSR_ANYCRLF option.
 .sp
-(h) The callout facility is PCRE2-specific.
+(i) The callout facility is PCRE2-specific. Perl supports codeblocks and
+variable interpolation, but not general hooks on every match.
 .sp
-(i) The partial matching facility is PCRE2-specific.
+(j) The partial matching facility is PCRE2-specific.
 .sp
-(j) The alternative matching function (\fBpcre2_dfa_match()\fP matches in a
+(k) The alternative matching function (\fBpcre2_dfa_match()\fP matches in a
 different way and is not Perl-compatible.
 .sp
-(k) PCRE2 recognizes some special sequences such as (*CR) at the start of
-a pattern that set overall options that cannot be changed within the pattern.
+(l) PCRE2 recognizes some special sequences such as (*CR) or (*NO_JIT) at
+the start of a pattern that set overall options that cannot be changed within
+the pattern.
+.P
+18. The Perl /a modifier restricts /d numbers to pure ascii, and the /aa
+modifier restricts /i case-insensitive matching to pure ascii, ignoring Unicode
+rules. This separation cannot be represented with PCRE2_UCP.
+.P
+19. Perl has different limits than PCRE2. See the
+.\" HREF
+\fBpcre2limit\fP
+.\"
+documentation for details. Perl went with 5.10 from recursion to iteration
+keeping the intermediate matches on the heap, which is ~10% slower but does not
+fall into any stack-overflow limit. PCRE2 made a similar change at release
+10.30, and also has many build-time and run-time customizable limits.
 .
 .
 .SH AUTHOR
@@ -185,6 +194,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 15 March 2015
-Copyright (c) 1997-2015 University of Cambridge.
+Last updated: 18 April 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2convert.3 b/doc/pcre2convert.3
new file mode 100644
index 0000000..3dadf6e
--- /dev/null
+++ b/doc/pcre2convert.3
@@ -0,0 +1,163 @@
+.TH PCRE2CONVERT 3 "12 July 2017" "PCRE2 10.30"
+.SH NAME
+PCRE2 - Perl-compatible regular expressions (revised API)
+.SH "EXPERIMENTAL PATTERN CONVERSION FUNCTIONS"
+.rs
+.sp
+This document describes a set of functions that can be used to convert
+"foreign" patterns into PCRE2 regular expressions. This facility is currently
+experimental, and may be changed in future releases. Two kinds of pattern,
+globs and POSIX patterns, are supported.
+.
+.
+.SH "THE CONVERT CONTEXT"
+.rs
+.sp
+.nf
+.B pcre2_convert_context *pcre2_convert_context_create(
+.B "  pcre2_general_context *\fIgcontext\fP);"
+.sp
+.B pcre2_convert_context *pcre2_convert_context_copy(
+.B "  pcre2_convert_context *\fIcvcontext\fP);"
+.sp
+.B void pcre2_convert_context_free(pcre2_convert_context *\fIcvcontext\fP);
+.sp
+.B int pcre2_set_glob_escape(pcre2_convert_context *\fIcvcontext\fP,
+.B "  uint32_t \fIescape_char\fP);"
+.sp
+.B int pcre2_set_glob_separator(pcre2_convert_context *\fIcvcontext\fP,
+.B "  uint32_t \fIseparator_char\fP);"
+.fi
+.sp
+A convert context is used to hold parameters that affect the way that pattern
+conversion works. Like all PCRE2 contexts, you need to use a context only if
+you want to override the defaults. There are the usual create, copy, and free
+functions. If custom memory management functions are set in a general context
+that is passed to \fBpcre2_convert_context_create()\fP, they are used for all
+memory management within the conversion functions.
+.P
+There are only two parameters in the convert context at present. Both apply
+only to glob conversions. The escape character defaults to grave accent under
+Windows, otherwise backslash. It can be set to zero, meaning no escape
+character, or to any punctuation character with a code point less than 256.
+The separator character defaults to backslash under Windows, otherwise forward
+slash. It can be set to forward slash, backslash, or dot.
+.P
+The two setting functions return zero on success, or PCRE2_ERROR_BADDATA if
+their second argument is invalid.
+.
+.
+.SH "THE CONVERSION FUNCTION"
+.rs
+.sp
+.nf
+.B int pcre2_pattern_convert(PCRE2_SPTR \fIpattern\fP, PCRE2_SIZE \fIlength\fP,
+.B "  uint32_t \fIoptions\fP, PCRE2_UCHAR **\fIbuffer\fP,"
+.B "  PCRE2_SIZE *\fIblength\fP, pcre2_convert_context *\fIcvcontext\fP);"
+.sp
+.B void pcre2_converted_pattern_free(PCRE2_UCHAR *\fIconverted_pattern\fP);
+.fi
+.sp
+The first two arguments of \fBpcre2_pattern_convert()\fP define the foreign
+pattern that is to be converted. The length may be given as
+PCRE2_ZERO_TERMINATED. The \fBoptions\fP argument defines how the pattern is to
+be processed. If the input is UTF, the PCRE2_CONVERT_UTF option should be set.
+PCRE2_CONVERT_NO_UTF_CHECK may also be set if you are sure the input is valid.
+One or more of the glob options, or one of the following POSIX options must be
+set to define the type of conversion that is required:
+.sp
+  PCRE2_CONVERT_GLOB
+  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
+  PCRE2_CONVERT_GLOB_NO_STARSTAR
+  PCRE2_CONVERT_POSIX_BASIC
+  PCRE2_CONVERT_POSIX_EXTENDED
+.sp
+Details of the conversions are given below. The \fBbuffer\fP and \fBblength\fP
+arguments define how the output is handled:
+.P
+If \fBbuffer\fP is NULL, the function just returns the length of the converted
+pattern via \fBblength\fP. This is one less than the length of buffer needed,
+because a terminating zero is always added to the output.
+.P
+If \fBbuffer\fP points to a NULL pointer, an output buffer is obtained using
+the allocator in the context or \fBmalloc()\fP if no context is supplied. A
+pointer to this buffer is placed in the variable to which \fBbuffer\fP points.
+When no longer needed the output buffer must be freed by calling
+\fBpcre2_converted_pattern_free()\fP.
+.P
+If \fBbuffer\fP points to a non-NULL pointer, \fBblength\fP must be set to the
+actual length of the buffer provided (in code units).
+.P
+In all cases, after successful conversion, the variable pointed to by
+\fBblength\fP is updated to the length actually used (in code units), excluding
+the terminating zero that is always added.
+.P
+If an error occurs, the length (via \fBblength\fP) is set to the offset
+within the input pattern where the error was detected. Only gross syntax errors
+are caught; there are plenty of errors that will get passed on for
+\fBpcre2_compile()\fP to discover.
+.P
+The return from \fBpcre2_pattern_convert()\fP is zero on success or a non-zero
+PCRE2 error code. Note that PCRE2 error codes may be positive or negative:
+\fBpcre2_compile()\fP uses mostly positive codes and \fBpcre2_match()\fP
+negative ones; \fBpcre2_convert()\fP uses existing codes of both kinds. A
+textual error message can be obtained by calling
+\fBpcre2_get_error_message()\fP.
+.
+.
+.SH "CONVERTING GLOBS"
+.rs
+.sp
+Globs are used to match file names, and consequently have the concept of a
+"path separator", which defaults to backslash under Windows and forward slash
+otherwise. If PCRE2_CONVERT_GLOB is set, the wildcards * and ? are not
+permitted to match separator characters, but the double-star (**) feature
+(which does match separators) is supported.
+.P
+PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR matches globs with wildcards allowed to
+match separator characters. PCRE2_GLOB_NO_STARSTAR matches globs with the
+double-star feature disabled. These options may be given together.
+.
+.
+.SH "CONVERTING POSIX PATTERNS"
+.rs
+.sp
+POSIX defines two kinds of regular expression pattern: basic and extended.
+These can be processed by setting PCRE2_CONVERT_POSIX_BASIC or
+PCRE2_CONVERT_POSIX_EXTENDED, respectively.
+.P
+In POSIX patterns, backslash is not special in a character class. Unmatched
+closing parentheses are treated as literals.
+.P
+In basic patterns, ? + | {} and () must be escaped to be recognized
+as metacharacters outside a character class. If the first character in the
+pattern is * it is treated as a literal. ^ is a metacharacter only at the start
+of a branch.
+.P
+In extended patterns, a backslash not in a character class always
+makes the next character literal, whatever it is. There are no backreferences.
+.P
+Note: POSIX mandates that the longest possible match at the first matching
+position must be found. This is not what \fBpcre2_match()\fP does; it yields
+the first match that is found. An application can use \fBpcre2_dfa_match()\fP
+to find the longest match, but that does not support backreferences (but then
+neither do POSIX extended patterns).
+.
+.
+.SH AUTHOR
+.rs
+.sp
+.nf
+Philip Hazel
+University Computing Service
+Cambridge, England.
+.fi
+.
+.
+.SH REVISION
+.rs
+.sp
+.nf
+Last updated: 12 July 2017
+Copyright (c) 1997-2017 University of Cambridge.
+.fi
diff --git a/doc/pcre2demo.3 b/doc/pcre2demo.3
index c02dcd9..a9e58e2 100644
--- a/doc/pcre2demo.3
+++ b/doc/pcre2demo.3
@@ -228,6 +228,21 @@ pcre2_match_data_create_from_pattern() above. */
 if (rc == 0)
   printf("ovector was not big enough for all the captured substrings\en");
 
+/* We must guard against patterns such as /(?=.\eK)/ that use \eK in an assertion
+to set the start of a match later than its end. In this demonstration program,
+we just detect this case and give up. */
+
+if (ovector[0] > ovector[1])
+  {
+  printf("\e\eK was used in an assertion to set the match start after its end.\en"
+    "From end to start the match was: %.*s\en", (int)(ovector[0] - ovector[1]),
+      (char *)(subject + ovector[1]));
+  printf("Run abandoned\en");
+  pcre2_match_data_free(match_data);
+  pcre2_code_free(re);
+  return 1;
+  }
+
 /* Show substrings stored in the output vector by number. Obviously, in a real
 application you might want to do things other than print them. */
 
@@ -355,6 +370,29 @@ for (;;)
     options = PCRE2_NOTEMPTY_ATSTART | PCRE2_ANCHORED;
     }
 
+  /* If the previous match was not an empty string, there is one tricky case to
+  consider. If a pattern contains \eK within a lookbehind assertion at the
+  start, the end of the matched string can be at the offset where the match
+  started. Without special action, this leads to a loop that keeps on matching
+  the same substring. We must detect this case and arrange to move the start on
+  by one character. The pcre2_get_startchar() function returns the starting
+  offset that was passed to pcre2_match(). */
+
+  else
+    {
+    PCRE2_SIZE startchar = pcre2_get_startchar(match_data);
+    if (start_offset <= startchar)
+      {
+      if (startchar >= subject_length) break;   /* Reached end of subject.   */
+      start_offset = startchar + 1;             /* Advance by one character. */
+      if (utf8)                                 /* If UTF-8, it may be more  */
+        {                                       /*   than one code unit.     */
+        for (; start_offset < subject_length; start_offset++)
+          if ((subject[start_offset] & 0xc0) != 0x80) break;
+        }
+      }
+    }
+
   /* Run the next matching operation */
 
   rc = pcre2_match(
@@ -419,6 +457,21 @@ for (;;)
   if (rc == 0)
     printf("ovector was not big enough for all the captured substrings\en");
 
+  /* We must guard against patterns such as /(?=.\eK)/ that use \eK in an
+  assertion to set the start of a match later than its end. In this
+  demonstration program, we just detect this case and give up. */
+
+  if (ovector[0] > ovector[1])
+    {
+    printf("\e\eK was used in an assertion to set the match start after its end.\en"
+      "From end to start the match was: %.*s\en", (int)(ovector[0] - ovector[1]),
+        (char *)(subject + ovector[1]));
+    printf("Run abandoned\en");
+    pcre2_match_data_free(match_data);
+    pcre2_code_free(re);
+    return 1;
+    }
+
   /* As before, show substrings stored in the output vector by number, and then
   also any named substrings. */
 
diff --git a/doc/pcre2grep.1 b/doc/pcre2grep.1
index 6d27780..5e5cbea 100644
--- a/doc/pcre2grep.1
+++ b/doc/pcre2grep.1
@@ -1,4 +1,4 @@
-.TH PCRE2GREP 1 "19 June 2016" "PCRE2 10.22"
+.TH PCRE2GREP 1 "13 November 2017" "PCRE2 10.31"
 .SH NAME
 pcre2grep - a grep with Perl-compatible regular expressions.
 .SH SYNOPSIS
@@ -52,11 +52,18 @@ span line boundaries. What defines a line boundary is controlled by the
 \fB-N\fP (\fB--newline\fP) option.
 .P
 The amount of memory used for buffering files that are being scanned is
-controlled by a parameter that can be set by the \fB--buffer-size\fP option.
-The default value for this parameter is specified when \fBpcre2grep\fP is
-built, with the default default being 20K. A block of memory three times this
-size is used (to allow for buffering "before" and "after" lines). An error
-occurs if a line overflows the buffer.
+controlled by parameters that can be set by the \fB--buffer-size\fP and
+\fB--max-buffer-size\fP options. The first of these sets the size of buffer
+that is obtained at the start of processing. If an input file contains very
+long lines, a larger buffer may be needed; this is handled by automatically
+extending the buffer, up to the limit specified by \fB--max-buffer-size\fP. The
+default values for these parameters are specified when \fBpcre2grep\fP is
+built, with the default defaults being 20K and 1M respectively. An error occurs
+if a line is too long and the buffer can no longer be expanded.
+.P
+The block of memory that is actually used is three times the "buffer size", to
+allow for buffering "before" and "after" lines. If the buffer size is too
+small, fewer than requested "before" and "after" lines may be output.
 .P
 Patterns can be no longer than 8K or BUFSIZ bytes, whichever is the greater.
 BUFSIZ is defined in \fB<stdio.h>\fP. When there is more than one pattern
@@ -94,27 +101,31 @@ The \fB--locale\fP option can be used to override this.
 .rs
 .sp
 It is possible to compile \fBpcre2grep\fP so that it uses \fBlibz\fP or
-\fBlibbz2\fP to read files whose names end in \fB.gz\fP or \fB.bz2\fP,
-respectively. You can find out whether your binary has support for one or both
-of these file types by running it with the \fB--help\fP option. If the
-appropriate support is not present, files are treated as plain text. The
-standard input is always so treated.
+\fBlibbz2\fP to read compressed files whose names end in \fB.gz\fP or
+\fB.bz2\fP, respectively. You can find out whether your \fBpcre2grep\fP binary
+has support for one or both of these file types by running it with the
+\fB--help\fP option. If the appropriate support is not present, all files are
+treated as plain text. The standard input is always so treated. When input is
+from a compressed .gz or .bz2 file, the \fB--line-buffered\fP option is
+ignored.
 .
 .
 .SH "BINARY FILES"
 .rs
 .sp
 By default, a file that contains a binary zero byte within the first 1024 bytes
-is identified as a binary file, and is processed specially. (GNU grep also
-identifies binary files in this manner.) See the \fB--binary-files\fP option
-for a means of changing the way binary files are handled.
+is identified as a binary file, and is processed specially. (GNU grep
+identifies binary files in this manner.) However, if the newline type is
+specified as "nul", that is, the line terminator is a binary zero, the test for
+a binary file is not applied. See the \fB--binary-files\fP option for a means
+of changing the way binary files are handled.
 .
 .
 .SH OPTIONS
 .rs
 .sp
 The order in which some of the options appear can affect the output. For
-example, both the \fB-h\fP and \fB-l\fP options affect the printing of file
+example, both the \fB-H\fP and \fB-l\fP options affect the printing of file
 names. Whichever comes later in the command line will be the one that takes
 effect. Similarly, except where noted below, if an option is given twice, the
 later setting is used. Numerical values for options may be followed by K or M,
@@ -126,24 +137,27 @@ command line starts with a hyphen but is not an option. This allows for the
 processing of patterns and file names that start with hyphens.
 .TP
 \fB-A\fP \fInumber\fP, \fB--after-context=\fP\fInumber\fP
-Output \fInumber\fP lines of context after each matching line. If file names
-and/or line numbers are being output, a hyphen separator is used instead of a
-colon for the context lines. A line containing "--" is output between each
-group of lines, unless they are in fact contiguous in the input file. The value
-of \fInumber\fP is expected to be relatively small. However, \fBpcre2grep\fP
-guarantees to have up to 8K of following text available for context output.
+Output up to \fInumber\fP lines of context after each matching line. Fewer
+lines are output if the next match or the end of the file is reached, or if the
+processing buffer size has been set too small. If file names and/or line
+numbers are being output, a hyphen separator is used instead of a colon for the
+context lines. A line containing "--" is output between each group of lines,
+unless they are in fact contiguous in the input file. The value of \fInumber\fP
+is expected to be relatively small. When \fB-c\fP is used, \fB-A\fP is ignored.
 .TP
 \fB-a\fP, \fB--text\fP
 Treat binary files as text. This is equivalent to
 \fB--binary-files\fP=\fItext\fP.
 .TP
 \fB-B\fP \fInumber\fP, \fB--before-context=\fP\fInumber\fP
-Output \fInumber\fP lines of context before each matching line. If file names
-and/or line numbers are being output, a hyphen separator is used instead of a
-colon for the context lines. A line containing "--" is output between each
-group of lines, unless they are in fact contiguous in the input file. The value
-of \fInumber\fP is expected to be relatively small. However, \fBpcre2grep\fP
-guarantees to have up to 8K of preceding text available for context output.
+Output up to \fInumber\fP lines of context before each matching line. Fewer
+lines are output if the previous match or the start of the file is within
+\fInumber\fP lines, or if the processing buffer size has been set too small. If
+file names and/or line numbers are being output, a hyphen separator is used
+instead of a colon for the context lines. A line containing "--" is output
+between each group of lines, unless they are in fact contiguous in the input
+file. The value of \fInumber\fP is expected to be relatively small. When
+\fB-c\fP is used, \fB-B\fP is ignored.
 .TP
 \fB--binary-files=\fP\fIword\fP
 Specify how binary files are to be processed. If the word is "binary" (the
@@ -158,8 +172,9 @@ be of interest and are skipped without causing any output or affecting the
 return code.
 .TP
 \fB--buffer-size=\fP\fInumber\fP
-Set the parameter that controls how much memory is used for buffering files
-that are being scanned.
+Set the parameter that controls how much memory is obtained at the start of
+processing for buffering files that are being scanned. See also
+\fB--max-buffer-size\fP below.
 .TP
 \fB-C\fP \fInumber\fP, \fB--context=\fP\fInumber\fP
 Output \fInumber\fP lines of context both before and after each matching line.
@@ -167,13 +182,15 @@ This is equivalent to setting both \fB-A\fP and \fB-B\fP to the same value.
 .TP
 \fB-c\fP, \fB--count\fP
 Do not output lines from the files that are being scanned; instead output the
-number of matches (or non-matches if \fB-v\fP is used) that would otherwise
-have caused lines to be shown. By default, this count is the same as the number
-of suppressed lines, but if the \fB-M\fP (multiline) option is used (without
-\fB-v\fP), there may be more suppressed lines than the number of matches.
+number of lines that would have been shown, either because they matched, or, if
+\fB-v\fP is set, because they failed to match. By default, this count is
+exactly the same as the number of lines that would have been output, but if the
+\fB-M\fP (multiline) option is used (without \fB-v\fP), there may be more
+suppressed lines than the count (that is, the number of matches).
 .sp
 If no lines are selected, the number zero is output. If several files are are
-being scanned, a count is output for each of them. However, if the
+being scanned, a count is output for each of them and the \fB-t\fP option can
+be used to cause a total to be output at the end. However, if the
 \fB--files-with-matches\fP option is also used, only those files whose counts
 are greater than zero are listed. When \fB-c\fP is used, the \fB-A\fP,
 \fB-B\fP, and \fB-C\fP options are ignored.
@@ -192,12 +209,22 @@ connected to a terminal. More resources are used when colouring is enabled,
 because \fBpcre2grep\fP has to search for all possible matches in a line, not
 just one, in order to colour them all.
 .sp
-The colour that is used can be specified by setting the environment variable
-PCRE2GREP_COLOUR or PCRE2GREP_COLOR. The value of this variable should be a
-string of two numbers, separated by a semicolon. They are copied directly into
-the control string for setting colour on a terminal, so it is your
-responsibility to ensure that they make sense. If neither of the environment
-variables is set, the default is "1;31", which gives red.
+The colour that is used can be specified by setting one of the environment
+variables PCRE2GREP_COLOUR, PCRE2GREP_COLOR, PCREGREP_COLOUR, or
+PCREGREP_COLOR, which are checked in that order. If none of these are set,
+\fBpcre2grep\fP looks for GREP_COLORS or GREP_COLOR (in that order). The value
+of the variable should be a string of two numbers, separated by a semicolon,
+except in the case of GREP_COLORS, which must start with "ms=" or "mt="
+followed by two semicolon-separated colours, terminated by the end of the
+string or by a colon. If GREP_COLORS does not start with "ms=" or "mt=" it is
+ignored, and GREP_COLOR is checked.
+.sp
+If the string obtained from one of the above variables contains any characters
+other than semicolon or digits, the setting is ignored and the default colour
+is used. The string is copied directly into the control string for setting
+colour on a terminal, so it is your responsibility to ensure that the values
+make sense. If no relevant environment variable is set, the default is "1;31",
+which gives red.
 .TP
 \fB-D\fP \fIaction\fP, \fB--devices=\fP\fIaction\fP
 If an input path is not a regular file or a directory, "action" specifies how
@@ -213,6 +240,9 @@ compatibility with GNU grep), "recurse" (equivalent to the \fB-r\fP option), or
 operating systems the effect of reading a directory like this is an immediate
 end-of-file; in others it may provoke an error.
 .TP
+\fB--depth-limit\fP=\fInumber\fP
+See \fB--match-limit\fP below.
+.TP
 \fB-e\fP \fIpattern\fP, \fB--regex=\fP\fIpattern\fP, \fB--regexp=\fP\fIpattern\fP
 Specify a pattern to be matched. This option can be used multiple times in
 order to specify several patterns. It can also be used as a way of specifying a
@@ -273,17 +303,17 @@ files; it does not apply to patterns specified by any of the \fB--include\fP or
 \fB--exclude\fP options.
 .TP
 \fB-f\fP \fIfilename\fP, \fB--file=\fP\fIfilename\fP
-Read patterns from the file, one per line, and match them against
-each line of input. What constitutes a newline when reading the file is the
-operating system's default. The \fB--newline\fP option has no effect on this
-option. Trailing white space is removed from each line, and blank lines are
-ignored. An empty file contains no patterns and therefore matches nothing. See
-also the comments about multiple patterns versus a single pattern with
-alternatives in the description of \fB-e\fP above.
-.sp
-If this option is given more than once, all the specified files are
-read. A data line is output if any of the patterns match it. A file name can
-be given as "-" to refer to the standard input. When \fB-f\fP is used, patterns
+Read patterns from the file, one per line, and match them against each line of
+input. What constitutes a newline when reading the file is the operating
+system's default. The \fB--newline\fP option has no effect on this option.
+Trailing white space is removed from each line, and blank lines are ignored. An
+empty file contains no patterns and therefore matches nothing. See also the
+comments about multiple patterns versus a single pattern with alternatives in
+the description of \fB-e\fP above.
+.sp
+If this option is given more than once, all the specified files are read. A
+data line is output if any of the patterns match it. A file name can be given
+as "-" to refer to the standard input. When \fB-f\fP is used, patterns
 specified on the command line using \fB-e\fP may also be present; they are
 tested before the file's patterns. However, no other pattern is taken from the
 command line; all arguments are treated as the names of paths to be searched.
@@ -304,8 +334,8 @@ Instead of showing lines or parts of lines that match, show each match as an
 offset from the start of the file and a length, separated by a comma. In this
 mode, no context is shown. That is, the \fB-A\fP, \fB-B\fP, and \fB-C\fP
 options are ignored. If there is more than one match in a line, each of them is
-shown separately. This option is mutually exclusive with \fB--line-offsets\fP
-and \fB--only-matching\fP.
+shown separately. This option is mutually exclusive with \fB--output\fP,
+\fB--line-offsets\fP, and \fB--only-matching\fP.
 .TP
 \fB-H\fP, \fB--with-filename\fP
 Force the inclusion of the file name at the start of output lines when
@@ -313,13 +343,18 @@ searching a single file. By default, the file name is not shown in this case.
 For matching lines, the file name is followed by a colon; for context lines, a
 hyphen separator is used. If a line number is also being output, it follows the
 file name. When the \fB-M\fP option causes a pattern to match more than one
-line, only the first is preceded by the file name.
+line, only the first is preceded by the file name. This option overrides any
+previous \fB-h\fP, \fB-l\fP, or \fB-L\fP options.
 .TP
 \fB-h\fP, \fB--no-filename\fP
 Suppress the output file names when searching multiple files. By default,
 file names are shown when multiple files are searched. For matching lines, the
 file name is followed by a colon; for context lines, a hyphen separator is used.
-If a line number is also being output, it follows the file name.
+If a line number is also being output, it follows the file name. This option
+overrides any previous \fB-H\fP, \fB-L\fP, or \fB-l\fP options.
+.TP
+\fB--heap-limit\fP=\fInumber\fP
+See \fB--match-limit\fP below.
 .TP
 \fB--help\fP
 Output a help message, giving brief details of the command options and file
@@ -365,16 +400,18 @@ given any number of times. If a directory matches both \fB--include-dir\fP and
 \fB-L\fP, \fB--files-without-match\fP
 Instead of outputting lines from the files, just output the names of the files
 that do not contain any lines that would have been output. Each file name is
-output once, on a separate line.
+output once, on a separate line. This option overrides any previous \fB-H\fP,
+\fB-h\fP, or \fB-l\fP options.
 .TP
 \fB-l\fP, \fB--files-with-matches\fP
 Instead of outputting lines from the files, just output the names of the files
-containing lines that would have been output. Each file name is output
-once, on a separate line. Searching normally stops as soon as a matching line
-is found in a file. However, if the \fB-c\fP (count) option is also used,
-matching continues in order to obtain the correct count, and those files that
-have at least one match are listed along with their counts. Using this option
-with \fB-c\fP is a way of suppressing the listing of files with no matches.
+containing lines that would have been output. Each file name is output once, on
+a separate line. Searching normally stops as soon as a matching line is found
+in a file. However, if the \fB-c\fP (count) option is also used, matching
+continues in order to obtain the correct count, and those files that have at
+least one match are listed along with their counts. Using this option with
+\fB-c\fP is a way of suppressing the listing of files with no matches. This
+opeion overrides any previous \fB-H\fP, \fB-h\fP, or \fB-L\fP options.
 .TP
 \fB--label\fP=\fIname\fP
 This option supplies a name to be used for the standard input when file names
@@ -382,14 +419,16 @@ are being output. If not supplied, "(standard input)" is used. There is no
 short form for this option.
 .TP
 \fB--line-buffered\fP
-When this option is given, input is read and processed line by line, and the
-output is flushed after each write. By default, input is read in large chunks,
-unless \fBpcre2grep\fP can determine that it is reading from a terminal (which
-is currently possible only in Unix-like environments). Output to terminal is
-normally automatically flushed by the operating system. This option can be
-useful when the input or output is attached to a pipe and you do not want
-\fBpcre2grep\fP to buffer up large amounts of data. However, its use will
-affect performance, and the \fB-M\fP (multiline) option ceases to work.
+When this option is given, non-compressed input is read and processed line by
+line, and the output is flushed after each write. By default, input is read in
+large chunks, unless \fBpcre2grep\fP can determine that it is reading from a
+terminal (which is currently possible only in Unix-like environments). Output
+to terminal is normally automatically flushed by the operating system. This
+option can be useful when the input or output is attached to a pipe and you do
+not want \fBpcre2grep\fP to buffer up large amounts of data. However, its use
+will affect performance, and the \fB-M\fP (multiline) option ceases to work.
+When input is from a compressed .gz or .bz2 file, \fB--line-buffered\fP is
+ignored.
 .TP
 \fB--line-offsets\fP
 Instead of showing lines or parts of lines that match, show each match as a
@@ -398,7 +437,8 @@ number is terminated by a colon (as usual; see the \fB-n\fP option), and the
 offset and length are separated by a comma. In this mode, no context is shown.
 That is, the \fB-A\fP, \fB-B\fP, and \fB-C\fP options are ignored. If there is
 more than one match in a line, each of them is shown separately. This option is
-mutually exclusive with \fB--file-offsets\fP and \fB--only-matching\fP.
+mutually exclusive with \fB--output\fP, \fB--file-offsets\fP, and
+\fB--only-matching\fP.
 .TP
 \fB--locale\fP=\fIlocale-name\fP
 This option specifies a locale to be used for pattern matching. It overrides
@@ -407,46 +447,51 @@ locale is specified, the PCRE2 library's default (usually the "C" locale) is
 used. There is no short form for this option.
 .TP
 \fB--match-limit\fP=\fInumber\fP
-Processing some regular expression patterns can require a very large amount of
-memory, leading in some cases to a program crash if not enough is available.
-Other patterns may take a very long time to search for all possible matching
-strings. The \fBpcre2_match()\fP function that is called by \fBpcre2grep\fP to
-do the matching has two parameters that can limit the resources that it uses.
-.sp
-The \fB--match-limit\fP option provides a means of limiting resource usage
-when processing patterns that are not going to match, but which have a very
-large number of possibilities in their search trees. The classic example is a
-pattern that uses nested unlimited repeats. Internally, PCRE2 uses a function
-called \fBmatch()\fP which it calls repeatedly (sometimes recursively). The
-limit set by \fB--match-limit\fP is imposed on the number of times this
-function is called during a match, which has the effect of limiting the amount
-of backtracking that can take place.
-.sp
-The \fB--recursion-limit\fP option is similar to \fB--match-limit\fP, but
-instead of limiting the total number of times that \fBmatch()\fP is called, it
-limits the depth of recursive calls, which in turn limits the amount of memory
-that can be used. The recursion depth is a smaller number than the total number
-of calls, because not all calls to \fBmatch()\fP are recursive. This limit is
-of use only if it is set smaller than \fB--match-limit\fP.
+Processing some regular expression patterns may take a very long time to search
+for all possible matching strings. Others may require a very large amount of
+memory. There are three options that set resource limits for matching.
+.sp
+The \fB--match-limit\fP option provides a means of limiting computing resource
+usage when processing patterns that are not going to match, but which have a
+very large number of possibilities in their search trees. The classic example
+is a pattern that uses nested unlimited repeats. Internally, PCRE2 has a
+counter that is incremented each time around its main processing loop. If the
+value set by \fB--match-limit\fP is reached, an error occurs.
+.sp
+The \fB--heap-limit\fP option specifies, as a number of kilobytes, the amount
+of heap memory that may be used for matching. Heap memory is needed only if
+matching the pattern requires a significant number of nested backtracking
+points to be remembered. This parameter can be set to zero to forbid the use of
+heap memory altogether.
+.sp
+The \fB--depth-limit\fP option limits the depth of nested backtracking points,
+which indirectly limits the amount of memory that is used. The amount of memory
+needed for each backtracking point depends on the number of capturing
+parentheses in the pattern, so the amount of memory that is used before this
+limit acts varies from pattern to pattern. This limit is of use only if it is
+set smaller than \fB--match-limit\fP.
 .sp
 There are no short forms for these options. The default settings are specified
-when the PCRE2 library is compiled, with the default default being 10 million.
+when the PCRE2 library is compiled, with the default defaults being very large
+and so effectively unlimited.
+.TP
+\fB--max-buffer-size=\fInumber\fP
+This limits the expansion of the processing buffer, whose initial size can be
+set by \fB--buffer-size\fP. The maximum buffer size is silently forced to be no
+smaller than the starting buffer size.
 .TP
 \fB-M\fP, \fB--multiline\fP
-Allow patterns to match more than one line. When this option is given, patterns
-may usefully contain literal newline characters and internal occurrences of ^
-and $ characters. The output for a successful match may consist of more than
-one line. The first is the line in which the match started, and the last is the
-line in which the match ended. If the matched string ends with a newline
-sequence the output ends at the end of that line.
-.sp
-When this option is set, the PCRE2 library is called in "multiline" mode. This
-allows a matched string to extend past the end of a line and continue on one or
-more subsequent lines. However, \fBpcre2grep\fP still processes the input line
-by line. Once a match has been handled, scanning restarts at the beginning of
-the next line, just as it does when \fB-M\fP is not present. This means that it
-is possible for the second or subsequent lines in a multiline match to be
-output again as part of another match.
+Allow patterns to match more than one line. When this option is set, the PCRE2
+library is called in "multiline" mode. This allows a matched string to extend
+past the end of a line and continue on one or more subsequent lines. Patterns
+used with \fB-M\fP may usefully contain literal newline characters and internal
+occurrences of ^ and $ characters. The output for a successful match may
+consist of more than one line. The first line is the line in which the match
+started, and the last line is the line in which the match ended. If the matched
+string ends with a newline sequence, the output ends at the end of that line.
+If \fB-v\fP is set, none of the lines in a multi-line match are output. Once a
+match has been handled, scanning restarts at the beginning of the line after
+the one in which the match ended.
 .sp
 The newline sequence that separates multiple lines must be matched as part of
 the pattern. For example, to find the phrase "regular expression" in a file
@@ -460,11 +505,8 @@ and is followed by + so as to match trailing white space on the first line as
 well as possibly handling a two-character newline sequence.
 .sp
 There is a limit to the number of lines that can be matched, imposed by the way
-that \fBpcre2grep\fP buffers the input file as it scans it. However,
-\fBpcre2grep\fP ensures that at least 8K characters or the rest of the file
-(whichever is the shorter) are available for forward matching, and similarly
-the previous 8K characters (or all the previous characters, if fewer than 8K)
-are guaranteed to be available for lookbehind assertions. The \fB-M\fP option
+that \fBpcre2grep\fP buffers the input file as it scans it. With a sufficiently
+large processing buffer, this should not be a problem, but the \fB-M\fP option
 does not work when input is read line by line (see \fP--line-buffered\fP.)
 .TP
 \fB-N\fP \fInewline-type\fP, \fB--newline\fP=\fInewline-type\fP
@@ -503,16 +545,41 @@ was explicitly disabled at build time. This option can be used to disable the
 use of JIT at run time. It is provided for testing and working round problems.
 It should never be needed in normal use.
 .TP
+\fB-O\fP \fItext\fP, \fB--output\fP=\fItext\fP
+When there is a match, instead of outputting the whole line that matched,
+output just the given text. This option is mutually exclusive with
+\fB--only-matching\fP, \fB--file-offsets\fP, and \fB--line-offsets\fP. Escape
+sequences starting with a dollar character may be used to insert the contents
+of the matched part of the line and/or captured substrings into the text.
+.sp
+$<digits> or ${<digits>} is replaced by the captured
+substring of the given decimal number; zero substitutes the whole match. If
+the number is greater than the number of capturing substrings, or if the
+capture is unset, the replacement is empty.
+.sp
+$a is replaced by bell; $b by backspace; $e by escape; $f by form feed; $n by
+newline; $r by carriage return; $t by tab; $v by vertical tab.
+.sp
+$o<digits> is replaced by the character represented by the given octal
+number; up to three digits are processed.
+.sp
+$x<digits> is replaced by the character represented by the given hexadecimal
+number; up to two digits are processed.
+.sp
+Any other character is substituted by itself. In particular, $$ is replaced by
+a single dollar.
+.TP
 \fB-o\fP, \fB--only-matching\fP
 Show only the part of the line that matched a pattern instead of the whole
 line. In this mode, no context is shown. That is, the \fB-A\fP, \fB-B\fP, and
 \fB-C\fP options are ignored. If there is more than one match in a line, each
-of them is shown separately. If \fB-o\fP is combined with \fB-v\fP (invert the
-sense of the match to find non-matching lines), no output is generated, but the
-return code is set appropriately. If the matched portion of the line is empty,
-nothing is output unless the file name or line number are being printed, in
-which case they are shown on an otherwise empty line. This option is mutually
-exclusive with \fB--file-offsets\fP and \fB--line-offsets\fP.
+of them is shown separately, on a separate line of output. If \fB-o\fP is
+combined with \fB-v\fP (invert the sense of the match to find non-matching
+lines), no output is generated, but the return code is set appropriately. If
+the matched portion of the line is empty, nothing is output unless the file
+name or line number are being printed, in which case they are shown on an
+otherwise empty line. This option is mutually exclusive with \fB--output\fP,
+\fB--file-offsets\fP and \fB--line-offsets\fP.
 .TP
 \fB-o\fP\fInumber\fP, \fB--only-matching\fP=\fInumber\fP
 Show only the part of the line that matched the capturing parentheses of the
@@ -520,14 +587,15 @@ given number. Up to 32 capturing parentheses are supported, and -o0 is
 equivalent to \fB-o\fP without a number. Because these options can be given
 without an argument (see above), if an argument is present, it must be given in
 the same shell item, for example, -o3 or --only-matching=2. The comments given
-for the non-argument case above also apply to this case. If the specified
+for the non-argument case above also apply to this option. If the specified
 capturing parentheses do not exist in the pattern, or were not set in the
 match, nothing is output unless the file name or line number are being output.
 .sp
-If this option is given multiple times, multiple substrings are output, in the
-order the options are given. For example, -o3 -o1 -o3 causes the substrings
-matched by capturing parentheses 3 and 1 and then 3 again to be output. By
-default, there is no separator (but see the next option).
+If this option is given multiple times, multiple substrings are output for each
+match, in the order the options are given, and all on one line. For example,
+-o3 -o1 -o3 causes the substrings matched by capturing parentheses 3 and 1 and
+then 3 again to be output. By default, there is no separator (but see the next
+option).
 .TP
 \fB--om-separator\fP=\fItext\fP
 Specify a separating string for multiple occurrences of \fB-o\fP. The default
@@ -552,6 +620,17 @@ Suppress error messages about non-existent or unreadable files. Such files are
 quietly skipped. However, the return code is still 2, even if matches were
 found in other files.
 .TP
+\fB-t\fP, \fB--total-count\fP
+This option is useful when scanning more than one file. If used on its own,
+\fB-t\fP suppresses all output except for a grand total number of matching
+lines (or non-matching lines if \fB-v\fP is used) in all the files. If \fB-t\fP
+is used with \fB-c\fP, a grand total is output except when the previous output
+is just one line. In other words, it is not output when just one file's count
+is listed. If file names are being output, the grand total is preceded by
+"TOTAL:". Otherwise, it appears as just another number. The \fB-t\fP option is
+ignored when used with \fB-L\fP (list files without matches), because the grand
+total would always be zero.
+.TP
 \fB-u\fP, \fB--utf-8\fP
 Operate in UTF-8 mode. This option is available only if PCRE2 has been compiled
 with UTF-8 support. All patterns (including those for any \fB--exclude\fP and
@@ -568,16 +647,18 @@ Invert the sense of the match, so that lines which do \fInot\fP match any of
 the patterns are the ones that are found.
 .TP
 \fB-w\fP, \fB--word-regex\fP, \fB--word-regexp\fP
-Force the patterns to match only whole words. This is equivalent to having \eb
-at the start and end of the pattern. This option applies only to the patterns
-that are matched against the contents of files; it does not apply to patterns
-specified by any of the \fB--include\fP or \fB--exclude\fP options.
+Force the patterns only to match "words". That is, there must be a word
+boundary at the start and end of each matched string. This is equivalent to
+having "\eb(?:" at the start of each pattern, and ")\eb" at the end. This
+option applies only to the patterns that are matched against the contents of
+files; it does not apply to patterns specified by any of the \fB--include\fP or
+\fB--exclude\fP options.
 .TP
 \fB-x\fP, \fB--line-regex\fP, \fB--line-regexp\fP
-Force the patterns to be anchored (each must start matching at the beginning of
-a line) and in addition, require them to match entire lines. This is equivalent
-to having ^ and $ characters at the start and end of each alternative top-level
-branch in every pattern. This option applies only to the patterns that are
+Force the patterns to start matching only at the beginnings of lines, and in
+addition, require them to match entire lines. In multiline mode the match may
+be more than one line. This is equivalent to having "^(?:" at the start of each
+pattern and ")$" at the end. This option applies only to the patterns that are
 matched against the contents of files; it does not apply to patterns specified
 by any of the \fB--include\fP or \fB--exclude\fP options.
 .
@@ -612,10 +693,11 @@ relying on the C I/O library to convert this to an appropriate sequence.
 Many of the short and long forms of \fBpcre2grep\fP's options are the same
 as in the GNU \fBgrep\fP program. Any long option of the form
 \fB--xxx-regexp\fP (GNU terminology) is also available as \fB--xxx-regex\fP
-(PCRE2 terminology). However, the \fB--file-list\fP, \fB--file-offsets\fP,
-\fB--include-dir\fP, \fB--line-offsets\fP, \fB--locale\fP, \fB--match-limit\fP,
-\fB-M\fP, \fB--multiline\fP, \fB-N\fP, \fB--newline\fP, \fB--om-separator\fP,
-\fB--recursion-limit\fP, \fB-u\fP, and \fB--utf-8\fP options are specific to
+(PCRE2 terminology). However, the \fB--depth-limit\fP, \fB--file-list\fP,
+\fB--file-offsets\fP, \fB--heap-limit\fP, \fB--include-dir\fP,
+\fB--line-offsets\fP, \fB--locale\fP, \fB--match-limit\fP, \fB-M\fP,
+\fB--multiline\fP, \fB-N\fP, \fB--newline\fP, \fB--om-separator\fP,
+\fB--output\fP, \fB-u\fP, and \fB--utf-8\fP options are specific to
 \fBpcre2grep\fP, as is the use of the \fB--only-matching\fP option with a
 capturing parentheses number.
 .P
@@ -658,14 +740,14 @@ options does have data, it must be given in the first form, using an equals
 character. Otherwise \fBpcre2grep\fP will assume that it has no data.
 .
 .
-.SH "CALLING EXTERNAL SCRIPTS"
+.SH "USING PCRE2'S CALLOUT FACILITY"
 .rs
 .sp
-On non-Windows systems, \fBpcre2grep\fP has, by default, support for calling
-external programs or scripts during matching by making use of PCRE2's callout
-facility. However, this support can be disabled when \fBpcre2grep\fP is built.
-You can find out whether your binary has support for callouts by running it
-with the \fB--help\fP option. If the support is not enabled, all callouts in
+\fBpcre2grep\fP has, by default, support for calling external programs or
+scripts or echoing specific strings during matching by making use of PCRE2's
+callout facility. However, this support can be disabled when \fBpcre2grep\fP is
+built. You can find out whether your binary has support for callouts by running
+it with the \fB--help\fP option. If the support is not enabled, all callouts in
 patterns are ignored by \fBpcre2grep\fP.
 .P
 A callout in a PCRE2 pattern is of the form (?C<arg>) where the argument is
@@ -673,10 +755,17 @@ either a number or a quoted string (see the
 .\" HREF
 \fBpcre2callout\fP
 .\"
-documentation for details). Numbered callouts are ignored by \fBpcre2grep\fP.
-String arguments are parsed as a list of substrings separated by pipe (vertical
-bar) characters. The first substring must be an executable name, with the
-following substrings specifying arguments:
+documentation for details). Numbered callouts are ignored by \fBpcre2grep\fP;
+only callouts with string arguments are useful.
+.
+.
+.SS "Calling external programs or scripts"
+.rs
+.sp
+If the callout string does not start with a pipe (vertical bar) character, it
+is parsed into a list of substrings separated by pipe characters. The first
+substring must be an executable name, with the following substrings specifying
+arguments:
 .sp
   executable_name|arg1|arg2|...
 .sp
@@ -710,6 +799,19 @@ the non-existence of the executable), a local matching failure occurs and the
 matcher backtracks in the normal way.
 .
 .
+.SS "Echoing a specific string"
+.rs
+.sp
+If the callout string starts with a pipe (vertical bar) character, the rest of
+the string is written to the output, having been passed through the same escape
+processing as text from the --output option. This provides a simple echoing
+facility that avoids calling an external program or script. No terminator is
+added to the string, so if you want a newline, you must include it explicitly.
+Matching continues normally after the string is output. If you want to see only
+the callout output but not any output from an actual match, you should end the
+relevant pattern with (*FAIL).
+.
+.
 .SH "MATCHING ERRORS"
 .rs
 .sp
@@ -722,9 +824,9 @@ message and the line that caused the problem to the standard error stream. If
 there are more than 20 such errors, \fBpcre2grep\fP gives up.
 .P
 The \fB--match-limit\fP option of \fBpcre2grep\fP can be used to set the
-overall resource limit; there is a second option called \fB--recursion-limit\fP
-that sets a limit on the amount of memory (usually stack) that is used (see the
-discussion of these options above).
+overall resource limit. There are also other limits that affect the amount of
+memory used during matching; see the discussion of \fB--heap-limit\fP and
+\fB--depth-limit\fP above.
 .
 .
 .SH DIAGNOSTICS
@@ -735,6 +837,9 @@ for syntax errors, overlong lines, non-existent or inaccessible files (even if
 matches were found in other files) or too many matching errors. Using the
 \fB-s\fP option to suppress error messages about inaccessible files does not
 affect the return code.
+.P
+When run under VMS, the return code is placed in the symbol PCRE2GREP_RC
+because VMS does not distinguish between exit(0) and exit(1).
 .
 .
 .SH "SEE ALSO"
@@ -757,6 +862,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 19 June 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 13 November 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2grep.txt b/doc/pcre2grep.txt
index 31aa610..30517b4 100644
--- a/doc/pcre2grep.txt
+++ b/doc/pcre2grep.txt
@@ -51,61 +51,73 @@ DESCRIPTION
        boundary is controlled by the -N (--newline) option.
 
        The amount of memory used for buffering files that are being scanned is
-       controlled  by a parameter that can be set by the --buffer-size option.
-       The default value for this parameter is  specified  when  pcre2grep  is
-       built,  with  the  default  default  being 20K. A block of memory three
-       times this size is used (to allow for buffering  "before"  and  "after"
-       lines). An error occurs if a line overflows the buffer.
-
-       Patterns  can  be  no  longer than 8K or BUFSIZ bytes, whichever is the
-       greater.  BUFSIZ is defined in <stdio.h>. When there is more  than  one
+       controlled  by  parameters  that  can  be  set by the --buffer-size and
+       --max-buffer-size options. The first of these sets the size  of  buffer
+       that  is obtained at the start of processing. If an input file contains
+       very long lines, a larger buffer may be  needed;  this  is  handled  by
+       automatically extending the buffer, up to the limit specified by --max-
+       buffer-size. The default values for these parameters are specified when
+       pcre2grep  is built, with the default defaults being 20K and 1M respec-
+       tively. An error occurs if a line is too long and  the  buffer  can  no
+       longer be expanded.
+
+       The  block  of  memory that is actually used is three times the "buffer
+       size", to allow for buffering "before" and "after" lines. If the buffer
+       size  is too small, fewer than requested "before" and "after" lines may
+       be output.
+
+       Patterns can be no longer than 8K or BUFSIZ  bytes,  whichever  is  the
+       greater.   BUFSIZ  is defined in <stdio.h>. When there is more than one
        pattern (specified by the use of -e and/or -f), each pattern is applied
-       to each line in the order in which they are defined,  except  that  all
+       to  each  line  in the order in which they are defined, except that all
        the -e patterns are tried before the -f patterns.
 
-       By  default, as soon as one pattern matches a line, no further patterns
+       By default, as soon as one pattern matches a line, no further  patterns
        are considered. However, if --colour (or --color) is used to colour the
-       matching  substrings, or if --only-matching, --file-offsets, or --line-
-       offsets is used to output only  the  part  of  the  line  that  matched
+       matching substrings, or if --only-matching, --file-offsets, or  --line-
+       offsets  is  used  to  output  only  the  part of the line that matched
        (either shown literally, or as an offset), scanning resumes immediately
-       following the match, so that further matches on the same  line  can  be
-       found.  If  there  are  multiple  patterns,  they  are all tried on the
-       remainder of the line, but patterns that follow the  one  that  matched
+       following  the  match,  so that further matches on the same line can be
+       found. If there are multiple  patterns,  they  are  all  tried  on  the
+       remainder  of  the  line, but patterns that follow the one that matched
        are not tried on the earlier part of the line.
 
-       This  behaviour  means  that  the  order in which multiple patterns are
-       specified can affect the output when one of the above options is  used.
-       This  is no longer the same behaviour as GNU grep, which now manages to
-       display earlier matches for later patterns (as  long  as  there  is  no
+       This behaviour means that the order  in  which  multiple  patterns  are
+       specified  can affect the output when one of the above options is used.
+       This is no longer the same behaviour as GNU grep, which now manages  to
+       display  earlier  matches  for  later  patterns (as long as there is no
        overlap).
 
-       Patterns  that can match an empty string are accepted, but empty string
+       Patterns that can match an empty string are accepted, but empty  string
        matches   are   never   recognized.   An   example   is   the   pattern
-       "(super)?(man)?",  in  which  all components are optional. This pattern
-       finds all occurrences of both "super" and  "man";  the  output  differs
-       from  matching  with  "super|man" when only the matching substrings are
+       "(super)?(man)?", in which all components are  optional.  This  pattern
+       finds  all  occurrences  of  both "super" and "man"; the output differs
+       from matching with "super|man" when only the  matching  substrings  are
        being shown.
 
-       If the LC_ALL or LC_CTYPE environment variable is set,  pcre2grep  uses
+       If  the  LC_ALL or LC_CTYPE environment variable is set, pcre2grep uses
        the value to set a locale when calling the PCRE2 library.  The --locale
        option can be used to override this.
 
 
 SUPPORT FOR COMPRESSED FILES
 
-       It is possible to compile pcre2grep so that it uses libz or  libbz2  to
-       read  files  whose names end in .gz or .bz2, respectively. You can find
-       out whether your binary has support for one or both of these file types
-       by running it with the --help option. If the appropriate support is not
-       present, files are treated as plain text. The standard input is  always
-       so treated.
+       It  is  possible to compile pcre2grep so that it uses libz or libbz2 to
+       read compressed files whose names end in .gz or .bz2, respectively. You
+       can  find out whether your pcre2grep binary has support for one or both
+       of these file types by running it with the --help option. If the appro-
+       priate support is not present, all files are treated as plain text. The
+       standard input is always so treated. When input is  from  a  compressed
+       .gz or .bz2 file, the --line-buffered option is ignored.
 
 
 BINARY FILES
 
        By  default,  a  file that contains a binary zero byte within the first
        1024 bytes is identified as a binary file, and is processed  specially.
-       (GNU  grep  also  identifies  binary  files  in  this  manner.) See the
+       (GNU grep identifies binary files in this manner.) However, if the new-
+       line type is specified as "nul", that is,  the  line  terminator  is  a
+       binary  zero,  the  test  for  a  binary  file  is not applied. See the
        --binary-files option for a means of changing the way binary files  are
        handled.
 
@@ -113,7 +125,7 @@ BINARY FILES
 OPTIONS
 
        The  order  in  which some of the options appear can affect the output.
-       For example, both the -h and -l options affect  the  printing  of  file
+       For example, both the -H and -l options affect  the  printing  of  file
        names.  Whichever  comes later in the command line will be the one that
        takes effect. Similarly, except where noted  below,  if  an  option  is
        given  twice,  the  later setting is used. Numerical values for options
@@ -126,46 +138,50 @@ OPTIONS
                  names that start with hyphens.
 
        -A number, --after-context=number
-                 Output number lines of context after each matching  line.  If
-                 file  names  and/or  line  numbers are being output, a hyphen
-                 separator is used instead of a colon for the context lines. A
-                 line  containing  "--" is output between each group of lines,
-                 unless they are in fact contiguous in  the  input  file.  The
-                 value  of number is expected to be relatively small. However,
-                 pcre2grep guarantees to have  up  to  8K  of  following  text
-                 available for context output.
+                 Output up to number lines  of  context  after  each  matching
+                 line.  Fewer lines are output if the next match or the end of
+                 the file is reached, or if the  processing  buffer  size  has
+                 been  set  too  small.  If file names and/or line numbers are
+                 being output, a hyphen separator is used instead of  a  colon
+                 for  the  context  lines.  A  line  containing "--" is output
+                 between each group of lines, unless they are in fact contigu-
+                 ous  in the input file. The value of number is expected to be
+                 relatively small. When -c is used, -A is ignored.
 
        -a, --text
-                 Treat  binary  files as text. This is equivalent to --binary-
+                 Treat binary files as text. This is equivalent  to  --binary-
                  files=text.
 
        -B number, --before-context=number
-                 Output number lines of context before each matching line.  If
-                 file  names  and/or  line  numbers are being output, a hyphen
-                 separator is used instead of a colon for the context lines. A
-                 line  containing  "--" is output between each group of lines,
-                 unless they are in fact contiguous in  the  input  file.  The
-                 value  of number is expected to be relatively small. However,
-                 pcre2grep guarantees to have  up  to  8K  of  preceding  text
-                 available for context output.
+                 Output  up  to  number  lines of context before each matching
+                 line. Fewer lines are output if the  previous  match  or  the
+                 start  of the file is within number lines, or if the process-
+                 ing buffer size has been set too small. If file names  and/or
+                 line  numbers  are  being  output, a hyphen separator is used
+                 instead of a colon for the context lines. A  line  containing
+                 "--"  is  output between each group of lines, unless they are
+                 in fact contiguous in the input file. The value of number  is
+                 expected  to  be  relatively  small.  When  -c is used, -B is
+                 ignored.
 
        --binary-files=word
-                 Specify  how binary files are to be processed. If the word is
-                 "binary" (the default),  pattern  matching  is  performed  on
-                 binary  files,  but  the  only  output is "Binary file <name>
-                 matches" when a match succeeds. If the word is "text",  which
-                 is  equivalent  to  the -a or --text option, binary files are
-                 processed in the same way as any other file.  In  this  case,
-                 when  a  match  succeeds,  the  output may be binary garbage,
-                 which can have nasty effects if sent to a  terminal.  If  the
-                 word  is  "without-match",  which  is  equivalent  to  the -I
-                 option, binary files are  not  processed  at  all;  they  are
+                 Specify how binary files are to be processed. If the word  is
+                 "binary"  (the  default),  pattern  matching  is performed on
+                 binary files, but the only  output  is  "Binary  file  <name>
+                 matches"  when a match succeeds. If the word is "text", which
+                 is equivalent to the -a or --text option,  binary  files  are
+                 processed  in  the  same way as any other file. In this case,
+                 when a match succeeds, the  output  may  be  binary  garbage,
+                 which  can  have  nasty effects if sent to a terminal. If the
+                 word is  "without-match",  which  is  equivalent  to  the  -I
+                 option,  binary  files  are  not  processed  at all; they are
                  assumed not to be of interest and are skipped without causing
                  any output or affecting the return code.
 
        --buffer-size=number
-                 Set the parameter that controls how much memory is  used  for
-                 buffering files that are being scanned.
+                 Set  the  parameter that controls how much memory is obtained
+                 at the start of processing for buffering files that are being
+                 scanned. See also --max-buffer-size below.
 
        -C number, --context=number
                  Output  number  lines  of  context both before and after each
@@ -174,19 +190,21 @@ OPTIONS
 
        -c, --count
                  Do  not  output  lines from the files that are being scanned;
-                 instead output the number of matches (or non-matches if -v is
-                 used)  that would otherwise have caused lines to be shown. By
-                 default, this count is the same as the number  of  suppressed
-                 lines, but if the -M (multiline) option is used (without -v),
-                 there may  be  more  suppressed  lines  than  the  number  of
-                 matches.
-
-                 If  no lines are selected, the number zero is output. If sev-
-                 eral files are are being scanned, a count is output for  each
-                 of  them. However, if the --files-with-matches option is also
-                 used, only those files whose counts are greater than zero are
-                 listed.  When  -c  is  used,  the  -A, -B, and -C options are
-                 ignored.
+                 instead output the number  of  lines  that  would  have  been
+                 shown, either because they matched, or, if -v is set, because
+                 they failed to match. By default, this count is  exactly  the
+                 same  as the number of lines that would have been output, but
+                 if the -M (multiline) option is used (without -v), there  may
+                 be  more suppressed lines than the count (that is, the number
+                 of matches).
+
+                 If no lines are selected, the number zero is output. If  sev-
+                 eral  files are are being scanned, a count is output for each
+                 of them and the -t option can be used to cause a total to  be
+                 output  at  the  end.  However,  if  the --files-with-matches
+                 option is also  used,  only  those  files  whose  counts  are
+                 greater  than  zero  are listed. When -c is used, the -A, -B,
+                 and -C options are ignored.
 
        --colour, --color
                  If this option is given without any data, it is equivalent to
@@ -204,205 +222,225 @@ OPTIONS
                  possible  matches in a line, not just one, in order to colour
                  them all.
 
-                 The colour that is used can be specified by setting the envi-
-                 ronment  variable  PCRE2GREP_COLOUR  or  PCRE2GREP_COLOR. The
-                 value of this variable should be a  string  of  two  numbers,
-                 separated  by  a semicolon. They are copied directly into the
-                 control string for setting colour on a  terminal,  so  it  is
-                 your  responsibility  to ensure that they make sense. If nei-
-                 ther of the environment variables  is  set,  the  default  is
-                 "1;31", which gives red.
+                 The colour that is used can be specified by  setting  one  of
+                 the  environment variables PCRE2GREP_COLOUR, PCRE2GREP_COLOR,
+                 PCREGREP_COLOUR, or PCREGREP_COLOR, which are checked in that
+                 order.  If  none  of  these  are  set,  pcre2grep  looks  for
+                 GREP_COLORS or GREP_COLOR (in that order). The value  of  the
+                 variable  should  be  a string of two numbers, separated by a
+                 semicolon, except in the  case  of  GREP_COLORS,  which  must
+                 start with "ms=" or "mt=" followed by two semicolon-separated
+                 colours, terminated by the end of the string or by  a  colon.
+                 If  GREP_COLORS  does  not  start  with  "ms=" or "mt=" it is
+                 ignored, and GREP_COLOR is checked.
+
+                 If the string obtained from one of the above  variables  con-
+                 tains any characters other than semicolon or digits, the set-
+                 ting is ignored and the default colour is used. The string is
+                 copied directly into the control string for setting colour on
+                 a terminal, so it is your responsibility to ensure  that  the
+                 values  make  sense.  If  no relevant environment variable is
+                 set, the default is "1;31", which gives red.
 
        -D action, --devices=action
-                 If  an  input  path  is  not  a  regular file or a directory,
-                 "action" specifies how it is to be  processed.  Valid  values
+                 If an input path is  not  a  regular  file  or  a  directory,
+                 "action"  specifies  how  it is to be processed. Valid values
                  are "read" (the default) or "skip" (silently skip the path).
 
        -d action, --directories=action
                  If an input path is a directory, "action" specifies how it is
-                 to be processed.  Valid values are  "read"  (the  default  in
-                 non-Windows  environments,  for compatibility with GNU grep),
-                 "recurse" (equivalent to the -r option), or "skip"  (silently
-                 skip  the  path, the default in Windows environments). In the
-                 "read" case, directories are read as if  they  were  ordinary
-                 files.  In  some  operating  systems  the effect of reading a
+                 to  be  processed.   Valid  values are "read" (the default in
+                 non-Windows environments, for compatibility with  GNU  grep),
+                 "recurse"  (equivalent to the -r option), or "skip" (silently
+                 skip the path, the default in Windows environments).  In  the
+                 "read"  case,  directories  are read as if they were ordinary
+                 files. In some operating systems  the  effect  of  reading  a
                  directory like this is an immediate end-of-file; in others it
                  may provoke an error.
 
+       --depth-limit=number
+                 See --match-limit below.
+
        -e pattern, --regex=pattern, --regexp=pattern
                  Specify a pattern to be matched. This option can be used mul-
                  tiple times in order to specify several patterns. It can also
-                 be  used  as a way of specifying a single pattern that starts
-                 with a hyphen. When -e is used, no argument pattern is  taken
-                 from  the  command  line;  all  arguments are treated as file
-                 names. There is no limit to the number of patterns. They  are
-                 applied  to  each line in the order in which they are defined
+                 be used as a way of specifying a single pattern  that  starts
+                 with  a hyphen. When -e is used, no argument pattern is taken
+                 from the command line; all  arguments  are  treated  as  file
+                 names.  There is no limit to the number of patterns. They are
+                 applied to each line in the order in which they  are  defined
                  until one matches.
 
-                 If -f is used with -e, the command line patterns are  matched
+                 If  -f is used with -e, the command line patterns are matched
                  first, followed by the patterns from the file(s), independent
-                 of the order in which these options are specified. Note  that
-                 multiple  use  of -e is not the same as a single pattern with
+                 of  the order in which these options are specified. Note that
+                 multiple use of -e is not the same as a single  pattern  with
                  alternatives. For example, X|Y finds the first character in a
-                 line  that  is  X or Y, whereas if the two patterns are given
+                 line that is X or Y, whereas if the two  patterns  are  given
                  separately, with X first, pcre2grep finds X if it is present,
                  even if it follows Y in the line. It finds Y only if there is
-                 no X in the line. This matters only if you are  using  -o  or
+                 no  X  in  the line. This matters only if you are using -o or
                  --colo(u)r to show the part(s) of the line that matched.
 
        --exclude=pattern
                  Files (but not directories) whose names match the pattern are
-                 skipped without being processed. This applies to  all  files,
-                 whether  listed  on  the  command line, obtained from --file-
+                 skipped  without  being processed. This applies to all files,
+                 whether listed on the command  line,  obtained  from  --file-
                  list, or by scanning a directory. The pattern is a PCRE2 reg-
-                 ular  expression,  and is matched against the final component
-                 of the file name, not the entire path. The  -F,  -w,  and  -x
+                 ular expression, and is matched against the  final  component
+                 of  the  file  name,  not the entire path. The -F, -w, and -x
                  options do not apply to this pattern. The option may be given
                  any number of times in order to specify multiple patterns. If
-                 a  file  name matches both an --include and an --exclude pat-
+                 a file name matches both an --include and an  --exclude  pat-
                  tern, it is excluded. There is no short form for this option.
 
        --exclude-from=filename
-                 Treat each non-empty line of the file  as  the  data  for  an
+                 Treat  each  non-empty  line  of  the file as the data for an
                  --exclude option. What constitutes a newline when reading the
-                 file is the operating system's default. The --newline  option
-                 has  no  effect on this option. This option may be given more
+                 file  is the operating system's default. The --newline option
+                 has no effect on this option. This option may be  given  more
                  than once in order to specify a number of files to read.
 
        --exclude-dir=pattern
                  Directories whose names match the pattern are skipped without
-                 being  processed,  whatever  the  setting  of the --recursive
-                 option. This applies to all directories,  whether  listed  on
+                 being processed, whatever  the  setting  of  the  --recursive
+                 option.  This  applies  to all directories, whether listed on
                  the command line, obtained from --file-list, or by scanning a
-                 parent directory. The pattern is a PCRE2 regular  expression,
-                 and  is  matched against the final component of the directory
-                 name, not the entire path. The -F, -w, and -x options do  not
-                 apply  to this pattern. The option may be given any number of
-                 times in order to specify more than one pattern. If a  direc-
-                 tory  matches  both  --include-dir  and  --exclude-dir, it is
+                 parent  directory. The pattern is a PCRE2 regular expression,
+                 and is matched against the final component of  the  directory
+                 name,  not the entire path. The -F, -w, and -x options do not
+                 apply to this pattern. The option may be given any number  of
+                 times  in order to specify more than one pattern. If a direc-
+                 tory matches both  --include-dir  and  --exclude-dir,  it  is
                  excluded. There is no short form for this option.
 
        -F, --fixed-strings
-                 Interpret each data-matching  pattern  as  a  list  of  fixed
-                 strings,  separated  by  newlines,  instead  of  as a regular
-                 expression. What constitutes a newline for  this  purpose  is
-                 controlled  by the --newline option. The -w (match as a word)
-                 and -x (match whole line) options can be used with -F.   They
+                 Interpret  each  data-matching  pattern  as  a  list of fixed
+                 strings, separated by  newlines,  instead  of  as  a  regular
+                 expression.  What  constitutes  a newline for this purpose is
+                 controlled by the --newline option. The -w (match as a  word)
+                 and  -x (match whole line) options can be used with -F.  They
                  apply to each of the fixed strings. A line is selected if any
                  of the fixed strings are found in it (subject to -w or -x, if
-                 present).  This  option applies only to the patterns that are
-                 matched against the contents of files; it does not  apply  to
-                 patterns  specified  by  any  of  the  --include or --exclude
+                 present). This option applies only to the patterns  that  are
+                 matched  against  the contents of files; it does not apply to
+                 patterns specified by  any  of  the  --include  or  --exclude
                  options.
 
        -f filename, --file=filename
-                 Read patterns from the file, one per  line,  and  match  them
-                 against  each  line of input. What constitutes a newline when
-                 reading the file  is  the  operating  system's  default.  The
-                 --newline option has no effect on this option. Trailing white
-                 space is removed from each line, and blank lines are ignored.
-                 An  empty  file  contains  no  patterns and therefore matches
-                 nothing. See also the comments about multiple patterns versus
-                 a  single  pattern with alternatives in the description of -e
-                 above.
-
-                 If this option is given more than  once,  all  the  specified
-                 files  are read. A data line is output if any of the patterns
-                 match it. A file name can be given as "-"  to  refer  to  the
-                 standard  input.  When  -f is used, patterns specified on the
-                 command line using -e may also be present;  they  are  tested
-                 before  the  file's  patterns.  However,  no other pattern is
+                 Read  patterns  from  the  file, one per line, and match them
+                 against each line of input. What constitutes a  newline  when
+                 reading  the  file  is  the  operating  system's default. The
+                 --newline option has no  effect  on  this  option.   Trailing
+                 white  space  is  removed from each line, and blank lines are
+                 ignored. An empty file contains  no  patterns  and  therefore
+                 matches  nothing.  See  also the comments about multiple pat-
+                 terns versus  a  single  pattern  with  alternatives  in  the
+                 description of -e above.
+
+                 If  this  option  is  given more than once, all the specified
+                 files are read. A data line is output if any of the  patterns
+                 match  it.  A  file  name can be given as "-" to refer to the
+                 standard input. When -f is used, patterns  specified  on  the
+                 command  line  using  -e may also be present; they are tested
+                 before the file's patterns.  However,  no  other  pattern  is
                  taken from the command line; all arguments are treated as the
                  names of paths to be searched.
 
        --file-list=filename
-                 Read  a  list  of  files  and/or  directories  that are to be
-                 scanned from the given file, one  per  line.  Trailing  white
+                 Read a list of  files  and/or  directories  that  are  to  be
+                 scanned  from  the  given  file, one per line. Trailing white
                  space is removed from each line, and blank lines are ignored.
-                 These paths are processed before any that are listed  on  the
-                 command  line.  The file name can be given as "-" to refer to
+                 These  paths  are processed before any that are listed on the
+                 command line. The file name can be given as "-" to  refer  to
                  the standard input.  If --file and --file-list are both spec-
-                 ified  as  "-",  patterns are read first. This is useful only
-                 when the standard input is a  terminal,  from  which  further
-                 lines  (the  list  of files) can be read after an end-of-file
-                 indication. If this option is given more than once,  all  the
+                 ified as "-", patterns are read first. This  is  useful  only
+                 when  the  standard  input  is a terminal, from which further
+                 lines (the list of files) can be read  after  an  end-of-file
+                 indication.  If  this option is given more than once, all the
                  specified files are read.
 
        --file-offsets
-                 Instead  of  showing lines or parts of lines that match, show
-                 each match as an offset from the start  of  the  file  and  a
-                 length,  separated  by  a  comma. In this mode, no context is
-                 shown. That is, the -A, -B, and -C options  are  ignored.  If
+                 Instead of showing lines or parts of lines that  match,  show
+                 each  match  as  an  offset  from the start of the file and a
+                 length, separated by a comma. In this  mode,  no  context  is
+                 shown.  That  is,  the -A, -B, and -C options are ignored. If
                  there is more than one match in a line, each of them is shown
-                 separately. This option is mutually  exclusive  with  --line-
-                 offsets and --only-matching.
+                 separately.  This option is mutually exclusive with --output,
+                 --line-offsets, and --only-matching.
 
        -H, --with-filename
-                 Force  the  inclusion of the file name at the start of output
+                 Force the inclusion of the file name at the start  of  output
                  lines when searching a single file. By default, the file name
                  is not shown in this case.  For matching lines, the file name
                  is followed by a colon; for context lines, a hyphen separator
-                 is  used.  If  a line number is also being output, it follows
-                 the file name. When the -M option causes a pattern  to  match
-                 more  than  one  line, only the first is preceded by the file
-                 name.
+                 is used. If a line number is also being  output,  it  follows
+                 the  file  name. When the -M option causes a pattern to match
+                 more than one line, only the first is preceded  by  the  file
+                 name.  This  option  overrides  any  previous  -h,  -l, or -L
+                 options.
 
        -h, --no-filename
                  Suppress the output file names when searching multiple files.
                  By  default,  file  names  are  shown when multiple files are
                  searched. For matching lines, the file name is followed by  a
                  colon;  for  context lines, a hyphen separator is used.  If a
-                 line number is also being output, it follows the file name.
+                 line number is also being output, it follows the  file  name.
+                 This option overrides any previous -H, -L, or -l options.
+
+       --heap-limit=number
+                 See --match-limit below.
 
-       --help    Output a help message, giving brief details  of  the  command
-                 options  and  file type support, and then exit. Anything else
+       --help    Output  a  help  message, giving brief details of the command
+                 options and file type support, and then exit.  Anything  else
                  on the command line is ignored.
 
-       -I        Ignore  binary  files.  This  is  equivalent   to   --binary-
+       -I        Ignore   binary   files.  This  is  equivalent  to  --binary-
                  files=without-match.
 
        -i, --ignore-case
                  Ignore upper/lower case distinctions during comparisons.
 
        --include=pattern
-                 If  any --include patterns are specified, the only files that
-                 are processed are those that match one of the  patterns  (and
-                 do  not  match  an  --exclude  pattern). This option does not
-                 affect directories, but it  applies  to  all  files,  whether
-                 listed  on the command line, obtained from --file-list, or by
-                 scanning a directory. The pattern is a PCRE2 regular  expres-
-                 sion,  and is matched against the final component of the file
-                 name, not the entire path. The -F, -w, and -x options do  not
-                 apply  to this pattern. The option may be given any number of
-                 times. If a file  name  matches  both  an  --include  and  an
-                 --exclude  pattern,  it  is excluded.  There is no short form
+                 If any --include patterns are specified, the only files  that
+                 are  processed  are those that match one of the patterns (and
+                 do not match an --exclude  pattern).  This  option  does  not
+                 affect  directories,  but  it  applies  to all files, whether
+                 listed on the command line, obtained from --file-list, or  by
+                 scanning  a directory. The pattern is a PCRE2 regular expres-
+                 sion, and is matched against the final component of the  file
+                 name,  not the entire path. The -F, -w, and -x options do not
+                 apply to this pattern. The option may be given any number  of
+                 times.  If  a  file  name  matches  both  an --include and an
+                 --exclude pattern, it is excluded.  There is  no  short  form
                  for this option.
 
        --include-from=filename
-                 Treat each non-empty line of the file  as  the  data  for  an
+                 Treat  each  non-empty  line  of  the file as the data for an
                  --include option. What constitutes a newline for this purpose
-                 is the operating system's default. The --newline  option  has
+                 is  the  operating system's default. The --newline option has
                  no effect on this option. This option may be given any number
                  of times; all the files are read.
 
        --include-dir=pattern
-                 If any --include-dir patterns are specified, the only  direc-
-                 tories  that  are  processed  are those that match one of the
-                 patterns (and do not match an  --exclude-dir  pattern).  This
-                 applies  to  all  directories,  whether listed on the command
-                 line, obtained from --file-list,  or  by  scanning  a  parent
-                 directory.  The pattern is a PCRE2 regular expression, and is
-                 matched against the final component of  the  directory  name,
-                 not  the entire path. The -F, -w, and -x options do not apply
+                 If  any --include-dir patterns are specified, the only direc-
+                 tories that are processed are those that  match  one  of  the
+                 patterns  (and  do  not match an --exclude-dir pattern). This
+                 applies to all directories, whether  listed  on  the  command
+                 line,  obtained  from  --file-list,  or  by scanning a parent
+                 directory. The pattern is a PCRE2 regular expression, and  is
+                 matched  against  the  final component of the directory name,
+                 not the entire path. The -F, -w, and -x options do not  apply
                  to this pattern. The option may be given any number of times.
-                 If  a directory matches both --include-dir and --exclude-dir,
+                 If a directory matches both --include-dir and  --exclude-dir,
                  it is excluded. There is no short form for this option.
 
        -L, --files-without-match
-                 Instead of outputting lines from the files, just  output  the
-                 names  of  the files that do not contain any lines that would
-                 have been output. Each file name is output once, on  a  sepa-
-                 rate line.
+                 Instead  of  outputting lines from the files, just output the
+                 names of the files that do not contain any lines  that  would
+                 have  been  output. Each file name is output once, on a sepa-
+                 rate line. This option overrides any previous -H, -h,  or  -l
+                 options.
 
        -l, --files-with-matches
                  Instead  of  outputting lines from the files, just output the
@@ -413,7 +451,8 @@ OPTIONS
                  matching continues in order to obtain the correct count,  and
                  those  files  that  have  at least one match are listed along
                  with their counts. Using this option with -c is a way of sup-
-                 pressing the listing of files with no matches.
+                 pressing  the  listing  of files with no matches. This opeion
+                 overrides any previous -H, -h, or -L options.
 
        --label=name
                  This option supplies a name to be used for the standard input
@@ -421,163 +460,194 @@ OPTIONS
                  input)" is used. There is no short form for this option.
 
        --line-buffered
-                 When  this  option is given, input is read and processed line
-                 by line, and the output  is  flushed  after  each  write.  By
-                 default,  input is read in large chunks, unless pcre2grep can
-                 determine that it is reading from a terminal (which  is  cur-
-                 rently  possible  only  in Unix-like environments). Output to
-                 terminal is normally automatically flushed by  the  operating
-                 system. This option can be useful when the input or output is
-                 attached to a pipe and you do not want pcre2grep to buffer up
-                 large  amounts  of data. However, its use will affect perfor-
-                 mance, and the -M (multiline) option ceases to work.
+                 When this option is given, non-compressed input is  read  and
+                 processed  line by line, and the output is flushed after each
+                 write. By default, input is  read  in  large  chunks,  unless
+                 pcre2grep  can  determine  that it is reading from a terminal
+                 (which is currently possible only in Unix-like environments).
+                 Output  to  terminal is normally automatically flushed by the
+                 operating system. This option can be useful when the input or
+                 output is attached to a pipe and you do not want pcre2grep to
+                 buffer up large amounts of data. However, its use will affect
+                 performance,  and  the  -M (multiline) option ceases to work.
+                 When input is from a compressed .gz  or  .bz2  file,  --line-
+                 buffered is ignored.
 
        --line-offsets
-                 Instead of showing lines or parts of lines that  match,  show
+                 Instead  of  showing lines or parts of lines that match, show
                  each match as a line number, the offset from the start of the
-                 line, and a length. The line number is terminated by a  colon
-                 (as  usual; see the -n option), and the offset and length are
-                 separated by a comma. In this  mode,  no  context  is  shown.
-                 That  is, the -A, -B, and -C options are ignored. If there is
-                 more than one match in a line, each of them  is  shown  sepa-
-                 rately. This option is mutually exclusive with --file-offsets
-                 and --only-matching.
+                 line,  and a length. The line number is terminated by a colon
+                 (as usual; see the -n option), and the offset and length  are
+                 separated  by  a  comma.  In  this mode, no context is shown.
+                 That is, the -A, -B, and -C options are ignored. If there  is
+                 more  than  one  match in a line, each of them is shown sepa-
+                 rately. This option  is  mutually  exclusive  with  --output,
+                 --file-offsets, and --only-matching.
 
        --locale=locale-name
-                 This option specifies a locale to be used for pattern  match-
-                 ing.  It  overrides the value in the LC_ALL or LC_CTYPE envi-
-                 ronment variables. If  no  locale  is  specified,  the  PCRE2
-                 library's  default (usually the "C" locale) is used. There is
+                 This  option specifies a locale to be used for pattern match-
+                 ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
+                 ronment  variables.  If  no  locale  is  specified, the PCRE2
+                 library's default (usually the "C" locale) is used. There  is
                  no short form for this option.
 
        --match-limit=number
-                 Processing some regular expression  patterns  can  require  a
-                 very  large amount of memory, leading in some cases to a pro-
-                 gram crash if not enough is available.   Other  patterns  may
-                 take  a  very  long  time to search for all possible matching
-                 strings.  The  pcre2_match()  function  that  is  called   by
-                 pcre2grep  to  do  the  matching  has two parameters that can
-                 limit the resources that it uses.
-
-                 The  --match-limit  option  provides  a  means  of   limiting
-                 resource usage when processing patterns that are not going to
-                 match, but which have a very large number of possibilities in
-                 their  search  trees.  The  classic example is a pattern that
-                 uses nested unlimited repeats. Internally, PCRE2 uses a func-
-                 tion  called  match()  which  it  calls repeatedly (sometimes
-                 recursively). The limit set by --match-limit  is  imposed  on
-                 the  number  of times this function is called during a match,
-                 which has the effect of limiting the amount  of  backtracking
-                 that can take place.
-
-                 The --recursion-limit option is similar to --match-limit, but
-                 instead of limiting the total number of times that match() is
-                 called, it limits the depth of recursive calls, which in turn
-                 limits the amount of memory that can be used.  The  recursion
-                 depth  is  a  smaller  number than the total number of calls,
-                 because not all calls to match() are recursive. This limit is
-                 of use only if it is set smaller than --match-limit.
-
-                 There  are no short forms for these options. The default set-
-                 tings are specified when the PCRE2 library is compiled,  with
-                 the default default being 10 million.
+                 Processing  some  regular expression patterns may take a very
+                 long time to search for all possible matching strings. Others
+                 may  require  a  very large amount of memory. There are three
+                 options that set resource limits for matching.
+
+                 The --match-limit option provides a means of limiting comput-
+                 ing  resource  usage  when  processing  patterns that are not
+                 going to match, but which have a very large number of  possi-
+                 bilities in their search trees. The classic example is a pat-
+                 tern that uses nested unlimited  repeats.  Internally,  PCRE2
+                 has  a  counter that is incremented each time around its main
+                 processing  loop.  If  the  value  set  by  --match-limit  is
+                 reached, an error occurs.
+
+                 The  --heap-limit option specifies, as a number of kilobytes,
+                 the amount of heap memory that may be used for matching. Heap
+                 memory is needed only if matching the pattern requires a sig-
+                 nificant number of nested backtracking points  to  be  remem-
+                 bered. This parameter can be set to zero to forbid the use of
+                 heap memory altogether.
+
+                 The --depth-limit option limits the  depth  of  nested  back-
+                 tracking points, which indirectly limits the amount of memory
+                 that is used. The amount of memory needed for each backtrack-
+                 ing  point  depends on the number of capturing parentheses in
+                 the pattern, so the amount of memory that is used before this
+                 limit  acts  varies from pattern to pattern. This limit is of
+                 use only if it is set smaller than --match-limit.
+
+                 There are no short forms for these options. The default  set-
+                 tings  are specified when the PCRE2 library is compiled, with
+                 the default defaults being  very  large  and  so  effectively
+                 unlimited.
+
+       --max-buffer-size=number
+                 This  limits  the  expansion  of the processing buffer, whose
+                 initial size can be set by --buffer-size. The maximum  buffer
+                 size  is  silently  forced to be no smaller than the starting
+                 buffer size.
 
        -M, --multiline
-                 Allow  patterns to match more than one line. When this option
-                 is given, patterns may usefully contain literal newline char-
-                 acters  and  internal  occurrences of ^ and $ characters. The
-                 output for a successful match may consist of  more  than  one
-                 line.  The  first is the line in which the match started, and
-                 the last is the line in which the match ended. If the matched
-                 string  ends  with  a newline sequence the output ends at the
-                 end of that line.
-
-                 When this option is set, the PCRE2 library is called in "mul-
-                 tiline" mode. This allows a matched string to extend past the
-                 end of a line and continue on one or more  subsequent  lines.
-                 However,  pcre2grep  still  processes the input line by line.
-                 Once a match has  been  handled,  scanning  restarts  at  the
-                 beginning  of  the  next line, just as it does when -M is not
-                 present. This means that it is possible  for  the  second  or
-                 subsequent  lines  in a multiline match to be output again as
-                 part of another match.
-
-                 The newline sequence that separates multiple  lines  must  be
-                 matched  as  part  of  the  pattern. For example, to find the
-                 phrase "regular expression" in a file where  "regular"  might
-                 be  at the end of a line and "expression" at the start of the
+                 Allow patterns to match more than one line. When this  option
+                 is set, the PCRE2 library is called in "multiline" mode. This
+                 allows a matched string to extend past the end of a line  and
+                 continue  on one or more subsequent lines. Patterns used with
+                 -M may usefully contain literal newline characters and inter-
+                 nal  occurrences of ^ and $ characters. The output for a suc-
+                 cessful match may consist of more than one  line.  The  first
+                 line  is  the  line  in which the match started, and the last
+                 line is the line in which the match  ended.  If  the  matched
+                 string  ends  with a newline sequence, the output ends at the
+                 end of that line.  If -v is set,  none  of  the  lines  in  a
+                 multi-line  match  are output. Once a match has been handled,
+                 scanning restarts at the beginning of the line after the  one
+                 in which the match ended.
+
+                 The  newline  sequence  that separates multiple lines must be
+                 matched as part of the pattern.  For  example,  to  find  the
+                 phrase  "regular  expression" in a file where "regular" might
+                 be at the end of a line and "expression" at the start of  the
                  next line, you could use this command:
 
                    pcre2grep -M 'regular\s+expression' <file>
 
-                 The \s escape sequence matches  any  white  space  character,
-                 including  newlines,  and  is  followed  by  + so as to match
-                 trailing white space on the first line as  well  as  possibly
+                 The  \s  escape  sequence  matches any white space character,
+                 including newlines, and is followed  by  +  so  as  to  match
+                 trailing  white  space  on the first line as well as possibly
                  handling a two-character newline sequence.
 
-                 There  is a limit to the number of lines that can be matched,
-                 imposed by the way that pcre2grep buffers the input  file  as
-                 it  scans  it.  However,  pcre2grep  ensures that at least 8K
-                 characters or the rest of the file (whichever is the shorter)
-                 are  available for forward matching, and similarly the previ-
-                 ous 8K characters (or all the previous characters,  if  fewer
-                 than 8K) are guaranteed to be available for lookbehind asser-
-                 tions. The -M option does not work when input is read line by
-                 line (see --line-buffered.)
+                 There is a limit to the number of lines that can be  matched,
+                 imposed  by  the way that pcre2grep buffers the input file as
+                 it scans it. With a  sufficiently  large  processing  buffer,
+                 this should not be a problem, but the -M option does not work
+                 when input is read line by line (see --line-buffered.)
 
        -N newline-type, --newline=newline-type
-                 The  PCRE2  library  supports  five different conventions for
-                 indicating the ends of lines. They are  the  single-character
-                 sequences  CR  (carriage  return) and LF (linefeed), the two-
-                 character sequence CRLF, an "anycrlf" convention, which  rec-
-                 ognizes  any  of the preceding three types, and an "any" con-
+                 The PCRE2 library supports  five  different  conventions  for
+                 indicating  the  ends of lines. They are the single-character
+                 sequences CR (carriage return) and LF  (linefeed),  the  two-
+                 character  sequence CRLF, an "anycrlf" convention, which rec-
+                 ognizes any of the preceding three types, and an  "any"  con-
                  vention, in which any Unicode line ending sequence is assumed
-                 to  end a line. The Unicode sequences are the three just men-
-                 tioned, plus  VT  (vertical  tab,  U+000B),  FF  (form  feed,
-                 U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
+                 to end a line. The Unicode sequences are the three just  men-
+                 tioned,  plus  VT  (vertical  tab,  U+000B),  FF  (form feed,
+                 U+000C),  NEL  (next  line,  U+0085),  LS  (line   separator,
                  U+2028), and PS (paragraph separator, U+2029).
 
-                 When the  PCRE2  library  is  built,  a  default  line-ending
-                 sequence   is  specified.   This  is  normally  the  standard
+                 When  the  PCRE2  library  is  built,  a  default line-ending
+                 sequence  is  specified.   This  is  normally  the   standard
                  sequence for the operating system. Unless otherwise specified
-                 by  this  option,  pcre2grep uses the library's default.  The
+                 by this option, pcre2grep uses the  library's  default.   The
                  possible values for this option are CR, LF, CRLF, ANYCRLF, or
-                 ANY.  This  makes  it possible to use pcre2grep to scan files
+                 ANY. This makes it possible to use pcre2grep  to  scan  files
                  that have come from other environments without having to mod-
-                 ify  their  line  endings.  If the data that is being scanned
-                 does not agree  with  the  convention  set  by  this  option,
-                 pcre2grep  may  behave in strange ways. Note that this option
-                 does not apply to files specified by the -f,  --exclude-from,
-                 or  --include-from  options,  which  are  expected to use the
+                 ify their line endings. If the data  that  is  being  scanned
+                 does  not  agree  with  the  convention  set  by this option,
+                 pcre2grep may behave in strange ways. Note that  this  option
+                 does  not apply to files specified by the -f, --exclude-from,
+                 or --include-from options, which  are  expected  to  use  the
                  operating system's standard newline sequence.
 
        -n, --line-number
                  Precede each output line by its line number in the file, fol-
-                 lowed  by  a colon for matching lines or a hyphen for context
+                 lowed by a colon for matching lines or a hyphen  for  context
                  lines. If the file name is also being output, it precedes the
-                 line  number.  When  the  -M option causes a pattern to match
-                 more than one line, only the first is preceded  by  its  line
+                 line number. When the -M option causes  a  pattern  to  match
+                 more  than  one  line, only the first is preceded by its line
                  number. This option is forced if --line-offsets is used.
 
-       --no-jit  If  the  PCRE2 library is built with support for just-in-time
+       --no-jit  If the PCRE2 library is built with support  for  just-in-time
                  compiling (which speeds up matching), pcre2grep automatically
                  makes use of this, unless it was explicitly disabled at build
-                 time. This option can be used to disable the use  of  JIT  at
-                 run  time. It is provided for testing and working round prob-
+                 time.  This  option  can be used to disable the use of JIT at
+                 run time. It is provided for testing and working round  prob-
                  lems.  It should never be needed in normal use.
 
+       -O text, --output=text
+                 When  there  is a match, instead of outputting the whole line
+                 that matched, output just the  given  text.  This  option  is
+                 mutually  exclusive with --only-matching, --file-offsets, and
+                 --line-offsets. Escape sequences starting with a dollar char-
+                 acter  may be used to insert the contents of the matched part
+                 of the line and/or captured substrings into the text.
+
+                 $<digits> or ${<digits>} is replaced  by  the  captured  sub-
+                 string  of  the  given  decimal  number; zero substitutes the
+                 whole match. If the number is greater than the number of cap-
+                 turing  substrings,  or if the capture is unset, the replace-
+                 ment is empty.
+
+                 $a is replaced by bell; $b by backspace; $e by escape; $f  by
+                 form  feed;  $n by newline; $r by carriage return; $t by tab;
+                 $v by vertical tab.
+
+                 $o<digits> is replaced by the character  represented  by  the
+                 given octal number; up to three digits are processed.
+
+                 $x<digits>  is  replaced  by the character represented by the
+                 given hexadecimal number; up to two digits are processed.
+
+                 Any other character is substituted by itself. In  particular,
+                 $$ is replaced by a single dollar.
+
        -o, --only-matching
                  Show only the part of the line that matched a pattern instead
-                 of  the  whole  line. In this mode, no context is shown. That
-                 is, the -A, -B, and -C options are ignored. If there is  more
-                 than  one  match in a line, each of them is shown separately.
-                 If -o is combined with -v (invert the sense of the  match  to
-                 find  non-matching  lines),  no  output is generated, but the
-                 return code is set appropriately. If the matched  portion  of
-                 the  line is empty, nothing is output unless the file name or
-                 line number are being printed, in which case they  are  shown
-                 on an otherwise empty line. This option is mutually exclusive
-                 with --file-offsets and --line-offsets.
+                 of the whole line. In this mode, no context  is  shown.  That
+                 is,  the -A, -B, and -C options are ignored. If there is more
+                 than one match in a line, each of them is  shown  separately,
+                 on  a  separate  line  of  output.  If -o is combined with -v
+                 (invert the sense of the match to find  non-matching  lines),
+                 no  output is generated, but the return code is set appropri-
+                 ately. If the matched portion of the line is  empty,  nothing
+                 is  output  unless  the  file  name  or line number are being
+                 printed, in which case they are shown on an  otherwise  empty
+                 line.  This  option  is  mutually  exclusive  with  --output,
+                 --file-offsets and --line-offsets.
 
        -onumber, --only-matching=number
                  Show only the part of the line  that  matched  the  capturing
@@ -587,82 +657,98 @@ OPTIONS
                  (see above), if an argument is present, it must be  given  in
                  the  same  shell item, for example, -o3 or --only-matching=2.
                  The comments given for the non-argument case above also apply
-                 to  this  case. If the specified capturing parentheses do not
+                 to this option. If the specified capturing parentheses do not
                  exist in the pattern, or were not set in the  match,  nothing
                  is  output unless the file name or line number are being out-
                  put.
 
                  If this option is given multiple times,  multiple  substrings
-                 are  output, in the order the options are given. For example,
-                 -o3 -o1 -o3 causes the substrings matched by capturing paren-
-                 theses  3  and  1  and then 3 again to be output. By default,
-                 there is no separator (but see the next option).
+                 are  output  for  each  match,  in  the order the options are
+                 given, and all on one line. For example, -o3 -o1  -o3  causes
+                 the  substrings  matched by capturing parentheses 3 and 1 and
+                 then 3 again to be output. By default, there is no  separator
+                 (but see the next option).
 
        --om-separator=text
-                 Specify a separating string for multiple occurrences  of  -o.
-                 The  default is an empty string. Separating strings are never
+                 Specify  a  separating string for multiple occurrences of -o.
+                 The default is an empty string. Separating strings are  never
                  coloured.
 
        -q, --quiet
                  Work quietly, that is, display nothing except error messages.
-                 The  exit  status  indicates  whether or not any matches were
+                 The exit status indicates whether or  not  any  matches  were
                  found.
 
        -r, --recursive
-                 If any given path is a directory, recursively scan the  files
-                 it  contains, taking note of any --include and --exclude set-
-                 tings. By default, a directory is read as a normal  file;  in
-                 some  operating  systems this gives an immediate end-of-file.
-                 This option is a shorthand  for  setting  the  -d  option  to
+                 If  any given path is a directory, recursively scan the files
+                 it contains, taking note of any --include and --exclude  set-
+                 tings.  By  default, a directory is read as a normal file; in
+                 some operating systems this gives an  immediate  end-of-file.
+                 This  option  is  a  shorthand  for  setting the -d option to
                  "recurse".
 
        --recursion-limit=number
                  See --match-limit above.
 
        -s, --no-messages
-                 Suppress  error  messages  about  non-existent  or unreadable
-                 files. Such files are quietly skipped.  However,  the  return
+                 Suppress error  messages  about  non-existent  or  unreadable
+                 files.  Such  files  are quietly skipped. However, the return
                  code is still 2, even if matches were found in other files.
 
+       -t, --total-count
+                 This option is useful when scanning more than  one  file.  If
+                 used  on its own, -t suppresses all output except for a grand
+                 total number of matching lines (or non-matching lines  if  -v
+                 is  used)  in  all  the files. If -t is used with -c, a grand
+                 total is output except when the previous output is  just  one
+                 line.  In  other words, it is not output when just one file's
+                 count is listed. If file names are being  output,  the  grand
+                 total  is preceded by "TOTAL:". Otherwise, it appears as just
+                 another number. The -t option is ignored when  used  with  -L
+                 (list  files  without matches), because the grand total would
+                 always be zero.
+
        -u, --utf-8
                  Operate in UTF-8 mode. This option is available only if PCRE2
                  has been compiled with UTF-8 support. All patterns (including
-                 those  for  any --exclude and --include options) and all sub-
-                 ject lines that are scanned must be valid  strings  of  UTF-8
+                 those for any --exclude and --include options) and  all  sub-
+                 ject  lines  that  are scanned must be valid strings of UTF-8
                  characters.
 
        -V, --version
-                 Write  the version numbers of pcre2grep and the PCRE2 library
-                 to the standard output and then exit. Anything  else  on  the
+                 Write the version numbers of pcre2grep and the PCRE2  library
+                 to  the  standard  output and then exit. Anything else on the
                  command line is ignored.
 
        -v, --invert-match
-                 Invert  the  sense  of  the match, so that lines which do not
+                 Invert the sense of the match, so that  lines  which  do  not
                  match any of the patterns are the ones that are found.
 
        -w, --word-regex, --word-regexp
-                 Force the patterns to match only whole words. This is equiva-
-                 lent  to  having \b at the start and end of the pattern. This
-                 option applies only to the patterns that are matched  against
-                 the  contents  of files; it does not apply to patterns speci-
-                 fied by any of the --include or --exclude options.
+                 Force the patterns only to match "words". That is, there must
+                 be a word boundary at the  start  and  end  of  each  matched
+                 string.  This is equivalent to having "\b(?:" at the start of
+                 each pattern, and ")\b" at the end. This option applies  only
+                 to  the  patterns  that  are  matched against the contents of
+                 files; it does not apply to patterns specified by any of  the
+                 --include or --exclude options.
 
        -x, --line-regex, --line-regexp
-                 Force the patterns to be anchored (each must  start  matching
-                 at  the beginning of a line) and in addition, require them to
-                 match entire lines. This is equivalent  to  having  ^  and  $
-                 characters at the start and end of each alternative top-level
-                 branch in every pattern. This option applies only to the pat-
-                 terns that are matched against the contents of files; it does
-                 not apply to patterns specified by any of  the  --include  or
-                 --exclude options.
+                 Force  the  patterns to start matching only at the beginnings
+                 of lines, and in  addition,  require  them  to  match  entire
+                 lines. In multiline mode the match may be more than one line.
+                 This is equivalent to having "^(?:" at the start of each pat-
+                 tern  and  ")$"  at  the end. This option applies only to the
+                 patterns that are matched against the contents of  files;  it
+                 does  not apply to patterns specified by any of the --include
+                 or --exclude options.
 
 
 ENVIRONMENT VARIABLES
 
-       The  environment  variables  LC_ALL  and LC_CTYPE are examined, in that
-       order, for a locale. The first one that is set is  used.  This  can  be
-       overridden  by  the  --locale  option.  If  no locale is set, the PCRE2
+       The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that
+       order,  for  a  locale.  The first one that is set is used. This can be
+       overridden by the --locale option. If  no  locale  is  set,  the  PCRE2
        library's default (usually the "C" locale) is used.
 
 
@@ -670,82 +756,87 @@ NEWLINES
 
        The -N (--newline) option allows pcre2grep to scan files with different
        newline conventions from the default. Any parts of the input files that
-       are written to the standard output are copied identically,  with  what-
-       ever  newline sequences they have in the input. However, the setting of
-       this option does not affect the interpretation of  files  specified  by
+       are  written  to the standard output are copied identically, with what-
+       ever newline sequences they have in the input. However, the setting  of
+       this  option  does  not affect the interpretation of files specified by
        the -f, --exclude-from, or --include-from options, which are assumed to
-       use the operating system's  standard  newline  sequence,  nor  does  it
-       affect  the way in which pcre2grep writes informational messages to the
+       use  the  operating  system's  standard  newline  sequence, nor does it
+       affect the way in which pcre2grep writes informational messages to  the
        standard error and output streams. For these it uses the string "\n" to
-       indicate  newlines,  relying on the C I/O library to convert this to an
+       indicate newlines, relying on the C I/O library to convert this  to  an
        appropriate sequence.
 
 
 OPTIONS COMPATIBILITY
 
        Many of the short and long forms of pcre2grep's options are the same as
-       in  the GNU grep program. Any long option of the form --xxx-regexp (GNU
+       in the GNU grep program. Any long option of the form --xxx-regexp  (GNU
        terminology) is also available as --xxx-regex (PCRE2 terminology). How-
-       ever,  the  --file-list, --file-offsets, --include-dir, --line-offsets,
-       --locale, --match-limit, -M, --multiline, -N,  --newline,  --om-separa-
-       tor,  --recursion-limit,  -u,  and  --utf-8  options  are  specific  to
-       pcre2grep, as is the use of the --only-matching option with a capturing
-       parentheses number.
-
-       Although  most  of the common options work the same way, a few are dif-
-       ferent in pcre2grep. For example, the --include option's argument is  a
-       glob  for GNU grep, but a regular expression for pcre2grep. If both the
-       -c and -l options are given, GNU grep lists only  file  names,  without
+       ever, the  --depth-limit,  --file-list,  --file-offsets,  --heap-limit,
+       --include-dir,  --line-offsets,  --locale,  --match-limit, -M, --multi-
+       line, -N, --newline, --om-separator, --output, -u, and --utf-8  options
+       are  specific to pcre2grep, as is the use of the --only-matching option
+       with a capturing parentheses number.
+
+       Although most of the common options work the same way, a few  are  dif-
+       ferent  in pcre2grep. For example, the --include option's argument is a
+       glob for GNU grep, but a regular expression for pcre2grep. If both  the
+       -c  and  -l  options are given, GNU grep lists only file names, without
        counts, but pcre2grep gives the counts as well.
 
 
 OPTIONS WITH DATA
 
        There are four different ways in which an option with data can be spec-
-       ified.  If a short form option is used, the  data  may  follow  immedi-
+       ified.   If  a  short  form option is used, the data may follow immedi-
        ately, or (with one exception) in the next command line item. For exam-
        ple:
 
          -f/some/file
          -f /some/file
 
-       The exception is the -o option, which may appear with or without  data.
-       Because  of this, if data is present, it must follow immediately in the
+       The  exception is the -o option, which may appear with or without data.
+       Because of this, if data is present, it must follow immediately in  the
        same item, for example -o3.
 
-       If a long form option is used, the data may appear in the same  command
-       line  item,  separated by an equals character, or (with two exceptions)
+       If  a long form option is used, the data may appear in the same command
+       line item, separated by an equals character, or (with  two  exceptions)
        it may appear in the next command line item. For example:
 
          --file=/some/file
          --file /some/file
 
-       Note, however, that if you want to supply a file name beginning with  ~
-       as  data  in  a  shell  command,  and have the shell expand ~ to a home
+       Note,  however, that if you want to supply a file name beginning with ~
+       as data in a shell command, and have the  shell  expand  ~  to  a  home
        directory, you must separate the file name from the option, because the
        shell does not treat ~ specially unless it is at the start of an item.
 
-       The  exceptions  to the above are the --colour (or --color) and --only-
-       matching options, for which the data  is  optional.  If  one  of  these
-       options  does  have  data, it must be given in the first form, using an
+       The exceptions to the above are the --colour (or --color)  and  --only-
+       matching  options,  for  which  the  data  is optional. If one of these
+       options does have data, it must be given in the first  form,  using  an
        equals character. Otherwise pcre2grep will assume that it has no data.
 
 
-CALLING EXTERNAL SCRIPTS
+USING PCRE2'S CALLOUT FACILITY
+
+       pcre2grep  has,  by  default,  support for calling external programs or
+       scripts or echoing specific strings during matching by  making  use  of
+       PCRE2's  callout  facility.  However, this support can be disabled when
+       pcre2grep is built. You can find out whether your  binary  has  support
+       for  callouts  by  running it with the --help option. If the support is
+       not enabled, all callouts in patterns are ignored by pcre2grep.
+
+       A callout in a PCRE2 pattern is of the form (?C<arg>) where  the  argu-
+       ment  is either a number or a quoted string (see the pcre2callout docu-
+       mentation for details). Numbered callouts  are  ignored  by  pcre2grep;
+       only callouts with string arguments are useful.
 
-       On non-Windows systems, pcre2grep has, by default, support for  calling
-       external  programs  or scripts during matching by making use of PCRE2's
-       callout facility. However, this support can be disabled when  pcre2grep
-       is  built.   You can find out whether your binary has support for call-
-       outs by running it with the  --help  option.  If  the  support  is  not
-       enabled, all callouts in patterns are ignored by pcre2grep.
+   Calling external programs or scripts
 
-       A  callout  in a PCRE2 pattern is of the form (?C<arg>) where the argu-
-       ment is either a number or a quoted string (see the pcre2callout  docu-
-       mentation  for  details).  Numbered  callouts are ignored by pcre2grep.
-       String arguments are parsed as a list of substrings separated  by  pipe
-       (vertical  bar)  characters.  The first substring must be an executable
-       name, with the following substrings specifying arguments:
+       If the callout string does not start with a pipe (vertical bar) charac-
+       ter, it is parsed into a list of substrings separated by  pipe  charac-
+       ters.  The first substring must be an executable name, with the follow-
+       ing substrings specifying arguments:
 
          executable_name|arg1|arg2|...
 
@@ -781,6 +872,18 @@ CALLING EXTERNAL SCRIPTS
        local matching failure occurs and the matcher backtracks in the  normal
        way.
 
+   Echoing a specific string
+
+       If  the callout string starts with a pipe (vertical bar) character, the
+       rest of the string is written to the output, having been passed through
+       the  same escape processing as text from the --output option. This pro-
+       vides a simple echoing facility that avoids calling an external program
+       or  script. No terminator is added to the string, so if you want a new-
+       line, you must include  it  explicitly.   Matching  continues  normally
+       after  the string is output. If you want to see only the callout output
+       but not any output from an actual match, you should  end  the  relevant
+       pattern with (*FAIL).
+
 
 MATCHING ERRORS
 
@@ -794,9 +897,9 @@ MATCHING ERRORS
        such errors, pcre2grep gives up.
 
        The --match-limit option of pcre2grep can be used to  set  the  overall
-       resource  limit; there is a second option called --recursion-limit that
-       sets a limit on the amount of memory (usually stack) that is used  (see
-       the discussion of these options above).
+       resource  limit.  There are also other limits that affect the amount of
+       memory used during matching; see the  discussion  of  --heap-limit  and
+       --depth-limit above.
 
 
 DIAGNOSTICS
@@ -807,6 +910,10 @@ DIAGNOSTICS
        errors. Using the -s option to suppress error messages about inaccessi-
        ble files does not affect the return code.
 
+       When   run  under  VMS,  the  return  code  is  placed  in  the  symbol
+       PCRE2GREP_RC because VMS  does  not  distinguish  between  exit(0)  and
+       exit(1).
+
 
 SEE ALSO
 
@@ -822,5 +929,5 @@ AUTHOR
 
 REVISION
 
-       Last updated: 19 June 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 13 November 2017
+       Copyright (c) 1997-2017 University of Cambridge.
diff --git a/doc/pcre2jit.3 b/doc/pcre2jit.3
index 0b95b4d..f6d17ca 100644
--- a/doc/pcre2jit.3
+++ b/doc/pcre2jit.3
@@ -1,4 +1,4 @@
-.TH PCRE2JIT 3 "05 June 2016" "PCRE2 10.22"
+.TH PCRE2JIT 3 "31 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "PCRE2 JUST-IN-TIME COMPILER SUPPORT"
@@ -152,7 +152,7 @@ below for a discussion of JIT stack usage.
 The error code PCRE2_ERROR_MATCHLIMIT is returned by the JIT code if searching
 a very large pattern tree goes on for too long, as it is in the same
 circumstance when JIT is not used, but the details of exactly what is counted
-are not the same. The PCRE2_ERROR_RECURSIONLIMIT error code is never returned
+are not the same. The PCRE2_ERROR_DEPTHLIMIT error code is never returned
 when JIT matching is used.
 .
 .
@@ -178,11 +178,8 @@ allocation functions, or NULL for standard memory allocation). It returns a
 pointer to an opaque structure of type \fBpcre2_jit_stack\fP, or NULL if there
 is an error. The \fBpcre2_jit_stack_free()\fP function is used to free a stack
 that is no longer needed. (For the technically minded: the address space is
-allocated by mmap or VirtualAlloc.)
-.P
-JIT uses far less memory for recursion than the interpretive code,
-and a maximum stack size of 512K to 1M should be more than enough for any
-pattern.
+allocated by mmap or VirtualAlloc.) A maximum stack size of 512K to 1M should
+be more than enough for any pattern.
 .P
 The \fBpcre2_jit_stack_assign()\fP function specifies which stack JIT code
 should use. Its arguments are as follows:
@@ -413,6 +410,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 05 June 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 31 March 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2limits.3 b/doc/pcre2limits.3
index a5bab81..88944db 100644
--- a/doc/pcre2limits.3
+++ b/doc/pcre2limits.3
@@ -1,4 +1,4 @@
-.TH PCRE2LIMITS 3 "05 November 2015" "PCRE2 10.21"
+.TH PCRE2LIMITS 3 "30 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "SIZE AND OTHER LIMITATIONS"
@@ -30,15 +30,6 @@ integer type, usually defined as size_t. Its maximum value (that is
 ~(PCRE2_SIZE)0) is reserved as a special indicator for zero-terminated strings
 and unset offsets.
 .P
-Note that when using the traditional matching function, PCRE2 uses recursion to
-handle subpatterns and indefinite repetition. This means that the available
-stack space may limit the size of a subject string that can be processed by
-certain patterns. For a discussion of stack issues, see the
-.\" HREF
-\fBpcre2stack\fP
-.\"
-documentation.
-.P
 All values in repeating quantifiers must be less than 65536.
 .P
 The maximum length of a lookbehind assertion is 65535 characters.
@@ -46,19 +37,20 @@ The maximum length of a lookbehind assertion is 65535 characters.
 There is no limit to the number of parenthesized subpatterns, but there can be
 no more than 65535 capturing subpatterns. There is, however, a limit to the
 depth of nesting of parenthesized subpatterns of all kinds. This is imposed in
-order to limit the amount of system stack used at compile time. The limit can
-be specified when PCRE2 is built; the default is 250.
-.P
-There is a limit to the number of forward references to subsequent subpatterns
-of around 200,000. Repeated forward references with fixed upper limits, for
-example, (?2){0,100} when subpattern number 2 is to the right, are included in
-the count. There is no limit to the number of backward references.
+order to limit the amount of system stack used at compile time. The default
+limit can be specified when PCRE2 is built; the default default is 250. An
+application can change this limit by calling pcre2_set_parens_nest_limit() to
+set the limit in a compile context.
 .P
 The maximum length of name for a named subpattern is 32 code units, and the
 maximum number of named subpatterns is 10000.
 .P
 The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
-is 255 for the 8-bit library and 65535 for the 16-bit and 32-bit libraries.
+is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
+32-bit libraries.
+.P
+The maximum length of a string argument to a callout is the largest number a
+32-bit unsigned integer can hold.
 .
 .
 .SH AUTHOR
@@ -75,6 +67,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 05 November 2015
-Copyright (c) 1997-2015 University of Cambridge.
+Last updated: 30 March 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2pattern.3 b/doc/pcre2pattern.3
index 70ac14a..5c0daa8 100644
--- a/doc/pcre2pattern.3
+++ b/doc/pcre2pattern.3
@@ -1,4 +1,4 @@
-.TH PCRE2PATTERN 3 "20 June 2016" "PCRE2 10.22"
+.TH PCRE2PATTERN 3 "12 September 2017" "PCRE2 10.31"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "PCRE2 REGULAR EXPRESSION DETAILS"
@@ -138,36 +138,52 @@ the application to apply the JIT optimization by calling
 \fBpcre2_jit_compile()\fP is ignored.
 .
 .
-.SS "Setting match and recursion limits"
+.SS "Setting match resource limits"
 .rs
 .sp
-The caller of \fBpcre2_match()\fP can set a limit on the number of times the
-internal \fBmatch()\fP function is called and on the maximum depth of
-recursive calls. These facilities are provided to catch runaway matches that
-are provoked by patterns with huge matching trees (a typical example is a
-pattern with nested unlimited repeats) and to avoid running out of system stack
-by too much recursion. When one of these limits is reached, \fBpcre2_match()\fP
-gives an error return. The limits can also be set by items at the start of the
-pattern of the form
+The pcre2_match() function contains a counter that is incremented every time it
+goes round its main loop. The caller of \fBpcre2_match()\fP can set a limit on
+this counter, which therefore limits the amount of computing resource used for
+a match. The maximum depth of nested backtracking can also be limited; this
+indirectly restricts the amount of heap memory that is used, but there is also
+an explicit memory limit that can be set.
+.P
+These facilities are provided to catch runaway matches that are provoked by
+patterns with huge matching trees (a typical example is a pattern with nested
+unlimited repeats applied to a long string that does not match). When one of
+these limits is reached, \fBpcre2_match()\fP gives an error return. The limits
+can also be set by items at the start of the pattern of the form
 .sp
+  (*LIMIT_HEAP=d)
   (*LIMIT_MATCH=d)
-  (*LIMIT_RECURSION=d)
+  (*LIMIT_DEPTH=d)
 .sp
 where d is any number of decimal digits. However, the value of the setting must
 be less than the value set (or defaulted) by the caller of \fBpcre2_match()\fP
 for it to have any effect. In other words, the pattern writer can lower the
 limits set by the programmer, but not raise them. If there is more than one
 setting of one of these limits, the lower value is used.
+.P
+Prior to release 10.30, LIMIT_DEPTH was called LIMIT_RECURSION. This name is
+still recognized for backwards compatibility.
+.P
+The heap limit applies only when the \fBpcre2_match()\fP interpreter is used
+for matching. It does not apply to JIT or DFA matching. The match limit is used
+(but in a different way) when JIT is being used, or when
+\fBpcre2_dfa_match()\fP is called, to limit computing resource usage by those
+matching functions. The depth limit is ignored by JIT but is relevant for DFA
+matching, which uses function recursion for recursions within the pattern. In
+this case, the depth limit controls the amount of system stack that is used.
 .
 .
 .\" HTML <a name="newlines"></a>
 .SS "Newline conventions"
 .rs
 .sp
-PCRE2 supports five different conventions for indicating line breaks in
+PCRE2 supports six different conventions for indicating line breaks in
 strings: a single CR (carriage return) character, a single LF (linefeed)
-character, the two-character sequence CRLF, any of the three preceding, or any
-Unicode newline sequence. The
+character, the two-character sequence CRLF, any of the three preceding, any
+Unicode newline sequence, or the NUL character (binary zero). The
 .\" HREF
 \fBpcre2api\fP
 .\"
@@ -180,13 +196,14 @@ about newlines, and shows how to set the newline convention when calling
 \fBpcre2_compile()\fP.
 .P
 It is also possible to specify a newline convention by starting a pattern
-string with one of the following five sequences:
+string with one of the following sequences:
 .sp
   (*CR)        carriage return
   (*LF)        linefeed
   (*CRLF)      carriage return, followed by linefeed
   (*ANYCRLF)   any of the three above
   (*ANY)       all Unicode newline sequences
+  (*NUL)       the NUL character (binary zero)
 .sp
 These override the default and the options given to the compiling function. For
 example, on a Unix system where LF is the default newline sequence, the pattern
@@ -201,8 +218,8 @@ The newline convention affects where the circumflex and dollar assertions are
 true. It also affects the interpretation of the dot metacharacter when
 PCRE2_DOTALL is not set, and the behaviour of \eN. However, it does not affect
 what the \eR escape sequence matches. By default, this is any Unicode newline
-sequence, for Perl compatibility. However, this can be changed; see the
-description of \eR in the section entitled
+sequence, for Perl compatibility. However, this can be changed; see the next
+section and the description of \eR in the section entitled
 .\" HTML <a href="#newlineseq">
 .\" </a>
 "Newline sequences"
@@ -225,7 +242,7 @@ corresponding to PCRE2_BSR_UNICODE.
 .rs
 .sp
 PCRE2 can be compiled to run in an environment that uses EBCDIC as its
-character code rather than ASCII or Unicode (typically a mainframe system). In
+character code instead of ASCII or Unicode (typically a mainframe system). In
 the sections below, character code values are ASCII or Unicode; in an EBCDIC
 environment these characters may have different code values, and there are no
 code points greater than 255.
@@ -292,11 +309,11 @@ character that is not a number or a letter, it takes away any special meaning
 that character may have. This use of backslash as an escape character applies
 both inside and outside character classes.
 .P
-For example, if you want to match a * character, you write \e* in the pattern.
-This escaping action applies whether or not the following character would
-otherwise be interpreted as a metacharacter, so it is always safe to precede a
-non-alphanumeric with backslash to specify that it stands for itself. In
-particular, if you want to match a backslash, you write \e\e.
+For example, if you want to match a * character, you must write \e* in the
+pattern. This escaping action applies whether or not the following character
+would otherwise be interpreted as a metacharacter, so it is always safe to
+precede a non-alphanumeric with backslash to specify that it stands for itself.
+In particular, if you want to match a backslash, you write \e\e.
 .P
 In a UTF mode, only ASCII numbers and letters have any special meaning after a
 backslash. All other characters (in particular, those whose codepoints are
@@ -326,7 +343,7 @@ An isolated \eE that is not preceded by \eQ is ignored. If \eQ is not followed
 by \eE later in the pattern, the literal interpretation continues to the end of
 the pattern (that is, \eE is assumed at the end). If the isolated \eQ is inside
 a character class, this causes an error, because the character class is not
-terminated.
+terminated by a closing square bracket.
 .
 .
 .\" HTML <a name="digitsafterbackslash"></a>
@@ -359,29 +376,28 @@ case letter, it is converted to upper case. Then bit 6 of the character (hex
 40) is inverted. Thus \ecA to \ecZ become hex 01 to hex 1A (A is 41, Z is 5A),
 but \ec{ becomes hex 3B ({ is 7B), and \ec; becomes hex 7B (; is 3B). If the
 code unit following \ec has a value less than 32 or greater than 126, a
-compile-time error occurs. This locks out non-printable ASCII characters in all
-modes.
+compile-time error occurs.
 .P
 When PCRE2 is compiled in EBCDIC mode, \ea, \ee, \ef, \en, \er, and \et
 generate the appropriate EBCDIC code values. The \ec escape is processed
 as specified for Perl in the \fBperlebcdic\fP document. The only characters
 that are allowed after \ec are A-Z, a-z, or one of @, [, \e, ], ^, _, or ?. Any
-other character provokes a compile-time error. The sequence \e@ encodes
-character code 0; the letters (in either case) encode characters 1-26 (hex 01
-to hex 1A); [, \e, ], ^, and _ encode characters 27-31 (hex 1B to hex 1F), and
-\e? becomes either 255 (hex FF) or 95 (hex 5F).
+other character provokes a compile-time error. The sequence \ec@ encodes
+character code 0; after \ec the letters (in either case) encode characters 1-26
+(hex 01 to hex 1A); [, \e, ], ^, and _ encode characters 27-31 (hex 1B to hex
+1F), and \ec? becomes either 255 (hex FF) or 95 (hex 5F).
 .P
-Thus, apart from \e?, these escapes generate the same character code values as
+Thus, apart from \ec?, these escapes generate the same character code values as
 they do in an ASCII environment, though the meanings of the values mostly
-differ. For example, \eG always generates code value 7, which is BEL in ASCII
+differ. For example, \ecG always generates code value 7, which is BEL in ASCII
 but DEL in EBCDIC.
 .P
-The sequence \e? generates DEL (127, hex 7F) in an ASCII environment, but
+The sequence \ec? generates DEL (127, hex 7F) in an ASCII environment, but
 because 127 is not a control character in EBCDIC, Perl makes it generate the
 APC character. Unfortunately, there are several variants of EBCDIC. In most of
 them the APC character has the value 255 (hex FF), but in the one Perl calls
 POSIX-BC its value is 95 (hex 5F). If certain other characters have POSIX-BC
-values, PCRE2 makes \e? generate 95; otherwise it generates 255.
+values, PCRE2 makes \ec? generate 95; otherwise it generates 255.
 .P
 After \e0 up to two further octal digits are read. If there are fewer than two
 digits, just those that are present are used. Thus the sequence \e0\ex\e015
@@ -455,9 +471,9 @@ a hexadecimal digit appears between \ex{ and }, or if there is no terminating
 .P
 If the PCRE2_ALT_BSUX option is set, the interpretation of \ex is as just
 described only when it is followed by two hexadecimal digits. Otherwise, it
-matches a literal "x" character. In this mode mode, support for code points
-greater than 256 is provided by \eu, which must be followed by four hexadecimal
-digits; otherwise it matches a literal "u" character.
+matches a literal "x" character. In this mode, support for code points greater
+than 256 is provided by \eu, which must be followed by four hexadecimal digits;
+otherwise it matches a literal "u" character.
 .P
 Characters whose value is less than 256 can be defined by either of the two
 syntaxes for \ex (or by \eu in PCRE2_ALT_BSUX mode). There is no difference in
@@ -471,15 +487,15 @@ the way they are handled. For example, \exdc is exactly the same as \ex{dc} (or
 Characters that are specified using octal or hexadecimal numbers are
 limited to certain values, as follows:
 .sp
-  8-bit non-UTF mode    less than 0x100
-  8-bit UTF-8 mode      less than 0x10ffff and a valid codepoint
-  16-bit non-UTF mode   less than 0x10000
-  16-bit UTF-16 mode    less than 0x10ffff and a valid codepoint
-  32-bit non-UTF mode   less than 0x100000000
-  32-bit UTF-32 mode    less than 0x10ffff and a valid codepoint
+  8-bit non-UTF mode    no greater than 0xff
+  16-bit non-UTF mode   no greater than 0xffff
+  32-bit non-UTF mode   no greater than 0xffffffff
+  All UTF modes         no greater than 0x10ffff and a valid codepoint
 .sp
-Invalid Unicode codepoints are the range 0xd800 to 0xdfff (the so-called
-"surrogate" codepoints), and 0xffef.
+Invalid Unicode codepoints are all those in the range 0xd800 to 0xdfff (the
+so-called "surrogate" codepoints). The check for these can be disabled by the
+caller of \fBpcre2_compile()\fP by setting the option
+PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES.
 .
 .
 .SS "Escape sequences in character classes"
@@ -502,15 +518,15 @@ In Perl, the sequences \el, \eL, \eu, and \eU are recognized by its string
 handler and used to modify the case of following characters. By default, PCRE2
 does not support these escape sequences. However, if the PCRE2_ALT_BSUX option
 is set, \eU matches a "U" character, and \eu can be used to define a character
-by code point, as described in the previous section.
+by code point, as described above.
 .
 .
 .SS "Absolute and relative back references"
 .rs
 .sp
-The sequence \eg followed by an unsigned or a negative number, optionally
-enclosed in braces, is an absolute or relative back reference. A named back
-reference can be coded as \eg{name}. Back references are discussed
+The sequence \eg followed by a signed or unsigned number, optionally enclosed
+in braces, is an absolute or relative back reference. A named back reference
+can be coded as \eg{name}. Back references are discussed
 .\" HTML <a href="#backreferences">
 .\" </a>
 later,
@@ -710,7 +726,9 @@ When PCRE2 is built with Unicode support (the default), three additional escape
 sequences that match characters with specific properties are available. In
 8-bit non-UTF-8 mode, these sequences are of course limited to testing
 characters whose codepoints are less than 256, but they do work in this mode.
-The extra escape sequences are:
+In 32-bit non-UTF mode, codepoints greater than 0x10ffff (the Unicode limit)
+may be encountered. These are all treated as being in the Common script and
+with an unassigned type. The extra escape sequences are:
 .sp
   \ep{\fIxx\fP}   a character with the \fIxx\fP property
   \eP{\fIxx\fP}   a character without the \fIxx\fP property
@@ -738,6 +756,7 @@ example:
 Those that are not part of an identified script are lumped together as
 "Common". The current list of scripts is:
 .P
+Adlam,
 Ahom,
 Anatolian_Hieroglyphs,
 Arabic,
@@ -748,6 +767,7 @@ Bamum,
 Bassa_Vah,
 Batak,
 Bengali,
+Bhaiksuki,
 Bopomofo,
 Brahmi,
 Braille,
@@ -809,6 +829,8 @@ Mahajani,
 Malayalam,
 Mandaic,
 Manichaean,
+Marchen,
+Masaram_Gondi,
 Meetei_Mayek,
 Mende_Kikakui,
 Meroitic_Cursive,
@@ -821,7 +843,9 @@ Multani,
 Myanmar,
 Nabataean,
 New_Tai_Lue,
+Newa,
 Nko,
+Nushu,
 Ogham,
 Ol_Chiki,
 Old_Hungarian,
@@ -832,6 +856,7 @@ Old_Persian,
 Old_South_Arabian,
 Old_Turkic,
 Oriya,
+Osage,
 Osmanya,
 Pahawh_Hmong,
 Palmyrene,
@@ -849,6 +874,7 @@ Siddham,
 SignWriting,
 Sinhala,
 Sora_Sompeng,
+Soyombo,
 Sundanese,
 Syloti_Nagri,
 Syriac,
@@ -859,6 +885,7 @@ Tai_Tham,
 Tai_Viet,
 Takri,
 Tamil,
+Tangut,
 Telugu,
 Thaana,
 Thai,
@@ -868,7 +895,8 @@ Tirhuta,
 Ugaritic,
 Vai,
 Warang_Citi,
-Yi.
+Yi,
+Zanabazar_Square.
 .P
 Each character has exactly one Unicode general category property, specified by
 a two-letter abbreviation. For compatibility with Perl, negation can be
@@ -972,9 +1000,11 @@ grapheme cluster", and treats the sequence as an atomic group
 .\"
 Unicode supports various kinds of composite character by giving each character
 a grapheme breaking property, and having rules that use these properties to
-define the boundaries of extended grapheme clusters. \eX always matches at
-least one character. Then it decides whether to add additional characters
-according to the following rules for ending a cluster:
+define the boundaries of extended grapheme clusters. The rules are defined in
+Unicode Standard Annex 29, "Unicode Text Segmentation".
+.P
+\eX always matches at least one character. Then it decides whether to add
+additional characters according to the following rules for ending a cluster:
 .P
 1. End at the end of the subject string.
 .P
@@ -985,11 +1015,22 @@ are of five types: L, V, T, LV, and LVT. An L character may be followed by an
 L, V, LV, or LVT character; an LV or V character may be followed by a V or T
 character; an LVT or T character may be follwed only by a T character.
 .P
-4. Do not end before extending characters or spacing marks. Characters with
-the "mark" property always have the "extend" grapheme breaking property.
+4. Do not end before extending characters or spacing marks or the "zero-width
+joiner" characters. Characters with the "mark" property always have the
+"extend" grapheme breaking property.
 .P
 5. Do not end after prepend characters.
 .P
+6. Do not break within emoji modifier sequences (a base character followed by a
+modifier). Extending characters are allowed before the modifier.
+.P
+7. Do not break within emoji zwj sequences (zero-width jointer followed by
+"glue after ZWJ" or "base glue after ZWJ").
+.P
+8. Do not break within emoji flag sequences. That is, do not break between
+regional indicator (RI) characters if there are an odd number of RI characters
+before the break point.
+.P
 6. Otherwise, end the cluster.
 .
 .
@@ -1325,13 +1366,34 @@ when matching character classes, whatever line-ending sequence is in use, and
 whatever setting of the PCRE2_DOTALL and PCRE2_MULTILINE options is used. A
 class such as [^a] always matches one of these characters.
 .P
+The character escape sequences \ed, \eD, \eh, \eH, \ep, \eP, \es, \eS, \ev,
+\eV, \ew, and \eW may appear in a character class, and add the characters that
+they match to the class. For example, [\edABCDEF] matches any hexadecimal
+digit. In UTF modes, the PCRE2_UCP option affects the meanings of \ed, \es, \ew
+and their upper case partners, just as it does when they appear outside a
+character class, as described in the section entitled
+.\" HTML <a href="#genericchartypes">
+.\" </a>
+"Generic character types"
+.\"
+above. The escape sequence \eb has a different meaning inside a character
+class; it matches the backspace character. The sequences \eB, \eN, \eR, and \eX
+are not special inside a character class. Like any other unrecognized escape
+sequences, they cause an error.
+.P
 The minus (hyphen) character can be used to specify a range of characters in a
 character class. For example, [d-m] matches any letter between d and m,
 inclusive. If a minus character is required in a class, it must be escaped with
 a backslash or appear in a position where it cannot be interpreted as
-indicating a range, typically as the first or last character in the class, or
-immediately after a range. For example, [b-d-z] matches letters in the range b
-to d, a hyphen character, or z.
+indicating a range, typically as the first or last character in the class,
+or immediately after a range. For example, [b-d-z] matches letters in the range
+b to d, a hyphen character, or z.
+.P
+Perl treats a hyphen as a literal if it appears before or after a POSIX class
+(see below) or before or after a character type escape such as as \ed or \eH.
+However, unless the hyphen is the last character in the class, Perl outputs a
+warning in its warning mode, as this is most likely a user error. As PCRE2 has
+no facility for warning, an error is given in these cases.
 .P
 It is not possible to have the literal character "]" as the end character of a
 range. A pattern such as [W-]46] is interpreted as a class of two characters
@@ -1341,15 +1403,14 @@ the end of range, so [W-\e]46] is interpreted as a class containing a range
 followed by two other characters. The octal or hexadecimal representation of
 "]" can also be used to end a range.
 .P
-An error is generated if a POSIX character class (see below) or an escape
-sequence other than one that defines a single character appears at a point
-where a range ending character is expected. For example, [z-\exff] is valid,
-but [A-\ed] and [A-[:digit:]] are not.
-.P
 Ranges normally include all code points between the start and end characters,
 inclusive. They can also be used for code points specified numerically, for
 example [\e000-\e037]. Ranges can include any characters that are valid for the
-current mode.
+current mode. In any UTF mode, the so-called "surrogate" characters (those
+whose code points lie between 0xd800 and 0xdfff inclusive) may not be specified
+explicitly by default (the PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES option disables
+this check). However, ranges such as [\ex{d7ff}-\ex{e000}], which include the
+surrogates, are always permitted.
 .P
 There is a special case in EBCDIC environments for ranges whose end points are
 both specified as literal letters in the same case. For compatibility with
@@ -1365,21 +1426,6 @@ matches the letters in either case. For example, [W-c] is equivalent to
 tables for a French locale are in use, [\exc8-\excb] matches accented E
 characters in both cases.
 .P
-The character escape sequences \ed, \eD, \eh, \eH, \ep, \eP, \es, \eS, \ev,
-\eV, \ew, and \eW may appear in a character class, and add the characters that
-they match to the class. For example, [\edABCDEF] matches any hexadecimal
-digit. In UTF modes, the PCRE2_UCP option affects the meanings of \ed, \es, \ew
-and their upper case partners, just as it does when they appear outside a
-character class, as described in the section entitled
-.\" HTML <a href="#genericchartypes">
-.\" </a>
-"Generic character types"
-.\"
-above. The escape sequence \eb has a different meaning inside a character
-class; it matches the backspace character. The sequences \eB, \eN, \eR, and \eX
-are not special inside a character class. Like any other unrecognized escape
-sequences, they cause an error.
-.P
 A circumflex can conveniently be used with the upper case character types to
 specify a more restricted set of characters than the matching lower case type.
 For example, the class [^\eW_] matches any letter or digit, but not underscore,
@@ -1527,20 +1573,25 @@ alternative in the subpattern.
 .SH "INTERNAL OPTION SETTING"
 .rs
 .sp
-The settings of the PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL, and
-PCRE2_EXTENDED options (which are Perl-compatible) can be changed from within
-the pattern by a sequence of Perl option letters enclosed between "(?" and ")".
-The option letters are
+The settings of the PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL,
+PCRE2_EXTENDED, PCRE2_EXTENDED_MORE, and PCRE2_NO_AUTO_CAPTURE options (which
+are Perl-compatible) can be changed from within the pattern by a sequence of
+Perl option letters enclosed between "(?" and ")". The option letters are
 .sp
   i  for PCRE2_CASELESS
   m  for PCRE2_MULTILINE
+  n  for PCRE2_NO_AUTO_CAPTURE
   s  for PCRE2_DOTALL
   x  for PCRE2_EXTENDED
+  xx for PCRE2_EXTENDED_MORE
 .sp
 For example, (?im) sets caseless, multiline matching. It is also possible to
-unset these options by preceding the letter with a hyphen, and a combined
-setting and unsetting such as (?im-sx), which sets PCRE2_CASELESS and
-PCRE2_MULTILINE while unsetting PCRE2_DOTALL and PCRE2_EXTENDED, is also
+unset these options by preceding the letter with a hyphen. The two "extended"
+options are not independent; unsetting either one cancels the effects of both
+of them.
+.P
+A combined setting and unsetting such as (?im-sx), which sets PCRE2_CASELESS
+and PCRE2_MULTILINE while unsetting PCRE2_DOTALL and PCRE2_EXTENDED, is also
 permitted. If a letter appears both before and after the hyphen, the option is
 unset. An empty options setting "(?)" is allowed. Needless to say, it has no
 effect.
@@ -1551,12 +1602,8 @@ respectively.
 .P
 When one of these option changes occurs at top level (that is, not inside
 subpattern parentheses), the change applies to the remainder of the pattern
-that follows. If the change is placed right at the start of a pattern, PCRE2
-extracts it into the global options (and it will therefore show up in data
-extracted by the \fBpcre2_pattern_info()\fP function).
-.P
-An option change within a subpattern (see below for a description of
-subpatterns) affects only that part of the subpattern that follows it, so
+that follows. An option change within a subpattern (see below for a description
+of subpatterns) affects only that part of the subpattern that follows it, so
 .sp
   (a(?i)b)c
 .sp
@@ -2096,9 +2143,9 @@ no such problem when named parentheses are used. A back reference to any
 subpattern is possible using named parentheses (see below).
 .P
 Another way of avoiding the ambiguity inherent in the use of digits following a
-backslash is to use the \eg escape sequence. This escape must be followed by an
-unsigned number or a negative number, optionally enclosed in braces. These
-examples are all identical:
+backslash is to use the \eg escape sequence. This escape must be followed by a
+signed or unsigned number, optionally enclosed in braces. These examples are
+all identical:
 .sp
   (ring), \e1
   (ring), \eg1
@@ -2106,8 +2153,7 @@ examples are all identical:
 .sp
 An unsigned number specifies an absolute reference without the ambiguity that
 is present in the older syntax. It is also useful when literal digits follow
-the reference. A negative number is a relative reference. Consider this
-example:
+the reference. A signed number is a relative reference. Consider this example:
 .sp
   (abc(def)ghi)\eg{-1}
 .sp
@@ -2117,6 +2163,10 @@ Similarly, \eg{-2} would be equivalent to \e1. The use of relative references
 can be helpful in long patterns, and also in patterns that are created by
 joining together fragments that contain references within themselves.
 .P
+The sequence \eg{+1} is a reference to the next capturing subpattern. This kind
+of forward reference can be useful it patterns that repeat. Perl does not
+support the use of + in this way.
+.P
 A back reference matches whatever actually matched the capturing subpattern in
 the current subject string, rather than anything matching the subpattern
 itself (see
@@ -2215,14 +2265,28 @@ above.
 .P
 More complicated assertions are coded as subpatterns. There are two kinds:
 those that look ahead of the current position in the subject string, and those
-that look behind it. An assertion subpattern is matched in the normal way,
-except that it does not cause the current matching position to be changed.
-.P
-Assertion subpatterns are not capturing subpatterns. If such an assertion
-contains capturing subpatterns within it, these are counted for the purposes of
+that look behind it, and in each case an assertion may be positive (must
+succeed for matching to continue) or negative (must not succeed for matching to
+continue). An assertion subpattern is matched in the normal way, except that,
+when matching continues afterwards, the matching position in the subject string
+is as it was at the start of the assertion.
+.P
+Assertion subpatterns are not capturing subpatterns. If an assertion contains
+capturing subpatterns within it, these are counted for the purposes of
 numbering the capturing subpatterns in the whole pattern. However, substring
-capturing is carried out only for positive assertions. (Perl sometimes, but not
-always, does do capturing in negative assertions.)
+capturing is carried out only for positive assertions that succeed, that is,
+one of their branches matches, so matching continues after the assertion. If
+all branches of a positive assertion fail to match, nothing is captured, and
+control is passed to the previous backtracking point.
+.P
+No capturing is done for a negative assertion unless it is being used as a
+condition in a
+.\" HTML <a href="#subpatternsassubroutines">
+.\" </a>
+conditional subpattern
+.\"
+(see the discussion below). Matching continues after a non-conditional negative
+assertion only if all its branches fail to match.
 .P
 For compatibility with Perl, most assertion subpatterns may be repeated; though
 it makes no sense to assert the same thing several times, the side effect of
@@ -2321,23 +2385,34 @@ temporarily move the current position back by the fixed length and then try to
 match. If there are insufficient characters before the current position, the
 assertion fails.
 .P
-In a UTF mode, PCRE2 does not allow the \eC escape (which matches a single code
-unit even in a UTF mode) to appear in lookbehind assertions, because it makes
-it impossible to calculate the length of the lookbehind. The \eX and \eR
-escapes, which can match different numbers of code units, are also not
-permitted.
+In UTF-8 and UTF-16 modes, PCRE2 does not allow the \eC escape (which matches a
+single code unit even in a UTF mode) to appear in lookbehind assertions,
+because it makes it impossible to calculate the length of the lookbehind. The
+\eX and \eR escapes, which can match different numbers of code units, are never
+permitted in lookbehinds.
 .P
 .\" HTML <a href="#subpatternsassubroutines">
 .\" </a>
 "Subroutine"
 .\"
 calls (see below) such as (?2) or (?&X) are permitted in lookbehinds, as long
-as the subpattern matches a fixed-length string.
+as the subpattern matches a fixed-length string. However,
 .\" HTML <a href="#recursion">
 .\" </a>
-Recursion,
+recursion,
 .\"
-however, is not supported.
+that is, a "subroutine" call into a group that is already active,
+is not supported.
+.P
+Perl does not support back references in lookbehinds. PCRE2 does support them,
+but only if certain conditions are met. The PCRE2_MATCH_UNSET_BACKREF option
+must not be set, there must be no use of (?| in the pattern (it creates
+duplicate subpattern numbers), and if the back reference is by name, the name
+must be unique. Of course, the referenced subpattern must itself be of fixed
+length. The following pattern matches words containing at least two characters
+that begin and end with the same character:
+.sp
+   \eb(\ew)\ew++(?<=\e1)
 .P
 Possessive quantifiers can be used in conjunction with lookbehind assertions to
 specify efficient matching of fixed-length strings at the end of subject
@@ -2476,7 +2551,9 @@ This makes the fragment independent of the parentheses in the larger pattern.
 .sp
 Perl uses the syntax (?(<name>)...) or (?('name')...) to test for a used
 subpattern by name. For compatibility with earlier versions of PCRE1, which had
-this facility before Perl, the syntax (?(name)...) is also recognized.
+this facility before Perl, the syntax (?(name)...) is also recognized. Note,
+however, that undelimited names consisting of the letter R followed by digits
+are ambiguous (see the following section).
 .P
 Rewriting the above example to use a named subpattern gives this:
 .sp
@@ -2490,33 +2567,55 @@ matched.
 .SS "Checking for pattern recursion"
 .rs
 .sp
-If the condition is the string (R), and there is no subpattern with the name R,
-the condition is true if a recursive call to the whole pattern or any
-subpattern has been made. If digits or a name preceded by ampersand follow the
-letter R, for example:
+"Recursion" in this sense refers to any subroutine-like call from one part of
+the pattern to another, whether or not it is actually recursive. See the
+sections entitled
+.\" HTML <a href="#recursion">
+.\" </a>
+"Recursive patterns"
+.\"
+and
+.\" HTML <a href="#subpatternsassubroutines">
+.\" </a>
+"Subpatterns as subroutines"
+.\"
+below for details of recursion and subpattern calls.
+.P
+If a condition is the string (R), and there is no subpattern with the name R,
+the condition is true if matching is currently in a recursion or subroutine
+call to the whole pattern or any subpattern. If digits follow the letter R, and
+there is no subpattern with that name, the condition is true if the most recent
+call is into a subpattern with the given number, which must exist somewhere in
+the overall pattern. This is a contrived example that is equivalent to a+b:
+.sp
+  ((?(R1)a+|(?1)b))
+.sp
+However, in both cases, if there is a subpattern with a matching name, the
+condition tests for its being set, as described in the section above, instead
+of testing for recursion. For example, creating a group with the name R1 by
+adding (?<R1>) to the above pattern completely changes its meaning.
+.P
+If a name preceded by ampersand follows the letter R, for example:
 .sp
-  (?(R3)...) or (?(R&name)...)
+  (?(R&name)...)
 .sp
-the condition is true if the most recent recursion is into a subpattern whose
-number or name is given. This condition does not check the entire recursion
-stack. If the name used in a condition of this kind is a duplicate, the test is
-applied to all subpatterns of the same name, and is true if any one of them is
-the most recent recursion.
+the condition is true if the most recent recursion is into a subpattern of that
+name (which must exist within the pattern).
+.P
+This condition does not check the entire recursion stack. It tests only the
+current level. If the name used in a condition of this kind is a duplicate, the
+test is applied to all subpatterns of the same name, and is true if any one of
+them is the most recent recursion.
 .P
 At "top level", all these recursion test conditions are false.
-.\" HTML <a href="#recursion">
-.\" </a>
-The syntax for recursive patterns
-.\"
-is described below.
 .
 .
 .\" HTML <a name="subdefine"></a>
 .SS "Defining subpatterns for use by reference only"
 .rs
 .sp
-If the condition is the string (DEFINE), and there is no subpattern with the
-name DEFINE, the condition is always false. In this case, there may be only one
+If the condition is the string (DEFINE), the condition is always false, even if
+there is a group with the name DEFINE. In this case, there may be only one
 alternative in the subpattern. It is always skipped if control reaches this
 point in the pattern; the idea of DEFINE is that it can be used to define
 subroutines that can be referenced from elsewhere. (The use of
@@ -2574,6 +2673,12 @@ presence of at least one letter in the subject. If a letter is found, the
 subject is matched against the first alternative; otherwise it is matched
 against the second. This pattern matches strings in one of the two forms
 dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
+.P
+When an assertion that is a condition contains capturing subpatterns, any
+capturing that occurs in a matching branch is retained afterwards, for both
+positive and negative assertions, because matching always continues after the
+assertion, whether it succeeds or fails. (Compare non-conditional assertions,
+when captures are retained only for positive assertions that succeed.)
 .
 .
 .\" HTML <a name="comments"></a>
@@ -2753,88 +2858,53 @@ is the actual recursive call.
 .SS "Differences in recursion processing between PCRE2 and Perl"
 .rs
 .sp
-Recursion processing in PCRE2 differs from Perl in two important ways. In PCRE2
-(like Python, but unlike Perl), a recursive subpattern call is always treated
-as an atomic group. That is, once it has matched some of the subject string, it
-is never re-entered, even if it contains untried alternatives and there is a
-subsequent matching failure. This can be illustrated by the following pattern,
-which purports to match a palindromic string that contains an odd number of
-characters (for example, "a", "aba", "abcba", "abcdcba"):
-.sp
-  ^(.|(.)(?1)\e2)$
-.sp
-The idea is that it either matches a single character, or two identical
-characters surrounding a sub-palindrome. In Perl, this pattern works; in PCRE2
-it does not if the pattern is longer than three characters. Consider the
-subject string "abcba":
+Some former differences between PCRE2 and Perl no longer exist.
 .P
-At the top level, the first character is matched, but as it is not at the end
-of the string, the first alternative fails; the second alternative is taken
-and the recursion kicks in. The recursive call to subpattern 1 successfully
-matches the next character ("b"). (Note that the beginning and end of line
-tests are not part of the recursion).
+Before release 10.30, recursion processing in PCRE2 differed from Perl in that
+a recursive subpattern call was always treated as an atomic group. That is,
+once it had matched some of the subject string, it was never re-entered, even
+if it contained untried alternatives and there was a subsequent matching
+failure. (Historical note: PCRE implemented recursion before Perl did.)
 .P
-Back at the top level, the next character ("c") is compared with what
-subpattern 2 matched, which was "a". This fails. Because the recursion is
-treated as an atomic group, there are now no backtracking points, and so the
-entire match fails. (Perl is able, at this point, to re-enter the recursion and
-try the second alternative.) However, if the pattern is written with the
-alternatives in the other order, things are different:
-.sp
-  ^((.)(?1)\e2|.)$
-.sp
-This time, the recursing alternative is tried first, and continues to recurse
-until it runs out of characters, at which point the recursion fails. But this
-time we do have another alternative to try at the higher level. That is the big
-difference: in the previous case the remaining alternative is at a deeper
-recursion level, which PCRE2 cannot use.
+Starting with release 10.30, recursive subroutine calls are no longer treated
+as atomic. That is, they can be re-entered to try unused alternatives if there
+is a matching failure later in the pattern. This is now compatible with the way
+Perl works. If you want a subroutine call to be atomic, you must explicitly
+enclose it in an atomic group.
 .P
-To change the pattern so that it matches all palindromic strings, not just
-those with an odd number of characters, it is tempting to change the pattern to
-this:
+Supporting backtracking into recursions simplifies certain types of recursive
+pattern. For example, this pattern matches palindromic strings:
 .sp
   ^((.)(?1)\e2|.?)$
 .sp
-Again, this works in Perl, but not in PCRE2, and for the same reason. When a
-deeper recursion has matched a single character, it cannot be entered again in
-order to match an empty string. The solution is to separate the two cases, and
-write out the odd and even cases as alternatives at the higher level:
-.sp
-  ^(?:((.)(?1)\e2|)|((.)(?3)\e4|.))
+The second branch in the group matches a single central character in the
+palindrome when there are an odd number of characters, or nothing when there
+are an even number of characters, but in order to work it has to be able to try
+the second case when the rest of the pattern match fails. If you want to match
+typical palindromic phrases, the pattern has to ignore all non-word characters,
+which can be done like this:
 .sp
-If you want to match typical palindromic phrases, the pattern has to ignore all
-non-word characters, which can be done like this:
-.sp
-  ^\eW*+(?:((.)\eW*+(?1)\eW*+\e2|)|((.)\eW*+(?3)\eW*+\e4|\eW*+.\eW*+))\eW*+$
+  ^\eW*+((.)\eW*+(?1)\eW*+\e2|\eW*+.?)\eW*+$
 .sp
 If run with the PCRE2_CASELESS option, this pattern matches phrases such as "A
-man, a plan, a canal: Panama!" and it works in both PCRE2 and Perl. Note the
-use of the possessive quantifier *+ to avoid backtracking into sequences of
-non-word characters. Without this, PCRE2 takes a great deal longer (ten times
-or more) to match typical phrases, and Perl takes so long that you think it has
-gone into a loop.
-.P
-\fBWARNING\fP: The palindrome-matching patterns above work only if the subject
-string does not start with a palindrome that is shorter than the entire string.
-For example, although "abcba" is correctly matched, if the subject is "ababa",
-PCRE2 finds the palindrome "aba" at the start, then fails at top level because
-the end of the string does not follow. Once again, it cannot jump back into the
-recursion to try other alternatives, so the entire match fails.
-.P
-The second way in which PCRE2 and Perl differ in their recursion processing is
-in the handling of captured values. In Perl, when a subpattern is called
-recursively or as a subpattern (see the next section), it has no access to any
-values that were captured outside the recursion, whereas in PCRE2 these values
-can be referenced. Consider this pattern:
+man, a plan, a canal: Panama!". Note the use of the possessive quantifier *+ to
+avoid backtracking into sequences of non-word characters. Without this, PCRE2
+takes a great deal longer (ten times or more) to match typical phrases, and
+Perl takes so long that you think it has gone into a loop.
+.P
+Another way in which PCRE2 and Perl used to differ in their recursion
+processing is in the handling of captured values. Formerly in Perl, when a
+subpattern was called recursively or as a subpattern (see the next section), it
+had no access to any values that were captured outside the recursion, whereas
+in PCRE2 these values can be referenced. Consider this pattern:
 .sp
   ^(.)(\e1|a(?2))
 .sp
-In PCRE2, this pattern matches "bab". The first capturing parentheses match "b",
-then in the second group, when the back reference \e1 fails to match "b", the
-second alternative matches "a" and then recurses. In the recursion, \e1 does
-now match "b" and so the whole match succeeds. In Perl, the pattern fails to
-match because inside the recursive call \e1 cannot access the externally set
-value.
+This pattern matches "bab". The first capturing parentheses match "b", then in
+the second group, when the back reference \e1 fails to match "b", the second
+alternative matches "a" and then recurses. In the recursion, \e1 does now match
+"b" and so the whole match succeeds. This match used to fail in Perl, but in
+later versions (I tried 5.024) it now works.
 .
 .
 .\" HTML <a name="subpatternsassubroutines"></a>
@@ -2863,11 +2933,10 @@ matches "sense and sensibility" and "response and responsibility", but not
 is used, it does match "sense and responsibility" as well as the other two
 strings. Another example is given in the discussion of DEFINE above.
 .P
-All subroutine calls, whether recursive or not, are always treated as atomic
-groups. That is, once a subroutine has matched some of the subject string, it
-is never re-entered, even if it contains untried alternatives and there is a
-subsequent matching failure. Any capturing parentheses that are set during the
-subroutine call revert to their previous values afterwards.
+Like recursions, subroutine calls used to be treated as atomic, but this
+changed at PCRE2 release 10.30, so backtracking into subroutine calls can now
+occur. However, any capturing parentheses that are set during the subroutine
+call revert to their previous values afterwards.
 .P
 Processing options such as case-independence are fixed when a subpattern is
 defined, so if it is used as a subroutine, such options cannot be changed for
@@ -2980,26 +3049,28 @@ The doubling is removed before the string is passed to the callout function.
 .SH "BACKTRACKING CONTROL"
 .rs
 .sp
-Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which
-are still described in the Perl documentation as "experimental and subject to
-change or removal in a future version of Perl". It goes on to say: "Their usage
-in production code should be noted to avoid problems during upgrades." The same
-remarks apply to the PCRE2 features described in this section.
-.P
-The new verbs make use of what was previously invalid syntax: an opening
-parenthesis followed by an asterisk. They are generally of the form (*VERB) or
-(*VERB:NAME). Some verbs take either form, possibly behaving differently
-depending on whether or not a name is present.
+There are a number of special "Backtracking Control Verbs" (to use Perl's
+terminology) that modify the behaviour of backtracking during matching. They
+are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form,
+possibly behaving differently depending on whether or not a name is present.
 .P
 By default, for compatibility with Perl, a name is any sequence of characters
 that does not include a closing parenthesis. The name is not processed in
 any way, and it is not possible to include a closing parenthesis in the name.
-However, if the PCRE2_ALT_VERBNAMES option is set, normal backslash processing
-is applied to verb names and only an unescaped closing parenthesis terminates
-the name. A closing parenthesis can be included in a name either as \e) or
-between \eQ and \eE. If the PCRE2_EXTENDED option is set, unescaped whitespace
-in verb names is skipped and #-comments are recognized, exactly as in the rest
-of the pattern.
+This can be changed by setting the PCRE2_ALT_VERBNAMES option, but the result
+is no longer Perl-compatible.
+.P
+When PCRE2_ALT_VERBNAMES is set, backslash processing is applied to verb names
+and only an unescaped closing parenthesis terminates the name. However, the
+only backslash items that are permitted are \eQ, \eE, and sequences such as
+\ex{100} that define character code points. Character type escapes such as \ed
+are faulted.
+.P
+A closing parenthesis can be included in a name either as \e) or between \eQ
+and \eE. In addition to backslash processing, if the PCRE2_EXTENDED option is
+also set, unescaped whitespace in verb names is skipped, and #-comments are
+recognized, exactly as in the rest of the pattern. PCRE2_EXTENDED does not
+affect verb names unless PCRE2_ALT_VERBNAMES is also set.
 .P
 The maximum length of a name is 255 in the 8-bit library and 65535 in the
 16-bit and 32-bit libraries. If the name is empty, that is, if the closing
@@ -3008,7 +3079,7 @@ not there. Any number of these verbs may occur in a pattern.
 .P
 Since these verbs are specifically related to backtracking, most of them can be
 used only when the pattern is to be matched using the traditional matching
-function, because these use a backtracking algorithm. With the exception of
+function, because that uses a backtracking algorithm. With the exception of
 (*FAIL), which behaves like a failing negative assertion, the backtracking
 control verbs cause an error if encountered by the DFA matching function.
 .P
@@ -3162,11 +3233,11 @@ to ensure that the match is always attempted.
 The following verbs do nothing when they are encountered. Matching continues
 with what follows, but if there is no subsequent match, causing a backtrack to
 the verb, a failure is forced. That is, backtracking cannot pass to the left of
-the verb. However, when one of these verbs appears inside an atomic group
-(which includes any group that is called as a subroutine) or in an assertion
-that is true, its effect is confined to that group, because once the group has
-been matched, there is never any backtracking into it. In this situation,
-backtracking has to jump to the left of the entire atomic group or assertion.
+the verb. However, when one of these verbs appears inside an atomic group or in
+an assertion that is true, its effect is confined to that group, because once
+the group has been matched, there is never any backtracking into it. In this
+situation, backtracking has to jump to the left of the entire atomic group or
+assertion.
 .P
 These verbs differ in exactly what kind of failure occurs when backtracking
 reaches them. The behaviour described below is what happens when the verb is
@@ -3226,8 +3297,8 @@ possessive quantifier, but there are some uses of (*PRUNE) that cannot be
 expressed in any other way. In an anchored pattern (*PRUNE) has the same effect
 as (*COMMIT).
 .P
-The behaviour of (*PRUNE:NAME) is the not the same as (*MARK:NAME)(*PRUNE).
-It is like (*MARK:NAME) in that the name is remembered for passing back to the
+The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is
+like (*MARK:NAME) in that the name is remembered for passing back to the
 caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
 ignoring those set by (*PRUNE) or (*THEN).
 .sp
@@ -3365,25 +3436,30 @@ in the second repeat of the group acts.
 .SS "Backtracking verbs in assertions"
 .rs
 .sp
-(*FAIL) in an assertion has its normal effect: it forces an immediate
-backtrack.
+(*FAIL) in any assertion has its normal effect: it forces an immediate
+backtrack. The behaviour of the other backtracking verbs depends on whether or
+not the assertion is standalone or acting as the condition in a conditional
+subpattern.
 .P
-(*ACCEPT) in a positive assertion causes the assertion to succeed without any
-further processing. In a negative assertion, (*ACCEPT) causes the assertion to
-fail without any further processing.
+(*ACCEPT) in a standalone positive assertion causes the assertion to succeed
+without any further processing; captured strings are retained. In a standalone
+negative assertion, (*ACCEPT) causes the assertion to fail without any further
+processing; captured substrings are discarded.
 .P
-The other backtracking verbs are not treated specially if they appear in a
-positive assertion. In particular, (*THEN) skips to the next alternative in the
-innermost enclosing group that has alternations, whether or not this is within
-the assertion.
+If the assertion is a condition, (*ACCEPT) causes the condition to be true for
+a positive assertion and false for a negative one; captured substrings are
+retained in both cases.
 .P
-Negative assertions are, however, different, in order to ensure that changing a
-positive assertion into a negative assertion changes its result. Backtracking
-into (*COMMIT), (*SKIP), or (*PRUNE) causes a negative assertion to be true,
-without considering any further alternative branches in the assertion.
-Backtracking into (*THEN) causes it to skip to the next enclosing alternative
-within the assertion (the normal behaviour), but if the assertion does not have
-such an alternative, (*THEN) behaves like (*PRUNE).
+The effect of (*THEN) is not allowed to escape beyond an assertion. If there
+are no more branches to try, (*THEN) causes a positive assertion to be false,
+and a negative assertion to be true.
+.P
+The other backtracking verbs are not treated specially if they appear in a
+standalone positive assertion. In a conditional positive assertion,
+backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the condition to be
+false. However, for both standalone and conditional negative assertions,
+backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the assertion to be
+true, without considering any further alternative branches.
 .
 .
 .\" HTML <a name="btsub"></a>
@@ -3429,6 +3505,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 20 June 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 12 September 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2perform.3 b/doc/pcre2perform.3
index ec86fe7..8b49a2a 100644
--- a/doc/pcre2perform.3
+++ b/doc/pcre2perform.3
@@ -1,4 +1,4 @@
-.TH PCRE2PERFORM 3 "02 January 2015" "PCRE2 10.00"
+.TH PCRE2PERFORM 3 "08 April 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "PCRE2 PERFORMANCE"
@@ -12,11 +12,11 @@ of them.
 .rs
 .sp
 Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
-so that most simple patterns do not use much memory. However, there is one case
-where the memory usage of a compiled pattern can be unexpectedly large. If a
-parenthesized subpattern has a quantifier with a minimum greater than 1 and/or
-a limited maximum, the whole subpattern is repeated in the compiled code. For
-example, the pattern
+so that most simple patterns do not use much memory for storing the compiled
+version. However, there is one case where the memory usage of a compiled
+pattern can be unexpectedly large. If a parenthesized subpattern has a
+quantifier with a minimum greater than 1 and/or a limited maximum, the whole
+subpattern is repeated in the compiled code. For example, the pattern
 .sp
   (abc|def){2,4}
 .sp
@@ -34,13 +34,13 @@ example, the very simple pattern
 .sp
   ((ab){1,1000}c){1,3}
 .sp
-uses 51K bytes when compiled using the 8-bit library. When PCRE2 is compiled
-with its default internal pointer size of two bytes, the size limit on a
-compiled pattern is 64K code units in the 8-bit and 16-bit libraries, and this
-is reached with the above pattern if the outer repetition is increased from 3
-to 4. PCRE2 can be compiled to use larger internal pointers and thus handle
-larger compiled patterns, but it is better to try to rewrite your pattern to
-use less memory if you can.
+uses over 50K bytes when compiled using the 8-bit library. When PCRE2 is
+compiled with its default internal pointer size of two bytes, the size limit on
+a compiled pattern is 64K code units in the 8-bit and 16-bit libraries, and
+this is reached with the above pattern if the outer repetition is increased
+from 3 to 4. PCRE2 can be compiled to use larger internal pointers and thus
+handle larger compiled patterns, but it is better to try to rewrite your
+pattern to use less memory if you can.
 .P
 One way of reducing the memory usage for such patterns is to make use of
 PCRE2's
@@ -52,32 +52,35 @@ facility. Re-writing the above pattern as
 .sp
   ((ab)(?2){0,999}c)(?1){0,2}
 .sp
-reduces the memory requirements to 18K, and indeed it remains under 20K even
-with the outer repetition increased to 100. However, this pattern is not
-exactly equivalent, because the "subroutine" calls are treated as
-.\" HTML <a href="pcre2pattern.html#atomicgroup">
-.\" </a>
-atomic groups
-.\"
-into which there can be no backtracking if there is a subsequent matching
-failure. Therefore, PCRE2 cannot do this kind of rewriting automatically.
-Furthermore, there is a noticeable loss of speed when executing the modified
-pattern. Nevertheless, if the atomic grouping is not a problem and the loss of
-speed is acceptable, this kind of rewriting will allow you to process patterns
-that PCRE2 cannot otherwise handle.
+reduces the memory requirements to around 16K, and indeed it remains under 20K
+even with the outer repetition increased to 100. However, this kind of pattern
+is not always exactly equivalent, because any captures within subroutine calls
+are lost when the subroutine completes. If this is not a problem, this kind of
+rewriting will allow you to process patterns that PCRE2 cannot otherwise
+handle. The matching performance of the two different versions of the pattern
+are roughly the same. (This applies from release 10.30 - things were different
+in earlier releases.)
 .
 .
-.SH "STACK USAGE AT RUN TIME"
+.SH "STACK AND HEAP USAGE AT RUN TIME"
 .rs
 .sp
-When \fBpcre2_match()\fP is used for matching, certain kinds of pattern can
-cause it to use large amounts of the process stack. In some environments the
-default process stack is quite small, and if it runs out the result is often
-SIGSEGV. Rewriting your pattern can often help. The
-.\" HREF
-\fBpcre2stack\fP
-.\"
-documentation discusses this issue in detail.
+From release 10.30, the interpretive (non-JIT) version of \fBpcre2_match()\fP
+uses very little system stack at run time. In earlier releases recursive
+function calls could use a great deal of stack, and this could cause problems,
+but this usage has been eliminated. Backtracking positions are now explicitly
+remembered in memory frames controlled by the code. An initial 20K vector of
+frames is allocated on the system stack (enough for about 100 frames for small
+patterns), but if this is insufficient, heap memory is used. The amount of heap
+memory can be limited; if the limit is set to zero, only the initial stack
+vector is used. Rewriting patterns to be time-efficient, as described below,
+may also reduce the memory requirements.
+.P
+In contrast to \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP does use recursive
+function calls, but only for processing atomic groups, lookaround assertions,
+and recursion within the pattern. Too much nested recursion may cause stack
+issues. The "match depth" parameter can be used to limit the depth of function
+recursion in \fBpcre2_dfa_match()\fP.
 .
 .
 .SH "PROCESSING TIME"
@@ -160,7 +163,59 @@ applied to a whole line of "a" characters, whereas the latter takes an
 appreciable time with strings longer than about 20 characters.
 .P
 In many cases, the solution to this kind of performance issue is to use an
-atomic group or a possessive quantifier.
+atomic group or a possessive quantifier. This can often reduce memory
+requirements as well. As another example, consider this pattern:
+.sp
+  ([^<]|<(?!inet))+
+.sp
+It matches from wherever it starts until it encounters "<inet" or the end of
+the data, and is the kind of pattern that might be used when processing an XML
+file. Each iteration of the outer parentheses matches either one character that
+is not "<" or a "<" that is not followed by "inet". However, each time a
+parenthesis is processed, a backtracking position is passed, so this
+formulation uses a memory frame for each matched character. For a long string,
+a lot of memory is required. Consider now this rewritten pattern, which matches
+exactly the same strings:
+.sp
+  ([^<]++|<(?!inet))+
+.sp
+This runs much faster, because sequences of characters that do not contain "<"
+are "swallowed" in one item inside the parentheses, and a possessive quantifier
+is used to stop any backtracking into the runs of non-"<" characters. This
+version also uses a lot less memory because entry to a new set of parentheses
+happens only when a "<" character that is not followed by "inet" is encountered
+(and we assume this is relatively rare).
+.P
+This example shows that one way of optimizing performance when matching long
+subject strings is to write repeated parenthesized subpatterns to match more
+than one character whenever possible.
+.
+.
+.SS "SETTING RESOURCE LIMITS"
+.rs
+.sp
+You can set limits on the amount of processing that takes place when matching,
+and on the amount of heap memory that is used. The default values of the limits
+are very large, and unlikely ever to operate. They can be changed when PCRE2 is
+built, and they can also be set when \fBpcre2_match()\fP or
+\fBpcre2_dfa_match()\fP is called. For details of these interfaces, see the
+.\" HREF
+\fBpcre2build\fP
+.\"
+documentation and the section entitled
+.\" HTML <a href="pcre2api.html#matchcontext">
+.\" </a>
+"The match context"
+.\"
+in the
+.\" HREF
+\fBpcre2api\fP
+.\"
+documentation.
+.P
+The \fBpcre2test\fP test program has a modifier called "find_limits" which, if
+applied to a subject line, causes it to find the smallest limits that allow a
+pattern to match. This is done by repeatedly matching with different limits.
 .
 .
 .SH AUTHOR
@@ -177,6 +232,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 02 January 2015
-Copyright (c) 1997-2015 University of Cambridge.
+Last updated: 08 April 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2posix.3 b/doc/pcre2posix.3
index 70a86d8..399e2a8 100644
--- a/doc/pcre2posix.3
+++ b/doc/pcre2posix.3
@@ -1,4 +1,4 @@
-.TH PCRE2POSIX 3 "31 January 2016" "PCRE2 10.22"
+.TH PCRE2POSIX 3 "15 June 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "SYNOPSIS"
@@ -46,7 +46,7 @@ replacement library. Other POSIX options are not even defined.
 .P
 There are also some options that are not defined by POSIX. These have been
 added at the request of users who want to make use of certain PCRE2-specific
-features via the POSIX calling interface.
+features via the POSIX calling interface or to add BSD or GNU functionality.
 .P
 When PCRE2 is called via these functions, it is only the API that is POSIX-like
 in style. The syntax and semantics of the regular expressions themselves are
@@ -68,10 +68,11 @@ identifying error codes.
 .rs
 .sp
 The function \fBregcomp()\fP is called to compile a pattern into an
-internal form. The pattern is a C string terminated by a binary zero, and
-is passed in the argument \fIpattern\fP. The \fIpreg\fP argument is a pointer
-to a \fBregex_t\fP structure that is used as a base for storing information
-about the compiled regular expression.
+internal form. By default, the pattern is a C string terminated by a binary
+zero (but see REG_PEND below). The \fIpreg\fP argument is a pointer to a
+\fBregex_t\fP structure that is used as a base for storing information about
+the compiled regular expression. (It is also used for input when REG_PEND is
+set.)
 .P
 The argument \fIcflags\fP is either zero, or contains one or more of the bits
 defined by the following macros:
@@ -93,6 +94,14 @@ The PCRE2_MULTILINE option is set when the regular expression is passed for
 compilation to the native function. Note that this does \fInot\fP mimic the
 defined POSIX behaviour for REG_NEWLINE (see the following section).
 .sp
+  REG_NOSPEC
+.sp
+The PCRE2_LITERAL option is set when the regular expression is passed for
+compilation to the native function. This disables all meta characters in the
+pattern, causing it to be treated as a literal string. The only other options
+that are allowed with REG_NOSPEC are REG_ICASE, REG_NOSUB, REG_PEND, and
+REG_UTF. Note that REG_NOSPEC is not part of the POSIX standard.
+.sp
   REG_NOSUB
 .sp
 When a pattern that is compiled with this flag is passed to \fBregexec()\fP for
@@ -101,6 +110,16 @@ captured strings are returned. Versions of the PCRE library prior to 10.22 used
 to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
 because it disables the use of back references.
 .sp
+  REG_PEND
+.sp
+If this option is set, the \fBreg_endp\fP field in the \fIpreg\fP structure
+(which has the type const char *) must be set to point to the character beyond
+the end of the pattern before calling \fBregcomp()\fP. The pattern itself may
+now contain binary zeroes, which are treated as data characters. Without
+REG_PEND, a binary zero terminates the pattern and the \fBre_endp\fP field is
+ignored. This is a GNU extension to the POSIX standard and should be used with
+caution in software intended to be portable to other systems.
+.sp
   REG_UCP
 .sp
 The PCRE2_UCP option is set when the regular expression is passed for
@@ -130,9 +149,10 @@ newlines are matched by the dot metacharacter (they are not) or by a negative
 class such as [^a] (they are).
 .P
 The yield of \fBregcomp()\fP is zero on success, and non-zero otherwise. The
-\fIpreg\fP structure is filled in on success, and one member of the structure
-is public: \fIre_nsub\fP contains the number of capturing subpatterns in
-the regular expression. Various error codes are defined in the header file.
+\fIpreg\fP structure is filled in on success, and one other member of the
+structure (as well as \fIre_endp\fP) is public: \fIre_nsub\fP contains the
+number of capturing subpatterns in the regular expression. Various error codes
+are defined in the header file.
 .P
 NOTE: If the yield of \fBregcomp()\fP is non-zero, you must not attempt to
 use the contents of the \fIpreg\fP structure. If, for example, you pass it to
@@ -204,15 +224,24 @@ function.
 .sp
   REG_STARTEND
 .sp
-The string is considered to start at \fIstring\fP + \fIpmatch[0].rm_so\fP and
-to have a terminating NUL located at \fIstring\fP + \fIpmatch[0].rm_eo\fP
-(there need not actually be a NUL at that location), regardless of the value of
-\fInmatch\fP. This is a BSD extension, compatible with but not specified by
-IEEE Standard 1003.2 (POSIX.2), and should be used with caution in software
-intended to be portable to other systems. Note that a non-zero \fIrm_so\fP does
-not imply REG_NOTBOL; REG_STARTEND affects only the location of the string, not
-how it is matched. Setting REG_STARTEND and passing \fIpmatch\fP as NULL are
-mutually exclusive; the error REG_INVARG is returned.
+When this option is set, the subject string is starts at \fIstring\fP +
+\fIpmatch[0].rm_so\fP and ends at \fIstring\fP + \fIpmatch[0].rm_eo\fP, which
+should point to the first character beyond the string. There may be binary
+zeroes within the subject string, and indeed, using REG_STARTEND is the only
+way to pass a subject string that contains a binary zero.
+.P
+Whatever the value of \fIpmatch[0].rm_so\fP, the offsets of the matched string
+and any captured substrings are still given relative to the start of
+\fIstring\fP itself. (Before PCRE2 release 10.30 these were given relative to
+\fIstring\fP + \fIpmatch[0].rm_so\fP, but this differs from other
+implementations.)
+.P
+This is a BSD extension, compatible with but not specified by IEEE Standard
+1003.2 (POSIX.2), and should be used with caution in software intended to be
+portable to other systems. Note that a non-zero \fIrm_so\fP does not imply
+REG_NOTBOL; REG_STARTEND affects only the location and length of the string,
+not how it is matched. Setting REG_STARTEND and passing \fIpmatch\fP as NULL
+are mutually exclusive; the error REG_INVARG is returned.
 .P
 If the pattern was compiled with the REG_NOSUB flag, no data about any matched
 strings is returned. The \fInmatch\fP and \fIpmatch\fP arguments of
@@ -271,6 +300,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 31 January 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 15 June 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2serialize.3 b/doc/pcre2serialize.3
index 664c1db..5a87cec 100644
--- a/doc/pcre2serialize.3
+++ b/doc/pcre2serialize.3
@@ -1,4 +1,4 @@
-.TH PCRE2SERIALIZE 3 "24 May 2016" "PCRE2 10.22"
+.TH PCRE2SERIALIZE 3 "21 March 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS"
@@ -37,7 +37,10 @@ The facility for saving and restoring compiled patterns is intended for use
 within individual applications. As such, the data supplied to
 \fBpcre2_serialize_decode()\fP is expected to be trusted data, not data from
 arbitrary external sources. There is only some simple consistency checking, not
-complete validation of what is being re-loaded.
+complete validation of what is being re-loaded. Corrupted data may cause
+undefined results. For example, if the length field of a pattern in the
+serialized data is corrupted, the deserializing code may read beyond the end of
+the byte stream that is passed to it.
 .
 .
 .SH "SAVING COMPILED PATTERNS"
@@ -181,6 +184,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 24 May 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 21 March 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2stack.3 b/doc/pcre2stack.3
deleted file mode 100644
index 8711263..0000000
--- a/doc/pcre2stack.3
+++ /dev/null
@@ -1,202 +0,0 @@
-.TH PCRE2STACK 3 "21 November 2014" "PCRE2 10.00"
-.SH NAME
-PCRE2 - Perl-compatible regular expressions (revised API)
-.SH "PCRE2 DISCUSSION OF STACK USAGE"
-.rs
-.sp
-When you call \fBpcre2_match()\fP, it makes use of an internal function called
-\fBmatch()\fP. This calls itself recursively at branch points in the pattern,
-in order to remember the state of the match so that it can back up and try a
-different alternative after a failure. As matching proceeds deeper and deeper
-into the tree of possibilities, the recursion depth increases. The
-\fBmatch()\fP function is also called in other circumstances, for example,
-whenever a parenthesized sub-pattern is entered, and in certain cases of
-repetition.
-.P
-Not all calls of \fBmatch()\fP increase the recursion depth; for an item such
-as a* it may be called several times at the same level, after matching
-different numbers of a's. Furthermore, in a number of cases where the result of
-the recursive call would immediately be passed back as the result of the
-current call (a "tail recursion"), the function is just restarted instead.
-.P
-Each time the internal \fBmatch()\fP function is called recursively, it uses
-memory from the process stack. For certain kinds of pattern and data, very
-large amounts of stack may be needed, despite the recognition of "tail
-recursion". Note that if PCRE2 is compiled with the -fsanitize=address option
-of the GCC compiler, the stack requirements are greatly increased.
-.P
-The above comments apply when \fBpcre2_match()\fP is run in its normal
-interpretive manner. If the compiled pattern was processed by
-\fBpcre2_jit_compile()\fP, and just-in-time compiling was successful, and the
-options passed to \fBpcre2_match()\fP were not incompatible, the matching
-process uses the JIT-compiled code instead of the \fBmatch()\fP function. In
-this case, the memory requirements are handled entirely differently. See the
-.\" HREF
-\fBpcre2jit\fP
-.\"
-documentation for details.
-.P
-The \fBpcre2_dfa_match()\fP function operates in a different way to
-\fBpcre2_match()\fP, and uses recursion only when there is a regular expression
-recursion or subroutine call in the pattern. This includes the processing of
-assertion and "once-only" subpatterns, which are handled like subroutine calls.
-Normally, these are never very deep, and the limit on the complexity of
-\fBpcre2_dfa_match()\fP is controlled by the amount of workspace it is given.
-However, it is possible to write patterns with runaway infinite recursions;
-such patterns will cause \fBpcre2_dfa_match()\fP to run out of stack. At
-present, there is no protection against this.
-.P
-The comments that follow do NOT apply to \fBpcre2_dfa_match()\fP; they are
-relevant only for \fBpcre2_match()\fP without the JIT optimization.
-.
-.
-.SS "Reducing \fBpcre2_match()\fP's stack usage"
-.rs
-.sp
-You can often reduce the amount of recursion, and therefore the
-amount of stack used, by modifying the pattern that is being matched. Consider,
-for example, this pattern:
-.sp
-  ([^<]|<(?!inet))+
-.sp
-It matches from wherever it starts until it encounters "<inet" or the end of
-the data, and is the kind of pattern that might be used when processing an XML
-file. Each iteration of the outer parentheses matches either one character that
-is not "<" or a "<" that is not followed by "inet". However, each time a
-parenthesis is processed, a recursion occurs, so this formulation uses a stack
-frame for each matched character. For a long string, a lot of stack is
-required. Consider now this rewritten pattern, which matches exactly the same
-strings:
-.sp
-  ([^<]++|<(?!inet))+
-.sp
-This uses very much less stack, because runs of characters that do not contain
-"<" are "swallowed" in one item inside the parentheses. Recursion happens only
-when a "<" character that is not followed by "inet" is encountered (and we
-assume this is relatively rare). A possessive quantifier is used to stop any
-backtracking into the runs of non-"<" characters, but that is not related to
-stack usage.
-.P
-This example shows that one way of avoiding stack problems when matching long
-subject strings is to write repeated parenthesized subpatterns to match more
-than one character whenever possible.
-.
-.
-.SS "Compiling PCRE2 to use heap instead of stack for \fBpcre2_match()\fP"
-.rs
-.sp
-In environments where stack memory is constrained, you might want to compile
-PCRE2 to use heap memory instead of stack for remembering back-up points when
-\fBpcre2_match()\fP is running. This makes it run more slowly, however. Details
-of how to do this are given in the
-.\" HREF
-\fBpcre2build\fP
-.\"
-documentation. When built in this way, instead of using the stack, PCRE2
-gets memory for remembering backup points from the heap. By default, the memory
-is obtained by calling the system \fBmalloc()\fP function, but you can arrange
-to supply your own memory management function. For details, see the section
-entitled
-.\" HTML <a href="pcre2api.html#matchcontext">
-.\" </a>
-"The match context"
-.\"
-in the
-.\" HREF
-\fBpcre2api\fP
-.\"
-documentation. Since the block sizes are always the same, it may be possible to
-implement customized a memory handler that is more efficient than the standard
-function. The memory blocks obtained for this purpose are retained and re-used
-if possible while \fBpcre2_match()\fP is running. They are all freed just
-before it exits.
-.
-.
-.SS "Limiting \fBpcre2_match()\fP's stack usage"
-.rs
-.sp
-You can set limits on the number of times the internal \fBmatch()\fP function
-is called, both in total and recursively. If a limit is exceeded,
-\fBpcre2_match()\fP returns an error code. Setting suitable limits should
-prevent it from running out of stack. The default values of the limits are very
-large, and unlikely ever to operate. They can be changed when PCRE2 is built,
-and they can also be set when \fBpcre2_match()\fP is called. For details of
-these interfaces, see the
-.\" HREF
-\fBpcre2build\fP
-.\"
-documentation and the section entitled
-.\" HTML <a href="pcre2api.html#matchcontext">
-.\" </a>
-"The match context"
-.\"
-in the
-.\" HREF
-\fBpcre2api\fP
-.\"
-documentation.
-.P
-As a very rough rule of thumb, you should reckon on about 500 bytes per
-recursion. Thus, if you want to limit your stack usage to 8Mb, you should set
-the limit at 16000 recursions. A 64Mb stack, on the other hand, can support
-around 128000 recursions.
-.P
-The \fBpcre2test\fP test program has a modifier called "find_limits" which, if
-applied to a subject line, causes it to find the smallest limits that allow a a
-pattern to match. This is done by calling \fBpcre2_match()\fP repeatedly with
-different limits.
-.
-.
-.SS "Changing stack size in Unix-like systems"
-.rs
-.sp
-In Unix-like environments, there is not often a problem with the stack unless
-very long strings are involved, though the default limit on stack size varies
-from system to system. Values from 8Mb to 64Mb are common. You can find your
-default limit by running the command:
-.sp
-  ulimit -s
-.sp
-Unfortunately, the effect of running out of stack is often SIGSEGV, though
-sometimes a more explicit error message is given. You can normally increase the
-limit on stack size by code such as this:
-.sp
-  struct rlimit rlim;
-  getrlimit(RLIMIT_STACK, &rlim);
-  rlim.rlim_cur = 100*1024*1024;
-  setrlimit(RLIMIT_STACK, &rlim);
-.sp
-This reads the current limits (soft and hard) using \fBgetrlimit()\fP, then
-attempts to increase the soft limit to 100Mb using \fBsetrlimit()\fP. You must
-do this before calling \fBpcre2_match()\fP.
-.
-.
-.SS "Changing stack size in Mac OS X"
-.rs
-.sp
-Using \fBsetrlimit()\fP, as described above, should also work on Mac OS X. It
-is also possible to set a stack size when linking a program. There is a
-discussion about stack sizes in Mac OS X at this web site:
-.\" HTML <a href="http://developer.apple.com/qa/qa2005/qa1419.html">
-.\" </a>
-http://developer.apple.com/qa/qa2005/qa1419.html.
-.\"
-.
-.
-.SH AUTHOR
-.rs
-.sp
-.nf
-Philip Hazel
-University Computing Service
-Cambridge, England.
-.fi
-.
-.
-.SH REVISION
-.rs
-.sp
-.nf
-Last updated: 21 November 2014
-Copyright (c) 1997-2014 University of Cambridge.
-.fi
diff --git a/doc/pcre2syntax.3 b/doc/pcre2syntax.3
index 8be8b92..6eb0235 100644
--- a/doc/pcre2syntax.3
+++ b/doc/pcre2syntax.3
@@ -1,4 +1,4 @@
-.TH PCRE2SYNTAX 3 "16 October 2015" "PCRE2 10.21"
+.TH PCRE2SYNTAX 3 "17 June 2017" "PCRE2 10.30"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
@@ -407,18 +407,21 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
   (?i)            caseless
   (?J)            allow duplicate names
   (?m)            multiline
+  (?n)            no auto capture
   (?s)            single line (dotall)
   (?U)            default ungreedy (lazy)
-  (?x)            extended (ignore white space)
+  (?x)            extended: ignore white space except in classes
+  (?xx)           as (?x) but also ignore space and tab in classes
   (?-...)         unset option(s)
 .sp
 The following are recognized only at the very start of a pattern or after one
 of the newline or \eR options with similar syntax. More than one of them may
-appear.
+appear. For the first three, d is a decimal number.
 .sp
-  (*LIMIT_MATCH=d) set the match limit to d (decimal number)
-  (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number)
-  (*NOTEMPTY)     set PCRE2_NOTEMPTY when matching
+  (*LIMIT_DEPTH=d) set the backtracking limit to d
+  (*LIMIT_HEAP=d)  set the heap size limit to d kilobytes
+  (*LIMIT_MATCH=d) set the match limit to d
+  (*NOTEMPTY)      set PCRE2_NOTEMPTY when matching
   (*NOTEMPTY_ATSTART) set PCRE2_NOTEMPTY_ATSTART when matching
   (*NO_AUTO_POSSESS) no auto-possessification (PCRE2_NO_AUTO_POSSESS)
   (*NO_DOTSTAR_ANCHOR) no .* anchoring (PCRE2_NO_DOTSTAR_ANCHOR)
@@ -427,10 +430,11 @@ appear.
   (*UTF)          set appropriate UTF mode for the library in use
   (*UCP)          set PCRE2_UCP (use Unicode properties for \ed etc)
 .sp
-Note that LIMIT_MATCH and LIMIT_RECURSION can only reduce the value of the
-limits set by the caller of pcre2_match(), not increase them. The application
-can lock out the use of (*UTF) and (*UCP) by setting the PCRE2_NEVER_UTF or
-PCRE2_NEVER_UCP options, respectively, at compile time.
+Note that LIMIT_DEPTH, LIMIT_HEAP, and LIMIT_MATCH can only reduce the value of
+the limits set by the caller of \fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP,
+not increase them. LIMIT_RECURSION is an obsolete synonym for LIMIT_DEPTH. The
+application can lock out the use of (*UTF) and (*UCP) by setting the
+PCRE2_NEVER_UTF or PCRE2_NEVER_UCP options, respectively, at compile time.
 .
 .
 .SH "NEWLINE CONVENTION"
@@ -444,6 +448,7 @@ settings with a similar syntax.
   (*CRLF)         carriage return followed by linefeed
   (*ANYCRLF)      all three of the above
   (*ANY)          any Unicode newline sequence
+  (*NUL)          the NUL character (binary zero)
 .
 .
 .SH "WHAT \eR MATCHES"
@@ -473,6 +478,9 @@ Each top-level branch of a look behind must be of a fixed length.
   \en              reference by number (can be ambiguous)
   \egn             reference by number
   \eg{n}           reference by number
+  \eg+n            relative reference by number (PCRE2 extension)
+  \eg-n            relative reference by number
+  \eg{+n}          relative reference by number (PCRE2 extension)
   \eg{-n}          relative reference by number
   \ek<name>        reference by name (Perl)
   \ek'name'        reference by name (Perl)
@@ -511,13 +519,17 @@ Each top-level branch of a look behind must be of a fixed length.
   (?(-n)              relative reference condition
   (?(<name>)          named reference condition (Perl)
   (?('name')          named reference condition (Perl)
-  (?(name)            named reference condition (PCRE2)
+  (?(name)            named reference condition (PCRE2, deprecated)
   (?(R)               overall recursion condition
-  (?(Rn)              specific group recursion condition
-  (?(R&name)          specific recursion condition
+  (?(Rn)              specific numbered group recursion condition
+  (?(R&name)          specific named group recursion condition
   (?(DEFINE)          define subpattern for reference
   (?(VERSION[>]=n.m)  test PCRE2 version
   (?(assert)          assertion condition
+.sp
+Note the ambiguity of (?(R) and (?(Rn) which might be named reference
+conditions or recursion tests. Such a condition is interpreted as a reference
+condition if the relevant named group exists.
 .
 .
 .SH "BACKTRACKING CONTROL"
@@ -577,6 +589,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 16 October 2015
-Copyright (c) 1997-2015 University of Cambridge.
+Last updated: 17 June 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2test.1 b/doc/pcre2test.1
index 2fbf794..ee78792 100644
--- a/doc/pcre2test.1
+++ b/doc/pcre2test.1
@@ -1,4 +1,4 @@
-.TH PCRE2TEST 1 "06 July 2016" "PCRE 10.22"
+.TH PCRE2TEST 1 "21 Decbmber 2017" "PCRE 10.31"
 .SH NAME
 pcre2test - a program for testing Perl-compatible regular expressions.
 .SH SYNOPSIS
@@ -29,7 +29,7 @@ subject is processed, and what output is produced.
 .P
 As the original fairly simple PCRE library evolved, it acquired many different
 features, and as a result, the original \fBpcretest\fP program ended up with a
-lot of options in a messy, arcane syntax, for testing all the features. The
+lot of options in a messy, arcane syntax for testing all the features. The
 move to the new PCRE2 API provided an opportunity to re-implement the test
 program as \fBpcre2test\fP, with a cleaner modifier syntax. Nevertheless, there
 are still many obscure modifiers, some of which are specifically designed for
@@ -47,32 +47,64 @@ strings that are encoded in 8-bit, 16-bit, or 32-bit code units. One, two, or
 all three of these libraries may be simultaneously installed. The
 \fBpcre2test\fP program can be used to test all the libraries. However, its own
 input and output are always in 8-bit format. When testing the 16-bit or 32-bit
-libraries, patterns and subject strings are converted to 16- or 32-bit format
-before being passed to the library functions. Results are converted back to
-8-bit code units for output.
+libraries, patterns and subject strings are converted to 16-bit or 32-bit
+format before being passed to the library functions. Results are converted back
+to 8-bit code units for output.
 .P
 In the rest of this document, the names of library functions and structures
 are given in generic form, for example, \fBpcre_compile()\fP. The actual
 names used in the libraries have a suffix _8, _16, or _32, as appropriate.
 .
 .
+.\" HTML <a name="inputencoding"></a>
 .SH "INPUT ENCODING"
 .rs
 .sp
 Input to \fBpcre2test\fP is processed line by line, either by calling the C
-library's \fBfgets()\fP function, or via the \fBlibreadline\fP library (see
-below). The input is processed using using C's string functions, so must not
-contain binary zeroes, even though in Unix-like environments, \fBfgets()\fP
-treats any bytes other than newline as data characters. In some Windows
-environments character 26 (hex 1A) causes an immediate end of file, and no
-further data is read.
+library's \fBfgets()\fP function, or via the \fBlibreadline\fP library. In some
+Windows environments character 26 (hex 1A) causes an immediate end of file, and
+no further data is read, so this character should be avoided unless you really
+want that action.
 .P
-For maximum portability, therefore, it is safest to avoid non-printing
-characters in \fBpcre2test\fP input files. There is a facility for specifying
-some or all of a pattern's characters as hexadecimal pairs, thus making it
-possible to include binary zeroes in a pattern for testing purposes. Subject
-lines are processed for backslash escapes, which makes it possible to include
-any data value.
+The input is processed using using C's string functions, so must not
+contain binary zeros, even though in Unix-like environments, \fBfgets()\fP
+treats any bytes other than newline as data characters. An error is generated
+if a binary zero is encountered. By default subject lines are processed for
+backslash escapes, which makes it possible to include any data value in strings
+that are passed to the library for matching. For patterns, there is a facility
+for specifying some or all of the 8-bit input characters as hexadecimal pairs,
+which makes it possible to include binary zeros.
+.
+.
+.SS "Input for the 16-bit and 32-bit libraries"
+.rs
+.sp
+When testing the 16-bit or 32-bit libraries, there is a need to be able to
+generate character code points greater than 255 in the strings that are passed
+to the library. For subject lines, backslash escapes can be used. In addition,
+when the \fButf\fP modifier (see
+.\" HTML <a href="#optionmodifiers">
+.\" </a>
+"Setting compilation options"
+.\"
+below) is set, the pattern and any following subject lines are interpreted as
+UTF-8 strings and translated to UTF-16 or UTF-32 as appropriate.
+.P
+For non-UTF testing of wide characters, the \fButf8_input\fP modifier can be
+used. This is mutually exclusive with \fButf\fP, and is allowed only in 16-bit
+or 32-bit mode. It causes the pattern and following subject lines to be treated
+as UTF-8 according to the original definition (RFC 2279), which allows for
+character values up to 0x7fffffff. Each character is placed in one 16-bit or
+32-bit code unit (in the 16-bit case, values greater than 0xffff cause an error
+to occur).
+.P
+UTF-8 (in its original definition) is not capable of encoding values greater
+than 0x7fffffff, but such values can be handled by the 32-bit library. When
+testing this library in non-UTF mode with \fButf8_input\fP set, if any
+character is preceded by the byte 0xff (which is an illegal byte in UTF-8)
+0x80000000 is added to the character's value. This is the only way of passing
+such code points in a pattern string. For subject strings, using an escape
+sequence is preferable.
 .
 .
 .SH "COMMAND LINE OPTIONS"
@@ -93,14 +125,24 @@ If the 32-bit library has been built, this option causes it to be used. If only
 the 32-bit library has been built, this is the default. If the 32-bit library
 has not been built, this option causes an error.
 .TP 10
+\fB-ac\fP
+Behave as if each pattern has the \fBauto_callout\fP modifier, that is, insert
+automatic callouts into every pattern that is compiled.
+.TP 10
+\fB-AC\fP
+As for \fB-ac\fP, but in addition behave as if each subject line has the
+\fBcallout_extra\fP modifier, that is, show additional information from
+callouts.
+.TP 10
 \fB-b\fP
-Behave as if each pattern has the \fB/fullbincode\fP modifier; the full
+Behave as if each pattern has the \fBfullbincode\fP modifier; the full
 internal binary form of the pattern is output after compilation.
 .TP 10
 \fB-C\fP
 Output the version number of the PCRE2 library, and all available information
 about the optional features that are included, and then exit with zero exit
-code. All other options are ignored.
+code. All other options are ignored. If both -C and -LM are present, whichever
+is first is recognized.
 .TP 10
 \fB-C\fP \fIoption\fP
 Output information about a specific build-time option, then exit. This
@@ -114,7 +156,7 @@ following options output the value and set the exit code as indicated:
   linksize   the configured internal link size (2, 3, or 4)
                exit code is set to the link size
   newline    the default newline setting:
-               CR, LF, CRLF, ANYCRLF, or ANY
+               CR, LF, CRLF, ANYCRLF, ANY, or NUL
                exit code is always 0
   bsr        the default setting for what \eR matches:
                ANYCRLF or ANY
@@ -153,13 +195,23 @@ a convenience facility for PCRE2 maintainers.
 Output a brief summary these options and then exit.
 .TP 10
 \fB-i\fP
-Behave as if each pattern has the \fB/info\fP modifier; information about the
+Behave as if each pattern has the \fBinfo\fP modifier; information about the
 compiled pattern is given after compilation.
 .TP 10
 \fB-jit\fP
 Behave as if each pattern line has the \fBjit\fP modifier; after successful
 compilation, each pattern is passed to the just-in-time compiler, if available.
 .TP 10
+\fB-jitverify\fP
+Behave as if each pattern line has the \fBjitverify\fP modifier; after
+successful compilation, each pattern is passed to the just-in-time compiler, if
+available, and the use of JIT is verified.
+.TP 10
+\fB-LM\fP
+List modifiers: write a list of available pattern and subject modifiers to the
+standard output, then exit with zero exit code. All other options are ignored.
+If both -C and -LM are present, whichever is first is recognized.
+.TP 10
 \fB-pattern\fB \fImodifier-list\fP
 Behave as if each pattern line contains the given modifiers.
 .TP 10
@@ -279,8 +331,8 @@ recognized as a newline by default. Without special action the tests would fail
 when PCRE2 is compiled with either CR or CRLF as the default newline.
 .P
 The #newline_default command specifies a list of newline types that are
-acceptable as the default. The types must be one of CR, LF, CRLF, ANYCRLF, or
-ANY (in upper or lower case), for example:
+acceptable as the default. The types must be one of CR, LF, CRLF, ANYCRLF,
+ANY, or NUL (in upper or lower case), for example:
 .sp
   #newline_default LF Any anyCRLF
 .sp
@@ -293,8 +345,9 @@ of the standard test input files.
 .P
 When the POSIX API is being tested there is no way to override the default
 newline convention, though it is possible to set the newline convention from
-within the pattern. A warning is given if the \fBposix\fP modifier is used when
-\fB#newline_default\fP would set a default for the non-POSIX API.
+within the pattern. A warning is given if the \fBposix\fP or \fBposix_nosub\fP
+modifier is used when \fB#newline_default\fP would set a default for the
+non-POSIX API.
 .sp
   #pattern <modifier-list>
 .sp
@@ -400,8 +453,9 @@ A pattern can be followed by a modifier list (details below).
 .sp
 Before each subject line is passed to \fBpcre2_match()\fP or
 \fBpcre2_dfa_match()\fP, leading and trailing white space is removed, and the
-line is scanned for backslash escapes. The following provide a means of
-encoding non-printing characters in a visible way:
+line is scanned for backslash escapes, unless the \fBsubject_literal\fP
+modifier was set for the pattern. The following provide a means of encoding
+non-printing characters in a visible way:
 .sp
   \ea         alarm (BEL, \ex07)
   \eb         backspace (\ex08)
@@ -463,6 +517,11 @@ character. A backslash followed by anything else causes an error. However, if
 the very last character in the line is a backslash (and there is no modifier
 list), it is ignored. This gives a way of passing an empty line as data, since
 a real empty line terminates the data input.
+.P
+If the \fBsubject_literal\fP modifier is set for a pattern, all subject lines
+that follow are treated as literals, with no special treatment of backslashes.
+No replication is possible, and any subject modifiers must be set as defaults
+by a \fB#subject\fP command.
 .
 .
 .SH "PATTERN MODIFIERS"
@@ -478,31 +537,44 @@ by a previous \fB#pattern\fP command.
 .SS "Setting compilation options"
 .rs
 .sp
-The following modifiers set options for \fBpcre2_compile()\fP. The most common
-ones have single-letter abbreviations. See
+The following modifiers set options for \fBpcre2_compile()\fP. Most of them set
+bits in the options argument of that function, but those whose names start with
+PCRE2_EXTRA are additional options that are set in the compile context. For the
+main options, there are some single-letter abbreviations that are the same as
+Perl options. There is special handling for /x: if a second x is present,
+PCRE2_EXTENDED is converted into PCRE2_EXTENDED_MORE as in Perl. A third
+appearance adds PCRE2_EXTENDED as well, though this makes no difference to the
+way \fBpcre2_compile()\fP behaves. See
 .\" HREF
 \fBpcre2api\fP
 .\"
-for a description of their effects.
+for a description of the effects of these options.
 .sp
       allow_empty_class         set PCRE2_ALLOW_EMPTY_CLASS
+      allow_surrogate_escapes   set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
       alt_bsux                  set PCRE2_ALT_BSUX
       alt_circumflex            set PCRE2_ALT_CIRCUMFLEX
       alt_verbnames             set PCRE2_ALT_VERBNAMES
       anchored                  set PCRE2_ANCHORED
       auto_callout              set PCRE2_AUTO_CALLOUT
+      bad_escape_is_literal     set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
   /i  caseless                  set PCRE2_CASELESS
       dollar_endonly            set PCRE2_DOLLAR_ENDONLY
   /s  dotall                    set PCRE2_DOTALL
       dupnames                  set PCRE2_DUPNAMES
+      endanchored               set PCRE2_ENDANCHORED
   /x  extended                  set PCRE2_EXTENDED
+  /xx extended_more             set PCRE2_EXTENDED_MORE
       firstline                 set PCRE2_FIRSTLINE
+      literal                   set PCRE2_LITERAL
+      match_line                set PCRE2_EXTRA_MATCH_LINE
       match_unset_backref       set PCRE2_MATCH_UNSET_BACKREF
+      match_word                set PCRE2_EXTRA_MATCH_WORD
   /m  multiline                 set PCRE2_MULTILINE
       never_backslash_c         set PCRE2_NEVER_BACKSLASH_C
       never_ucp                 set PCRE2_NEVER_UCP
       never_utf                 set PCRE2_NEVER_UTF
-      no_auto_capture           set PCRE2_NO_AUTO_CAPTURE
+  /n  no_auto_capture           set PCRE2_NO_AUTO_CAPTURE
       no_auto_possess           set PCRE2_NO_AUTO_POSSESS
       no_dotstar_anchor         set PCRE2_NO_DOTSTAR_ANCHOR
       no_start_optimize         set PCRE2_NO_START_OPTIMIZE
@@ -515,7 +587,9 @@ for a description of their effects.
 As well as turning on the PCRE2_UTF option, the \fButf\fP modifier causes all
 non-printing characters in output strings to be printed using the \ex{hh...}
 notation. Otherwise, those less than 0x100 are output in hex without the curly
-brackets.
+brackets. Setting \fButf\fP in 16-bit or 32-bit mode also causes pattern and
+subject strings to be translated to UTF-16 or UTF-32, respectively, before
+being passed to library functions.
 .
 .
 .\" HTML <a name="controlmodifiers"></a>
@@ -523,12 +597,18 @@ brackets.
 .rs
 .sp
 The following modifiers affect the compilation process or request information
-about the pattern:
+about the pattern. There are single-letter abbreviations for some that are
+heavily used in the test files.
 .sp
       bsr=[anycrlf|unicode]     specify \eR handling
   /B  bincode                   show binary code without lengths
       callout_info              show callout information
+      convert=<options>         request foreign pattern conversion
+      convert_glob_escape=c     set glob escape character
+      convert_glob_separator=c  set glob separator character
+      convert_length            set convert buffer length
       debug                     same as info,fullbincode
+      framesize                 show matching frame size
       fullbincode               show binary code with lengths
   /I  info                      show info about compiled pattern
       hex                       unquoted characters are hexadecimal
@@ -546,7 +626,10 @@ about the pattern:
       push                      push compiled pattern onto the stack
       pushcopy                  push a copy onto the stack
       stackguard=<number>       test the stackguard feature
+      subject_literal           treat all subject lines as literal
       tables=[0|1|2]            select internal tables
+      use_length                do not zero-terminate the pattern
+      utf8_input                treat input as UTF-8
 .sp
 The effects of these modifiers are described in the following sections.
 .
@@ -561,7 +644,7 @@ is built, with the default default being Unicode.
 .P
 The \fBnewline\fP modifier specifies which characters are to be interpreted as
 newlines, both in the pattern and in subject lines. The type must be one of CR,
-LF, CRLF, ANYCRLF, or ANY (in upper or lower case).
+LF, CRLF, ANYCRLF, ANY, or NUL (in upper or lower case).
 .
 .
 .SS "Information about a pattern"
@@ -609,6 +692,10 @@ unit" is the last literal code unit that must be present in any match. This is
 not necessarily the last character. These lines are omitted if no starting or
 ending code units are recorded.
 .P
+The \fBframesize\fP modifier shows the size, in bytes, of the storage frames
+used by \fBpcre2_match()\fP for handling backtracking. The size depends on the
+number of capturing parentheses in the pattern.
+.P
 The \fBcallout_info\fP modifier requests information about all the callouts in
 the pattern. A list of them is output at the end of any other information that
 is requested. For each callout, either its number or string is given, followed
@@ -642,12 +729,41 @@ nine characters, only two of which are specified in hexadecimal:
   /ab "literal" 32/hex
 .sp
 Either single or double quotes may be used. There is no way of including
-the delimiter within a substring.
+the delimiter within a substring. The \fBhex\fP and \fBexpand\fP modifiers are
+mutually exclusive.
+.
+.
+.SS "Specifying the pattern's length"
+.rs
+.sp
+By default, patterns are passed to the compiling functions as zero-terminated
+strings but can be passed by length instead of being zero-terminated. The
+\fBuse_length\fP modifier causes this to happen. Using a length happens
+automatically (whether or not \fBuse_length\fP is set) when \fBhex\fP is set,
+because patterns specified in hexadecimal may contain binary zeros.
 .P
-By default, \fBpcre2test\fP passes patterns as zero-terminated strings to
-\fBpcre2_compile()\fP, giving the length as PCRE2_ZERO_TERMINATED. However, for
-patterns specified with the \fBhex\fP modifier, the actual length of the
-pattern is passed.
+If \fBhex\fP or \fBuse_length\fP is used with the POSIX wrapper API (see
+.\" HTML <a href="#posixwrapper">
+.\" </a>
+"Using the POSIX wrapper API"
+.\"
+below), the REG_PEND extension is used to pass the pattern's length.
+.
+.
+.SS "Specifying wide characters in 16-bit and 32-bit modes"
+.rs
+.sp
+In 16-bit and 32-bit modes, all input is automatically treated as UTF-8 and
+translated to UTF-16 or UTF-32 when the \fButf\fP modifier is set. For testing
+the 16-bit and 32-bit libraries in non-UTF mode, the \fButf8_input\fP modifier
+can be used. It is mutually exclusive with \fButf\fP. Input lines are
+interpreted as UTF-8 as a means of specifying wide characters. More details are
+given in
+.\" HTML <a href="#inputencoding">
+.\" </a>
+"Input encoding"
+.\"
+above.
 .
 .
 .SS "Generating long repetitive patterns"
@@ -665,7 +781,8 @@ are expanded before the pattern is passed to \fBpcre2_compile()\fP. For
 example, \e[AB]{6000} is expanded to "ABAB..." 6000 times. This construction
 cannot be nested. An initial "\e[" sequence is recognized only if "]{" followed
 by decimal digits and "}" is found later in the pattern. If not, the characters
-remain in the pattern unaltered.
+remain in the pattern unaltered. The \fBexpand\fP and \fBhex\fP modifiers are
+mutually exclusive.
 .P
 If part of an expanded pattern looks like an expansion, but is really part of
 the actual pattern, unwanted expansion can be avoided by giving two values in
@@ -696,7 +813,7 @@ below
 .\"
 for details of how these options are specified for each match attempt.
 .P
-JIT compilation is requested by the \fB/jit\fP pattern modifier, which may
+JIT compilation is requested by the \fBjit\fP pattern modifier, which may
 optionally be followed by an equals sign and a number in the range 0 to 7.
 The three bits that make up the number specify which of the three JIT operating
 modes are to be compiled:
@@ -705,7 +822,7 @@ modes are to be compiled:
   2  compile JIT code for soft partial matching
   4  compile JIT code for hard partial matching
 .sp
-The possible values for the \fB/jit\fP modifier are therefore:
+The possible values for the \fBjit\fP modifier are therefore:
 .sp
   0  disable JIT
   1  normal matching only
@@ -720,7 +837,7 @@ to \fBpcre2_match()\fP with either the PCRE2_PARTIAL_SOFT or the
 PCRE2_PARTIAL_HARD option set. Note that such a call may return a complete
 match; the options enable the possibility of a partial match, but do not
 require it. Note also that if you request JIT compilation only for partial
-matching (for example, /jit=2) but do not set the \fBpartial\fP modifier on a
+matching (for example, jit=2) but do not set the \fBpartial\fP modifier on a
 subject line, that match will not use JIT code because none was compiled for
 non-partial matching.
 .P
@@ -750,14 +867,14 @@ code was actually used in the match.
 .SS "Setting a locale"
 .rs
 .sp
-The \fB/locale\fP modifier must specify the name of a locale, for example:
+The \fBlocale\fP modifier must specify the name of a locale, for example:
 .sp
   /pattern/locale=fr_FR
 .sp
 The given locale is set, \fBpcre2_maketables()\fP is called to build a set of
 character tables for the locale, and this is then passed to
 \fBpcre2_compile()\fP when compiling the regular expression. The same tables
-are used when matching the following subject lines. The \fB/locale\fP modifier
+are used when matching the following subject lines. The \fBlocale\fP modifier
 applies only to the pattern on which it appears, but can be given in a
 \fB#pattern\fP command if a default is needed. Setting a locale and alternate
 character tables are mutually exclusive.
@@ -766,7 +883,7 @@ character tables are mutually exclusive.
 .SS "Showing pattern memory"
 .rs
 .sp
-The \fB/memory\fP modifier causes the size in bytes of the memory used to hold
+The \fBmemory\fP modifier causes the size in bytes of the memory used to hold
 the compiled pattern to be output. This does not include the size of the
 \fBpcre2_code\fP block; it is just the actual compiled data. If the pattern is
 subsequently passed to the JIT compiler, the size of the JIT compiled code is
@@ -797,10 +914,11 @@ causes a compilation error. The default is the largest number a PCRE2_SIZE
 variable can hold (essentially unlimited).
 .
 .
+.\" HTML <a name="posixwrapper"></a>
 .SS "Using the POSIX wrapper API"
 .rs
 .sp
-The \fB/posix\fP and \fBposix_nosub\fP modifiers cause \fBpcre2test\fP to call
+The \fBposix\fP and \fBposix_nosub\fP modifiers cause \fBpcre2test\fP to call
 PCRE2 via the POSIX wrapper API rather than its native API. When
 \fBposix_nosub\fP is used, the POSIX option REG_NOSUB is passed to
 \fBregcomp()\fP. The POSIX wrapper supports only the 8-bit library. Note that
@@ -830,12 +948,16 @@ large buffer is used.
 The \fBaftertext\fP and \fBallaftertext\fP subject modifiers work as described
 below. All other modifiers are either ignored, with a warning message, or cause
 an error.
+.P
+The pattern is passed to \fBregcomp()\fP as a zero-terminated string by
+default, but if the \fBuse_length\fP or \fBhex\fP modifiers are set, the
+REG_PEND extension is used to pass it by length.
 .
 .
 .SS "Testing the stack guard feature"
 .rs
 .sp
-The \fB/stackguard\fP modifier is used to test the use of
+The \fBstackguard\fP modifier is used to test the use of
 \fBpcre2_set_compile_recursion_guard()\fP, a function that is provided to
 enable stack availability to be checked during compilation (see the
 .\" HREF
@@ -852,7 +974,7 @@ be aborted.
 .SS "Using alternative character tables"
 .rs
 .sp
-The value specified for the \fB/tables\fP modifier must be one of the digits 0,
+The value specified for the \fBtables\fP modifier must be one of the digits 0,
 1, or 2. It causes a specific set of built-in character tables to be passed to
 \fBpcre2_compile()\fP. This is used in the PCRE2 tests to check behaviour with
 different character tables. The digit specifies the tables as follows:
@@ -870,17 +992,19 @@ are mutually exclusive.
 .SS "Setting certain match controls"
 .rs
 .sp
-The following modifiers are really subject modifiers, and are described below.
-However, they may be included in a pattern's modifier list, in which case they
-are applied to every subject line that is processed with that pattern. They may
-not appear in \fB#pattern\fP commands. These modifiers do not affect the
-compilation process.
+The following modifiers are really subject modifiers, and are described under
+"Subject Modifiers" below. However, they may be included in a pattern's
+modifier list, in which case they are applied to every subject line that is
+processed with that pattern. These modifiers do not affect the compilation
+process.
 .sp
       aftertext                  show text after match
       allaftertext               show text after captures
       allcaptures                show all captures
       allusedtext                show all consulted text
+      altglobal                  alternative global matching
   /g  global                     global matching
+      jitstack=<n>               set size of JIT stack
       mark                       show mark values
       replace=<string>           specify a replacement string
       startchar                  show starting character when relevant
@@ -893,6 +1017,15 @@ These modifiers may not appear in a \fB#pattern\fP command. If you want them as
 defaults, set them in a \fB#subject\fP command.
 .
 .
+.SS "Specifying literal subject lines"
+.rs
+.sp
+If the \fBsubject_literal\fP modifier is present on a pattern, all the subject
+lines that it matches are taken as literal strings, with no interpretation of
+backslashes. It is not possible to set subject modifiers on such lines, but any
+that are set as defaults by a \fB#subject\fP command are recognized.
+.
+.
 .SS "Saving a compiled pattern"
 .rs
 .sp
@@ -903,7 +1036,9 @@ facility is used when saving compiled patterns to a file, as described in the
 section entitled "Saving and restoring compiled patterns"
 .\" HTML <a href="#saverestore">
 .\" </a>
-below. If \fBpushcopy\fP is used instead of \fBpush\fP, a copy of the compiled
+below.
+.\"
+If \fBpushcopy\fP is used instead of \fBpush\fP, a copy of the compiled
 pattern is stacked, leaving the original as current, ready to match the
 following input lines. This provides a way of testing the
 \fBpcre2_code_copy()\fP function.
@@ -916,6 +1051,39 @@ allowed, does not carry through to any subsequent matching that uses a stacked
 pattern.
 .
 .
+.SS "Testing foreign pattern conversion"
+.rs
+.sp
+The experimental foreign pattern conversion functions in PCRE2 can be tested by
+setting the \fBconvert\fP modifier. Its argument is a colon-separated list of
+options, which set the equivalent option for the \fBpcre2_pattern_convert()\fP
+function:
+.sp
+  glob                    PCRE2_CONVERT_GLOB
+  glob_no_starstar        PCRE2_CONVERT_GLOB_NO_STARSTAR
+  glob_no_wild_separator  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
+  posix_basic             PCRE2_CONVERT_POSIX_BASIC
+  posix_extended          PCRE2_CONVERT_POSIX_EXTENDED
+  unset                   Unset all options
+.sp
+The "unset" value is useful for turning off a default that has been set by a
+\fB#pattern\fP command. When one of these options is set, the input pattern is
+passed to \fBpcre2_pattern_convert()\fP. If the conversion is successful, the
+result is reflected in the output and then passed to \fBpcre2_compile()\fP. The
+normal \fButf\fP and \fBno_utf_check\fP options, if set, cause the
+PCRE2_CONVERT_UTF and PCRE2_CONVERT_NO_UTF_CHECK options to be passed to
+\fBpcre2_pattern_convert()\fP.
+.P
+By default, the conversion function is allowed to allocate a buffer for its
+output. However, if the \fBconvert_length\fP modifier is set to a value greater
+than zero, \fBpcre2test\fP passes a buffer of the given length. This makes it
+possible to test the length check.
+.P
+The \fBconvert_glob_escape\fP and \fBconvert_glob_separator\fP modifiers can be
+used to specify the escape and separator characters for glob processing,
+overriding the defaults, which are operating-system dependent.
+.
+.
 .\" HTML <a name="subjectmodifiers"></a>
 .SH "SUBJECT MODIFIERS"
 .rs
@@ -935,6 +1103,7 @@ The following modifiers set options for \fBpcre2_match()\fP or
 for a description of their effects.
 .sp
       anchored                  set PCRE2_ANCHORED
+      endanchored               set PCRE2_ENDANCHORED
       dfa_restart               set PCRE2_DFA_RESTART
       dfa_shortest              set PCRE2_DFA_SHORTEST
       no_jit                    set PCRE2_NO_JIT
@@ -949,11 +1118,27 @@ for a description of their effects.
 The partial matching modifiers are provided with abbreviations because they
 appear frequently in tests.
 .P
-If the \fB/posix\fP modifier was present on the pattern, causing the POSIX
-wrapper API to be used, the only option-setting modifiers that have any effect
-are \fBnotbol\fP, \fBnotempty\fP, and \fBnoteol\fP, causing REG_NOTBOL,
-REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to \fBregexec()\fP.
-The other modifiers are ignored, with a warning message.
+If the \fBposix\fP or \fBposix_nosub\fP modifier was present on the pattern,
+causing the POSIX wrapper API to be used, the only option-setting modifiers
+that have any effect are \fBnotbol\fP, \fBnotempty\fP, and \fBnoteol\fP,
+causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to
+\fBregexec()\fP. The other modifiers are ignored, with a warning message.
+.P
+There is one additional modifier that can be used with the POSIX wrapper. It is
+ignored (with a warning) if used for non-POSIX matching.
+.sp
+      posix_startend=<n>[:<m>]
+.sp
+This causes the subject string to be passed to \fBregexec()\fP using the
+REG_STARTEND option, which uses offsets to specify which part of the string is
+searched. If only one number is given, the end offset is passed as the end of
+the subject string. For more detail of REG_STARTEND, see the
+.\" HREF
+\fBpcre2posix\fP
+.\"
+documentation. If the subject string contains binary zeros (coded as escapes
+such as \ex{00} because \fBpcre2test\fP does not support actual binary zeros in
+its input), you must use \fBposix_startend\fP to specify its length.
 .
 .
 .SS "Setting match controls"
@@ -971,23 +1156,28 @@ pattern.
       altglobal                  alternative global matching
       callout_capture            show captures at callout time
       callout_data=<n>           set a value to pass via callouts
+      callout_error=<n>[:<m>]    control callout error
+      callout_extra              show extra callout information
       callout_fail=<n>[:<m>]     control callout failure
+      callout_no_where           do not show position of a callout
       callout_none               do not supply a callout function
       copy=<number or name>      copy captured substring
+      depth_limit=<n>            set a depth limit
       dfa                        use \fBpcre2_dfa_match()\fP
-      find_limits                find match and recursion limits
+      find_limits                find match and depth limits
       get=<number or name>       extract captured substring
       getall                     extract all captured substrings
   /g  global                     global matching
+      heap_limit=<n>             set a limit on heap memory
       jitstack=<n>               set size of JIT stack
       mark                       show mark values
       match_limit=<n>            set a match limit
-      memory                     show memory usage
+      memory                     show heap memory usage
       null_context               match with a NULL context
       offset=<n>                 set starting offset
       offset_limit=<n>           set offset limit
       ovector=<n>                set size of output vector
-      recursion_limit=<n>        set a recursion limit
+      recursion_limit=<n>        obsolete synonym for depth_limit
       replace=<string>           specify a replacement string
       startchar                  show startchar when relevant
       startoffset=<n>            same as offset=<n>
@@ -1063,27 +1253,20 @@ does no capturing); it is ignored, with a warning message, if present.
 .rs
 .sp
 A callout function is supplied when \fBpcre2test\fP calls the library matching
-functions, unless \fBcallout_none\fP is specified. If \fBcallout_capture\fP is
-set, the current captured groups are output when a callout occurs.
-.P
-The \fBcallout_fail\fP modifier can be given one or two numbers. If there is
-only one number, 1 is returned instead of 0 when a callout of that number is
-reached. If two numbers are given, 1 is returned when callout <n> is reached
-for the <m>th time. Note that callouts with string arguments are always given
-the number zero. See "Callouts" below for a description of the output when a
-callout it taken.
-.P
-The \fBcallout_data\fP modifier can be given an unsigned or a negative number.
-This is set as the "user data" that is passed to the matching function, and
-passed back when the callout function is invoked. Any value other than zero is
-used as a return from \fBpcre2test\fP's callout function.
+functions, unless \fBcallout_none\fP is specified. Its behaviour can be
+controlled by various modifiers listed above whose names begin with
+\fBcallout_\fP. Details are given in the section entitled "Callouts"
+.\" HTML <a href="#callouts">
+.\" </a>
+below.
+.\"
 .
 .
 .SS "Finding all matches in a string"
 .rs
 .sp
 Searching for all possible matches within a subject can be requested by the
-\fBglobal\fP or \fB/altglobal\fP modifier. After finding a match, the matching
+\fBglobal\fP or \fBaltglobal\fP modifier. After finding a match, the matching
 function is called again to search the remainder of the subject. The difference
 between \fBglobal\fP and \fBaltglobal\fP is that the former uses the
 \fIstart_offset\fP argument to \fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP
@@ -1198,39 +1381,44 @@ matching provokes an error return ("bad option value") from
 .sp
 The \fBjitstack\fP modifier provides a way of setting the maximum stack size
 that is used by the just-in-time optimization code. It is ignored if JIT
-optimization is not being used. The value is a number of kilobytes. Providing a
-stack that is larger than the default 32K is necessary only for very
-complicated patterns.
+optimization is not being used. The value is a number of kilobytes. Setting
+zero reverts to the default of 32K. Providing a stack that is larger than the
+default is necessary only for very complicated patterns. If \fBjitstack\fP is
+set non-zero on a subject line it overrides any value that was set on the
+pattern.
 .
 .
-.SS "Setting match and recursion limits"
+.SS "Setting heap, match, and depth limits"
 .rs
 .sp
-The \fBmatch_limit\fP and \fBrecursion_limit\fP modifiers set the appropriate
-limits in the match context. These values are ignored when the
+The \fBheap_limit\fP, \fBmatch_limit\fP, and \fBdepth_limit\fP modifiers set
+the appropriate limits in the match context. These values are ignored when the
 \fBfind_limits\fP modifier is specified.
 .
 .
 .SS "Finding minimum limits"
 .rs
 .sp
-If the \fBfind_limits\fP modifier is present, \fBpcre2test\fP calls
-\fBpcre2_match()\fP several times, setting different values in the match
-context via \fBpcre2_set_match_limit()\fP and \fBpcre2_set_recursion_limit()\fP
-until it finds the minimum values for each parameter that allow
-\fBpcre2_match()\fP to complete without error.
+If the \fBfind_limits\fP modifier is present on a subject line, \fBpcre2test\fP
+calls the relevant matching function several times, setting different values in
+the match context via \fBpcre2_set_heap_limit(), \fBpcre2_set_match_limit()\fP,
+or \fBpcre2_set_depth_limit()\fP until it finds the minimum values for each
+parameter that allows the match to complete without error.
 .P
 If JIT is being used, only the match limit is relevant. If DFA matching is
-being used, neither limit is relevant, and this modifier is ignored (with a
-warning message).
+being used, only the depth limit is relevant.
 .P
 The \fImatch_limit\fP number is a measure of the amount of backtracking
 that takes place, and learning the minimum value can be instructive. For most
 simple matches, the number is quite small, but for patterns with very large
 numbers of matching possibilities, it can become large very quickly with
-increasing length of subject string. The \fImatch_limit_recursion\fP number is
-a measure of how much stack (or, if PCRE2 is compiled with NO_RECURSE, how much
-heap) memory is needed to complete the match attempt.
+increasing length of subject string.
+.P
+For non-DFA matching, the minimum \fIdepth_limit\fP number is a measure of how
+much nested backtracking happens (that is, how deeply the pattern's tree is
+searched). In the case of DFA matching, \fIdepth_limit\fP controls the depth of
+recursive calls of the internal function that is used for handling pattern
+recursion, lookaround assertions, and atomic groups.
 .
 .
 .SS "Showing MARK names"
@@ -1247,8 +1435,15 @@ is added to the non-match message.
 .SS "Showing memory usage"
 .rs
 .sp
-The \fBmemory\fP modifier causes \fBpcre2test\fP to log all memory allocation
-and freeing calls that occur during a match operation.
+The \fBmemory\fP modifier causes \fBpcre2test\fP to log the sizes of all heap
+memory allocation and freeing calls that occur during a call to
+\fBpcre2_match()\fP. These occur only when a match requires a bigger vector
+than the default for remembering backtracking points. In many cases there will
+be no heap memory used and therefore no additional output. No heap memory is
+allocated during matching with \fBpcre2_dfa_match\fP or with JIT, so in those
+cases the \fBmemory\fP modifier never has any effect. For this modifier to
+work, the \fBnull_context\fP modifier must not be set on both the pattern and
+the subject, though it can be set on one or the other.
 .
 .
 .SS "Setting a starting offset"
@@ -1291,8 +1486,8 @@ pair of offsets.)
 By default, the subject string is passed to a native API matching function with
 its correct length. In order to test the facility for passing a zero-terminated
 string, the \fBzero_terminate\fP modifier is provided. It causes the length to
-be passed as PCRE2_ZERO_TERMINATED. (When matching via the POSIX interface,
-this modifier has no effect, as there is no facility for passing a length.)
+be passed as PCRE2_ZERO_TERMINATED. When matching via the POSIX interface,
+this modifier is ignored, with a warning.
 .P
 When testing \fBpcre2_substitute()\fP, this modifier also has the effect of
 passing the replacement string as zero-terminated.
@@ -1349,7 +1544,7 @@ code unit offset of the start of the failing character is also output. Here is
 an example of an interactive \fBpcre2test\fP run.
 .sp
   $ pcre2test
-  PCRE2 version 9.00 2014-05-10
+  PCRE2 version 10.22 2016-07-29
 .sp
     re> /^abc(\ed+)/
   data> abc123
@@ -1376,7 +1571,7 @@ unset substring is shown as "<unset>", as for the second data line.
 If the strings contain any non-printing characters, they are output as \exhh
 escapes if the value is less than 256 and UTF mode is not set. Otherwise they
 are output as \ex{hh...} escapes. See below for the definition of non-printing
-characters. If the \fB/aftertext\fP modifier is set, the output for substring
+characters. If the \fBaftertext\fP modifier is set, the output for substring
 0 is followed by the the rest of the subject string, identified by "0+" like
 this:
 .sp
@@ -1470,27 +1665,15 @@ For further information about partial matching, see the
 documentation.
 .
 .
+.\" HTML <a name="callouts"></a>
 .SH CALLOUTS
 .rs
 .sp
 If the pattern contains any callout requests, \fBpcre2test\fP's callout
-function is called during matching unless \fBcallout_none\fP is specified.
-This works with both matching functions.
-.P
-The callout function in \fBpcre2test\fP returns zero (carry on matching) by
-default, but you can use a \fBcallout_fail\fP modifier in a subject line (as
-described above) to change this and other parameters of the callout.
-.P
-Inserting callouts can be helpful when using \fBpcre2test\fP to check
-complicated regular expressions. For further information about callouts, see
-the
-.\" HREF
-\fBpcre2callout\fP
-.\"
-documentation.
-.P
-The output for callouts with numerical arguments and those with string
-arguments is slightly different.
+function is called during matching unless \fBcallout_none\fP is specified. This
+works with both matching functions, and with JIT, though there are some
+differences in behaviour. The output for callouts with numerical arguments and
+those with string arguments is slightly different.
 .
 .
 .SS "Callouts with numerical arguments"
@@ -1511,7 +1694,7 @@ the current position precedes the start position, which can happen if the
 callout is in a lookbehind assertion.
 .P
 Callouts numbered 255 are assumed to be automatic callouts, inserted as a
-result of the \fB/auto_callout\fP pattern modifier. In this case, instead of
+result of the \fBauto_callout\fP pattern modifier. In this case, instead of
 showing the callout number, the offset in the pattern, preceded by a plus, is
 output. For example:
 .sp
@@ -1564,6 +1747,103 @@ example:
 .sp
 .
 .
+.SS "Callout modifiers"
+.rs
+.sp
+The callout function in \fBpcre2test\fP returns zero (carry on matching) by
+default, but you can use a \fBcallout_fail\fP modifier in a subject line to
+change this and other parameters of the callout (see below).
+.P
+If the \fBcallout_capture\fP modifier is set, the current captured groups are
+output when a callout occurs. This is useful only for non-DFA matching, as
+\fBpcre2_dfa_match()\fP does not support capturing, so no captures are ever
+shown.
+.P
+The normal callout output, showing the callout number or pattern offset (as
+described above) is suppressed if the \fBcallout_no_where\fP modifier is set.
+.P
+When using the interpretive matching function \fBpcre2_match()\fP without JIT,
+setting the \fBcallout_extra\fP modifier causes additional output from
+\fBpcre2test\fP's callout function to be generated. For the first callout in a
+match attempt at a new starting position in the subject, "New match attempt" is
+output. If there has been a backtrack since the last callout (or start of
+matching if this is the first callout), "Backtrack" is output, followed by "No
+other matching paths" if the backtrack ended the previous match attempt. For
+example:
+.sp
+   re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
+  data> aac\e=callout_extra
+  New match attempt
+  --->aac
+   +0 ^       (
+   +1 ^       a+
+   +3 ^ ^     )
+   +4 ^ ^     b
+  Backtrack
+  --->aac
+   +3 ^^      )
+   +4 ^^      b
+  Backtrack
+  No other matching paths
+  New match attempt
+  --->aac
+   +0  ^      (
+   +1  ^      a+
+   +3  ^^     )
+   +4  ^^     b
+  Backtrack
+  No other matching paths
+  New match attempt
+  --->aac
+   +0   ^     (
+   +1   ^     a+
+  Backtrack
+  No other matching paths
+  New match attempt
+  --->aac
+   +0    ^    (
+   +1    ^    a+
+  No match
+.sp
+Notice that various optimizations must be turned off if you want all possible
+matching paths to be scanned. If \fBno_start_optimize\fP is not used, there is
+an immediate "no match", without any callouts, because the starting
+optimization fails to find "b" in the subject, which it knows must be present
+for any match. If \fBno_auto_possess\fP is not used, the "a+" item is turned
+into "a++", which reduces the number of backtracks.
+.P
+The \fBcallout_extra\fP modifier has no effect if used with the DFA matching
+function, or with JIT.
+.
+.
+.SS "Return values from callouts"
+.rs
+.sp
+The default return from the callout function is zero, which allows matching to
+continue. The \fBcallout_fail\fP modifier can be given one or two numbers. If
+there is only one number, 1 is returned instead of 0 (causing matching to
+backtrack) when a callout of that number is reached. If two numbers (<n>:<m>)
+are given, 1 is returned when callout <n> is reached and there have been at
+least <m> callouts. The \fBcallout_error\fP modifier is similar, except that
+PCRE2_ERROR_CALLOUT is returned, causing the entire matching process to be
+aborted. If both these modifiers are set for the same callout number,
+\fBcallout_error\fP takes precedence. Note that callouts with string arguments
+are always given the number zero.
+.P
+The \fBcallout_data\fP modifier can be given an unsigned or a negative number.
+This is set as the "user data" that is passed to the matching function, and
+passed back when the callout function is invoked. Any value other than zero is
+used as a return from \fBpcre2test\fP's callout function.
+.P
+Inserting callouts can be helpful when using \fBpcre2test\fP to check
+complicated regular expressions. For further information about callouts, see
+the
+.\" HREF
+\fBpcre2callout\fP
+.\"
+documentation.
+.
+.
 .
 .SH "NON-PRINTING CHARACTERS"
 .rs
@@ -1574,7 +1854,7 @@ therefore shown as hex escapes.
 .P
 When \fBpcre2test\fP is outputting text that is a matched part of a subject
 string, it behaves in the same way, unless a different locale has been set for
-the pattern (using the \fB/locale\fP modifier). In this case, the
+the pattern (using the \fBlocale\fP modifier). In this case, the
 \fBisprint()\fP function is used to distinguish printing and non-printing
 characters.
 .
@@ -1682,6 +1962,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 06 July 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 21 December 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
diff --git a/doc/pcre2test.txt b/doc/pcre2test.txt
index cfa0baa..93efd24 100644
--- a/doc/pcre2test.txt
+++ b/doc/pcre2test.txt
@@ -26,7 +26,7 @@ SYNOPSIS
 
        As the original fairly simple PCRE library evolved,  it  acquired  many
        different  features,  and  as  a  result, the original pcretest program
-       ended up with a lot of options in a messy, arcane syntax,  for  testing
+       ended up with a lot of options in a messy, arcane  syntax  for  testing
        all the features. The move to the new PCRE2 API provided an opportunity
        to re-implement the test program as pcre2test, with a cleaner  modifier
        syntax.  Nevertheless,  there are still many obscure modifiers, some of
@@ -45,7 +45,7 @@ PCRE2's 8-BIT, 16-BIT AND 32-BIT LIBRARIES
        installed. The pcre2test program can be used to test all the libraries.
        However, its own input and output are  always  in  8-bit  format.  When
        testing  the  16-bit  or 32-bit libraries, patterns and subject strings
-       are converted to 16- or  32-bit  format  before  being  passed  to  the
+       are converted to 16-bit or 32-bit format before  being  passed  to  the
        library  functions.  Results are converted back to 8-bit code units for
        output.
 
@@ -58,45 +58,81 @@ PCRE2's 8-BIT, 16-BIT AND 32-BIT LIBRARIES
 INPUT ENCODING
 
        Input  to  pcre2test is processed line by line, either by calling the C
-       library's fgets() function, or via the libreadline library (see below).
+       library's fgets() function, or via the  libreadline  library.  In  some
+       Windows  environments  character 26 (hex 1A) causes an immediate end of
+       file, and no further data is read, so this character should be  avoided
+       unless you really want that action.
+
        The  input  is  processed using using C's string functions, so must not
-       contain binary zeroes, even though in Unix-like  environments,  fgets()
-       treats any bytes other than newline as data characters. In some Windows
-       environments character 26 (hex 1A) causes an immediate end of file, and
-       no further data is read.
+       contain binary zeros, even though in  Unix-like  environments,  fgets()
+       treats  any  bytes  other  than newline as data characters. An error is
+       generated if a binary zero is encountered. By default subject lines are
+       processed for backslash escapes, which makes it possible to include any
+       data value in strings that are passed to the library for matching.  For
+       patterns,  there  is a facility for specifying some or all of the 8-bit
+       input characters as hexadecimal  pairs,  which  makes  it  possible  to
+       include binary zeros.
+
+   Input for the 16-bit and 32-bit libraries
+
+       When testing the 16-bit or 32-bit libraries, there is a need to be able
+       to generate character code points greater than 255 in the strings  that
+       are  passed to the library. For subject lines, backslash escapes can be
+       used. In addition, when the  utf  modifier  (see  "Setting  compilation
+       options" below) is set, the pattern and any following subject lines are
+       interpreted as UTF-8 strings and translated  to  UTF-16  or  UTF-32  as
+       appropriate.
 
-       For  maximum portability, therefore, it is safest to avoid non-printing
-       characters in pcre2test input files. There is a facility for specifying
-       some or all of a pattern's characters as hexadecimal pairs, thus making
-       it possible to include binary zeroes in a pattern for testing purposes.
-       Subject  lines are processed for backslash escapes, which makes it pos-
-       sible to include any data value.
+       For  non-UTF testing of wide characters, the utf8_input modifier can be
+       used. This is mutually exclusive with  utf,  and  is  allowed  only  in
+       16-bit  or  32-bit  mode.  It  causes the pattern and following subject
+       lines to be treated as UTF-8 according to the original definition  (RFC
+       2279), which allows for character values up to 0x7fffffff. Each charac-
+       ter is placed in one 16-bit or 32-bit code unit (in  the  16-bit  case,
+       values greater than 0xffff cause an error to occur).
+
+       UTF-8  (in  its  original definition) is not capable of encoding values
+       greater than 0x7fffffff, but such values can be handled by  the  32-bit
+       library. When testing this library in non-UTF mode with utf8_input set,
+       if any character is preceded by the byte 0xff (which is an illegal byte
+       in  UTF-8)  0x80000000  is  added to the character's value. This is the
+       only way of passing such code points in a pattern string.  For  subject
+       strings, using an escape sequence is preferable.
 
 
 COMMAND LINE OPTIONS
 
        -8        If the 8-bit library has been built, this option causes it to
-                 be  used  (this is the default). If the 8-bit library has not
+                 be used (this is the default). If the 8-bit library  has  not
                  been built, this option causes an error.
 
-       -16       If the 16-bit library has been built, this option  causes  it
-                 to  be  used. If only the 16-bit library has been built, this
-                 is the default. If the 16-bit library  has  not  been  built,
+       -16       If  the  16-bit library has been built, this option causes it
+                 to be used. If only the 16-bit library has been  built,  this
+                 is  the  default.  If  the 16-bit library has not been built,
                  this option causes an error.
 
-       -32       If  the  32-bit library has been built, this option causes it
-                 to be used. If only the 32-bit library has been  built,  this
-                 is  the  default.  If  the 32-bit library has not been built,
+       -32       If the 32-bit library has been built, this option  causes  it
+                 to  be  used. If only the 32-bit library has been built, this
+                 is the default. If the 32-bit library  has  not  been  built,
                  this option causes an error.
 
-       -b        Behave as if each pattern has the /fullbincode modifier;  the
+       -ac       Behave as if each pattern has the auto_callout modifier, that
+                 is, insert automatic callouts into every pattern that is com-
+                 piled.
+
+       -AC       As  for  -ac,  but in addition behave as if each subject line
+                 has the callout_extra  modifier,  that  is,  show  additional
+                 information from callouts.
+
+       -b        Behave  as  if each pattern has the fullbincode modifier; the
                  full internal binary form of the pattern is output after com-
                  pilation.
 
-       -C        Output the version number  of  the  PCRE2  library,  and  all
-                 available  information  about  the optional features that are
-                 included, and then  exit  with  zero  exit  code.  All  other
-                 options are ignored.
+       -C        Output  the  version  number  of  the  PCRE2 library, and all
+                 available information about the optional  features  that  are
+                 included,  and  then  exit  with  zero  exit  code. All other
+                 options are ignored. If both -C and -LM are  present,  which-
+                 ever is first is recognized.
 
        -C option Output  information  about a specific build-time option, then
                  exit. This functionality is intended for use in scripts  such
@@ -110,7 +146,7 @@ COMMAND LINE OPTIONS
                    linksize   the configured internal link size (2, 3, or 4)
                                 exit code is set to the link size
                    newline    the default newline setting:
-                                CR, LF, CRLF, ANYCRLF, or ANY
+                                CR, LF, CRLF, ANYCRLF, ANY, or NUL
                                 exit code is always 0
                    bsr        the default setting for what \R matches:
                                 ANYCRLF or ANY
@@ -147,13 +183,24 @@ COMMAND LINE OPTIONS
 
        -help     Output a brief summary these options and then exit.
 
-       -i        Behave as if each pattern has the /info modifier; information
+       -i        Behave as if each pattern has the info modifier;  information
                  about the compiled pattern is given after compilation.
 
        -jit      Behave  as  if  each pattern line has the jit modifier; after
                  successful compilation, each pattern is passed to  the  just-
                  in-time compiler, if available.
 
+       -jitverify
+                 Behave  as  if  each pattern line has the jitverify modifier;
+                 after successful compilation, each pattern is passed  to  the
+                 just-in-time  compiler,  if  available, and the use of JIT is
+                 verified.
+
+       -LM       List modifiers: write a list of available pattern and subject
+                 modifiers  to  the  standard output, then exit with zero exit
+                 code. All other options are ignored.  If both -C and -LM  are
+                 present, whichever is first is recognized.
+
        -pattern modifier-list
                  Behave as if each pattern line contains the given modifiers.
 
@@ -269,7 +316,7 @@ COMMAND LINES
 
        The #newline_default command specifies a list of newline types that are
        acceptable  as the default. The types must be one of CR, LF, CRLF, ANY-
-       CRLF, or ANY (in upper or lower case), for example:
+       CRLF, ANY, or NUL (in upper or lower case), for example:
 
          #newline_default LF Any anyCRLF
 
@@ -282,9 +329,9 @@ COMMAND LINES
 
        When  the  POSIX  API  is  being tested there is no way to override the
        default newline convention, though it is possible to  set  the  newline
-       convention  from  within  the  pattern. A warning is given if the posix
-       modifier is used when #newline_default would set a default for the non-
-       POSIX API.
+       convention  from within the pattern. A warning is given if the posix or
+       posix_nosub modifier is used when #newline_default would set a  default
+       for the non-POSIX API.
 
          #pattern <modifier-list>
 
@@ -387,8 +434,9 @@ SUBJECT LINE SYNTAX
 
        Before   each   subject   line   is   passed   to   pcre2_match()    or
        pcre2_dfa_match(), leading and trailing white space is removed, and the
-       line is scanned for backslash escapes. The following provide a means of
-       encoding non-printing characters in a visible way:
+       line is scanned for backslash escapes, unless the subject_literal modi-
+       fier was set for the pattern. The following provide a means of encoding
+       non-printing characters in a visible way:
 
          \a         alarm (BEL, \x07)
          \b         backspace (\x08)
@@ -405,23 +453,23 @@ SUBJECT LINE SYNTAX
          \x{hh...}  hexadecimal character (any number of hex digits)
 
        The use of \x{hh...} is not dependent on the use of the utf modifier on
-       the pattern. It is recognized always. There may be any number of  hexa-
-       decimal  digits  inside  the  braces; invalid values provoke error mes-
+       the  pattern. It is recognized always. There may be any number of hexa-
+       decimal digits inside the braces; invalid  values  provoke  error  mes-
        sages.
 
-       Note that \xhh specifies one byte rather than one  character  in  UTF-8
-       mode;  this  makes it possible to construct invalid UTF-8 sequences for
-       testing purposes. On the other hand, \x{hh} is interpreted as  a  UTF-8
-       character  in UTF-8 mode, generating more than one byte if the value is
-       greater than 127.  When testing the 8-bit library not  in  UTF-8  mode,
+       Note  that  \xhh  specifies one byte rather than one character in UTF-8
+       mode; this makes it possible to construct invalid UTF-8  sequences  for
+       testing  purposes.  On the other hand, \x{hh} is interpreted as a UTF-8
+       character in UTF-8 mode, generating more than one byte if the value  is
+       greater  than  127.   When testing the 8-bit library not in UTF-8 mode,
        \x{hh} generates one byte for values less than 256, and causes an error
        for greater values.
 
        In UTF-16 mode, all 4-digit \x{hhhh} values are accepted. This makes it
        possible to construct invalid UTF-16 sequences for testing purposes.
 
-       In  UTF-32  mode,  all  4- to 8-digit \x{...} values are accepted. This
-       makes it possible to construct invalid  UTF-32  sequences  for  testing
+       In UTF-32 mode, all 4- to 8-digit \x{...}  values  are  accepted.  This
+       makes  it  possible  to  construct invalid UTF-32 sequences for testing
        purposes.
 
        There is a special backslash sequence that specifies replication of one
@@ -429,33 +477,38 @@ SUBJECT LINE SYNTAX
 
          \[<characters>]{<count>}
 
-       This makes it possible to test long strings without having  to  provide
+       This  makes  it possible to test long strings without having to provide
        them as part of the file. For example:
 
          \[abc]{4}
 
-       is  converted to "abcabcabcabc". This feature does not support nesting.
+       is converted to "abcabcabcabc". This feature does not support  nesting.
        To include a closing square bracket in the characters, code it as \x5D.
 
-       A backslash followed by an equals sign marks the  end  of  the  subject
+       A  backslash  followed  by  an equals sign marks the end of the subject
        string and the start of a modifier list. For example:
 
          abc\=notbol,notempty
 
-       If  the  subject  string is empty and \= is followed by whitespace, the
-       line is treated as a comment line, and is not used  for  matching.  For
+       If the subject string is empty and \= is followed  by  whitespace,  the
+       line  is  treated  as a comment line, and is not used for matching. For
        example:
 
          \= This is a comment.
          abc\= This is an invalid modifier list.
 
-       A  backslash  followed  by  any  other  non-alphanumeric character just
+       A backslash followed  by  any  other  non-alphanumeric  character  just
        escapes that character. A backslash followed by anything else causes an
-       error.  However,  if the very last character in the line is a backslash
-       (and there is no modifier list), it is ignored. This  gives  a  way  of
-       passing  an  empty line as data, since a real empty line terminates the
+       error. However, if the very last character in the line is  a  backslash
+       (and  there  is  no  modifier list), it is ignored. This gives a way of
+       passing an empty line as data, since a real empty line  terminates  the
        data input.
 
+       If the subject_literal modifier is set for a pattern, all subject lines
+       that follow are treated as literals, with no special treatment of back-
+       slashes.  No replication is possible, and any subject modifiers must be
+       set as defaults by a #subject command.
+
 
 PATTERN MODIFIERS
 
@@ -466,28 +519,42 @@ PATTERN MODIFIERS
 
    Setting compilation options
 
-       The  following modifiers set options for pcre2_compile(). The most com-
-       mon ones have single-letter abbreviations. See pcre2api for a  descrip-
-       tion of their effects.
+       The  following  modifiers set options for pcre2_compile(). Most of them
+       set bits in the options argument of  that  function,  but  those  whose
+       names start with PCRE2_EXTRA are additional options that are set in the
+       compile context. For the main options,  there  are  some  single-letter
+       abbreviations  that are the same as Perl options. There is special han-
+       dling for /x: if a second x is  present,  PCRE2_EXTENDED  is  converted
+       into   PCRE2_EXTENDED_MORE   as   in  Perl.  A  third  appearance  adds
+       PCRE2_EXTENDED as well, though this makes  no  difference  to  the  way
+       pcre2_compile()  behaves. See pcre2api for a description of the effects
+       of these options.
 
              allow_empty_class         set PCRE2_ALLOW_EMPTY_CLASS
+             allow_surrogate_escapes   set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
              alt_bsux                  set PCRE2_ALT_BSUX
              alt_circumflex            set PCRE2_ALT_CIRCUMFLEX
              alt_verbnames             set PCRE2_ALT_VERBNAMES
              anchored                  set PCRE2_ANCHORED
              auto_callout              set PCRE2_AUTO_CALLOUT
+             bad_escape_is_literal     set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
          /i  caseless                  set PCRE2_CASELESS
              dollar_endonly            set PCRE2_DOLLAR_ENDONLY
          /s  dotall                    set PCRE2_DOTALL
              dupnames                  set PCRE2_DUPNAMES
+             endanchored               set PCRE2_ENDANCHORED
          /x  extended                  set PCRE2_EXTENDED
+         /xx extended_more             set PCRE2_EXTENDED_MORE
              firstline                 set PCRE2_FIRSTLINE
+             literal                   set PCRE2_LITERAL
+             match_line                set PCRE2_EXTRA_MATCH_LINE
              match_unset_backref       set PCRE2_MATCH_UNSET_BACKREF
+             match_word                set PCRE2_EXTRA_MATCH_WORD
          /m  multiline                 set PCRE2_MULTILINE
              never_backslash_c         set PCRE2_NEVER_BACKSLASH_C
              never_ucp                 set PCRE2_NEVER_UCP
              never_utf                 set PCRE2_NEVER_UTF
-             no_auto_capture           set PCRE2_NO_AUTO_CAPTURE
+         /n  no_auto_capture           set PCRE2_NO_AUTO_CAPTURE
              no_auto_possess           set PCRE2_NO_AUTO_POSSESS
              no_dotstar_anchor         set PCRE2_NO_DOTSTAR_ANCHOR
              no_start_optimize         set PCRE2_NO_START_OPTIMIZE
@@ -498,19 +565,27 @@ PATTERN MODIFIERS
              utf                       set PCRE2_UTF
 
        As well as turning on the PCRE2_UTF option, the utf modifier causes all
-       non-printing characters in output  strings  to  be  printed  using  the
-       \x{hh...}  notation. Otherwise, those less than 0x100 are output in hex
-       without the curly brackets.
+       non-printing  characters  in  output  strings  to  be printed using the
+       \x{hh...} notation. Otherwise, those less than 0x100 are output in  hex
+       without  the  curly brackets. Setting utf in 16-bit or 32-bit mode also
+       causes pattern and subject  strings  to  be  translated  to  UTF-16  or
+       UTF-32, respectively, before being passed to library functions.
 
    Setting compilation controls
 
-       The following modifiers  affect  the  compilation  process  or  request
-       information about the pattern:
+       The  following  modifiers  affect  the  compilation  process or request
+       information about the pattern. There  are  single-letter  abbreviations
+       for some that are heavily used in the test files.
 
              bsr=[anycrlf|unicode]     specify \R handling
          /B  bincode                   show binary code without lengths
              callout_info              show callout information
+             convert=<options>         request foreign pattern conversion
+             convert_glob_escape=c     set glob escape character
+             convert_glob_separator=c  set glob separator character
+             convert_length            set convert buffer length
              debug                     same as info,fullbincode
+             framesize                 show matching frame size
              fullbincode               show binary code with lengths
          /I  info                      show info about compiled pattern
              hex                       unquoted characters are hexadecimal
@@ -528,7 +603,10 @@ PATTERN MODIFIERS
              push                      push compiled pattern onto the stack
              pushcopy                  push a copy onto the stack
              stackguard=<number>       test the stackguard feature
+             subject_literal           treat all subject lines as literal
              tables=[0|1|2]            select internal tables
+             use_length                do not zero-terminate the pattern
+             utf8_input                treat input as UTF-8
 
        The effects of these modifiers are described in the following sections.
 
@@ -541,7 +619,7 @@ PATTERN MODIFIERS
 
        The newline modifier specifies which characters are to  be  interpreted
        as newlines, both in the pattern and in subject lines. The type must be
-       one of CR, LF, CRLF, ANYCRLF, or ANY (in upper or lower case).
+       one of CR, LF, CRLF, ANYCRLF, ANY, or NUL (in upper or lower case).
 
    Information about a pattern
 
@@ -589,6 +667,10 @@ PATTERN MODIFIERS
        last character. These lines are omitted if no starting or  ending  code
        units are recorded.
 
+       The  framesize modifier shows the size, in bytes, of the storage frames
+       used by pcre2_match() for handling backtracking. The  size  depends  on
+       the number of capturing parentheses in the pattern.
+
        The  callout_info  modifier requests information about all the callouts
        in the pattern. A list of them is output at the end of any other infor-
        mation that is requested. For each callout, either its number or string
@@ -619,12 +701,30 @@ PATTERN MODIFIERS
          /ab "literal" 32/hex
 
        Either single or double quotes may be used. There is no way of  includ-
-       ing the delimiter within a substring.
+       ing  the delimiter within a substring. The hex and expand modifiers are
+       mutually exclusive.
+
+   Specifying the pattern's length
+
+       By default, patterns are passed to the compiling functions as zero-ter-
+       minated  strings but can be passed by length instead of being zero-ter-
+       minated. The use_length modifier causes this to happen. Using a  length
+       happens  automatically  (whether  or not use_length is set) when hex is
+       set, because patterns  specified  in  hexadecimal  may  contain  binary
+       zeros.
+
+       If hex or use_length is used with the POSIX wrapper API (see "Using the
+       POSIX wrapper API" below), the REG_PEND extension is used to  pass  the
+       pattern's length.
+
+   Specifying wide characters in 16-bit and 32-bit modes
 
-       By  default,  pcre2test  passes  patterns as zero-terminated strings to
-       pcre2_compile(), giving the length as  PCRE2_ZERO_TERMINATED.  However,
-       for  patterns specified with the hex modifier, the actual length of the
-       pattern is passed.
+       In 16-bit and 32-bit modes, all input is automatically treated as UTF-8
+       and translated to UTF-16 or UTF-32 when the utf modifier  is  set.  For
+       testing the 16-bit and 32-bit libraries in non-UTF mode, the utf8_input
+       modifier can be used. It is mutually exclusive with  utf.  Input  lines
+       are interpreted as UTF-8 as a means of specifying wide characters. More
+       details are given in "Input encoding" above.
 
    Generating long repetitive patterns
 
@@ -640,38 +740,39 @@ PATTERN MODIFIERS
        ple, \[AB]{6000} is expanded to "ABAB..." 6000 times. This construction
        cannot be nested. An initial "\[" sequence is recognized only  if  "]{"
        followed  by  decimal  digits and "}" is found later in the pattern. If
-       not, the characters remain in the pattern unaltered.
+       not, the characters remain in the pattern unaltered. The expand and hex
+       modifiers are mutually exclusive.
 
-       If part of an expanded pattern looks like an expansion, but  is  really
+       If  part  of an expanded pattern looks like an expansion, but is really
        part of the actual pattern, unwanted expansion can be avoided by giving
        two values in the quantifier. For example, \[AB]{6000,6000} is not rec-
        ognized as an expansion item.
 
-       If  the  info modifier is set on an expanded pattern, the result of the
+       If the info modifier is set on an expanded pattern, the result  of  the
        expansion is included in the information that is output.
 
    JIT compilation
 
-       Just-in-time (JIT) compiling is a  heavyweight  optimization  that  can
-       greatly  speed  up pattern matching. See the pcre2jit documentation for
-       details. JIT compiling happens, optionally, after a  pattern  has  been
-       successfully  compiled into an internal form. The JIT compiler converts
+       Just-in-time  (JIT)  compiling  is  a heavyweight optimization that can
+       greatly speed up pattern matching. See the pcre2jit  documentation  for
+       details.  JIT  compiling  happens, optionally, after a pattern has been
+       successfully compiled into an internal form. The JIT compiler  converts
        this to optimized machine code. It needs to know whether the match-time
        options PCRE2_PARTIAL_HARD and PCRE2_PARTIAL_SOFT are going to be used,
-       because different code is generated for the different  cases.  See  the
-       partial  modifier in "Subject Modifiers" below for details of how these
+       because  different  code  is generated for the different cases. See the
+       partial modifier in "Subject Modifiers" below for details of how  these
        options are specified for each match attempt.
 
-       JIT compilation is requested by the /jit pattern  modifier,  which  may
+       JIT  compilation  is  requested  by the jit pattern modifier, which may
        optionally be followed by an equals sign and a number in the range 0 to
-       7.  The three bits that make up the number specify which of  the  three
+       7.   The  three bits that make up the number specify which of the three
        JIT operating modes are to be compiled:
 
          1  compile JIT code for non-partial matching
          2  compile JIT code for soft partial matching
          4  compile JIT code for hard partial matching
 
-       The possible values for the /jit modifier are therefore:
+       The possible values for the jit modifier are therefore:
 
          0  disable JIT
          1  normal matching only
@@ -681,54 +782,54 @@ PATTERN MODIFIERS
          6  soft and hard partial matching only
          7  all three modes
 
-       If  no  number  is  given,  7 is assumed. The phrase "partial matching"
+       If no number is given, 7 is  assumed.  The  phrase  "partial  matching"
        means a call to pcre2_match() with either the PCRE2_PARTIAL_SOFT or the
-       PCRE2_PARTIAL_HARD  option set. Note that such a call may return a com-
+       PCRE2_PARTIAL_HARD option set. Note that such a call may return a  com-
        plete match; the options enable the possibility of a partial match, but
-       do  not  require it. Note also that if you request JIT compilation only
-       for partial matching (for example, /jit=2) but do not set  the  partial
-       modifier  on  a  subject line, that match will not use JIT code because
+       do not require it. Note also that if you request JIT  compilation  only
+       for  partial  matching  (for example, jit=2) but do not set the partial
+       modifier on a subject line, that match will not use  JIT  code  because
        none was compiled for non-partial matching.
 
-       If JIT compilation is successful, the compiled JIT code will  automati-
-       cally  be  used  when  an appropriate type of match is run, except when
-       incompatible run-time options are specified. For more details, see  the
-       pcre2jit  documentation. See also the jitstack modifier below for a way
+       If  JIT compilation is successful, the compiled JIT code will automati-
+       cally be used when an appropriate type of match  is  run,  except  when
+       incompatible  run-time options are specified. For more details, see the
+       pcre2jit documentation. See also the jitstack modifier below for a  way
        of setting the size of the JIT stack.
 
-       If the jitfast modifier is specified, matching is done  using  the  JIT
-       "fast  path" interface, pcre2_jit_match(), which skips some of the san-
-       ity checks that are done by pcre2_match(), and of course does not  work
-       when  JIT  is not supported. If jitfast is specified without jit, jit=7
+       If  the  jitfast  modifier is specified, matching is done using the JIT
+       "fast path" interface, pcre2_jit_match(), which skips some of the  san-
+       ity  checks that are done by pcre2_match(), and of course does not work
+       when JIT is not supported. If jitfast is specified without  jit,  jit=7
        is assumed.
 
-       If the jitverify modifier is specified, information about the  compiled
-       pattern  shows  whether  JIT  compilation was or was not successful. If
-       jitverify is specified without jit, jit=7 is assumed. If  JIT  compila-
-       tion  is successful when jitverify is set, the text "(JIT)" is added to
+       If  the jitverify modifier is specified, information about the compiled
+       pattern shows whether JIT compilation was or  was  not  successful.  If
+       jitverify  is  specified without jit, jit=7 is assumed. If JIT compila-
+       tion is successful when jitverify is set, the text "(JIT)" is added  to
        the first output line after a match or non match when JIT-compiled code
        was actually used in the match.
 
    Setting a locale
 
-       The /locale modifier must specify the name of a locale, for example:
+       The locale modifier must specify the name of a locale, for example:
 
          /pattern/locale=fr_FR
 
        The given locale is set, pcre2_maketables() is called to build a set of
-       character tables for the locale, and this is then passed to  pcre2_com-
-       pile()  when compiling the regular expression. The same tables are used
-       when matching the following subject lines. The /locale modifier applies
+       character  tables for the locale, and this is then passed to pcre2_com-
+       pile() when compiling the regular expression. The same tables are  used
+       when  matching the following subject lines. The locale modifier applies
        only to the pattern on which it appears, but can be given in a #pattern
-       command if a default is needed. Setting a locale and alternate  charac-
+       command  if a default is needed. Setting a locale and alternate charac-
        ter tables are mutually exclusive.
 
    Showing pattern memory
 
-       The  /memory  modifier  causes  the size in bytes of the memory used to
-       hold the compiled pattern to be output. This does not include the  size
-       of  the  pcre2_code  block; it is just the actual compiled data. If the
-       pattern is subsequently passed to the JIT compiler, the size of the JIT
+       The memory modifier causes the size in bytes of the memory used to hold
+       the  compiled  pattern  to be output. This does not include the size of
+       the pcre2_code block; it is just the actual compiled data. If the  pat-
+       tern  is  subsequently  passed to the JIT compiler, the size of the JIT
        compiled code is also output. Here is an example:
 
            re> /a(b)c/jit,memory
@@ -738,27 +839,27 @@ PATTERN MODIFIERS
 
    Limiting nested parentheses
 
-       The  parens_nest_limit  modifier  sets  a  limit on the depth of nested
-       parentheses in a pattern. Breaching  the  limit  causes  a  compilation
-       error.   The  default  for  the library is set when PCRE2 is built, but
-       pcre2test sets its own default of 220, which is  required  for  running
+       The parens_nest_limit modifier sets a limit  on  the  depth  of  nested
+       parentheses  in  a  pattern.  Breaching  the limit causes a compilation
+       error.  The default for the library is set when  PCRE2  is  built,  but
+       pcre2test  sets  its  own default of 220, which is required for running
        the standard test suite.
 
    Limiting the pattern length
 
-       The  max_pattern_length  modifier  sets  a limit, in code units, to the
+       The max_pattern_length modifier sets a limit, in  code  units,  to  the
        length of pattern that pcre2_compile() will accept. Breaching the limit
-       causes  a  compilation  error.  The  default  is  the  largest number a
+       causes a compilation  error.  The  default  is  the  largest  number  a
        PCRE2_SIZE variable can hold (essentially unlimited).
 
    Using the POSIX wrapper API
 
-       The /posix and posix_nosub modifiers cause pcre2test to call PCRE2  via
-       the  POSIX  wrapper API rather than its native API. When posix_nosub is
-       used, the POSIX option REG_NOSUB is  passed  to  regcomp().  The  POSIX
-       wrapper  supports  only  the 8-bit library. Note that it does not imply
+       The  posix  and posix_nosub modifiers cause pcre2test to call PCRE2 via
+       the POSIX wrapper API rather than its native API. When  posix_nosub  is
+       used,  the  POSIX  option  REG_NOSUB  is passed to regcomp(). The POSIX
+       wrapper supports only the 8-bit library. Note that it  does  not  imply
        POSIX matching semantics; for more detail see the pcre2posix documenta-
-       tion.  The  following  pattern  modifiers set options for the regcomp()
+       tion. The following pattern modifiers set  options  for  the  regcomp()
        function:
 
          caseless           REG_ICASE
@@ -768,35 +869,39 @@ PATTERN MODIFIERS
          ucp                REG_UCP        )   the POSIX standard
          utf                REG_UTF8       )
 
-       The regerror_buffsize modifier specifies a size for  the  error  buffer
-       that  is  passed to regerror() in the event of a compilation error. For
+       The  regerror_buffsize  modifier  specifies a size for the error buffer
+       that is passed to regerror() in the event of a compilation  error.  For
        example:
 
          /abc/posix,regerror_buffsize=20
 
-       This provides a means of testing the behaviour of regerror()  when  the
-       buffer  is  too  small  for the error message. If this modifier has not
+       This  provides  a means of testing the behaviour of regerror() when the
+       buffer is too small for the error message. If  this  modifier  has  not
        been set, a large buffer is used.
 
-       The aftertext and allaftertext  subject  modifiers  work  as  described
-       below.  All other modifiers are either ignored, with a warning message,
+       The  aftertext  and  allaftertext  subject  modifiers work as described
+       below. All other modifiers are either ignored, with a warning  message,
        or cause an error.
 
+       The  pattern  is  passed  to  regcomp()  as a zero-terminated string by
+       default, but if the use_length or hex modifiers are set,  the  REG_PEND
+       extension is used to pass it by length.
+
    Testing the stack guard feature
 
-       The /stackguard modifier is used to  test  the  use  of  pcre2_set_com-
-       pile_recursion_guard(),  a  function  that  is provided to enable stack
-       availability to be checked during compilation (see the  pcre2api  docu-
-       mentation  for  details).  If  the  number specified by the modifier is
+       The  stackguard  modifier  is  used  to  test the use of pcre2_set_com-
+       pile_recursion_guard(), a function that is  provided  to  enable  stack
+       availability  to  be checked during compilation (see the pcre2api docu-
+       mentation for details). If the number  specified  by  the  modifier  is
        greater than zero, pcre2_set_compile_recursion_guard() is called to set
-       up  callback  from pcre2_compile() to a local function. The argument it
-       receives is the current nesting parenthesis depth; if this  is  greater
+       up callback from pcre2_compile() to a local function. The  argument  it
+       receives  is  the current nesting parenthesis depth; if this is greater
        than the value given by the modifier, non-zero is returned, causing the
        compilation to be aborted.
 
    Using alternative character tables
 
-       The value specified for the /tables modifier must be one of the  digits
+       The  value  specified for the tables modifier must be one of the digits
        0, 1, or 2. It causes a specific set of built-in character tables to be
        passed to pcre2_compile(). This is used in the PCRE2 tests to check be-
        haviour with different character tables. The digit specifies the tables
@@ -807,23 +912,25 @@ PATTERN MODIFIERS
                pcre2_chartables.c.dist
          2   a set of tables defining ISO 8859 characters
 
-       In table 2, some characters whose codes are greater than 128 are  iden-
-       tified  as  letters,  digits,  spaces, etc. Setting alternate character
+       In  table 2, some characters whose codes are greater than 128 are iden-
+       tified as letters, digits, spaces,  etc.  Setting  alternate  character
        tables and a locale are mutually exclusive.
 
    Setting certain match controls
 
        The following modifiers are really subject modifiers, and are described
-       below.   However, they may be included in a pattern's modifier list, in
-       which case they are applied to every subject  line  that  is  processed
-       with that pattern. They may not appear in #pattern commands. These mod-
-       ifiers do not affect the compilation process.
+       under "Subject Modifiers" below. However, they may  be  included  in  a
+       pattern's  modifier  list, in which case they are applied to every sub-
+       ject line that is processed with that pattern. These modifiers  do  not
+       affect the compilation process.
 
              aftertext                  show text after match
              allaftertext               show text after captures
              allcaptures                show all captures
              allusedtext                show all consulted text
+             altglobal                  alternative global matching
          /g  global                     global matching
+             jitstack=<n>               set size of JIT stack
              mark                       show mark values
              replace=<string>           specify a replacement string
              startchar                  show starting character when relevant
@@ -832,26 +939,65 @@ PATTERN MODIFIERS
              substitute_unknown_unset   use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
              substitute_unset_empty     use PCRE2_SUBSTITUTE_UNSET_EMPTY
 
-       These modifiers may not appear in a #pattern command. If you want  them
+       These  modifiers may not appear in a #pattern command. If you want them
        as defaults, set them in a #subject command.
 
+   Specifying literal subject lines
+
+       If the subject_literal modifier is present on a pattern, all  the  sub-
+       ject lines that it matches are taken as literal strings, with no inter-
+       pretation of backslashes. It is not possible to set  subject  modifiers
+       on  such  lines, but any that are set as defaults by a #subject command
+       are recognized.
+
    Saving a compiled pattern
 
-       When  a  pattern with the push modifier is successfully compiled, it is
-       pushed onto a stack of compiled patterns,  and  pcre2test  expects  the
-       next  line to contain a new pattern (or a command) instead of a subject
+       When a pattern with the push modifier is successfully compiled,  it  is
+       pushed  onto  a  stack  of compiled patterns, and pcre2test expects the
+       next line to contain a new pattern (or a command) instead of a  subject
        line. This facility is used when saving compiled patterns to a file, as
-       described  in  the section entitled "Saving and restoring compiled pat-
-       terns" below. If pushcopy is used instead of push, a copy of  the  com-
-       piled  pattern  is  stacked,  leaving the original as current, ready to
-       match the following input lines. This provides a  way  of  testing  the
-       pcre2_code_copy()  function.   The  push  and  pushcopy   modifiers are
-       incompatible with compilation modifiers such  as  global  that  act  at
-       match  time. Any that are specified are ignored (for the stacked copy),
+       described in the section entitled "Saving and restoring  compiled  pat-
+       terns"  below.  If pushcopy is used instead of push, a copy of the com-
+       piled pattern is stacked, leaving the original  as  current,  ready  to
+       match  the  following  input  lines. This provides a way of testing the
+       pcre2_code_copy() function.   The  push  and  pushcopy   modifiers  are
+       incompatible  with  compilation  modifiers  such  as global that act at
+       match time. Any that are specified are ignored (for the stacked  copy),
        with a warning message, except for replace, which causes an error. Note
-       that  jitverify, which is allowed, does not carry through to any subse-
+       that jitverify, which is allowed, does not carry through to any  subse-
        quent matching that uses a stacked pattern.
 
+   Testing foreign pattern conversion
+
+       The  experimental  foreign pattern conversion functions in PCRE2 can be
+       tested by setting the convert modifier. Its argument is  a  colon-sepa-
+       rated  list  of  options,  which  set  the  equivalent  option  for the
+       pcre2_pattern_convert() function:
+
+         glob                    PCRE2_CONVERT_GLOB
+         glob_no_starstar        PCRE2_CONVERT_GLOB_NO_STARSTAR
+         glob_no_wild_separator  PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
+         posix_basic             PCRE2_CONVERT_POSIX_BASIC
+         posix_extended          PCRE2_CONVERT_POSIX_EXTENDED
+         unset                   Unset all options
+
+       The "unset" value is useful for turning off a default that has been set
+       by a #pattern command. When one of these options is set, the input pat-
+       tern is passed to pcre2_pattern_convert(). If the  conversion  is  suc-
+       cessful,  the  result  is  reflected  in  the output and then passed to
+       pcre2_compile(). The normal utf and no_utf_check options, if set, cause
+       the  PCRE2_CONVERT_UTF  and  PCRE2_CONVERT_NO_UTF_CHECK  options  to be
+       passed to pcre2_pattern_convert().
+
+       By default, the conversion function is allowed to allocate a buffer for
+       its  output.  However, if the convert_length modifier is set to a value
+       greater than zero, pcre2test passes a buffer of the given length.  This
+       makes it possible to test the length check.
+
+       The  convert_glob_escape  and  convert_glob_separator  modifiers can be
+       used to specify the escape and separator characters for  glob  process-
+       ing, overriding the defaults, which are operating-system dependent.
+
 
 SUBJECT MODIFIERS
 
@@ -860,10 +1006,11 @@ SUBJECT MODIFIERS
 
    Setting match options
 
-       The    following   modifiers   set   options   for   pcre2_match()   or
+       The   following   modifiers   set   options   for   pcre2_match()    or
        pcre2_dfa_match(). See pcreapi for a description of their effects.
 
              anchored                  set PCRE2_ANCHORED
+             endanchored               set PCRE2_ENDANCHORED
              dfa_restart               set PCRE2_DFA_RESTART
              dfa_shortest              set PCRE2_DFA_SHORTEST
              no_jit                    set PCRE2_NO_JIT
@@ -875,20 +1022,34 @@ SUBJECT MODIFIERS
              partial_hard (or ph)      set PCRE2_PARTIAL_HARD
              partial_soft (or ps)      set PCRE2_PARTIAL_SOFT
 
-       The partial matching modifiers are provided with abbreviations  because
+       The  partial matching modifiers are provided with abbreviations because
        they appear frequently in tests.
 
-       If  the  /posix  modifier was present on the pattern, causing the POSIX
-       wrapper API to be used, the only option-setting modifiers that have any
-       effect   are   notbol,   notempty,   and  noteol,  causing  REG_NOTBOL,
-       REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to  regexec().
-       The other modifiers are ignored, with a warning message.
+       If the posix or posix_nosub modifier was present on the pattern,  caus-
+       ing the POSIX wrapper API to be used, the only option-setting modifiers
+       that have any effect are notbol, notempty, and noteol, causing REG_NOT-
+       BOL,  REG_NOTEMPTY,  and  REG_NOTEOL,  respectively,  to  be  passed to
+       regexec(). The other modifiers are ignored, with a warning message.
+
+       There is one additional modifier that can be used with the POSIX  wrap-
+       per. It is ignored (with a warning) if used for non-POSIX matching.
+
+             posix_startend=<n>[:<m>]
+
+       This  causes  the  subject  string  to be passed to regexec() using the
+       REG_STARTEND option, which uses offsets to specify which  part  of  the
+       string  is  searched.  If  only  one number is given, the end offset is
+       passed as the end of the subject string. For more detail  of  REG_STAR-
+       TEND,  see the pcre2posix documentation. If the subject string contains
+       binary zeros (coded as escapes such as \x{00}  because  pcre2test  does
+       not support actual binary zeros in its input), you must use posix_star-
+       tend to specify its length.
 
    Setting match controls
 
-       The  following  modifiers  affect the matching process or request addi-
-       tional information. Some of them may also be  specified  on  a  pattern
-       line  (see  above), in which case they apply to every subject line that
+       The following modifiers affect the matching process  or  request  addi-
+       tional  information.  Some  of  them may also be specified on a pattern
+       line (see above), in which case they apply to every subject  line  that
        is matched against that pattern.
 
              aftertext                  show text after match
@@ -898,23 +1059,28 @@ SUBJECT MODIFIERS
              altglobal                  alternative global matching
              callout_capture            show captures at callout time
              callout_data=<n>           set a value to pass via callouts
+             callout_error=<n>[:<m>]    control callout error
+             callout_extra              show extra callout information
              callout_fail=<n>[:<m>]     control callout failure
+             callout_no_where           do not show position of a callout
              callout_none               do not supply a callout function
              copy=<number or name>      copy captured substring
+             depth_limit=<n>            set a depth limit
              dfa                        use pcre2_dfa_match()
-             find_limits                find match and recursion limits
+             find_limits                find match and depth limits
              get=<number or name>       extract captured substring
              getall                     extract all captured substrings
          /g  global                     global matching
+             heap_limit=<n>             set a limit on heap memory
              jitstack=<n>               set size of JIT stack
              mark                       show mark values
              match_limit=<n>            set a match limit
-             memory                     show memory usage
+             memory                     show heap memory usage
              null_context               match with a NULL context
              offset=<n>                 set starting offset
              offset_limit=<n>           set offset limit
              ovector=<n>                set size of output vector
-             recursion_limit=<n>        set a recursion limit
+             recursion_limit=<n>        obsolete synonym for depth_limit
              replace=<string>           specify a replacement string
              startchar                  show startchar when relevant
              startoffset=<n>            same as offset=<n>
@@ -925,29 +1091,29 @@ SUBJECT MODIFIERS
              zero_terminate             pass the subject as zero-terminated
 
        The effects of these modifiers are described in the following sections.
-       When  matching  via the POSIX wrapper API, the aftertext, allaftertext,
-       and ovector subject modifiers work as described below. All other  modi-
+       When matching via the POSIX wrapper API, the  aftertext,  allaftertext,
+       and  ovector subject modifiers work as described below. All other modi-
        fiers are either ignored, with a warning message, or cause an error.
 
    Showing more text
 
-       The  aftertext modifier requests that as well as outputting the part of
+       The aftertext modifier requests that as well as outputting the part  of
        the subject string that matched the entire pattern, pcre2test should in
        addition output the remainder of the subject string. This is useful for
        tests where the subject contains multiple copies of the same substring.
-       The  allaftertext  modifier  requests the same action for captured sub-
+       The allaftertext modifier requests the same action  for  captured  sub-
        strings as well as the main matched substring. In each case the remain-
        der is output on the following line with a plus character following the
        capture number.
 
-       The allusedtext modifier requests that all the text that was  consulted
-       during  a  successful pattern match by the interpreter should be shown.
-       This feature is not supported for JIT matching, and if  requested  with
-       JIT  it  is  ignored  (with  a  warning message). Setting this modifier
+       The  allusedtext modifier requests that all the text that was consulted
+       during a successful pattern match by the interpreter should  be  shown.
+       This  feature  is not supported for JIT matching, and if requested with
+       JIT it is ignored (with  a  warning  message).  Setting  this  modifier
        affects the output if there is a lookbehind at the start of a match, or
-       a  lookahead  at  the  end, or if \K is used in the pattern. Characters
-       that precede or follow the start and end of the actual match are  indi-
-       cated  in  the output by '<' or '>' characters underneath them. Here is
+       a lookahead at the end, or if \K is used  in  the  pattern.  Characters
+       that  precede or follow the start and end of the actual match are indi-
+       cated in the output by '<' or '>' characters underneath them.  Here  is
        an example:
 
            re> /(?<=pqr)abc(?=xyz)/
@@ -955,16 +1121,16 @@ SUBJECT MODIFIERS
           0: pqrabcxyz
              <<<   >>>
 
-       This shows that the matched string is "abc",  with  the  preceding  and
-       following  strings  "pqr"  and  "xyz"  having been consulted during the
+       This  shows  that  the  matched string is "abc", with the preceding and
+       following strings "pqr" and "xyz"  having  been  consulted  during  the
        match (when processing the assertions).
 
-       The startchar modifier requests that the  starting  character  for  the
-       match  be  indicated,  if  it  is different to the start of the matched
+       The  startchar  modifier  requests  that the starting character for the
+       match be indicated, if it is different to  the  start  of  the  matched
        string. The only time when this occurs is when \K has been processed as
        part of the match. In this situation, the output for the matched string
-       is displayed from the starting character  instead  of  from  the  match
-       point,  with  circumflex  characters  under the earlier characters. For
+       is  displayed  from  the  starting  character instead of from the match
+       point, with circumflex characters under  the  earlier  characters.  For
        example:
 
            re> /abc\Kxyz/
@@ -972,7 +1138,7 @@ SUBJECT MODIFIERS
           0: abcxyz
              ^^^
 
-       Unlike allusedtext, the startchar modifier can be used with JIT.   How-
+       Unlike  allusedtext, the startchar modifier can be used with JIT.  How-
        ever, these two modifiers are mutually exclusive.
 
    Showing the value of all capture groups
@@ -980,90 +1146,78 @@ SUBJECT MODIFIERS
        The allcaptures modifier requests that the values of all potential cap-
        tured parentheses be output after a match. By default, only those up to
        the highest one actually used in the match are output (corresponding to
-       the return code from pcre2_match()). Groups that did not take  part  in
-       the  match  are  output as "<unset>". This modifier is not relevant for
-       DFA matching (which does no capturing); it is ignored, with  a  warning
+       the  return  code from pcre2_match()). Groups that did not take part in
+       the match are output as "<unset>". This modifier is  not  relevant  for
+       DFA  matching  (which does no capturing); it is ignored, with a warning
        message, if present.
 
    Testing callouts
 
-       A  callout function is supplied when pcre2test calls the library match-
-       ing functions, unless callout_none is specified. If callout_capture  is
-       set, the current captured groups are output when a callout occurs.
-
-       The  callout_fail modifier can be given one or two numbers. If there is
-       only one number, 1 is returned instead of 0 when a callout of that num-
-       ber  is  reached.  If two numbers are given, 1 is returned when callout
-       <n> is reached for the <m>th time. Note that callouts with string argu-
-       ments  are  always  given  the  number zero. See "Callouts" below for a
-       description of the output when a callout it taken.
-
-       The callout_data modifier can be given an unsigned or a  negative  num-
-       ber.   This  is  set  as the "user data" that is passed to the matching
-       function, and passed back when the callout  function  is  invoked.  Any
-       value  other  than  zero  is  used as a return from pcre2test's callout
-       function.
+       A callout function is supplied when pcre2test calls the library  match-
+       ing  functions,  unless callout_none is specified. Its behaviour can be
+       controlled by various modifiers listed above  whose  names  begin  with
+       callout_. Details are given in the section entitled "Callouts" below.
 
    Finding all matches in a string
 
        Searching for all possible matches within a subject can be requested by
-       the  global or /altglobal modifier. After finding a match, the matching
-       function is called again to search the remainder of  the  subject.  The
-       difference  between  global  and  altglobal is that the former uses the
-       start_offset argument to pcre2_match() or  pcre2_dfa_match()  to  start
-       searching  at  a new point within the entire string (which is what Perl
+       the global or altglobal modifier. After finding a match,  the  matching
+       function  is  called  again to search the remainder of the subject. The
+       difference between global and altglobal is that  the  former  uses  the
+       start_offset  argument  to  pcre2_match() or pcre2_dfa_match() to start
+       searching at a new point within the entire string (which is  what  Perl
        does), whereas the latter passes over a shortened subject. This makes a
        difference to the matching process if the pattern begins with a lookbe-
        hind assertion (including \b or \B).
 
-       If an empty string  is  matched,  the  next  match  is  done  with  the
+       If  an  empty  string  is  matched,  the  next  match  is done with the
        PCRE2_NOTEMPTY_ATSTART and PCRE2_ANCHORED flags set, in order to search
        for another, non-empty, match at the same point in the subject. If this
-       match  fails,  the  start  offset  is advanced, and the normal match is
-       retried. This imitates the way Perl handles such cases when  using  the
-       /g  modifier  or  the  split()  function. Normally, the start offset is
-       advanced by one character, but if  the  newline  convention  recognizes
-       CRLF  as  a newline, and the current character is CR followed by LF, an
+       match fails, the start offset is advanced,  and  the  normal  match  is
+       retried.  This  imitates the way Perl handles such cases when using the
+       /g modifier or the split() function.  Normally,  the  start  offset  is
+       advanced  by  one  character,  but if the newline convention recognizes
+       CRLF as a newline, and the current character is CR followed by  LF,  an
        advance of two characters occurs.
 
    Testing substring extraction functions
 
-       The copy  and  get  modifiers  can  be  used  to  test  the  pcre2_sub-
+       The  copy  and  get  modifiers  can  be  used  to  test  the pcre2_sub-
        string_copy_xxx() and pcre2_substring_get_xxx() functions.  They can be
-       given more than once, and each can specify a group name or number,  for
+       given  more than once, and each can specify a group name or number, for
        example:
 
           abcd\=copy=1,copy=3,get=G1
 
-       If  the  #subject command is used to set default copy and/or get lists,
-       these can be unset by specifying a negative number to cancel  all  num-
+       If the #subject command is used to set default copy and/or  get  lists,
+       these  can  be unset by specifying a negative number to cancel all num-
        bered groups and an empty name to cancel all named groups.
 
-       The  getall  modifier  tests pcre2_substring_list_get(), which extracts
+       The getall modifier tests  pcre2_substring_list_get(),  which  extracts
        all captured substrings.
 
-       If the subject line is successfully matched, the  substrings  extracted
-       by  the  convenience  functions  are  output  with C, G, or L after the
-       string number instead of a colon. This is in  addition  to  the  normal
-       full  list.  The string length (that is, the return from the extraction
+       If  the  subject line is successfully matched, the substrings extracted
+       by the convenience functions are output with  C,  G,  or  L  after  the
+       string  number  instead  of  a colon. This is in addition to the normal
+       full list. The string length (that is, the return from  the  extraction
        function) is given in parentheses after each substring, followed by the
        name when the extraction was by name.
 
    Testing the substitution function
 
-       If  the  replace  modifier  is  set, the pcre2_substitute() function is
-       called instead of one of the matching functions. Note that  replacement
-       strings  cannot  contain commas, because a comma signifies the end of a
+       If the replace modifier is  set,  the  pcre2_substitute()  function  is
+       called  instead of one of the matching functions. Note that replacement
+       strings cannot contain commas, because a comma signifies the end  of  a
        modifier. This is not thought to be an issue in a test program.
 
-       Unlike subject strings, pcre2test does not process replacement  strings
-       for  escape  sequences. In UTF mode, a replacement string is checked to
-       see if it is a valid UTF-8 string. If so, it is correctly converted  to
-       a  UTF  string of the appropriate code unit width. If it is not a valid
-       UTF-8 string, the individual code units are copied directly. This  pro-
+       Unlike  subject strings, pcre2test does not process replacement strings
+       for escape sequences. In UTF mode, a replacement string is  checked  to
+       see  if it is a valid UTF-8 string. If so, it is correctly converted to
+       a UTF string of the appropriate code unit width. If it is not  a  valid
+       UTF-8  string, the individual code units are copied directly. This pro-
        vides a means of passing an invalid UTF-8 string for testing purposes.
 
-       The  following modifiers set options (in additional to the normal match
+       The following modifiers set options (in additional to the normal  match
        options) for pcre2_substitute():
 
          global                      PCRE2_SUBSTITUTE_GLOBAL
@@ -1073,8 +1227,8 @@ SUBJECT MODIFIERS
          substitute_unset_empty      PCRE2_SUBSTITUTE_UNSET_EMPTY
 
 
-       After a successful substitution, the modified string  is  output,  pre-
-       ceded  by the number of replacements. This may be zero if there were no
+       After  a  successful  substitution, the modified string is output, pre-
+       ceded by the number of replacements. This may be zero if there were  no
        matches. Here is a simple example of a substitution test:
 
          /abc/replace=xxx
@@ -1083,12 +1237,12 @@ SUBJECT MODIFIERS
              =abc=abc=\=global
           2: =xxx=xxx=
 
-       Subject and replacement strings should be kept relatively short  (fewer
-       than  256 characters) for substitution tests, as fixed-size buffers are
-       used. To make it easy to test for buffer overflow, if  the  replacement
-       string  starts  with a number in square brackets, that number is passed
-       to pcre2_substitute() as the  size  of  the  output  buffer,  with  the
-       replacement  string  starting at the next character. Here is an example
+       Subject  and replacement strings should be kept relatively short (fewer
+       than 256 characters) for substitution tests, as fixed-size buffers  are
+       used.  To  make it easy to test for buffer overflow, if the replacement
+       string starts with a number in square brackets, that number  is  passed
+       to  pcre2_substitute()  as  the  size  of  the  output buffer, with the
+       replacement string starting at the next character. Here is  an  example
        that tests the edge case:
 
          /abc/
@@ -1097,11 +1251,11 @@ SUBJECT MODIFIERS
              123abc123\=replace=[9]XYZ
          Failed: error -47: no more memory
 
-       The   default   action   of    pcre2_substitute()    is    to    return
-       PCRE2_ERROR_NOMEMORY  when  the output buffer is too small. However, if
-       the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using  the  sub-
-       stitute_overflow_length  modifier),  pcre2_substitute() continues to go
-       through the motions of matching and substituting, in order  to  compute
+       The    default    action    of    pcre2_substitute()   is   to   return
+       PCRE2_ERROR_NOMEMORY when the output buffer is too small.  However,  if
+       the  PCRE2_SUBSTITUTE_OVERFLOW_LENGTH  option is set (by using the sub-
+       stitute_overflow_length modifier), pcre2_substitute() continues  to  go
+       through  the  motions of matching and substituting, in order to compute
        the size of buffer that is required. When this happens, pcre2test shows
        the required buffer length (which includes space for the trailing zero)
        as part of the error message. For example:
@@ -1111,43 +1265,48 @@ SUBJECT MODIFIERS
          Failed: error -47: no more memory: 10 code units are needed
 
        A replacement string is ignored with POSIX and DFA matching. Specifying
-       partial matching provokes an error return  ("bad  option  value")  from
+       partial  matching  provokes  an  error return ("bad option value") from
        pcre2_substitute().
 
    Setting the JIT stack size
 
-       The  jitstack modifier provides a way of setting the maximum stack size
-       that is used by the just-in-time optimization code. It  is  ignored  if
+       The jitstack modifier provides a way of setting the maximum stack  size
+       that  is  used  by the just-in-time optimization code. It is ignored if
        JIT optimization is not being used. The value is a number of kilobytes.
-       Providing a stack that is larger than the default 32K is necessary only
-       for very complicated patterns.
+       Setting  zero  reverts to the default of 32K. Providing a stack that is
+       larger than the default is necessary only  for  very  complicated  pat-
+       terns.  If  jitstack is set non-zero on a subject line it overrides any
+       value that was set on the pattern.
 
-   Setting match and recursion limits
+   Setting heap, match, and depth limits
 
-       The  match_limit and recursion_limit modifiers set the appropriate lim-
-       its in the match context. These values are ignored when the find_limits
-       modifier is specified.
+       The heap_limit, match_limit, and depth_limit modifiers set  the  appro-
+       priate  limits  in the match context. These values are ignored when the
+       find_limits modifier is specified.
 
    Finding minimum limits
 
-       If  the  find_limits modifier is present, pcre2test calls pcre2_match()
-       several times, setting  different  values  in  the  match  context  via
-       pcre2_set_match_limit()  and pcre2_set_recursion_limit() until it finds
-       the minimum values for each parameter that allow pcre2_match() to  com-
-       plete without error.
+       If the find_limits modifier is present on  a  subject  line,  pcre2test
+       calls  the  relevant matching function several times, setting different
+       values   in   the    match    context    via    pcre2_set_heap_limit(),
+       pcre2_set_match_limit(),  or pcre2_set_depth_limit() until it finds the
+       minimum values for each parameter that allows  the  match  to  complete
+       without error.
 
        If JIT is being used, only the match limit is relevant. If DFA matching
-       is being used, neither limit is relevant, and this modifier is  ignored
-       (with a warning message).
-
-       The  match_limit number is a measure of the amount of backtracking that
-       takes place, and learning the minimum value  can  be  instructive.  For
-       most  simple  matches, the number is quite small, but for patterns with
-       very large numbers of matching possibilities, it can become large  very
-       quickly    with    increasing    length    of   subject   string.   The
-       match_limit_recursion number is a measure of how  much  stack  (or,  if
-       PCRE2  is  compiled with NO_RECURSE, how much heap) memory is needed to
-       complete the match attempt.
+       is being used, only the depth limit is relevant.
+
+       The match_limit number is a measure of the amount of backtracking  that
+       takes  place,  and  learning  the minimum value can be instructive. For
+       most simple matches, the number is quite small, but for  patterns  with
+       very  large numbers of matching possibilities, it can become large very
+       quickly with increasing length of subject string.
+
+       For non-DFA matching, the minimum depth_limit number is  a  measure  of
+       how much nested backtracking happens (that is, how deeply the pattern's
+       tree is searched). In the case of DFA  matching,  depth_limit  controls
+       the  depth of recursive calls of the internal function that is used for
+       handling pattern recursion, lookaround assertions, and atomic groups.
 
    Showing MARK names
 
@@ -1160,8 +1319,16 @@ SUBJECT MODIFIERS
 
    Showing memory usage
 
-       The memory modifier causes pcre2test to log all memory  allocation  and
-       freeing calls that occur during a match operation.
+       The memory modifier causes pcre2test to log the sizes of all heap  mem-
+       ory   allocation  and  freeing  calls  that  occur  during  a  call  to
+       pcre2_match(). These occur only when a match requires a  bigger  vector
+       than  the  default  for  remembering backtracking points. In many cases
+       there will be no heap memory used and therefore no  additional  output.
+       No  heap  memory  is  allocated during matching with pcre2_dfa_match or
+       with JIT, so in those cases the memory modifier never has  any  effect.
+       For this modifier to work, the null_context modifier must not be set on
+       both the pattern and the subject, though it can be set on  one  or  the
+       other.
 
    Setting a starting offset
 
@@ -1196,59 +1363,58 @@ SUBJECT MODIFIERS
        By default, the subject string is passed to a native API matching func-
        tion with its correct length. In order to test the facility for passing
        a  zero-terminated  string, the zero_terminate modifier is provided. It
-       causes the length to be passed as PCRE2_ZERO_TERMINATED. (When matching
-       via  the  POSIX  interface, this modifier has no effect, as there is no
-       facility for passing a length.)
+       causes the length to be passed as PCRE2_ZERO_TERMINATED. When  matching
+       via the POSIX interface, this modifier is ignored, with a warning.
 
-       When testing pcre2_substitute(), this modifier also has the  effect  of
+       When  testing  pcre2_substitute(), this modifier also has the effect of
        passing the replacement string as zero-terminated.
 
    Passing a NULL context
 
-       Normally,   pcre2test   passes   a   context  block  to  pcre2_match(),
+       Normally,  pcre2test  passes  a   context   block   to   pcre2_match(),
        pcre2_dfa_match() or pcre2_jit_match(). If the null_context modifier is
-       set,  however,  NULL  is  passed. This is for testing that the matching
+       set, however, NULL is passed. This is for  testing  that  the  matching
        functions behave correctly in this case (they use default values). This
-       modifier  cannot  be used with the find_limits modifier or when testing
+       modifier cannot be used with the find_limits modifier or  when  testing
        the substitution function.
 
 
 THE ALTERNATIVE MATCHING FUNCTION
 
-       By default,  pcre2test  uses  the  standard  PCRE2  matching  function,
+       By  default,  pcre2test  uses  the  standard  PCRE2  matching function,
        pcre2_match() to match each subject line. PCRE2 also supports an alter-
-       native matching function, pcre2_dfa_match(), which operates in  a  dif-
-       ferent  way, and has some restrictions. The differences between the two
+       native  matching  function, pcre2_dfa_match(), which operates in a dif-
+       ferent way, and has some restrictions. The differences between the  two
        functions are described in the pcre2matching documentation.
 
-       If the dfa modifier is set, the alternative matching function is  used.
-       This  function  finds all possible matches at a given point in the sub-
-       ject. If, however, the dfa_shortest modifier is set,  processing  stops
-       after  the  first  match is found. This is always the shortest possible
+       If  the dfa modifier is set, the alternative matching function is used.
+       This function finds all possible matches at a given point in  the  sub-
+       ject.  If,  however, the dfa_shortest modifier is set, processing stops
+       after the first match is found. This is always  the  shortest  possible
        match.
 
 
 DEFAULT OUTPUT FROM pcre2test
 
-       This section describes the output when the  normal  matching  function,
+       This  section  describes  the output when the normal matching function,
        pcre2_match(), is being used.
 
-       When  a  match  succeeds,  pcre2test  outputs the list of captured sub-
-       strings, starting with number 0 for the string that matched  the  whole
-       pattern.    Otherwise,  it  outputs  "No  match"  when  the  return  is
-       PCRE2_ERROR_NOMATCH, or "Partial  match:"  followed  by  the  partially
-       matching  substring  when the return is PCRE2_ERROR_PARTIAL. (Note that
-       this is the entire substring that  was  inspected  during  the  partial
-       match;  it  may  include  characters before the actual match start if a
+       When a match succeeds, pcre2test outputs  the  list  of  captured  sub-
+       strings,  starting  with number 0 for the string that matched the whole
+       pattern.   Otherwise,  it  outputs  "No  match"  when  the  return   is
+       PCRE2_ERROR_NOMATCH,  or  "Partial  match:"  followed  by the partially
+       matching substring when the return is PCRE2_ERROR_PARTIAL.  (Note  that
+       this  is  the  entire  substring  that was inspected during the partial
+       match; it may include characters before the actual  match  start  if  a
        lookbehind assertion, \K, \b, or \B was involved.)
 
        For any other return, pcre2test outputs the PCRE2 negative error number
-       and  a  short  descriptive  phrase. If the error is a failed UTF string
-       check, the code unit offset of the start of the  failing  character  is
+       and a short descriptive phrase. If the error is  a  failed  UTF  string
+       check,  the  code  unit offset of the start of the failing character is
        also output. Here is an example of an interactive pcre2test run.
 
          $ pcre2test
-         PCRE2 version 9.00 2014-05-10
+         PCRE2 version 10.22 2016-07-29
 
            re> /^abc(\d+)/
          data> abc123
@@ -1260,8 +1426,8 @@ DEFAULT OUTPUT FROM pcre2test
        Unset capturing substrings that are not followed by one that is set are
        not shown by pcre2test unless the allcaptures modifier is specified. In
        the following example, there are two capturing substrings, but when the
-       first data line is matched, the second, unset substring is  not  shown.
-       An  "internal" unset substring is shown as "<unset>", as for the second
+       first  data  line is matched, the second, unset substring is not shown.
+       An "internal" unset substring is shown as "<unset>", as for the  second
        data line.
 
            re> /(a)|(b)/
@@ -1273,11 +1439,11 @@ DEFAULT OUTPUT FROM pcre2test
           1: <unset>
           2: b
 
-       If the strings contain any non-printing characters, they are output  as
-       \xhh  escapes  if  the  value is less than 256 and UTF mode is not set.
+       If  the strings contain any non-printing characters, they are output as
+       \xhh escapes if the value is less than 256 and UTF  mode  is  not  set.
        Otherwise they are output as \x{hh...} escapes. See below for the defi-
-       nition  of  non-printing characters. If the /aftertext modifier is set,
-       the output for substring 0 is followed by the the rest of  the  subject
+       nition of non-printing characters. If the aftertext  modifier  is  set,
+       the  output  for substring 0 is followed by the the rest of the subject
        string, identified by "0+" like this:
 
            re> /cat/aftertext
@@ -1285,7 +1451,7 @@ DEFAULT OUTPUT FROM pcre2test
           0: cat
           0+ aract
 
-       If  global  matching  is  requested, the results of successive matching
+       If global matching is requested, the  results  of  successive  matching
        attempts are output in sequence, like this:
 
            re> /\Bi(\w\w)/g
@@ -1297,8 +1463,8 @@ DEFAULT OUTPUT FROM pcre2test
           0: ipp
           1: pp
 
-       "No match" is output only if the first match attempt fails. Here is  an
-       example  of  a  failure  message (the offset 4 that is specified by the
+       "No  match" is output only if the first match attempt fails. Here is an
+       example of a failure message (the offset 4 that  is  specified  by  the
        offset modifier is past the end of the subject string):
 
            re> /xyz/
@@ -1306,7 +1472,7 @@ DEFAULT OUTPUT FROM pcre2test
          Error -24 (bad offset value)
 
        Note that whereas patterns can be continued over several lines (a plain
-       ">"  prompt  is used for continuations), subject lines may not. However
+       ">" prompt is used for continuations), subject lines may  not.  However
        newlines can be included in a subject by means of the \n escape (or \r,
        \r\n, etc., depending on the newline sequence setting).
 
@@ -1314,7 +1480,7 @@ DEFAULT OUTPUT FROM pcre2test
 OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
 
        When the alternative matching function, pcre2_dfa_match(), is used, the
-       output consists of a list of all the matches that start  at  the  first
+       output  consists  of  a list of all the matches that start at the first
        point in the subject where there is at least one match. For example:
 
            re> /(tang|tangerine|tan)/
@@ -1323,11 +1489,11 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
           1: tang
           2: tan
 
-       Using  the normal matching function on this data finds only "tang". The
-       longest matching string is always  given  first  (and  numbered  zero).
-       After  a  PCRE2_ERROR_PARTIAL  return,  the output is "Partial match:",
-       followed by the partially matching substring. Note  that  this  is  the
-       entire  substring  that  was inspected during the partial match; it may
+       Using the normal matching function on this data finds only "tang".  The
+       longest  matching  string  is  always  given first (and numbered zero).
+       After a PCRE2_ERROR_PARTIAL return, the  output  is  "Partial  match:",
+       followed  by  the  partially  matching substring. Note that this is the
+       entire substring that was inspected during the partial  match;  it  may
        include characters before the actual match start if a lookbehind asser-
        tion, \b, or \B was involved. (\K is not supported for DFA matching.)
 
@@ -1343,16 +1509,16 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
           1: tan
           0: tan
 
-       The alternative matching function does not support  substring  capture,
-       so  the  modifiers  that are concerned with captured substrings are not
+       The  alternative  matching function does not support substring capture,
+       so the modifiers that are concerned with captured  substrings  are  not
        relevant.
 
 
 RESTARTING AFTER A PARTIAL MATCH
 
-       When the alternative matching function has given  the  PCRE2_ERROR_PAR-
+       When  the  alternative matching function has given the PCRE2_ERROR_PAR-
        TIAL return, indicating that the subject partially matched the pattern,
-       you can restart the match with additional subject data by means of  the
+       you  can restart the match with additional subject data by means of the
        dfa_restart modifier. For example:
 
            re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
@@ -1361,26 +1527,17 @@ RESTARTING AFTER A PARTIAL MATCH
          data> n05\=dfa,dfa_restart
           0: n05
 
-       For  further  information  about partial matching, see the pcre2partial
+       For further information about partial matching,  see  the  pcre2partial
        documentation.
 
 
 CALLOUTS
 
        If the pattern contains any callout requests, pcre2test's callout func-
-       tion  is called during matching unless callout_none is specified.  This
-       works with both matching functions.
-
-       The callout function in pcre2test returns zero (carry on  matching)  by
-       default,  but you can use a callout_fail modifier in a subject line (as
-       described above) to change this and other parameters of the callout.
-
-       Inserting callouts can be helpful when using pcre2test to check compli-
-       cated  regular expressions. For further information about callouts, see
-       the pcre2callout documentation.
-
-       The output for callouts with numerical arguments and those with  string
-       arguments is slightly different.
+       tion is called during matching unless callout_none is  specified.  This
+       works with both matching functions, and with JIT, though there are some
+       differences in behaviour. The output for callouts with numerical  argu-
+       ments and those with string arguments is slightly different.
 
    Callouts with numerical arguments
 
@@ -1399,8 +1556,8 @@ CALLOUTS
        position, which can happen if the callout is in a lookbehind assertion.
 
        Callouts numbered 255 are assumed to be automatic callouts, inserted as
-       a  result  of the /auto_callout pattern modifier. In this case, instead
-       of showing the callout number, the offset in the pattern, preceded by a
+       a result of the auto_callout pattern modifier. In this case, instead of
+       showing the callout number, the offset in the pattern,  preceded  by  a
        plus, is output. For example:
 
            re> /\d?[A-E]\*/auto_callout
@@ -1451,46 +1608,140 @@ CALLOUTS
           0: abcdef
 
 
+   Callout modifiers
+
+       The callout function in pcre2test returns zero (carry on  matching)  by
+       default,  but  you can use a callout_fail modifier in a subject line to
+       change this and other parameters of the callout (see below).
+
+       If the callout_capture modifier is set, the current captured groups are
+       output when a callout occurs. This is useful only for non-DFA matching,
+       as pcre2_dfa_match() does not support capturing,  so  no  captures  are
+       ever shown.
+
+       The normal callout output, showing the callout number or pattern offset
+       (as described above) is suppressed if the callout_no_where modifier  is
+       set.
+
+       When  using  the  interpretive  matching function pcre2_match() without
+       JIT, setting the callout_extra modifier causes additional  output  from
+       pcre2test's  callout function to be generated. For the first callout in
+       a match attempt at a new starting position in the subject,  "New  match
+       attempt"  is output. If there has been a backtrack since the last call-
+       out (or start of matching if this is the first callout), "Backtrack" is
+       output,  followed  by  "No other matching paths" if the backtrack ended
+       the previous match attempt. For example:
+
+          re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
+         data> aac\=callout_extra
+         New match attempt
+         --->aac
+          +0 ^       (
+          +1 ^       a+
+          +3 ^ ^     )
+          +4 ^ ^     b
+         Backtrack
+         --->aac
+          +3 ^^      )
+          +4 ^^      b
+         Backtrack
+         No other matching paths
+         New match attempt
+         --->aac
+          +0  ^      (
+          +1  ^      a+
+          +3  ^^     )
+          +4  ^^     b
+         Backtrack
+         No other matching paths
+         New match attempt
+         --->aac
+          +0   ^     (
+          +1   ^     a+
+         Backtrack
+         No other matching paths
+         New match attempt
+         --->aac
+          +0    ^    (
+          +1    ^    a+
+         No match
+
+       Notice that various optimizations must be turned off if  you  want  all
+       possible  matching  paths  to  be  scanned. If no_start_optimize is not
+       used, there is an immediate "no match", without any  callouts,  because
+       the  starting  optimization  fails to find "b" in the subject, which it
+       knows must be present for any match. If no_auto_possess  is  not  used,
+       the  "a+"  item is turned into "a++", which reduces the number of back-
+       tracks.
+
+       The callout_extra modifier has no effect if used with the DFA  matching
+       function, or with JIT.
+
+   Return values from callouts
+
+       The  default  return  from  the  callout function is zero, which allows
+       matching to continue. The callout_fail modifier can be given one or two
+       numbers. If there is only one number, 1 is returned instead of 0 (caus-
+       ing matching to backtrack) when a callout of that number is reached. If
+       two  numbers  (<n>:<m>)  are  given,  1 is returned when callout <n> is
+       reached and there have been at least <m>  callouts.  The  callout_error
+       modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, caus-
+       ing the entire matching process to be aborted. If both these  modifiers
+       are  set  for  the same callout number, callout_error takes precedence.
+       Note that callouts with string arguments are always  given  the  number
+       zero.
+
+       The  callout_data  modifier can be given an unsigned or a negative num-
+       ber.  This is set as the "user data" that is  passed  to  the  matching
+       function,  and  passed  back  when the callout function is invoked. Any
+       value other than zero is used as  a  return  from  pcre2test's  callout
+       function.
+
+       Inserting callouts can be helpful when using pcre2test to check compli-
+       cated regular expressions. For further information about callouts,  see
+       the pcre2callout documentation.
+
+
 NON-PRINTING CHARACTERS
 
        When pcre2test is outputting text in the compiled version of a pattern,
-       bytes  other  than 32-126 are always treated as non-printing characters
+       bytes other than 32-126 are always treated as  non-printing  characters
        and are therefore shown as hex escapes.
 
-       When pcre2test is outputting text that is a matched part of  a  subject
-       string,  it behaves in the same way, unless a different locale has been
-       set for the pattern (using the /locale modifier).  In  this  case,  the
-       isprint()  function  is  used  to distinguish printing and non-printing
+       When  pcre2test  is outputting text that is a matched part of a subject
+       string, it behaves in the same way, unless a different locale has  been
+       set  for  the  pattern  (using  the locale modifier). In this case, the
+       isprint() function is used to  distinguish  printing  and  non-printing
        characters.
 
 
 SAVING AND RESTORING COMPILED PATTERNS
 
-       It is possible to save compiled patterns  on  disc  or  elsewhere,  and
+       It  is  possible  to  save  compiled patterns on disc or elsewhere, and
        reload them later, subject to a number of restrictions. JIT data cannot
-       be saved. The host on which the patterns are reloaded must  be  running
+       be  saved.  The host on which the patterns are reloaded must be running
        the same version of PCRE2, with the same code unit width, and must also
-       have the same endianness, pointer width  and  PCRE2_SIZE  type.  Before
-       compiled  patterns  can be saved they must be serialized, that is, con-
-       verted to a stream of bytes. A single byte stream may contain any  num-
-       ber  of  compiled  patterns,  but  they must all use the same character
+       have  the  same  endianness,  pointer width and PCRE2_SIZE type. Before
+       compiled patterns can be saved they must be serialized, that  is,  con-
+       verted  to a stream of bytes. A single byte stream may contain any num-
+       ber of compiled patterns, but they must  all  use  the  same  character
        tables. A single copy of the tables is included in the byte stream (its
        size is 1088 bytes).
 
-       The  functions  whose  names  begin  with pcre2_serialize_ are used for
-       serializing and de-serializing. They are described in the  pcre2serial-
+       The functions whose names begin  with  pcre2_serialize_  are  used  for
+       serializing  and de-serializing. They are described in the pcre2serial-
        ize  documentation.  In  this  section  we  describe  the  features  of
        pcre2test that can be used to test these functions.
 
-       When a pattern with push  modifier  is  successfully  compiled,  it  is
-       pushed  onto  a  stack  of compiled patterns, and pcre2test expects the
-       next line to contain a new pattern (or command) instead  of  a  subject
-       line.  By contrast, the pushcopy modifier causes a copy of the compiled
-       pattern to be stacked, leaving the  original  available  for  immediate
-       matching.  By  using  push and/or pushcopy, a number of patterns can be
+       When  a  pattern  with  push  modifier  is successfully compiled, it is
+       pushed onto a stack of compiled patterns,  and  pcre2test  expects  the
+       next  line  to  contain a new pattern (or command) instead of a subject
+       line. By contrast, the pushcopy modifier causes a copy of the  compiled
+       pattern  to  be  stacked,  leaving the original available for immediate
+       matching. By using push and/or pushcopy, a number of  patterns  can  be
        compiled and retained. These modifiers are incompatible with posix, and
-       control  modifiers  that act at match time are ignored (with a message)
-       for the stacked patterns. The jitverify modifier applies only  at  com-
+       control modifiers that act at match time are ignored (with  a  message)
+       for  the  stacked patterns. The jitverify modifier applies only at com-
        pile time.
 
        The command
@@ -1498,21 +1749,21 @@ SAVING AND RESTORING COMPILED PATTERNS
          #save <filename>
 
        causes all the stacked patterns to be serialized and the result written
-       to the named file. Afterwards, all the stacked patterns are freed.  The
+       to  the named file. Afterwards, all the stacked patterns are freed. The
        command
 
          #load <filename>
 
-       reads  the  data in the file, and then arranges for it to be de-serial-
-       ized, with the resulting compiled patterns added to the pattern  stack.
-       The  pattern  on the top of the stack can be retrieved by the #pop com-
-       mand, which must be followed by  lines  of  subjects  that  are  to  be
-       matched  with  the pattern, terminated as usual by an empty line or end
-       of file. This command may be followed by  a  modifier  list  containing
-       only  control  modifiers that act after a pattern has been compiled. In
+       reads the data in the file, and then arranges for it to  be  de-serial-
+       ized,  with the resulting compiled patterns added to the pattern stack.
+       The pattern on the top of the stack can be retrieved by the  #pop  com-
+       mand,  which  must  be  followed  by  lines  of subjects that are to be
+       matched with the pattern, terminated as usual by an empty line  or  end
+       of  file.  This  command  may be followed by a modifier list containing
+       only control modifiers that act after a pattern has been  compiled.  In
        particular,  hex,  posix,  posix_nosub,  push,  and  pushcopy  are  not
-       allowed,  nor are any option-setting modifiers.  The JIT modifiers are,
-       however permitted. Here is an example that saves and reloads  two  pat-
+       allowed, nor are any option-setting modifiers.  The JIT modifiers  are,
+       however  permitted.  Here is an example that saves and reloads two pat-
        terns.
 
          /abc/push
@@ -1525,10 +1776,10 @@ SAVING AND RESTORING COMPILED PATTERNS
          #pop jit,bincode
          abc
 
-       If  jitverify  is  used with #pop, it does not automatically imply jit,
+       If jitverify is used with #pop, it does not  automatically  imply  jit,
        which is different behaviour from when it is used on a pattern.
 
-       The #popcopy command is analagous to the pushcopy modifier in  that  it
+       The  #popcopy  command is analagous to the pushcopy modifier in that it
        makes current a copy of the topmost stack pattern, leaving the original
        still on the stack.
 
@@ -1548,5 +1799,5 @@ AUTHOR
 
 REVISION
 
-       Last updated: 06 July 2016
-       Copyright (c) 1997-2016 University of Cambridge.
+       Last updated: 21 December 2017
+       Copyright (c) 1997-2017 University of Cambridge.
diff --git a/doc/pcre2unicode.3 b/doc/pcre2unicode.3
index 253d4b6..813fadf 100644
--- a/doc/pcre2unicode.3
+++ b/doc/pcre2unicode.3
@@ -1,4 +1,4 @@
-.TH PCRE2UNICODE 3 "03 July 2016" "PCRE2 10.22"
+.TH PCRE2UNICODE 3 "17 May 2017" "PCRE2 10.30"
 .SH NAME
 PCRE - Perl-compatible regular expressions (revised API)
 .SH "UNICODE AND UTF SUPPORT"
@@ -40,7 +40,7 @@ and
 documentation. Only the short names for properties are supported. For example,
 \ep{L} matches a letter. Its Perl synonym, \ep{Letter}, is not supported.
 Furthermore, in Perl, many properties may optionally be prefixed by "Is", for
-compatibility with Perl 5.6. PCRE does not support this.
+compatibility with Perl 5.6. PCRE2 does not support this.
 .
 .
 .SH "WIDE CHARACTERS AND UTF MODES"
@@ -101,10 +101,16 @@ low-valued characters, unless the PCRE2_UCP option is set.
 However, the special horizontal and vertical white space matching escapes (\eh,
 \eH, \ev, and \eV) do match all the appropriate Unicode characters, whether or
 not PCRE2_UCP is set.
-.P
-Case-insensitive matching in UTF mode makes use of Unicode properties. A few
-Unicode characters such as Greek sigma have more than two codepoints that are
-case-equivalent, and these are treated as such.
+.
+.
+.SH "CASE-EQUIVALENCE IN UTF MODES"
+.rs
+.sp
+Case-insensitive matching in a UTF mode makes use of Unicode properties except
+for characters whose code points are less than 128 and that have at most two
+case-equivalent values. For these, a direct table lookup is used for speed. A
+few Unicode characters such as Greek sigma have more than two codepoints that
+are case-equivalent, and these are treated as such.
 .
 .
 .SH "VALIDITY OF UTF STRINGS"
@@ -158,6 +164,14 @@ or \fBpcre2_dfa_match()\fP.
 .P
 If you pass an invalid UTF string when PCRE2_NO_UTF_CHECK is set, the result
 is undefined and your program may crash or loop indefinitely.
+.P
+Note that setting PCRE2_NO_UTF_CHECK at compile time does not disable the error
+that is given if an escape sequence for an invalid Unicode code point is
+encountered in the pattern. If you want to allow escape sequences such as
+\ex{d800} (a surrogate code point) you can set the
+PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES extra option. However, this is possible
+only in UTF-8 and UTF-32 modes, because these values are not representable in
+UTF-16.
 .
 .
 .\" HTML <a name="utf8strings"></a>
@@ -266,6 +280,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 03 July 2016
-Copyright (c) 1997-2016 University of Cambridge.
+Last updated: 17 May 2017
+Copyright (c) 1997-2017 University of Cambridge.
 .fi
author	Matthew Vernon <matthew@debian.org>	2018-02-24 12:07:04 +0000
committer	Matthew Vernon <matthew@debian.org>	2018-02-24 12:07:04 +0000
commit	e98c3314cf9e05aa99f5e192862ec37f29b7dbb5 (patch)
tree	b69bb3feb63a4fd79ad8a6e55865228f6fde04eb /doc
parent	92b17f0eb8fddd7117c5344a1e1177daec21995a (diff)