Red Hat Bugzilla – Bug 1402288
PCRE 8.32 mismatches Unicode ranges in JIT mode
Last modified: 2017-08-01 08:20:57 EDT
+++ This bug was initially created as a clone of Bug #1400267 +++ Name : pcre-devel Arch : x86_64 Version : 8.32 Release : 15.el7_2.1 [...] Steps to Reproduce: 1. php -r "var_dump(preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));" Actual results: string(0) "" Expected results: string(4) "test" [...] > I can't replicate using pcretest either, but that's only half the picture as > pcretest does not perform any replacement. The issue in my snippet > definitely exists under PHP 7.0.13 and the system PCRE implementation (8.32). > I found it. It's because of JIT. If I request pcretest to use JIT, it matches: $ printf '%s\n%s\n' '/[\x{0000}\x{200B}-\x{200D}\x{FEFF}]|\r?\n|\r/8W' 'test' | pcretest -s++ PCRE version 8.32 2012-11-30 re> data> 0: t (JIT) data>
From PHP side, also need to have jit enabled: # php -n -d pcre.jit=1 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));" string(6) "7.0.10" string(0) "" # php -n -d pcre.jit=0 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));" string(6) "7.0.10" string(4) "test" BTW, our build (rh-php70) have pcre.jit=0 by default (original bug report was about a non-rh package)
This was fixed by upstream between 8.34 and 8.35 version with commit: commit f928c7adccd8daa61e76c22130d79689ec41f21c Author: zherczeg <zherczeg@2f5784b3-3f2a-0410-8824-cb99058d5e15> Date: Sun Dec 22 16:27:35 2013 +0000 A new flag is set, when property checks are present in an XCLASS. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1414 2f5784b3-3f2a-0410-8824-cb99058d5e15 The commit introduced some optimizations and the issue was probably fixed by an accident.
Created attachment 1229055 [details] Fix ported to 8.32 The upstream optimization ported to 8.32. It fixes the reported bug. Please note it changes internal representation of a studied pattern, so it's not possible to load patterns serialized by previous PCRE builds. But this limitation is documented in pcreprecompile(3): Compiling regular expressions with one version of PCRE for use with a different version is not guaranteed to work and may cause crashes, and saving and restoring a compiled pattern loses any JIT optimization data.
An unsupported testing package with this fix is available on <http://people.redhat.com/~ppisar/pcre-8.32-17.el7/>.
# rpm -q pcre pcre-8.32-17.el7.x86_64 # rpm -qf $(which php) rh-php70-php-cli-7.0.10-2.el7.x86_64 # php -n -d pcre.jit=0 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));" string(6) "7.0.10" string(4) "test" # php -n -d pcre.jit=1 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));" string(6) "7.0.10" string(4) "test" I confirm the fix, thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1909