Bug 1402288

Summary: PCRE 8.32 mismatches Unicode ranges in JIT mode
Product: Red Hat Enterprise Linux 7 Reporter: Petr Pisar <ppisar>
Component: pcreAssignee: Petr Pisar <ppisar>
Status: CLOSED ERRATA QA Contact: Martin Kyral <mkyral>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: fedora, isenfeld, kieran, mkyral, ovasik, qe-baseos-apps, rcollet
Target Milestone: rcKeywords: Patch
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: pcre-8.32-17.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1400267
: 1434487 (view as bug list) Environment:
Last Closed: 2017-08-01 12:20:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1393865    
Attachments:
Description Flags
Fix ported to 8.32 none

Description Petr Pisar 2016-12-07 07:56:44 UTC
+++ This bug was initially created as a clone of Bug #1400267 +++

Name        : pcre-devel
Arch        : x86_64
Version     : 8.32
Release     : 15.el7_2.1
[...]
Steps to Reproduce:
1. php -r "var_dump(preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));"

Actual results:
string(0) ""

Expected results:
string(4) "test"

[...]
> I can't replicate using pcretest either, but that's only half the picture as
> pcretest does not perform any replacement. The issue in my snippet
> definitely exists under PHP 7.0.13 and the system PCRE implementation (8.32).
> 
I found it. It's because of JIT. If I request pcretest to use JIT, it matches:

$ printf '%s\n%s\n' '/[\x{0000}\x{200B}-\x{200D}\x{FEFF}]|\r?\n|\r/8W' 'test' | pcretest -s++
PCRE version 8.32 2012-11-30

  re> data>  0: t (JIT)
data>

Comment 1 Remi Collet 2016-12-07 09:09:26 UTC
From PHP side, also need to have jit enabled:

# php -n -d pcre.jit=1 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));"
string(6) "7.0.10"
string(0) ""

# php -n -d pcre.jit=0 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));"
string(6) "7.0.10"
string(4) "test"


BTW, our build (rh-php70) have pcre.jit=0 by default (original bug report was about a non-rh package)

Comment 2 Petr Pisar 2016-12-07 09:46:34 UTC
This was fixed by upstream between 8.34 and 8.35 version with commit:

commit f928c7adccd8daa61e76c22130d79689ec41f21c
Author: zherczeg <zherczeg@2f5784b3-3f2a-0410-8824-cb99058d5e15>
Date:   Sun Dec 22 16:27:35 2013 +0000

    A new flag is set, when property checks are present in an XCLASS.
    
    git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1414 2f5784b3-3f2a-0410-8824-cb99058d5e15

The commit introduced some optimizations and the issue was probably fixed by an accident.

Comment 4 Petr Pisar 2016-12-07 13:24:42 UTC
Created attachment 1229055 [details]
Fix ported to 8.32

The upstream optimization ported to 8.32. It fixes the reported bug.

Please note it changes internal representation of a studied pattern, so it's not possible to load patterns serialized by previous PCRE builds. But this limitation is documented in pcreprecompile(3):

       Compiling  regular  expressions with one version of PCRE for use with a
       different version is not guaranteed to work and may cause crashes,  and
       saving  and  restoring  a  compiled  pattern loses any JIT optimization
       data.

Comment 5 Petr Pisar 2016-12-07 13:31:12 UTC
An unsupported testing package with this fix is available on <http://people.redhat.com/~ppisar/pcre-8.32-17.el7/>.

Comment 6 Remi Collet 2016-12-07 13:41:21 UTC
# rpm -q pcre
pcre-8.32-17.el7.x86_64

# rpm -qf $(which php)
rh-php70-php-cli-7.0.10-2.el7.x86_64

# php -n -d pcre.jit=0 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));"
string(6) "7.0.10"
string(4) "test"

# php -n -d pcre.jit=1 -r "var_dump(PHP_VERSION, preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));"
string(6) "7.0.10"
string(4) "test"


I confirm the fix, thanks.

Comment 11 errata-xmlrpc 2017-08-01 12:20:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1909