Bug 1434486

Summary: PCRE 7.8 fails to recognize non-ASCII printable characters
Product: Red Hat Enterprise Linux 6 Reporter: Martin Kyral <mkyral>
Component: pcreAssignee: Petr Pisar <ppisar>
Status: CLOSED NOTABUG QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.9   
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1400267 Environment:
Last Closed: 2017-03-22 09:37:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Kyral 2017-03-21 15:32:17 UTC
+++ This bug was initially created as a clone of Bug #1400267 +++
Happens on RHEL 6 with pcre-7.8-7.el6.x86_64 as well.

Description of problem:
preg_replace with unicode modifier in PHP 7.0.13 with PCRE 8.32 (from Centos 7 updates repo) does not work as expected in the below use case. 

Version-Release number of selected component (if applicable):
PHP 7.0.13 has been installed from cPanel (http://www.cpanel.com/) EasyApache 4 (ea-php70-7.0.13-1.1.1.cpanel.x86_64)

# php --version
PHP 7.0.13 (cli) (built: Nov 14 2016 15:24:31) ( NTS )
Copyright (c) 1997-2016 The PHP Group
Zend Engine v3.0.0, Copyright (c) 1998-2016 Zend Technologies
    with the ionCube PHP Loader (enabled) + Intrusion Protection from ioncube24.com (unconfigured) v6.0.4, Copyright (c) 2002-2016, by ionCube Ltd.

# yum info pcre-devel
Loaded plugins: fastestmirror, universal-hooks
Loading mirror speeds from cached hostfile
 * EA4: 85.13.201.2
 * base: mirror.vorboss.net
 * extras: mirror.vorboss.net
 * updates: mirror.vorboss.net
Installed Packages
Name        : pcre-devel
Arch        : x86_64
Version     : 8.32
Release     : 15.el7_2.1
Size        : 1.4 M
Repo        : installed
From repo   : updates
Summary     : Development files for pcre
URL         : http://www.pcre.org/
Licence     : BSD
Description : Development files (Headers, libraries for dynamic linking, etc)
            : for pcre.

How reproducible:
Always - 100%

Steps to Reproduce:
1. php -r "var_dump(preg_replace('/[\\x{0000}\\x{200B}-\\x{200D}\\x{FEFF}]|\\r?\\n|\\r/u', '', 'test'));"

Actual results:
string(0) ""

Expected results:
string(4) "test"

Additional info:

--- Additional comment from Remi Collet on 2016-11-30 13:53:47 EST ---

Notice: rh-php70 packages in RHSCL 2.3 are also affected.

Another example, run using RHEL / RHSCL official packages:

$ php -r "var_dump(PHP_VERSION, preg_replace('/[^[:print:]]/u', '', 'ČEZ'));"
string(6) "5.4.16"
string(2) "EZ"

$ scl enable rh-php56 bash
$ php -r "var_dump(PHP_VERSION, preg_replace('/[^[:print:]]/u', '', 'ČEZ'));"
string(6) "5.6.25"
string(2) "EZ"

$ scl enable rh-php70 bash
$ php -r "var_dump(PHP_VERSION, preg_replace('/[^[:print:]]/u', '', 'ČEZ'));"
string(6) "7.0.10"
string(2) "EZ"

While with fedora package (pcre 8.39):

$ php -r "var_dump(PHP_VERSION, preg_replace('/[^[:print:]]/u', '', 'ČEZ'));"
string(6) "7.0.13"
string(4) "ČEZ"

Comment 1 Petr Pisar 2017-03-22 09:37:27 UTC
pcre-7.8-7.el6 does cannot match it because UCP matching mode was added in PCRE 8.10. (And then the [:print:] class was adjusted in UCP mode only.) pcretest reports the unknown UCP mode switch (/W):

$ printf '/[[:print:]]/8W\nČ\n' | pcretest
PCRE version 7.8 2008-09-05

  re> ** Unknown option 'W'
  re>     > ** Unexpected EOF

Without the /W option, RHEL-6 pcre package behaves in accordance with RHEL-7 and upstream code.

And even if that was not the case I don't think changing the behavior in this RHEL-6 late period would be appropriate.

I'm not going to "fix" it.