Bug 457064 - pcre is configured with no support for Unicode properties
pcre is configured with no support for Unicode properties
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: pcre (Show other bugs)
5.5
All Linux
high Severity high
: rc
: ---
Assigned To: Petr Pisar
Ondrej Moriš
: FutureFeature, Triaged
: 461712 (view as bug list)
Depends On:
Blocks: 502912 554476 GSS_RHEL5.6_RFE
  Show dependency treegraph
 
Reported: 2008-07-29 10:37 EDT by orensol
Modified: 2013-04-15 04:58 EDT (History)
21 users (show)

See Also:
Fixed In Version: pcre-6.6-6.el5
Doc Type: Enhancement
Doc Text:
Unicode properties have been enabled to support \p{..}, \P{..}, and \X escape sequences.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-01-13 17:09:20 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Fix for looping on pattern compilation in non-UTF-8 (3.18 KB, patch)
2011-01-13 10:33 EST, Petr Pisar
no flags Details | Diff
Test case for the loop problem (950 bytes, text/plain)
2011-01-13 10:36 EST, Petr Pisar
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
CentOS 3252 None None None Never

  None (edit)
Description orensol 2008-07-29 10:37:04 EDT
Description of problem:
pcre package comes configured with utf-8 support, but with no unicode properties
enabled.

Version-Release number of selected component (if applicable):
6.6-2

How reproducible:
always

Steps to Reproduce:
1.install pcre package from repository
  

Actual results:
> pcretest -C
PCRE version 6.6 06-Feb-2006
Compiled with
  UTF-8 support
  No Unicode properties support
  Newline character is LF
  Internal link size = 2
  POSIX malloc threshold = 10
  Default match limit = 10000000
  Default recursion depth limit = 10000000
  Match recursion uses stack


Expected results:
PCRE version 6.6 06-Feb-2006
Compiled with
  UTF-8 support
  Unicode properties support
  Newline character is LF
  Internal link size = 2
  POSIX malloc threshold = 10
  Default match limit = 10000000
  Default recursion depth limit = 10000000
  Match recursion uses stack

Additional info:
add configure option --enable-unicode-properties
Comment 1 Gerwin Krist 2008-08-29 04:30:02 EDT
Same problem here, we need it badly with the Zend Framework for sites with non-latin charsets.
Comment 2 Thomas Heil 2008-11-23 12:06:24 EST
Is the about to be fixed, so the need for an overlay packages repository would be no longer needed ?
Comment 3 Gerwin Krist 2008-11-23 14:38:46 EST
No still not fixed. RH has marked it as a feature (it is?!?) request.
Comment 4 Robert Scheck 2009-01-08 04:17:24 EST
We've exactly the same problem here, in order to use the search engine
delivered with TYPOlight webCMS (PHP), we really need that feature enabled.
I also wonder that this isn't already enabled for a long time - or is Red
Hat less unicode interested as Fedora is for ages now?

I'm adding RHEL Product and Program Management on Cc to get this maybe
solved for RHEL 5.3 as it seems to be a tiny and minor change to me (and I
also hope, that the e-mail address isn't just a dummy one). I'm adding Joe
as he's the PHP downstream maintainer and also should know the issue/the
missing feature in PHP.
Comment 5 Stepan Kasal 2009-01-08 08:46:59 EST
*** Bug 461712 has been marked as a duplicate of this bug. ***
Comment 6 Gerwin Krist 2009-01-24 04:49:40 EST
Is may hope that this issue is fixed in the 5.3 release?
Comment 7 Robert Scheck 2009-01-24 06:21:08 EST
Not that I could see it. Gerwin, you may want to ask your salesguy as well
to get this priorized.
Comment 8 Gerwin Krist 2009-01-28 13:02:54 EST
For your info: I escalated the support ticket about this ,to high.
Comment 10 RHEL Product and Program Management 2009-03-26 12:48:19 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 11 intel352 2009-04-14 10:33:03 EDT
This page details how to build a fixed PCRE with Unicode support, or you can just download the RPM that has been already compiled for Centos 5.2 (=RHEL 5.2)

http://gaarai.com/2009/01/31/unicode-support-on-centos-52-with-php-and-pcre/
Comment 12 RHEL Product and Program Management 2009-04-16 13:07:13 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 16 RHEL Product and Program Management 2009-05-05 09:07:26 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 22 Alain Portal 2010-01-25 13:12:45 EST
Can the "Red Hat Product Management" explain why it doesn't want to enable the unicode properties?

Don't it think enabling utf8 support and not unicode properties is contradictory?
Comment 24 intel352 2010-06-29 10:26:07 EDT
wow, it's been two years now... this is an easily fixed issue, that still hasn't been rectified in two years?

I'm glad I moved on to Ubuntu.
Comment 26 Robert Scheck 2010-07-18 07:01:18 EDT
I've cross-filed this issue as Service Request 2041330.
Comment 31 Martin Prpic 2010-11-15 10:00:44 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Unicode properties have been enabled to support \p{..}, \P{..}, and \X escape sequences.
Comment 33 Tuomo Soini 2011-01-12 11:20:52 EST
Please test for php bug38600. Our internal build with unicode properties caused this bug to hit again. We needed extra patch to address this infinite loop in pcre.

Testing code for the issue is in http://bugs.php.net/bug.php?id=38600

Please test before releasing new pcre.
Comment 34 Petr Pisar 2011-01-13 07:49:10 EST
(In reply to comment #33)
> Please test for php bug38600. Our internal build with unicode properties caused
> this bug to hit again. We needed extra patch to address this infinite loop in
> pcre.
> 
> Testing code for the issue is in http://bugs.php.net/bug.php?id=38600
> 
> Please test before releasing new pcre.

I can confirm the pcre-6.6-6.el5 does not terminate on compiling the pattern:

#!/bin/sh
PATTERN='/(?<!\w)(0x[\p{N}]+[lL]?|[\p{Nd}]+(e[\p{Nd}]*)?[lLdDfF]?)(?!\w)/'
TEXT='bla bla bla'
printf "${PATTERN}\\n${TEXT}\\n" | pcretest
Comment 37 Petr Pisar 2011-01-13 10:33:32 EST
Created attachment 473346 [details]
Fix for looping on pattern compilation in non-UTF-8

I believe I found fix for the loop problem. This attachment should fix it.
Comment 38 Petr Pisar 2011-01-13 10:36:39 EST
Created attachment 473347 [details]
Test case for the loop problem

This C program tests the loop problem. If the problem is fixed, the program returns with success code in finite time. Otherwise it never halts.
Comment 39 Petr Pisar 2011-01-13 11:49:46 EST
The infinite loop problem is addressed in new bug #669413. Thanks for careful testing.
Comment 40 errata-xmlrpc 2011-01-13 17:09:20 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0022.html

Note You need to log in before you can comment on or make changes to this bug.