Bug 172792 - use of study() with utf8 support enabled breaks regexps
use of study() with utf8 support enabled breaks regexps
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: perl (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Marcela Mašláňová
David Lawrence
:
Depends On:
Blocks: 135975
  Show dependency treegraph
 
Reported: 2005-11-09 15:32 EST by Jason Vas Dias
Modified: 2008-03-06 10:45 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-03-06 10:45:26 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jason Vas Dias 2005-11-09 15:32:14 EST
Description of problem:

Use of study() with utf8 support enabled breaks perl-5.8.7's
regular expressions :

OK without UTF:
$  echo 'ABDCEFGHIJK' | 
   perl -pe 'study; s/HIJK/1234/;'
ABDCEFG1234

$ echo 'ABCDEFGHIJK' |
  perl -e '$_=<>; study; print /HIJK/,"\n";'
1

FAILS with UTF:
$ echo 'ABDCEFGHIJK' |
  PERL_UNICODE=31 perl -pe 'study; s/HIJK/1234/;'
ABDCEFGHIJK

$ echo 'ABCDEFGHIJK' | 
  PERL_UNICODE=31 perl -e '$_=<>; study; print /HIJK/,"\n";'

(re did not match)

Seems to be study() that is the culprit:
$ echo 'ABDCEFGHIJK' | 
  PERL_UNICODE=31 perl -pe 's/HIJK/1234/;'
ABDCEFG1234

And it is because $_ gets utf8-ness from STDIN:

$ echo 'ABDCEFGHIJK' |
  PERL_UNICODE=63 perl -e '$_=<>; study; print /HIJK/ ? "OK" : "FAIL","\n";'
FAIL

$ PERL_UNICODE=63 perl -e '$_="ABDCEFGHIJK"; study; print /HIJK/ ? "OK" :
"FAIL","\n";'
OK

This was in the 'en_US.UTF-8' locale. If I make utf-8 support
conditional on locale, the problem goes away for the C locale:

$ echo 'ABDCEFGHIJK' |
  PERL_UNICODE=127 LC_ALL=C perl -e '$_=<>; study; print /HIJK/ ? "OK" :
"FAIL","\n";'
OK

Version-Release number of selected component (if applicable):
ALL perl versions

How reproducible:
100%

Additional Information:

This is upstream perl bug 37646 ( http://rt.perl.org/rt3/index.html?q=37646 )
Comment 1 Robin Norwood 2006-10-01 19:34:26 EDT
assigning to rnorwood@redhat.com
Comment 2 Marcela Mašláňová 2008-03-06 10:45:26 EST
The perl bug tracker was closed. It should be fixed in perl-5.10 soon.

Note You need to log in before you can comment on or make changes to this bug.