Bug 157071 - Perl doesn't lovercase accented caracters in UTF-8
Perl doesn't lovercase accented caracters in UTF-8
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: perl (Show other bugs)
4
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jason Vas Dias
David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-06 12:27 EDT by Horst H. von Brand
Modified: 2007-11-30 17:11 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-11-08 17:32:20 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Horst H. von Brand 2005-05-06 12:27:46 EDT
Description of problem:
Perl doesn't lovercase accented characters. I'm running LANG=en_US.UTF-8
The steps given below are copy-pasted from a Gnome terminal.

Version-Release number of selected component (if applicable):
perl-5.8.6-10

How reproducible:
Always

Steps to Reproduce:
1. perl -e 'print lc "ÁÉÜÒÊ\n"'
2. perl -e 'use locale; print lc "ÁÉÜÒÊ\n"'
3.
  
Actual results:
ÁÉÜÒÊ

Expected results:
áéüòê

Additional info:
With:

   perl -e 'use utf8; print lc "ÁÉÜÒÊ\n"'

there is no visible output, od(1) shows junk:

   perl -e 'use locale; print lc "ÁÉÜÒÊ\n"' | od -c
   0000000 303 201 303 211 303 234 303 222 303 212  \n
   0000013
Comment 1 Jason Vas Dias 2005-11-08 17:32:20 EST
Yes, I know the perl unicode implementation is far from user-friendly
or intuitive - this is an upstream issue that is being addressed - but
it does work (just) if used correctly .

perl's lc / uc DO work for UTF-8, IF the UTF-8 is properly encoded, AND perl is
running in wide-character mode , AND the characters have defined  upper/lower
case counterparts in your current locale.

These examples should expose the issues - I suggest you also read the 
perlunicode and perllocale man-pages .

$ perl -C -e 'use locale; use utf8; use Encode qw(decode); 
$s=decode(utf8,"\xc5\x99\xc4\x9b"); print uc $s,"\n";'
ŘĚ

$ perl -C -e 'use locale; use utf8; use Encode qw(decode); 
$s=decode(utf8,"\xc5\x99\xc4\x9b"); print  $s,"\n";'
řě

$ perl -e 'use Encode qw(decode);  $s=decode(utf8,"\xc5\x99\xc4\x9b"); print 
$s,"\n";'
Wide character in print at -e line 1.
řě

$ perl -C -e 'use Encode qw(decode);  $s=decode(utf8,"\xc5\x99\xc4\x9b"); print
 $s,"\n";'
řě

$ PERL_UNICODE=31 perl -e 'use Encode qw(decode);
$s=decode(utf8,"\xc5\x99\xc4\x9b"); print uc $s,"\n";'
ŘĚ

$ PERL_UNICODE=31 perl -e 'use Encode qw(decode);  $s=decode(utf8,"ŘĚ"); print
lc $s,"\n";'
řě



Note You need to log in before you can comment on or make changes to this bug.