From Bugzilla Helper: User-Agent: Mozilla/5.0 Galeon/1.2.6 (X11; Linux i686; U;) Gecko/20020827 Description of problem: Including Locale::Language in your perl program while using en_US.UTF-8 as the LANG gives you some warnings. Version-Release number of selected component (if applicable): 5.8.0-55 How reproducible: Always Steps to Reproduce: 1.LANG=en_US.UTF-8 perl -we 'use Locale::Language' Actual Results: Malformed UTF-8 character (unexpected end of string) at /usr/lib/perl5/5.8.0/Locale/Language.pm line 115, <DATA> line 109. Malformed UTF-8 character (unexpected end of string) at /usr/lib/perl5/5.8.0/Locale/Language.pm line 117, <DATA> line 109. Malformed UTF-8 character (unexpected non-continuation byte 0x6c, immediately after start byte 0xe5) in lc at /usr/lib/perl5/5.8.0/Locale/Language.pm line 117, <DATA> line 109. Malformed UTF-8 character (unexpected end of string) at /usr/lib/perl5/5.8.0/Locale/Language.pm line 115, <DATA> line 178. Malformed UTF-8 character (unexpected end of string) at /usr/lib/perl5/5.8.0/Locale/Language.pm line 117, <DATA> line 178. Malformed UTF-8 character (unexpected non-continuation byte 0x6b, immediately after start byte 0xfc) in lc at /usr/lib/perl5/5.8.0/Locale/Language.pm line 117, <DATA> line 178. Expected Results: You shouldn't see any errors (undefine LANG and re-run the same command). Additional info: The good news is: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=rt-17439-38139.10.5873486677133%40bugs6.perl.org&rnum=3&prev=/groups%3Fq%3DMalformed%2BUTF-8%2Bcharacter%2B(unexpected%2Bend%2Bof%2Bstring)%26meta%3Dsite%253Dgroups The patch included has been applied to perl but I can't verify because I don't have a login to perl.org. Actually I found this bug while dealing with another one, more on that latter because I didn't find the culprit and can't submit an incomplete bug.
This is very urgent issue. There's working patch already, I suggest RedHat publishing update quite soon. I have default set up of RH8.0, and I ran into this problem with trying my www perl script on this box, and it yells errors.
I have integrated the patch from upstream (perl change 17927). There are other issues preventing an immediate errata of Perl itself, however. If you would like to test a candidate package, it can be arranged, but please be aware it would be unsupported.
a package fixing this and other utf8 issues should be in rawhide soon (and should recompile on stock 8.0 with no trouble).
Now with recent RawHide (I tested -79, -81, -82) regexps with UTF-8 seem to fail: mirror@entropy home]$ LANG=en_US mirror packages/rawhide.srpm package=RawHide alviss.et.tudelft.nl:/pub/redhat/rawhide/SRPMS/SRPMS -> /usr/local/mirror/redhat/rawhide/SRPMS/SRPMS No files to transfer [mirror@entropy home]$ LANG=en_US.UTF-8 mirror packages/rawhide.srpm unknown input in "/etc/mirror.defaults" line 10 of: package=defaults unknown keyword in "/etc/mirror.defaults" line 10 of: [mirror@entropy home]$ rpm -q mirror mirror-2.9-11 [mirror@entropy home]$ rpm -q perl perl-5.8.0-82 The regexp in question is: /^\s*([^\s=+]+)\s*([=+])(.*)?$/ If I go back to perl-5.8.0-73 all is fine again.
See the bug #82652 for a patch.
Fix confirmed in perl-5.8.3-10