Bug 80113

Summary: mknmz finds malformed characters in templates and crashes
Product: [Retired] Red Hat Linux Reporter: Robert Myers <rmyers1400>
Component: namazuAssignee: Akira TAGOH <tagoh>
Status: CLOSED RAWHIDE QA Contact: Bill Huang <bhuang>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.0CC: rmyers1400
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-01-24 08:18:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Myers 2002-12-20 05:08:26 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

Description of problem:
[root@warp rmyers]# mknmz -V -a -L=en

Malformed UTF-8 character (unexpected continuation byte 0xbb, with no preceding
start byte) in require at /usr/share/namazu/pl/wakati.pl line 56.

Malformed UTF-8 character (1 byte, need 3, after start byte 0xec) in require at
/usr/share/namazu/pl/wakati.pl line 56.

Malformed UTF-8 character (unexpected continuation byte 0xa1, with no preceding
start byte) at /usr/share/namazu/pl/gfilter.pl line 97.

Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) at /usr/share/namazu/pl/gfilter.pl line 97.

Malformed UTF-8 character (unexpected continuation byte 0xa1, with no preceding
start byte) at /usr/share/namazu/pl/gfilter.pl line 97.

Malformed UTF-8 character (unexpected continuation byte 0xa2, with no preceding
start byte) at /usr/share/namazu/pl/gfilter.pl line 97.

Malformed UTF-8 character (unexpected continuation byte 0xa2, with no preceding
start byte) at /usr/share/namazu/filter/hnf.pl line 72.

AND SO ON AND SO FORTH, ENDING WITH THE ERROR MESSAGE:

Unmatched ( in regex; marked by <-- HERE in m/(^\s*( <-- HERE
Date:|Subject:|Message-IDUnmatched:|From:|/ at
/usr/share/namazu/filter/mailnews.pl line 212.

Compilation failed in require at /usr/bin/mknmz line 399.

]0;root@warp:/home/rmyers


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.mknmz -V -a -L=en
2.
3.
    

Actual Results:  See description of problem

Expected Results:  Should create an index file.  Works fine under RH 7.3
distribution.

Additional info:

Platform is a P4.  Working platform (RH 7.3) is a Tualatin P3 Celeron.

Comment 1 Robert Myers 2002-12-20 05:14:16 UTC
System is completely up-to-date using RHN as of Dec 20, 2002.  All available
components of RH 8.0 professional were installed and have been updated using RHN.

Comment 2 Akira TAGOH 2002-12-24 07:31:04 UTC
Could you attach a sample file to reproduce this problem?

Comment 3 Robert Myers 2003-01-09 00:04:27 UTC
I don't know what file I would send, since namazu is apparently finding errors
in the template files, which are exactly what anaconda installed from the
distribution disk.

I think the problem is locale-related.  mknmz works fine on the RH 7.2 machine
with en_US.iso885915.  It crashes on RH 8.0 with en_US.UT_8, reporting malformed
UTF-8 characters (apparently in the namazu templates). 

In other words, I think there is something going wrong in the multi-language
support, and/or the namazu templates were created assuming a different locale.

RH took away the locale chooser GUI, so I am currently fumbling around to see if
I can change the locale to en_US.iso885915.

Comment 4 Robert Myers 2003-01-09 01:45:35 UTC
The problem is with Perl, not with Namazu

LC_ALL=en_US.ISO8859-1
export LC_ALL

as suggested in man perllocale allows namazu to run (apparently correctly). 
It's still grinding away, and it will take a while to finish.  When it has run
to completion and is apparently producing correct results, I'll close out the
report.

Comment 5 Akira TAGOH 2003-01-24 08:18:41 UTC
should be fixed in 2.0.12-5.