Bug 88148

Summary: strtok.3 man page contains illegal UTF-8 encoding; iconv: illegal input sequence at position 1292
Product: [Retired] Red Hat Linux Reporter: Charles R. Anderson <cra>
Component: man-pagesAssignee: Eido Inoue <havill>
Status: CLOSED DUPLICATE QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: emmanuel.thome, mitr
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-02-21 18:52:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed patch to fix strtok.3 man page
none
Script to find problematic manual pages
none
Sample output of badman.out, from RHL 9 system none

Description Charles R. Anderson 2003-04-06 21:25:17 UTC
Description of problem:

"man strtok" returns this error:

illegal input sequence at position 1292

Version-Release number of selected component (if applicable):

1.53-3

How reproducible:

Always

Steps to Reproduce:
1. Open xterm
2. Type "man strtok"
    
Actual results:

>man strtok
iconv: illegal input sequence at position 1292

Expected results:

The man page should be displayed.

Additional info:

I believe I have identified the source of this problem.  By ungzipping the man
page strtok.3.gz, and trying to run nroff on it manually I get this:

>nroff strtok.3
iconv: illegal input sequence at position 1190

This byte position in the file is a hex E1, and should show up as a
forward-accented 'a' character.  I believe the correct UTF-8 sequence for that
character should be C3 A1.

Comment 1 Charles R. Anderson 2003-04-06 21:26:56 UTC
Created attachment 90939 [details]
Proposed patch to fix strtok.3 man page

Comment 2 Emmanuel Thomé 2003-04-09 16:17:02 UTC
This is not related to strtok

Actually, *many* man pages are unreadable with RH9. fontconfig is one of them
(illegal input char at position 205). cdrecord is another one. test is yet
another one...

It's wrong in the first place to assume that man pages contain UTF-8 data.
99.95% of the pages not in plain 7-bit ascii contain iso-latin-1 chars (like the
copyright sign, for instance).

Therefore, the culprit is the /usr/bin/nroff script. The UTF-8 on line 112
should not be there, IMHO. Or RH assumes the work of converting all the man
pages to UTF-8.

Cheers,

E.

Comment 3 Steve Coile 2003-05-22 14:18:36 UTC
Created attachment 91893 [details]
Script to find problematic manual pages

Having encountered the "iconv" error with "man", and seeing that others are
having the same problem, I thought it might be useful to see what other pages
suffered from the problem.  So this script was born.

Comment 4 Steve Coile 2003-05-22 14:20:32 UTC
Created attachment 91894 [details]
Sample output of badman.out, from RHL 9 system

Sample of output generated by the attached "badman.sh" script.	Generated on a
Red Hat Linux 9 system.  The RHL 9 system is not a full install, so a great
many man pages included with 9 are not represented.

Comment 5 Eido Inoue 2003-07-11 18:11:39 UTC

*** This bug has been marked as a duplicate of 98969 ***

Comment 6 Red Hat Bugzilla 2006-02-21 18:52:30 UTC
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.