Bug 88148 - strtok.3 man page contains illegal UTF-8 encoding; iconv: illegal input sequence at position 1292
Summary: strtok.3 man page contains illegal UTF-8 encoding; iconv: illegal input seque...
Keywords:
Status: CLOSED DUPLICATE of bug 98969
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: man-pages
Version: 9
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Eido Inoue
QA Contact: Ben Levenson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-04-06 21:25 UTC by Charles R. Anderson
Modified: 2007-04-18 16:52 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2006-02-21 18:52:30 UTC
Embargoed:


Attachments (Terms of Use)
Proposed patch to fix strtok.3 man page (520 bytes, patch)
2003-04-06 21:26 UTC, Charles R. Anderson
no flags Details | Diff
Script to find problematic manual pages (500 bytes, text/plain)
2003-05-22 14:18 UTC, Steve Coile
no flags Details
Sample output of badman.out, from RHL 9 system (132.27 KB, text/plain)
2003-05-22 14:20 UTC, Steve Coile
no flags Details

Description Charles R. Anderson 2003-04-06 21:25:17 UTC
Description of problem:

"man strtok" returns this error:

illegal input sequence at position 1292

Version-Release number of selected component (if applicable):

1.53-3

How reproducible:

Always

Steps to Reproduce:
1. Open xterm
2. Type "man strtok"
    
Actual results:

>man strtok
iconv: illegal input sequence at position 1292

Expected results:

The man page should be displayed.

Additional info:

I believe I have identified the source of this problem.  By ungzipping the man
page strtok.3.gz, and trying to run nroff on it manually I get this:

>nroff strtok.3
iconv: illegal input sequence at position 1190

This byte position in the file is a hex E1, and should show up as a
forward-accented 'a' character.  I believe the correct UTF-8 sequence for that
character should be C3 A1.

Comment 1 Charles R. Anderson 2003-04-06 21:26:56 UTC
Created attachment 90939 [details]
Proposed patch to fix strtok.3 man page

Comment 2 Emmanuel Thomé 2003-04-09 16:17:02 UTC
This is not related to strtok

Actually, *many* man pages are unreadable with RH9. fontconfig is one of them
(illegal input char at position 205). cdrecord is another one. test is yet
another one...

It's wrong in the first place to assume that man pages contain UTF-8 data.
99.95% of the pages not in plain 7-bit ascii contain iso-latin-1 chars (like the
copyright sign, for instance).

Therefore, the culprit is the /usr/bin/nroff script. The UTF-8 on line 112
should not be there, IMHO. Or RH assumes the work of converting all the man
pages to UTF-8.

Cheers,

E.

Comment 3 Steve Coile 2003-05-22 14:18:36 UTC
Created attachment 91893 [details]
Script to find problematic manual pages

Having encountered the "iconv" error with "man", and seeing that others are
having the same problem, I thought it might be useful to see what other pages
suffered from the problem.  So this script was born.

Comment 4 Steve Coile 2003-05-22 14:20:32 UTC
Created attachment 91894 [details]
Sample output of badman.out, from RHL 9 system

Sample of output generated by the attached "badman.sh" script.	Generated on a
Red Hat Linux 9 system.  The RHL 9 system is not a full install, so a great
many man pages included with 9 are not represented.

Comment 5 Eido Inoue 2003-07-11 18:11:39 UTC

*** This bug has been marked as a duplicate of 98969 ***

Comment 6 Red Hat Bugzilla 2006-02-21 18:52:30 UTC
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.


Note You need to log in before you can comment on or make changes to this bug.