Bug 88148 - strtok.3 man page contains illegal UTF-8 encoding; iconv: illegal input sequence at position 1292
strtok.3 man page contains illegal UTF-8 encoding; iconv: illegal input seque...
Status: CLOSED DUPLICATE of bug 98969
Product: Red Hat Linux
Classification: Retired
Component: man-pages (Show other bugs)
9
All Linux
medium Severity medium
: ---
: ---
Assigned To: Eido Inoue
Ben Levenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-04-06 17:25 EDT by Charles R. Anderson
Modified: 2007-04-18 12:52 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-21 13:52:30 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Proposed patch to fix strtok.3 man page (520 bytes, patch)
2003-04-06 17:26 EDT, Charles R. Anderson
no flags Details | Diff
Script to find problematic manual pages (500 bytes, text/plain)
2003-05-22 10:18 EDT, Steve Coile
no flags Details
Sample output of badman.out, from RHL 9 system (132.27 KB, text/plain)
2003-05-22 10:20 EDT, Steve Coile
no flags Details

  None (edit)
Description Charles R. Anderson 2003-04-06 17:25:17 EDT
Description of problem:

"man strtok" returns this error:

illegal input sequence at position 1292

Version-Release number of selected component (if applicable):

1.53-3

How reproducible:

Always

Steps to Reproduce:
1. Open xterm
2. Type "man strtok"
    
Actual results:

>man strtok
iconv: illegal input sequence at position 1292

Expected results:

The man page should be displayed.

Additional info:

I believe I have identified the source of this problem.  By ungzipping the man
page strtok.3.gz, and trying to run nroff on it manually I get this:

>nroff strtok.3
iconv: illegal input sequence at position 1190

This byte position in the file is a hex E1, and should show up as a
forward-accented 'a' character.  I believe the correct UTF-8 sequence for that
character should be C3 A1.
Comment 1 Charles R. Anderson 2003-04-06 17:26:56 EDT
Created attachment 90939 [details]
Proposed patch to fix strtok.3 man page
Comment 2 Emmanuel Thomé 2003-04-09 12:17:02 EDT
This is not related to strtok

Actually, *many* man pages are unreadable with RH9. fontconfig is one of them
(illegal input char at position 205). cdrecord is another one. test is yet
another one...

It's wrong in the first place to assume that man pages contain UTF-8 data.
99.95% of the pages not in plain 7-bit ascii contain iso-latin-1 chars (like the
copyright sign, for instance).

Therefore, the culprit is the /usr/bin/nroff script. The UTF-8 on line 112
should not be there, IMHO. Or RH assumes the work of converting all the man
pages to UTF-8.

Cheers,

E.
Comment 3 Steve Coile 2003-05-22 10:18:36 EDT
Created attachment 91893 [details]
Script to find problematic manual pages

Having encountered the "iconv" error with "man", and seeing that others are
having the same problem, I thought it might be useful to see what other pages
suffered from the problem.  So this script was born.
Comment 4 Steve Coile 2003-05-22 10:20:32 EDT
Created attachment 91894 [details]
Sample output of badman.out, from RHL 9 system

Sample of output generated by the attached "badman.sh" script.	Generated on a
Red Hat Linux 9 system.  The RHL 9 system is not a full install, so a great
many man pages included with 9 are not represented.
Comment 5 Eido Inoue 2003-07-11 14:11:39 EDT

*** This bug has been marked as a duplicate of 98969 ***
Comment 6 Red Hat Bugzilla 2006-02-21 13:52:30 EST
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.

Note You need to log in before you can comment on or make changes to this bug.