Bug 108991 - non UTF-8 encoding of man-pages
non UTF-8 encoding of man-pages
Product: Fedora
Classification: Fedora
Component: man-pages (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Eido Inoue
Ben Levenson
Depends On:
  Show dependency treegraph
Reported: 2003-11-03 21:42 EST by Maciej Żenczykowski
Modified: 2007-11-30 17:10 EST (History)
2 users (show)

See Also:
Fixed In Version: 1.64-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2003-12-15 17:12:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Maciej Żenczykowski 2003-11-03 21:42:42 EST
Version-Release number of selected component (if applicable):
man-pages-1.53-3 (RedHat 9, fully up2date system)

How reproducible: Always

Description of problem:

The following files:
//generated via
rpm -q --filesbypkg man-pages | sed "s@.*/usr@/usr@" | while read i;
do zcat "$i" | iconv -f UTF-8 -t ISO-8859-1 2>/dev/null >/dev/null ||
echo "$i"; done


fail conversion from UTF-8 to <anything> since they are not UTF-8
encoded.  This is important since this step is performed by man
resulting in "iconv: illegal input sequence at position ####" error
messages from i.e. "man 2 close" et al.  I've also seen this error in
Polish language manual pages.  Furthermore the above list may be
incomplete as it only catches manpages with invalid chars (thus
obviously not correct) and not possibly correct UTF-8 man pages which
aren't encoded as UTF-8.

Either /usr/bin/nroff should be changed to not use iconv -f UTF-8 or
all man-pages should be converted to UTF-8 (or some auto-detection

This is very annoying as it makes many man pages useless (LC_ALL et
all settings change nothing as the problem lies in the input encoding,
which isn't UTF-8 as expected and not the output encoding which can be
changed via locale settings)
Comment 1 Maciej Żenczykowski 2003-11-06 10:02:22 EST
Still present in Fedora Core 1
Comment 2 Roozbeh Pournader 2003-11-26 13:54:43 EST
To reproduce the bug, one can run "man iso_8859_1" in a
gnome-terminal. The character table is just full of question marks
instead of different characters.
Comment 3 Eido Inoue 2003-11-26 21:11:49 EST
The way encoding is handled has changed from RHL 9, which expected the
source character set for the man pages from Western European language
localed to be in ISO-8859-1 (which was then converted to UTF-8).

They do need to be re-encoded.

Note You need to log in before you can comment on or make changes to this bug.