Bug 117053 - utf8 man pages on non-utf8 terminal
utf8 man pages on non-utf8 terminal
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: man (Show other bugs)
rawhide
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Eido Inoue
Ben Levenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-02-27 14:35 EST by Ville Herva
Modified: 2007-11-30 17:10 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-02-27 15:08:43 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ville Herva 2004-02-27 14:35:57 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; X11; Linux i686) Opera 7.23  [en]

Description of problem:
> * Tue Feb 10 2004 Adrian Havill <havill@redhat.com> 1.5m2-2
> 
> - add all locale man pages
> - convert all msgs and manpages to utf-8
> - iconv patch no longer needed now that utf-8-to-legacy conversion is not
>   needed

I use man with non-utf8 terminals quite a lot (local rxvt, through ssh connection from 
windows). After this change, all man pages have the special characters garbled on 
non-utf8 terminals. 

Are they supposed work any more? I don't quite follow the logic why the iconv patch 
was dropped (if I understand what it was doing) - not all terminal are nor will be utf8. 
Think of a shell server - most people log on from non-utf terminals (Windows putty etc)
.

One of the failing configurations:
man-1.5m2-3
less-382-1
rxvt-2.7.8-4

COLORTERM=rxvt-xpm
LC_ALL=en_US
LESSCHARSET=latin1
TERM=rxvt

Version-Release number of selected component (if applicable):
1.5m2

How reproducible:
Always

Steps to Reproduce:
1. open up any man page on non-utf8 terminal

Actual Results:  
------------------------------------------------------------------------------------------------
NAME
       man - format and display the on-line manual pages
       manpath - determine user�<80><99>s search path for man pages
------------------------------------------------------------------------------------------------



Expected Results:
------------------------------------------------------------------------------------------------
NAME
       man - format and display the on-line manual pages
       manpath - determine user's search path for man pages
------------------------------------------------------------------------------------------------
Comment 1 Eido Inoue 2004-02-27 15:08:43 EST
the "iconv patch" referred to in the changelog was a corner-special
case that only applied to Cyrillic character sets. It was a bad hack,
and a special case to deal with the transition period (if there was
one) where western european locales were in UTF-8 but everything else
(especially Cyrillic and CJK) was in local character sets, yet
Cyrillic man pages were mixed UTF-8 and KOI8-R.

The problem of supporting more than UTF-8 is that man itself pipes
though nroff, nroff pipes through groff, and groff pipes through the
pager. Through all that level of indirection, it is impossible for man
to know what the target terminals character set was, which is why
everything is output as UTF-8. (It is still possible to have man pages
in non-UTF-8 for the purpose of backwards compatibility, but they will
get normalized to UTF-8. We can do this because we CAN determine
whether a man page is UTF-8 or not; but we cannot determine whether
the terminal is UTF-8 or not)

There is a simple workaround to your problem; simply redefine the
environment variable for PAGER (or better, MANPAGER) so that it
includes an 'iconv -f utf-8 -t your-favorite-charset |'.

Or you can create a shell script wrapper for the pager that performs
the iconv conversion and pipes it to the pager.

Hope these suggestions help.
Comment 2 Ville Herva 2004-02-27 15:15:40 EST
Yes it helps. Thank you very much.

I think I now understand the limitations.

I can live with the workaround.

(Perhaps it was good to bring this up, since I fear it will be asked a lot, once fc2 ships.)
Comment 3 Ville Herva 2004-02-27 15:27:04 EST
Umm, there still seems to be issue. If I do

MANPAGER='iconv -f utf-8 -t latin1 | less' man man

I get

--------------------------------------------------------------------------------------
man(1)                                                                  man(1)


NAME
       man - format and display the on-line manual pages
       manpath - determine user
iconv: illegal input sequence at position 183
--------------------------------------------------------------------------------------

Ie. it halts when encountering the first non-plain-ascii character.

rpm -qf =iconv
glibc-common-2.3.3-10

Comment 4 Eido Inoue 2004-02-27 15:49:52 EST
that's because the apostrophe that man is using for "user's" is a
"pretty apostrophe" (wohoo! the benefits of unicode! pretty
apostrophes and quotes! ;) ) that iconv thinks doesn't exist in latin-1.

Force iconv to "dumb-down"  the source by adding "//translit" (for
"transliterate into approximate equivalent characters) to the end of
"latin1"
Comment 5 Ville Herva 2004-02-27 16:28:22 EST
Oh, ok. The //translit trick works. 

Now I'll just have to ponder whether or not a man can live without *pretty* 
apostrophes.

Thanks again. 

Ps: after some serious pondering I took the manly stance that I like my apostrophes 
rough. If the command was called woman, then everything would have to be pretty.

Note You need to log in before you can comment on or make changes to this bug.