Bug 105764

Summary: japanese man pages are unreadble in non-UTF8 locale
Product: [Fedora] Fedora Reporter: Kazutoshi Morioka <morioka>
Component: manAssignee: Eido Inoue <havill>
Status: CLOSED RAWHIDE QA Contact: Ben Levenson <benl>
Severity: high Docs Contact:
Priority: medium    
Version: rawhide   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 1.5k-12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-10-09 20:33:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kazutoshi Morioka 2003-09-27 05:03:14 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030702
Galeon/1.3.9

Description of problem:
Japanese man pages are shown in currupted characters in non UTF-8 locale, i.e.
ja_JP.eucJP or ja_JP.SJIS.
ja_JP.UTF-8 is OK.

Version-Release number of selected component (if applicable):
man-1.5k-8

How reproducible:
Always

Steps to Reproduce:
1.install man-pages-ja package.
2.set language to japanese.
3.open gnome-terminal.
4.chenge terminal character coding to EUC-JP.
5.type this command
LANG=ja_JP.eucJP man cat

Actual Results:  man page is shown in corrupted characters, and not readable.

Expected Results:  man page is shown in japanese.

Additional info:

In process of japanese man-page, groff expect that input character encoding is
according to locale.
But japanese locale has four encoding choices: EUC-JP (this is default of Red
Hat Linux), SHIFT-JIS, ISO-2022-JP, UTF-8 (this is default of Fedora Core).
And man pages in man-pages-ja package are encoded with UTF-8.
If a user chose ja_JP.ecuJP locale or ja_JP.SJIS locale,
groff will expect EUC-JP or SHIFT-JIS encoded japanese man pages,
but japanese man pages are always encoded in UTF-8.
So, groff can't handle that correctly, shows currupted man pages. 

This patch is not a complete fix, but will cure the problem.
--- /etc/man.config.orig     2003-09-27 03:50:19.000000000 +0900
+++ /etc/man.config  2003-09-27 03:48:34.000000000 +0900
@@ -97,7 +97,7 @@
 #
 TROFF          /usr/bin/groff -Tps -mandoc
 NROFF          /usr/bin/nroff -c -mandoc
-JNROFF         /usr/bin/groff -Tnippon -mandocj
+JNROFF         LANG=ja_JP.UTF-8 /usr/bin/groff -Tnippon -mandocj
 KNROFF         /usr/bin/groff -Tkorean -mandoc
 EQN            /usr/bin/geqn -Tps
 NEQN           /usr/bin/geqn -Tlatin1
@@ -108,7 +108,8 @@
 PIC            /usr/bin/gpic
 VGRIND
 GRAP
-PAGER          /usr/bin/less -isr
+#PAGER         /usr/bin/less -isr
+PAGER          /usr/bin/lv -i
 CAT            /bin/cat
 #
 # The command "man -a xyzzy" will show all man pages for xyzzy.

I consider, converting man pages according to current locale before groff is a
better solution, but /usr/bin/iconv will not detect current locale.
So, another hack will be needed here.

Because of compatibility issue, japanese people may want to set thire
locale back to good-old ja_JP.eucJP. 
Also, this may be important for many asian Fedora users.

Comment 1 Eido Inoue 2003-10-09 21:06:44 UTC
change added, however changing from less to lv is not really desired, as many
systems do not have/want to install lv. what i18n functionality from lv is
needed in less? please file a bug against less.

Comment 2 Kazutoshi Morioka 2003-10-10 01:17:16 UTC
OK, I will file a bug about less to show a strange error message,
"less: multi.c:1521: convert_to_ujis: Assertion `cvindex == 1' failed."
when it convert UTF-8 to EUC-JP.

Comment 3 Kazutoshi Morioka 2003-10-10 04:13:03 UTC
Oh, I misunderstood. According to less's man-page, unfortunatly,
it seems that less dose not have any feature detect UTF-8 encoding automaticaly
and automaticaly convert it to EUC-JP encoding.
I think that we, japanese-users, may have to set PAGER environment variable to
lv at this time.
I don't think this less's issue is a bug.
And I try to explor sourcecode of less.