Red Hat Bugzilla – Bug 143787
Lynx won't display Japanese Unicode when in Japanese locales
Last modified: 2007-11-30 17:10:57 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020
Description of problem:
lynx currently does not have support for translating Japanese from legacy character sets (Shift-JIS, EUC-JP, etc.) into UTF-8. Therefore, we include a hack that causes lynx to use EUC-JP instead when run in Japanese locales, since most Japanese web content is still in one of the legacy locales.
The version of lynx included does not support translating Japanese UTF-8 into EUC-JP, which means that gibberish is obtained instead when UTF-8 webpages are viewed. However, as of version 2.8.6dev.4, this functionality is added, so long as lynx is compiled with --enable-japanese-utf8. Unfortunately, lynx still does not translate the legacy charsets INTO UTF-8, but at least updating would allow Japanese UTF-8 webpages to be viewed by Japanese users.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. rm ~/.lynxrc, if it exists
2. export LANG=ja_JP.EUC-JP
3. lynx http://bythebay.web.infoseek.co.jp/pc/mojibake61.html
Actual Results: Gibberish, mojibake, corrupted characters, whatever you want to call it
Expected Results: Legible Japanese should have been displayed. In fact, leaving LANG=en_US.UTF-8 and running lynx http://bythebay.web.infoseek.co.jp/pc/mojibake61.html should give the proper result, assuming Japanese fonts are installed.
The fix is pretty easy. Just get a recent version of lynx, at least 2.8.6dev4, though 2.8.6dev7 is available, stable, and has other fixes, and add --enable-japanese-utf8 to the %configure line in lynx.spec
2.8.6dev.8 is current.
Ah yes, you're right of course. Misthought that. In any case, it would be a
I think that Redhat/Fedora policy is in general to avoid "development" versions,
but 2.8.6dev.8 is extremely stable IMO and worth updating to. Do you agree?
My patches tend to be pretty stable. But I am concerned
about the newer code for multibyte support. Briefly: I
set out to reimplement the UTF-8 logic in lynx using the
wide-character in ncurses. The options-menu and info-page
are parts that use this (a good built-in self-test). Some
of the Debian users say the options menu isn't working well
in that configuration (and I've not found where our configs
differ). Here's a screen shot I made (they say the links
are being offset by one cell):
Well, I built 2.8.6dev.8 (with --enable-japanese-utf8), and the options menu
looks perfectly fine for me with LANG=ja_JP.utf8 It doesn't look quite like
your screenshot, though, since your screenshot to me looks like the links are
offset. (Notice how the ãï¼ is duplicated at the end in yours.)
I do have a problem if I resize my terminal. I have to quit lynx and restart it
for it work properly, reloading doesn't help, nor does loading a new webpage.
But that's a different problem, happens in all charsets and locales, and of long
standing in lynx
I didn't notice the duplication (will look into that).
The only problem I had noticed was with the file
near the end: the line-length for wrapping is off by one.
I suspect this is a different case. The reports I had about
the offset was that all of the highlighted links were one
column to the left (something like that).
Resizing is a different problem - the configure script looks
for resizeterm() and ncurses. If there's something wrong
with the ifdef's it won't compile-in the code for that.
I'll hold off upgrading the RPM until it's settled down a bit then. I'd much
prefer to upgrade to a stable release than a development one.
That sounds ok. I just marked 2.8.6dev.9, which has a
number of fixes other than this area (and will be working
on xterm). My current plan for 2.8.6dev.10 is to work on
this area (to fix the two items mentioned above, as least).
I'm thinking about releasing 2.8.6 around late February,
provided that I can iron out this area.
As a slight update to this issue, 2.8.6dev.14 was released recently, and
"extend[s] experimental option --enable-japanese-utf8, allowing lynx to convert
EUC-JP and Shift_JIS strings to UTF-8"
It's a rather nice update that means that, with that option enabled, Japanese
users can browse in a Unicode Japanese locale as well and still view basically
all webpages. In addition, non-Japanese users can use lynx when in a
non-Japanse UTF-8 based locale and still read basically all Japanese webpages.
As it is currently, most Japanese webpages seem to still be using EUC-JP or
However, I don't know whether there are any big blockers left in 2.8.6dev.14. I
know upgrades to development releases are discouraged, but this is a pretty huge
fix for reading Japanese webpages.
There are no big blockers - my available time's been reduced,
so it's taking longer to do things. There are a few new
(since 2.8.5) display problems that I've been working to resolve.
Those get done concurrently with new bug reports, but basically
it's the new display problems that are why 2.8.6's not yet released.
Fedora Core 3 is now maintained by the Fedora Legacy project for security
updates only. If this problem is a security issue, please reopen and
reassign to the Fedora Legacy product. If it is not a security issue and
hasn't been resolved in the current FC5 updates or in the FC6 test
release, reopen and change the version to match.
Still a problem with FC5 and FC6 test1.
It's an issue with lynx2.8.5. It's essentially fixed in 2.8.6 (by passing
--enable-japanese-utf8 when building), but that's been in .dev releases for
almost two and a half years. The 2.8.6.dev releases, such as dev.18, are very
stable and have long reached the point in my mind where the bugfixes and added
features outweigh any possible regressions. But 2.8.6 isn't officially out,
though I'm not sure what important bugs are blocking it.
The bug also hangs around as a reminder to add the build option. This would be
a highly nice thing to fix, since we've gone to UTF8 for the default Japanese
locale for some time.
There's one annoying regression (returning to a page after
following a link with a "#" doesn't always return to the
expected point). I think I've dealt with the other issues
blocking a release, and intend working on that once I've got
past a current set of changes to xterm...
I released lynx 2.8.6 last week - updating the package should
resolve this bug report.
Fixed in lynx-2.8.6-1.
Not quite, actually. The "--enable-japanese-utf8" option still needs to be
added to configure in order to solve the original bug report, and I see that
according to CVS, it hasn't been yet. Thanks!
Thank you for your fast notice, you are right, lynx-2.8.6-1 does not fix your
lynx-2.8.6-2 uses --enable-japanese-utf8 option and should fix the bug.