Bug 208511 - "info" cannot deal with CJK characters
Summary: "info" cannot deal with CJK characters
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: texinfo
Version: 9
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Vitezslav Crhonek
QA Contact: Ben Levenson
URL:
Whiteboard: bzcl34nup
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-09-28 22:39 UTC by lynnboy
Modified: 2008-11-20 10:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-11-20 10:45:24 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
texinfo test file (189 bytes, application/octet-stream)
2006-11-14 21:56 UTC, Chip Coldwell
no flags Details
Make makeinfo multibyte-aware (18.68 KB, patch)
2006-12-04 21:14 UTC, Miloslav Trmač
no flags Details | Diff
Make makeinfo multibyte-aware (19.52 KB, patch)
2006-12-05 16:29 UTC, Miloslav Trmač
no flags Details | Diff
Clean up of the above patch: use the mbswidth module from gnulib (7.02 KB, patch)
2007-01-24 19:41 UTC, Miloslav Trmač
no flags Details | Diff
Add !HAVE_MBRTOWC fallbacks to gnulib's multibyte handling interfaces (11.31 KB, patch)
2007-02-14 17:36 UTC, Miloslav Trmač
no flags Details | Diff
Make info multibyte-aware (41.38 KB, patch)
2007-02-14 17:37 UTC, Miloslav Trmač
no flags Details | Diff

Description lynnboy 2006-09-28 22:39:07 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.8.0.7) Gecko/20060913 Fedora/1.5.0.7-1.fc5 Firefox/1.5.0.7 pango-text

Description of problem:
The standalone info cannot realize that a CJK character takes double width of a ASCII character and is individable. Info counts characters by its byte amount, but not its real character, and doesn't display them correctly.
For example, what I wrote was a Chinese info page, but I cannot place 40 Chinese characters in a line, there were only 26 and when I resize the terminal many characters broken.
by the way, the "makeinfo" just cannot deal with my name! It becomes a strange word.



Version-Release number of selected component (if applicable):
info

How reproducible:
Always


Steps to Reproduce:
1. Just use "makeinfo" to deal a caracter like "样"(UTF-8:E4 BD 95) or "楠"(UTF-8:E6 A5 A0)(and there are many others) and each lost a byte.
2. Build some CJK .info page and watch it.
3. resize the terminal, count for the characters.


Actual Results:
Characters broken, they can't fill lines, and individual characters breaks at line end.


Expected Results:
A character use three or more bytes, but they takes two spaces, and cannot be broken.



Additional info:

Comment 1 Chip Coldwell 2006-11-14 21:56:25 UTC
Created attachment 141191 [details]
texinfo test file

Comment 2 Chip Coldwell 2006-11-14 22:18:15 UTC
I think I have reproduced this.  Essentially, what is happening is the
standalone info is splitting up multibyte characters when the terminal window is
resized.

Chip

Comment 3 Chip Coldwell 2006-11-14 22:28:22 UTC
(In reply to comment #0)
> 
> Description of problem:
> The standalone info cannot realize that a CJK character takes double width of
a ASCII character and is individable.

I should correct you here: the CJK character takes three times the width of an
ASCII character.  But anyway, your point is still true: I have reproduced this
problem.  However, the bug is reported against the wrong component: the emacs
packages do not supply the standalone info program, so I am going to reassign
this bug to the texinfo maintainer.

Chip


Comment 4 Patrice Dumas 2006-11-15 13:29:05 UTC
If I'm not wrong there is no utf8 support in info. I don't
even understand why some multibyte characters appear rightly.

Comment 5 Miloslav Trmač 2006-12-04 21:14:49 UTC
Created attachment 142775 [details]
Make makeinfo multibyte-aware

I have just sent this patch upstream.

Comment 6 Miloslav Trmač 2006-12-05 16:29:35 UTC
Created attachment 142865 [details]
Make makeinfo multibyte-aware

Comment 7 Miloslav Trmač 2007-01-24 19:41:39 UTC
Created attachment 146442 [details]
Clean up of the above patch: use the mbswidth module from gnulib

Comment 8 Miloslav Trmač 2007-02-14 17:36:23 UTC
Created attachment 148073 [details]
Add !HAVE_MBRTOWC fallbacks to gnulib's multibyte handling interfaces

Comment 9 Miloslav Trmač 2007-02-14 17:37:32 UTC
Created attachment 148074 [details]
Make info multibyte-aware

The above patches have all been submitted upstream, and handle everything but
input of multibyte characters in the info modeline (e.g. searching for
multibyte characters).

Comment 10 Bug Zapper 2008-04-04 03:52:40 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 11 Bug Zapper 2008-05-06 16:25:42 UTC
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 12 Patrice Dumas 2008-05-11 22:35:30 UTC
Is it fixed, or not?

Comment 13 Vitezslav Crhonek 2008-05-12 13:31:46 UTC
No, it is not. Patches sent to upstream by Miloslav, but still not included in
4.11. But it should be fixed in 4.12 (see Changelog), so I will check it after
update to 4.12 in Rawhide.

Comment 14 Bug Zapper 2008-05-14 02:23:05 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Vitezslav Crhonek 2008-05-14 11:37:31 UTC
I tested 4.12 version... it still doesn't work:(

Comment 16 Vitezslav Crhonek 2008-11-20 10:45:24 UTC
Fixed in texinfo-4.13-1.


Note You need to log in before you can comment on or make changes to this bug.