Bug 108994
| Summary: | text mangled by man and less | ||
|---|---|---|---|
| Product: | [Retired] Red Hat Linux | Reporter: | Joe R. Doupnik <jrd> |
| Component: | less | Assignee: | Eido Inoue <havill> |
| Status: | CLOSED RAWHIDE | QA Contact: | Ben Levenson <benl> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 9 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | i686 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | 1.5m2 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2004-02-19 16:49:19 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Fedora Core 1 default packages: less is ignoring LESSCHARSET variable and display incorrect characters for iso8859-2 and utf-8 encodeded text file (probably others encodigs are missiterpreted too). Another locale setting has no effect on it. If build less from less-378-11.1.src.rpm without patch#1, patch#2 and patch#3 all is working OK. This is fixed with the nroff/man UTF-8 combination in rawhide |
From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) Description of problem: Text processing bug: RH 8 and 9, and WS v3 all share a common problem: the *roff processing of man pages and "less" displays inserts a three byte binary code to the user. That code occurs when parsing quotation marks in the source. The code is most likely that of *roff internal work being passed between processing stages, but alas is not recognized and instead is passed to the user. One barely notices this on a console display, but it's still wrong. One heavily notices this when telneting in with a good terminal emulator. A handy test item I use is man iptables. If we look at the second paragraph of running text and do a debug display of a terminal emulator we something like this: Each chain is a list of rules which can match a set of packets. EachMJ [[24;1H[[K:[[24;1H[[24;1H[[K rule specifies what to do with a packet that matches. This is called aMJ �@Xtarget�@Y, which may be a jump to a user-defined chain in the same t Notice the word "target" above and the binary gibberish surrounding it. The "@" item turns out to be hex 80, a non-printable control code. And so on. Doing a more filename with similar quoting can yield the same difficulty. Here is the same material viewed thorough a binary editor (I did man iptables >/tmp/x.x and then viewed x.x through editor bvi): 000005B0 63 6B 65 74 73 2E 20 20 20 45 61 63 68 0A 20 20 ckets. Each. 000005C0 20 20 20 20 20 72 75 6C 65 20 73 70 65 63 69 66 rule specif 000005D0 69 65 73 20 77 68 61 74 20 74 6F 20 64 6F 20 77 ies what to do w 000005E0 69 74 68 20 61 20 70 61 63 6B 65 74 20 74 68 61 ith a packet tha 000005F0 74 20 6D 61 74 63 68 65 73 2E 20 20 54 68 69 73 t matches. This 00000600 20 69 73 20 63 61 6C 6C 65 64 20 61 0A 20 20 20 is called a. 00000610 20 20 20 20 E2 80 98 74 61 72 67 65 74 E2 80 99 ...target... ---- the line above has the details around word target ---- 00000620 2C 20 77 68 69 63 68 20 6D 61 79 20 62 65 20 61 , which may be a 00000630 20 6A 75 6D 70 20 74 6F 20 61 20 75 73 65 72 2D jump to a user- 00000640 64 65 66 69 6E 65 64 20 63 68 61 69 6E 20 69 6E defined chain in 00000650 20 74 68 65 20 20 73 61 6D 65 20 20 74 61 2D 0A the same ta-. 00000660 20 20 20 20 20 20 20 62 6C 65 2E 0A 0A 0A 54 08 ble....T. 00000670 54 41 08 41 52 08 52 47 08 47 45 08 45 54 08 54 TA.AR.RG.GE.ET.T 00000680 53 08 53 0A 20 20 20 20 20 20 20 41 20 20 66 69 S.S. A fi 00000690 72 65 77 61 6C 6C 20 72 75 6C 65 20 73 70 65 63 rewall rule spec Or, viewing more x.x on FreeBSD to see clearly what's happening: Each chain is a list of rules which can match a set of packets. Each rule specifies what to do with a packet that matches. This is called a <E2><80><98>target<E2><80><99>, which may be a jump to a user-defined cha in in the same ta- ble. For what it's worth dept: SuSE does not have these problems, using the same source material. Nor do *BSD systems. So it's a bug in the RH way of constructing these utilities, deep within something roff-like. Thanks, Joe Doupnik jrd.edu Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. man iptables and similar, as illustrated in the report above 2. 3. Actual Results: Please refer to the report above for explicit information. Expected Results: Quote marks, rather than binary gibberish Additional info: Two test systems, uname -a on each: Linux netlab4.usu.edu 2.4.20-20.9 #1 Mon Aug 18 11:28:34 EDT 2003 i586 i586 i386 GNU/Linux Linux netlab6.usu.edu 2.4.21-4.0.1.EL #1 Thu Oct 23 01:42:27 EDT 2003 i686 athlon i386 GNU/Linux