Bug 173599 - xterm outputs codes as "wide" characters even when selected font is "narrow"
Summary: xterm outputs codes as "wide" characters even when selected font is "narrow"
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: xterm
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jason Vas Dias
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-11-18 15:50 UTC by Nick Simicich
Modified: 2007-11-30 22:11 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-11-30 22:37:46 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
The fonts that Sharp APL uses and their installation procedure. (23.31 KB, application/octet-stream)
2005-11-18 16:04 UTC, Nick Simicich
no flags Details
installed with xrdb -nocpp -merge before apl is run in an xterm. (7.99 KB, text/plain)
2005-11-25 21:36 UTC, Nick Simicich
no flags Details

Description Nick Simicich 2005-11-18 15:50:35 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
I installed Sharp APL (which has run on older versions of Redhat). APL uses a proprietary font, which uses all code positions.  The fonts are attached.  When the APL program echos characters which are "high" (that is, they seem to have the high bit set) the xterm program tries to output the data as wide characters, and this completely garbles the output.  If you type "blind" the program works as expected except for the odd characters.

I have also attached a very simple program that outputs "high" characters.

Version-Release number of selected component (if applicable):
xterm-205.1.FC4

How reproducible:
Always

Steps to Reproduce:
1. xterm -fn saxlarge
2. testout | less
3. Observe the characters output at about decimal 160-253 - they should all be APL special characters.
  

Actual Results:  xterm interpreted the characters as the first byte of multi-byte characters, even though the selected font was a single byte font.

Expected Results:  The font should have been examined. If the selected font was a single byte font, the output should have been done using a narrow output method. Alternatively, there should be a flag that allows the end user to bypass the wide output.

Additional info:

I grabbed the source rpm, and modified the specfile to use the options --disable-luit and --disable-wide-chars.  The version of xterm that I then built was able to support Sharp APL.  My problem is fixed - until xterm is patched again.  Ideally, I believe that if I specify a font that only has 256 code points, doing wide character processing is broken, but I would be happy with a documented command line option, or a resource (that could be specified from the command line) or even a separate version of xterm that is separately configured with the ifdefs to turn off the wide processing.  

This is my "test" program:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
        int m;
        int i;
        char p;
        for(i=0;i<513;i++) {
                //scanf("%d\n",&m);
                m = i;
                p = (char)m;
                printf("|< -%c- -%d- -%c- >|", m,m,p);
                if(i && i % 5 == 0) printf("\n");
        }
        printf("\n");
        exit(0);
}

Comment 1 Nick Simicich 2005-11-18 16:04:35 UTC
Created attachment 121233 [details]
The fonts that Sharp APL uses and their installation procedure.

These are the X11 font files and the script that Sharp APL uses to install
them. You also need to reload the font server if you use it, and you can use
xlsfonts to verify installation.  I did all this - the proper font was being
used, the output was simply wrong until I recompiled xterm - and now I can't
use xterm if I have any actual multi-byte text files.  If you want to see the
whole Sharp APL system, there is no charge for the code under Linux.  Get it
from here: ftp://ftp.tor.soliton.com/pub/SAXreleases/CurrentRelease/sax611/

Comment 2 Thomas E. Dickey 2005-11-24 01:10:13 UTC
Perhaps Redhat has setup xterm to run in UTF-8 mode.
That would produce this effect.  (I do note that the
locale cited in the Bugzilla line is not UTF-8).

Comment 3 Jason Vas Dias 2005-11-24 04:18:01 UTC
Please try the latest xterm version for FC-4, xterm-207-1, now in 
FC-4 Updates/Testing .
I was unable to find any problems with this version and your 'APL Medium' fonts -
I did install them:
$ xlsfonts | grep sax
-soliton-apl-medium-r-normal-saxlarge-0-0-72-72-c-0-sax8859-1
-soliton-apl-medium-r-normal-saxlarge-20-200-72-72-c-120-sax8859-1
-soliton-apl-medium-r-normal-saxlarge-20-200-72-72-c-120-sax8859-1
-soliton-apl-medium-r-normal-saxmedium-0-0-72-72-c-0-sax8859-1
-soliton-apl-medium-r-normal-saxmedium-16-160-72-72-c-80-sax8859-1
-soliton-apl-medium-r-normal-saxsmall-0-0-72-72-c-0-sax8859-1
-soliton-apl-medium-r-normal-saxsmall-11-110-72-72-c-80-sax8859-1

but I could not see any 'apl' (Array Programming Language?) chars between 
chars 160 - 255 , either from your test program output, or from a perl
command such as:
# perl -e 'for( $i=128; $i < 255; $i++)
           { print "[ ", sprintf("%3d",$i)," ", chr($i)," ]"; 
if ( ($i * 8) % 72 == 0 ){ print "\n"; };};'





Comment 4 Jason Vas Dias 2005-11-24 05:02:25 UTC
sorry, I pressed <tab> <return> by mistake.

Anyway, I ran your test program and the above perl code under both xterm-207-1.FC4
or under xterm-200-6, when run with either:
  # xterm -fa 'APL Medium'
or with 
  # xterm -fn '-soliton-apl-medium-r-normal-saxmedium-0-0-72-72-c-0-sax8859-1' 
but saw no difference in output .

I then compiled xterm with --disable-luit and --disable-wide-chars , and the 
APL symbols were displayed, eg by :
perl -e 'for( $i=160; $i < 255; $i++){ print "[ ", sprintf("%3d",$i)," ",
chr($i)," ]"; if ( ($i * 8) % 72 == 0 ){ print "\n"; };};

Since the --disable-luit and --disable-wide-chars configuration options
would seem to disable UTF-8 support, they are unlikely to be the default.

We do not "setup xterm to run in UTF-8 mode" . 

It seems that if wide-char mode is enabled, then output of chars > 127 is 
disabled .

I will investigate to see if we could make xterm provide a dynamic
'disable wide char' and 'disable-luit' mode.

One way around this might be to provide a charset for the font that could be
used with the +lc -en xterm options.


Comment 5 Nick Simicich 2005-11-25 21:32:49 UTC
OK, thanks. Your comments led me partway down the road for this:

Adding "-en c" to the the xterm command fixes the OUTPUT problem but breaks
INPUT.  I tested this under FC1 and FC3 after rebuilding the xterm with the
original specfile and force installed it over the "newer" one I had installed. 
Running the perl test program (better than my C one) from the command line with
-fn saxlarge got me none of the APL characters.  Running it with "-en c" (which,
so far as I can tell, just makes the locale stuff more or less go away for
output) allows those high characters through.

BUT.....apl also requires a lot of special input.  Without "-en c", I can type
blind:  typing the keystrokes 

[alt] [l] [release alt] [a] [v] [enter] 

(QUADav) causes APL to output all of the special characters.  This "works" for
both input and output with the "hacked" xterm.  With the xterm built-as-shipped,
and without "-en c", if I close my eyes and type, I get the output I expect,
except, of course, that I don't see the characters I expect in the output
vector.  If I add "-en c" to the xterm command, the output characters are
correct, but I can't type the "quad" symbol, among many others.

(There is an apl command - ")keys" - that gives me a keyboard map that produces
all of the keyboard output possibilities. I can see the characters even if I
can't type a quad.)

I kept changing things - and finally got some of the input to work with:

-en C +u8

Or:  encode C, turn off utf8 

Now, according to the man page, this option is overridden by -en. This could
have been why I never tried it - but the reality is that I simply never tried
xterm options because I looked at the man page at 3 AM and I did not see the
locale or these options in among the jump scrolling and so forth options. My fault.

But it does seem that the documentation is defective.  Specifically:

               This  option  and  the utf8 resource are overridden by the -lc
               and -en options and locale resource.  That is,  if  xterm  has
               been  compiled to support luit, and the locale resource is not
               ââfalseââ this option is ignored. 

According to appres, I have no XTerm resource that matches (case independently)
loca. So this option should be ignored - none the less, it still has effect on
input mapping.  

BTW:  I installed the new version of xterm from the updates-testing repository,
xterm-207-1.fc4. I had to use the command "yum update xterm.i386" to do that,
not sure if that is intentional or not.  The same effects seem to happen - I can
get the output to work if I use xterm -en C (output does not work if I add +u8 -
although the apl program works with those options). In other words, there is no
change to my problem with the latest version of xterm.

My problem is not solved.  95% of what I do works - provided that I never make a
mistake.  The xterm command for apl contains 

 -tm "intr ^c erase ^h" 

but the backspace key quits working none the less.  And, well, if I select
something on the screen and then paste it back, that does not work either. It
looks like going through paste interprets high characters badly.

I recompiled xterm without those new functions - it still works that way. Cut
and paste, backspace, etc., all work.  I'm willing to run any tests anyone wants
to try and help isolate this.

So I'm back to using the xterm without the new functions.  I can't find a
combination of options that makes APL useful with just command line options
(unless, of course, I can type without making a mistake and I never want to do
things I can normally do on a window system).

I am unsure that this violates the principle of "least surprise" - that is, I
had a package that worked on an older release of Linux - it stopped working when
I installed it in the new release, and the method of getting it to work again is
not obvious.  I understand the desire to make UTF-8 fonts work for
internationalization, but I feel that some priority should be put on having old
apps keep working.

A quick way of doing that would be to generate a second xterm command - call it
xterm8, say, that was simply 8 bit clean and did not have the new features -
that and a man page would allow people to get stuff working.

My point is that, if I have specified a font that does not have wide characters,
it makes zero sense to cause xterm to output wide characters.  I traditionally
would expect apl to work with 8 bit input transparently, and I'm still unsure
what effect +u8 has, why specifying +u8 makes apl input work, and why specifying
+u8 makes the test program fail, and why cut and paste does not work.

This command is run before xterm, BTW:

xrdb -nocpp -merge $SAXDIR/lib/term/saxkey.map

It is conditional on whether the entries are already in xrdb.

I will attach that file.  There is also a 

xmodmap $SAXDIR/lib/term/saxmod.map

which contains this single line:

add mod1 = Alt_L


-=-=-=-
non-essential info below this line:

BTW, APL stands for "A Programming Language" although it might easily stand for
Array Programming Language, since it allows for parallel operations on arrays to
a much greater extent than even a language like R does.  I first used it in
1970, on an IBM 360/65 mainframe.  The language was very similar to what we see
today - except that arrays had to be uniform - now they can be "nested" and
nonuniform. The smallest version of the interpreter I ever saw was implemented
on an IBM 1130 - with 8k 16 bit words of memory. It used a special selectric
typeball to generate the special characters.

Many of the symbols are used to manipulate, reshape, and otherwise work with
vectors, and multidimensional arrays in a uniform manner.  For example, the
circle (alt j) overstruck with the / swaps the array indices and is called a
transpose - if you want to produce a multiplication table, the program can be
something like:

( 10 10 alt-r alt-I 10) alt-- transpose 10 10 alt-r alt-i 10

or:

(10 10 R I 10 )x o/ 10 10 R I 10

Evaluation is strictly right to left unless it is altered by parenthesis.

alt-r is Rho and is used to reshape arrays to a new size.
alt-i is iota and is used to generate a vector between origin and the right
argument.
alt-dash (alt--) is the multiply sign - * is used for exponentiation.

To place the mines randomly in an advanced minesweeper game, 

18 30 alt-r ((100 alt-r 1),((18 alt-- 30)-100)alt-r 0)[(18 alt-- 30) ? 18 alt-- 30]
or:

18 30 R ((100 R 1),((18x30)-100) R 0)[(18x30)?18x30]

In other words, make a vector of the proper number of zeros and 1's, then index
it randomly and reshape it into the 18 by 30 matrix.

With single characters as opposed to the alt sequences, the code is very
concise. At one point the language was considered the leading language for fast
prototyping and perhaps ad-hoc analysis in the financial community. 

Comment 6 Nick Simicich 2005-11-25 21:36:29 UTC
Created attachment 121496 [details]
installed with xrdb -nocpp -merge before apl is run in an xterm.

installed with xrdb -nocpp -merge before apl is run in an xterm.

Comment 7 Jason Vas Dias 2005-11-30 22:37:46 UTC
I had another look at this.

The xterm-207-1 package should be able to be made to work for you unchanged -
you'll just need to use the '-en c +u8' options and uncheck the 
'Alt Sends Escape' xterm option  or set the VT100*eightBitInput xresource to 1
(true)) - this option is on the <CTRL>+<LEFT MOUSE BUTTON> xterm menu.

By default, since the default locale is UTF-8 enabled, and for the many apps
that require that the ALT modifier sends Escape, we ship xterm with the 
VT100*eightBitInput resource set to 0 , which enforces conversion of 
input codes > 127 to wide chars (UTF-8), and makes the ALT key send Escape .

For instance, to make the '$' key emit the pound and euro sign, I do :
  $ xmodmap -e 'keycode 13 = 0x34 0x24 0xa3 0xa4' 
so      key <4>  = 4
   <SHIFT> +<4>  = $
   <ALT>   +<4>  = £(pound sign)
<SHIFT>+<ALT>+<4>= ¤(euro sign)

This should work equally well for APL - when run in an xterm with 
option '-fn saxmedium -en c +u8', and 'Alt Sends Escape' unchecked,
the <ALT>+<4> key produced the Yen ( ´ ) symbol .
You could set up keyboard maps so that you can edit apl code from an xterm
with vi.

The backspace key also worked correctly - the '-tm "intr ^c erase ^h"'
option you used probably caused it to work incorrectly - erase should
be ^? for the xterm terminfo .

So, to summarise:
- Use the xterm options '-en c +u8'
- Uncheck the 'Alt Sends Escape' option or set the xresource 
'VT100*eightBitInput: 1'.
The xterm defaults should not be changed, so this is NOTABUG .


Note You need to log in before you can comment on or make changes to this bug.