Bug 82610

Summary: tcsh has new command line length limit of ~245 characters
Product: [Retired] Red Hat Linux Reporter: Jamie Zawinski <jwz>
Component: tcshAssignee: Miloslav Trmač <mitr>
Status: CLOSED WORKSFORME QA Contact: Bill Huang <bhuang>
Severity: high Docs Contact:
Priority: medium    
Version: 8.0CC: mitr
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-08-20 07:01:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jamie Zawinski 2003-01-23 23:44:50 UTC
Since upgrading from RH 7.2 to RH 8.0, I can no longer compose long command
lines in tcsh when running in an XEmacs *shell* buffer.  I don't know whether
this is a bug in tcsh, or in readline, or what, but it does not happen with bash.

It's not merely truncating the command line -- it's also apparently leaving the
un-read characters on the input buffer, and then interpreting them as commands
afterward!  This is potentially disasterous, and could easily lead to loss of
files, if there were redirections or something on the command line.

This still happens after setting $LANG to C, so it's apparently not some Unicode
locale BS, as so many recent problems are.

Watch what happens (line wrapping and blank lines inserted for readability):

  setenv LANG C
  xemacs
  M-x shell

  <jwz@gronk:/home/jwz/> echo a23456789 b23456789 c23456789 d2345678
  9 e23456789 f23456789 g23456789 h23456789 i23456789 j23456789 k234
  56789 l23456789 m23456789 n23456789 o23456789 p23456789 q23456789 
  r23456789 s23456789 t23456789 u23456789 v23456789 w23456789 x23456
  789 y23456789

  [ ... apparent ls of current directory, maybe from a ^D ? ... ]


  <jwz@gronk:/home/jwz/> y23456789: Command not found.
  Exit 1


  <jwz@gronk:/home/jwz/>     [  ... then I hit RET again, and  ... ]

  a23456789 b23456789 c23456789 d23456789 e23456789 f23456789 g23456
  789 h23456789 i23456789 j23456789 k23456789 l23456789 m23456789 n2
  3456789 o23456789 p23456789 q23456789 r23456789 s23456789 t2345678
  9 u23456789 v23456789 w23456789 x23456789

  <jwz@gronk:/home/jwz/> sh

  sh-2.05b$ echo a23456789 b23456789 c23456789 d23456789 e23456789 f
  23456789 g23456789 h23456789 i23456789 j23456789 k23456789 l234567
  89 m23456789 n23456789 o23456789 p23456789 q23456789 r23456789 s23
  456789 t23456789 u23456789 v23456789 w23456789 x23456789 y23456789

  a23456789 b23456789 c23456789 d23456789 e23456789 f23456789 g23456
  789 h23456789 i23456789 j23456789 k23456789 l23456789 m23456789 n2
  3456789 o23456789 p23456789 q23456789 r23456789 s23456789 t2345678
  9 u23456789 v23456789 w23456789 x23456789 y23456789
  sh-2.05b$ 


Versions:

    Red Hat Linux release 8.0 (Psyche)
    Linux 2.4.18-14 #1 Wed Sep 4 12:13:11 EDT 2002 i686
       athlon i386 GNU/Linux
    tcsh-6.12-2
    bash-2.05b-5
    readline-devel-4.3-3
    readline-4.3-3
    xemacs-21.4.8-16

Comment 1 Miloslav Trmač 2004-08-20 07:01:12 UTC
I can't reproduce this with tcsh-6.13-1 and xemacs-21.4.15-5,
from looking at the code the limit is roughly 4096 characters.

Please reopen if you still experience the problem with a recent
distribution.

Comment 2 Jamie Zawinski 2004-08-20 07:54:14 UTC
The problem still exists, but it turns out that "unset filec" is the fix.

I *think* the verdict is that it's xemacs's fault, but nobody really
knows how to fix it portably.

FYI, here's what Martin had to say about this in Jan 2003:

Martin Buchholz <martin> wrote:
> 
> Historically, all Unix systems had a limit on the number of characters
> you can input on the command line - generally 255.  This is one of the
> most stupid system limits imaginable.
[...]
> It depends on your shell, and your system.
> 
> On my Linux system, the limit seems to have been raised to about 40k.
> To reproduce this in an xterm, you have to get your shell into
> canonical mode, e.g. `bash --noediting' or `unset edit' in tcsh.
> Try building up a huge `echo' command - you probably won't be able to
> input more than 40k.
> 
> My guess is that the Linux folks have raised the MAX_CANON limit, but
> without changing the system header files.  Perhaps because Linus and
> Uli don't really talk much.
> 
> xemacs tries to circumvent the system limit.  If the command line is
> longer than 200 or so, it sends the command line in chunks of 200,
> with a '^D' character in between (!).  This insanity has mostly worked
> for the past decade.  "Usually" the '^D's are discarded.
> 
> csh has always been a little special, because '^D' also does file
> completion.  I remember that ten years ago,
> 
> unset filec
> 
> seemed to fix things when some user had a problem just like yours.

Martin Buchholz <martin> wrote:
> 
>> Bingo.  "unset filec" fixes it.
>>
>> So if xemacs were either:
>>
>>  - just sending all the bytes at once; or
>>  - sending them in chunks, but without the extra ^D
> 
> If XEmacs sends more than MAX_CANON (usually 256) bytes all at once to
> the tty, it typically gets wedged in a most horrible way.
> 
> Unfortunately, sending them in chunks without ^D has the same effect.
> 
>      EOF   (Control-d or ASCII EOT) may be used  to  generate  an
>            end-of-file   from  a terminal. When received, all the
>            characters waiting to be read are  immediately  passed
>            to the program, without waiting for a newline, and the
>            EOF is discarded.  Thus, if no characters are  waiting
>            (that is, the EOF occurred at the beginning of a line)
>            zero characters are passed back, which is the standard
>            end-of-file  indication. Unless escaped, the EOF char-
>            acter is not echoed. Because EOT is  the  default  EOF
>            character, this prevents terminals that respond to EOT
>            from hanging up.
> 
> You can see the effect even today by doing:
> xterm
> bash -noediting (or Solaris /bin/sh)
> hold down the 'x' key. (or paste some text without newlines repeatedly)
> 
> Eventually (1024 chars on Linux; 256 on Solaris) the xterm will refuse
> to allow any more input and will just beep at you.
> 
> jwz> then it would work fine without "unset filec".
> 
> The above description is only correct if icanon mode is turned on in
> the tty.  That used to be generally true, but these days shells like
> to process their input characters themselves by default.
> 
> The correct way to fix this bug is for XEmacs to check whether the tty
> is in icanon mode and the EOF character is defined.  Then, and ONLY
> then, should the ^D trick be used to avoid MAX_CANON.
> 
> It is probably also a bug that ICANON is set by XEmacs when it
> initializes a pty.  If a user explicitly inserts a literal ^W into the
> shell input buffer, it is more likely that they want this passed to
> the shell than that they want to have the previous word be discarded
> by the tty.  Emacs is a better editor than the tty -- the user already
> has backward-kill-word.