Created attachment 795983 [details] /tmp/missingcharacter.png I have 80-colums wide xterm window with a fixed font, my TERM is xterm, locale is cs_CZ.UTF-8, black-on-white colors. I observe this: $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx\n' | grep 1234 --color=always xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123xxxxxxxxx As you can see, the charater `4' is missing in the output. This is a raw output: $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx\n' | grep 1234 --color=always | hexdump -C 00000000 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 |xxxxxxxxxxxxxxxx| * 00000040 78 78 78 78 78 78 78 78 78 78 78 78 1b 5b 30 31 |xxxxxxxxxxxx.[01| 00000050 3b 33 31 6d 1b 5b 4b 31 32 33 34 1b 5b 6d 1b 5b |;31m.[K1234.[m.[| 00000060 4b 78 78 78 78 78 78 78 78 78 0a |Kxxxxxxxxx.| 0000006b I guess this is a bug in xterm. This happens only if colored grep output is enabled. That means the `1234' should be printed red, while following x's white and the `4' should occupy last colums of xterm window. But for unkown reason, the `4' is missing. I have xterm-293-1.fc19.x86_64.
Do you see the problem also with xterm-295 from rawhide? The changelog doesn't seem to mention any problems like this, but it would be good to be sure. http://koji.fedoraproject.org/koji/buildinfo?buildID=451963
Yes. I can see the same with xterm-295-2.fc20.x86_64.rpm installed into F19.
CCing upstream maintainer.
sounds like http://invisible-island.net/xterm/xterm.faq.html#grep_colors
By the way, a user last month pointed out that he was able to fix grep by setting GREP_COLORS="ne"
> sounds like http://invisible-island.net/xterm/xterm.faq.html#grep_colors Yes. It matches. So can it be considered a bug in the VT100 specification? If it can, would it be possible to implement an xterm (run-time) option to diverge from the specification at this point to provide more verbatim output? > By the way, a user last month pointed out that he was able to fix grep > by setting GREP_COLORS="ne" That's equivalent to grep --color=never. I.e. it disables the colors at all.
Actually it's a bug (or limitation if you prefer) in grep. Changing xterm to accommodate grep would introduce a bug in xterm. In a quick check here, GREP_COLORS=ne grep --color=always perl gives me colored filenames (magenta) and "perl" (red) on the matching lines. Perhaps you are testing it in a different way. If there is some scenario where grep's configurability won't work, then we can discuss that.
(I left out the "*" on the example line -)
You are right that the GREP_COLORS=ne still uses colors. I misread grep(1) manual before. Omitting the \E[K really helps: $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx\n' | GREP_COLORS=ne grep 1234 --color=always | hexdump -C 00000000 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 |xxxxxxxxxxxxxxxx| * 00000040 78 78 78 78 78 78 78 78 78 78 78 78 1b 5b 30 31 |xxxxxxxxxxxx.[01| 00000050 3b 33 31 6d 31 32 33 34 1b 5b 6d 78 78 78 78 78 |;31m1234.[mxxxxx| 00000060 78 78 78 78 0a |xxxx.| 00000065 $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx\n' | GREP_COLORS=ne grep 1234 --color=always xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx I don't understand why the \E[K is used by grep. However if it is not compatible with xterm and VT100, it should be fixed in grep or terminfo.
Created attachment 796269 [details] Window shot Image this time.
grep does not use terminfo (it ignores the TERM variable other than to check if it is "dumb'). Call it hard-coded... A comment in main.c says that it would be "impractical" to use terminfo. By the way, the changelog says that the original code for this feature was from Ulrich Drepper, in 2001. Very little work has been done on it since then. As I read the comments, it appeared to have been mostly based on the Linux console with a sprinkling of quirks for some very old (and less used) terminals such as Telray.
I understand the use of \E[K, of course (expanding on that would be a distraction). It might be possible to devise an improvement to grep which would address this problem without introducing new ones. However, it's unlikely that any patch that I made would make its way into the upstream code. If there's sufficient interest to a patch for Fedora, I can be motivated into investigating this further, but the workaround suggested should be enough otherwise.
(In reply to Petr Pisar from comment #9) > I don't understand why the \E[K is used by grep. > To correctly clean the background, try this: $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx\n' | GREP_COLORS='ms=01;31:mc=01;31:sl=01;41:cx=:fn=35:ln=32:bn=32:se=36:ne' grep 1234 And this (without the 'ne'): $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234xxxxxxxxx\n' | GREP_COLORS='ms=01;31:mc=01;31:sl=01;41:cx=:fn=35:ln=32:bn=32:se=36' grep 1234 > However if it is not > compatible with xterm and VT100, it should be fixed in grep or terminfo. > Apparently it is not compatible with xterm and some others (e.g. urxvt), but there are also terminals where it works correctly (e.g. xfce4-terminal, konsole, ...). Even the urxvt renders it differently than xterm (I added $ bellow to mark the linebreak): xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123 $ xxxxxxxxx Note the additional space after the '3'. I think the problem is in weird (undocumented?) DEC VT100 behaviour (which I cannot proof because I do not have the original DEC VT100 terminal). Namely in the DECAWM (Auto wrap) feature (which is DEC Private feature). It is not documented in the ECMA-48 and IMHO it is probably not documented anywhere (please point me to the documentation if you know it). It seems the autowrap (if enabled) wraps the line when the 80 + 1 (in case of 80 characters columns) character is received, not immediately after the 80th character (control characters are not count) as one would expect, thus the ERASE IN LINE (EL) command erases the last character (number 4 in this case), not the next line. For me this seems to be a bug in the DECAWM implemtation in xterm (and also in the original terminal?). The current behaviour seems non logical to me and very hard to workaround - I have no idea how to workaround this correctly in grep and not break anything else. If you have any idea how to do it, please let me know. As there are terminals where the autowrap works correctly (e.g. xfce4-terminal, konsole) and it seems that nothing else break, I think this could be fixed / workarounded in the xterm code. Reproducer: $ stty cols 80 $ echo -e 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234\e[Kxxxxxxxxx' Expected result: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1234 xxxxxxxxx Current result: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123x xxxxxxxx
(In reply to Jaroslav Škarvada from comment #13) > (In reply to Petr Pisar from comment #9) > > I don't understand why the \E[K is used by grep. > > > To correctly clean the background, try this: > $ printf > 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > 1234xxxxxxxxx\n' | > GREP_COLORS='ms=01;31:mc=01;31:sl=01;41:cx=:fn=35:ln=32:bn=32:se=36:ne' grep > 1234 > > And this (without the 'ne'): > $ printf > 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > 1234xxxxxxxxx\n' | > GREP_COLORS='ms=01;31:mc=01;31:sl=01;41:cx=:fn=35:ln=32:bn=32:se=36' grep > 1234 > Of course for xterm (due to weird autowrapping behaviour) e.g. this: $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx12345xxxxxxxxx\n' | GREP_COLORS='ms=01;31:mc=01;31:sl=01;41:cx=:fn=35:ln=32:bn=32:se=36:ne' grep 12345 and this (without 'ne'): $ printf 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx12345xxxxxxxxx\n' | GREP_COLORS='ms=01;31:mc=01;31:sl=01;41:cx=:fn=35:ln=32:bn=32:se=36' grep 12345
FYI xfce4-terminal and xterm both uses TERM=xterm-256color, so both should behave same. I am not aware of any terminfo capability we could query for it in grep.
xfce4-terminal is a VTE wrapper; VTE's developers have chosen to pretend that it works with TERM=xterm, etc., so there's no help from that direction. Besides, the most-applicable capability (xenl) partly-works with VTE. Color isn't really the problem - it's one detail in a larger bug that's been there quite a while. I wrote a test-screen for vttest 8 years ago. Here are some screenshots which I made (from my lengthy to-do list...): http://invisible-island.net/vttest/vttest-wrap.html
Offhand, grep's a smaller and more manageable program. You're more likely to be able to repair it than VTE.
(In reply to Thomas E. Dickey from comment #16) > xfce4-terminal is a VTE wrapper; VTE's developers have chosen to pretend that > it works with TERM=xterm, etc., so there's no help from that direction. > Besides, the most-applicable capability (xenl) partly-works with VTE. > xenl: The xenl capability says that lines of exactly 80 printable characters will cause only one line feed -- that is, the automatic line feed will not be generated after the 80th printable character, but before the 81st, if there is one. Thus when the line consists of exactly 80 printable characters plus a line feed, the terminal does not insert an automatic line feed. It doesn't say that the EL will not wrap, so it is only partial of what we need. Is there any specification for such EL behaviour? If yes, I thing the xfce4 developers has it wrong. > Color isn't really the problem - it's one detail in a larger bug that's > been there quite a while. Sure, it's the EL command, see the reproducer in comment 13. > I wrote a test-screen for vttest 8 years ago. > Here are some screenshots which I made (from my lengthy to-do list...): > > http://invisible-island.net/vttest/vttest-wrap.html Great, thanks, will check. I don't know how to workaround the problem, which sequence to send which would workaround it for xterm but not break it for others, e.g. xfce4-terminal. So in the worst case the client (grep) would need to check if the autowrap behaviour is on, what are the cols limits and count the chars and send workaround characters. It seems really weird to me. I doubt such patch will ever get into grep. Aside the above mentioned hack it would also need to query terminfo and according to the folowing upstream statement it's unlikely they will accept such change: > It would be impractical for GNU grep to become a full-fledged > terminal program linked against ncurses or the like, so it will > not detect terminfo(5) capabilities.
I did say that xenl was the "most-applicable". Bear in mind that the terminfo manpage is descriptive, not prescriptive, and that it is describing features which are well-known. DEC's documentation comments that on the right margin, all controls are ignored - which testing confirms means more than just newline. I also noted that grep is unlikely to query terminfo (for more reasons than one - chiefly nontechnical ones). But I suggested that it might be possible to devise a better compromise than turning EL off completely. (Suppressing EL on the right margin wouldn't help for those who think that it should behave differently in a pipe).
(In reply to Thomas E. Dickey from comment #19) Ok, thanks for info. I will point this upstream. If you think this BZ belongs to grep, I will probably close it as wontfix or upstream.
It definitely belongs to grep.
Filled as upstream bug #15444: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15444 Closing with resolution upstream.