Bug 502572 - cat stop responding after 1st cat and CTRL+C interrupt.
Summary: cat stop responding after 1st cat and CTRL+C interrupt.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Danny Feng
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 526775 533192
TreeView+ depends on / blocked
 
Reported: 2009-05-26 08:43 UTC by masanari iida
Modified: 2018-10-20 00:11 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:41:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
posted patch (1.05 KB, patch)
2009-07-07 02:08 UTC, Danny Feng
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description masanari iida 2009-05-26 08:43:18 UTC
Description of problem:
telnet login to RHEL5 system, do a "cat <filename>" and press CTRL+C.
This command stop the output on display succesfully.
But on the 2nd attempt, cat <filename> the text scroll stop responding
when the text scroll reach a position where I hit CTRL+C before.


Version-Release number of selected component (if applicable):
RHEL 5.2 (2.6.18-53)
telnet-server-0.17-39.el5
telnet-0.17-38.el5
krb5-libs-1.6.1-17.el5
pam_krb5-2.2.14-1
krb5-libs-1.6.1-17.el5
krb5-devel-1.6.1-17.el5
krb5-devel-1.6.1-17.el5
krb5-workstation-1.6.1-17.el5
pam_krb5-2.2.14-1
krb5-auth-dialog-0.7-1

How reproducible:
Always

Steps to Reproduce 1:

PC =>  RHEL4 or RHEL5 == telnet ==> RHEL5

Connection between PC and RHEL4/RHEL5 can be telnet or ssh.
Connection between RHEL4/RHEL5 and RHEL5 must be telnet.


1. Login to RHEL5 using telnet
2. # cat large_text_file
3. Press CTRL+C before the cat output finish.
4. Confirm shell returns prompt.
5. # cat large_text_file    one more time.

Actual results:
cat suddenly stop scrolling.
The point where the scroll is stopped is same point
that I had pressed CTRL+C in step 3.

Expected results:
cat large_text_file have to be finished without errors.

6. Use "vi large_text_file" instead of "cat large_text_file"
screen also stop responding.


Step to reproduced 2:

PC = Tru64 == telnet ==> RHEL5

Connection between PC and Tru64 can be telnet or ssh.
Connection between Tru64 and RHEL5 must be telnet.

1. Login to RHEL5 using telnet
2. # cat large_text_file
3. Press CTRL+C before the cat output finish.

Actual results:
cat scroll stop, but the screen never cleared.
bash shell prompt is NOT on the console.

4. Login to the RHEL5 from different console.
5. Find out the 1st bash shell that you have run cat.
6. strace -p <PID of the Bash>
7. Hit ENTER key on the 1st bash terminal, and see strace output.
8. type "exit" finish the bash session, so you are kicked out from
the telnet session. (Which means, bash shell is working as expected.)

Actual results:
strace output shows the bash try to output the shell prompt.
But 1st terminal never cleared and nothing comes out.


Additional information:
(1) Use ssh connection to the RHEL5 never show this symptom.
(2) Use RHEL4 instead of RHEL5 never show this symptom.
(3) Use zsh or tcsh did not fix this problem.
(4) Use latest coreutils (cat) from upstream didn't fix the symptom.
(5) Use FreeBSD or IRIX instead of Tru64 didn't fix the symptom.
(6) Use putty or teraterm on Windows PC, the symptom still the same.
(7) Ensable krb5-telnet instead of telnet didn't fix the symptom.
(8) If you connect to the RHEL5 from PC without RHEL4/5 or Tru64,
this symptom never reproduced.

Comment 1 Adam Tkac 2009-05-28 14:43:25 UTC
This looks like kernel problem for me (handling of OOB data).

Could you please update to the latest kernel, please? Your kernel (2.6.18-53) is quite old.

If problem still remains could you specify which network driver do you use on affected RHEL5 machine, please? (or simply attach "lspci -v" and "lsmod" outputs)

Thanks.

Comment 2 masanari iida 2009-05-29 07:53:33 UTC
Update the kernel version to 2.6.18-128.1.10, x86_64.
And test again.
The symptom still exist.

(Detail)
PC = ssh => RHEL4 = telnet => RHEL5(2.6.18-128.1.10)

(0) Telnet login to RHEL5 box.
(1) # cat /tmp/test_file 
(2) Press CTRL+C.
    Scroll stop as expected.  Shell prompt comes back.

(3) # cat /tmp/test_file 
    Scrolling stops automatically, even though I didn't press CTRL+C.
    Shell prompt comes back.

The network card on RHEL5 is Broadcom BCM5708.

           +-1c.0-[0000:02-03]----00.0-[0000:03]----00.0  Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet
           +-1c.1-[0000:04-05]----00.0-[0000:05]----00.0  Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet

Driver version 
bnx2 v1.7.9-1 (July 18, 2008)
This driver came from kernel 2.6.18-128.1.10.


# more /etc/modprobe.conf
alias eth0 bnx2
alias eth1 bnx2
alias scsi_hostadapter cciss
alias scsi_hostadapter1 ata_piix

Comment 3 Adam Tkac 2009-05-29 12:25:00 UTC
As I expected this is the kernel problem.

I created long text file:
0
1
2
....
72802

and then tried to "cat" it on machine with NetXtreme II BCM5708 card (bnx2 driver). The first time I terminated the cat command via ctrl+c and then when I tried to "cat" the whole file again it failed. Interesting is that the strace utility says that the whole file were written:
...
[pid  3924] write(1, "70801\n70802\n70803\n70804\n70805\n70"..., 4096) = 4096
[pid  3924] read(3, "3\n71484\n71485\n71486\n71487\n71488\n"..., 4096) = 4096
[pid  3924] write(1, "3\n71484\n71485\n71486\n71487\n71488\n"..., 4096) = 4096
[pid  3924] read(3, "166\n72167\n72168\n72169\n72170\n7217"..., 4096) = 3820
[pid  3924] write(1, "166\n72167\n72168\n72169\n72170\n7217"..., 3820) = 3820
[pid  3924] read(3, "", 4096)           = 0
[pid  3924] close(3)                    = 0
[pid  3924] close(1)                    = 0
[pid  3924] exit_group(0)               = ?
Process 3810 resumed

But on the client side output ends around number 1000 (sometimes more, sometimes less).

When I tried to reproduce same problem in virtual machine (Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ network cadr, 8139too driver) everything worked fine.

Reassigning to kernel for further inspection.

Comment 4 masanari iida 2009-05-29 13:50:32 UTC
I downloaded kernel source 2.6.29 from kernel.org for test purpose.
I boot the RHEL5 box with 2.6.29 kernel and try the same test.
It works as expected, both with RHEL4 case and also with Tru64 case.

Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.9.3 (March 17, 2009)

====
FYI, my RHEL5 box is not a virtual machine.

I have tested another RHEL5 which is on Xen, it didnot show this symptom.
I have tested yet another RHEL5 which is on VMware, it shows this symptom.

Thanks.

Comment 5 masanari iida 2009-06-08 15:58:54 UTC
Test following kernels.
2.6.24 - 2.6.27   Symptom reproduced.
2.6.28, 2.6.29    Symptom fixed.

Comment 6 masanari iida 2009-06-09 02:37:56 UTC
Confirm 2.6.28-rc1 also fix the symptom.

Comment 7 masanari iida 2009-06-09 06:57:12 UTC
Sorry,  Comment #6 was wrong information.

I have double checked on 2.6.28-rc series,
and  2.6.28-rc1, rc5 and rc7 were bad kernels.
2.6.28-rc8 was an oldest fixed kernel in 2.6.28-rc series.
And 2.6.28 and 2.6.29 were fixed,too.

Comment 8 masanari iida 2009-06-09 07:42:37 UTC
> This looks like kernel problem for me (handling of OOB data).

Thanks for the hint.
I have identify the fix within 2.6.28-rc8 Changelog.

commit 33cf71cee14743185305c61625c4544885055733
Author: Petr Tesarik <ptesarik>
Date:   Fri Nov 21 16:42:58 2008 -0800

    tcp: Do not use TSO/GSO when there is urgent data

    This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=12014

The patch is
http://mirror.celinuxforum.org/gitstat//commit-detail.php?commit=33cf71cee14743185305c61625c4544885055733

I have apply this patch to kernel 2.6.28-rc7,
and confirm the symptom was fixed.

I ask Red Hat to backport this patch into RHEL5 kernel.

Comment 10 Danny Feng 2009-07-07 02:08:58 UTC
Created attachment 350711 [details]
posted patch

Comment 12 RHEL Program Management 2009-09-25 17:37:22 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 13 Don Zickus 2009-10-06 19:37:42 UTC
in kernel-2.6.18-168.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 15 Chris Ward 2010-02-11 10:31:09 UTC
~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~

RHEL 5.5 Beta has been released! There should be a fix present in this 
release that addresses your request. Please test and report back results 
here, by March 3rd 2010 (2010-03-03) or sooner.

Upon successful verification of this request, post your results and update 
the Verified field in Bugzilla with the appropriate value.

If you encounter any issues while testing, please describe them and set 
this bug into NEED_INFO. If you encounter new defects or have additional 
patch(es) to request for inclusion, please clone this bug per each request
and escalate through your support representative.

Comment 16 masanari iida 2010-02-12 07:35:19 UTC
Confirm with kernel-2.6.18-186 (x86_64) fix this symptom.
Thank you for support.

Comment 17 Mike Gahagan 2010-03-12 21:44:50 UTC
I can confirm cat is still intruptable with the -192 kernel

Comment 19 errata-xmlrpc 2010-03-30 07:41:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html


Note You need to log in before you can comment on or make changes to this bug.