Bug 502572
Summary: | cat stop responding after 1st cat and CTRL+C interrupt. | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | masanari iida <masanari_iida> | ||||
Component: | kernel | Assignee: | Danny Feng <dfeng> | ||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.2 | CC: | atkac, cward, dzickus, mgahagan, tao | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-03-30 07:41:16 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 526775, 533192 | ||||||
Attachments: |
|
Description
masanari iida
2009-05-26 08:43:18 UTC
This looks like kernel problem for me (handling of OOB data). Could you please update to the latest kernel, please? Your kernel (2.6.18-53) is quite old. If problem still remains could you specify which network driver do you use on affected RHEL5 machine, please? (or simply attach "lspci -v" and "lsmod" outputs) Thanks. Update the kernel version to 2.6.18-128.1.10, x86_64. And test again. The symptom still exist. (Detail) PC = ssh => RHEL4 = telnet => RHEL5(2.6.18-128.1.10) (0) Telnet login to RHEL5 box. (1) # cat /tmp/test_file (2) Press CTRL+C. Scroll stop as expected. Shell prompt comes back. (3) # cat /tmp/test_file Scrolling stops automatically, even though I didn't press CTRL+C. Shell prompt comes back. The network card on RHEL5 is Broadcom BCM5708. +-1c.0-[0000:02-03]----00.0-[0000:03]----00.0 Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet +-1c.1-[0000:04-05]----00.0-[0000:05]----00.0 Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet Driver version bnx2 v1.7.9-1 (July 18, 2008) This driver came from kernel 2.6.18-128.1.10. # more /etc/modprobe.conf alias eth0 bnx2 alias eth1 bnx2 alias scsi_hostadapter cciss alias scsi_hostadapter1 ata_piix As I expected this is the kernel problem. I created long text file: 0 1 2 .... 72802 and then tried to "cat" it on machine with NetXtreme II BCM5708 card (bnx2 driver). The first time I terminated the cat command via ctrl+c and then when I tried to "cat" the whole file again it failed. Interesting is that the strace utility says that the whole file were written: ... [pid 3924] write(1, "70801\n70802\n70803\n70804\n70805\n70"..., 4096) = 4096 [pid 3924] read(3, "3\n71484\n71485\n71486\n71487\n71488\n"..., 4096) = 4096 [pid 3924] write(1, "3\n71484\n71485\n71486\n71487\n71488\n"..., 4096) = 4096 [pid 3924] read(3, "166\n72167\n72168\n72169\n72170\n7217"..., 4096) = 3820 [pid 3924] write(1, "166\n72167\n72168\n72169\n72170\n7217"..., 3820) = 3820 [pid 3924] read(3, "", 4096) = 0 [pid 3924] close(3) = 0 [pid 3924] close(1) = 0 [pid 3924] exit_group(0) = ? Process 3810 resumed But on the client side output ends around number 1000 (sometimes more, sometimes less). When I tried to reproduce same problem in virtual machine (Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ network cadr, 8139too driver) everything worked fine. Reassigning to kernel for further inspection. I downloaded kernel source 2.6.29 from kernel.org for test purpose. I boot the RHEL5 box with 2.6.29 kernel and try the same test. It works as expected, both with RHEL4 case and also with Tru64 case. Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.9.3 (March 17, 2009) ==== FYI, my RHEL5 box is not a virtual machine. I have tested another RHEL5 which is on Xen, it didnot show this symptom. I have tested yet another RHEL5 which is on VMware, it shows this symptom. Thanks. Test following kernels. 2.6.24 - 2.6.27 Symptom reproduced. 2.6.28, 2.6.29 Symptom fixed. Confirm 2.6.28-rc1 also fix the symptom. Sorry, Comment #6 was wrong information. I have double checked on 2.6.28-rc series, and 2.6.28-rc1, rc5 and rc7 were bad kernels. 2.6.28-rc8 was an oldest fixed kernel in 2.6.28-rc series. And 2.6.28 and 2.6.29 were fixed,too. > This looks like kernel problem for me (handling of OOB data). Thanks for the hint. I have identify the fix within 2.6.28-rc8 Changelog. commit 33cf71cee14743185305c61625c4544885055733 Author: Petr Tesarik <ptesarik> Date: Fri Nov 21 16:42:58 2008 -0800 tcp: Do not use TSO/GSO when there is urgent data This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=12014 The patch is http://mirror.celinuxforum.org/gitstat//commit-detail.php?commit=33cf71cee14743185305c61625c4544885055733 I have apply this patch to kernel 2.6.28-rc7, and confirm the symptom was fixed. I ask Red Hat to backport this patch into RHEL5 kernel. Created attachment 350711 [details]
posted patch
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. in kernel-2.6.18-168.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. ~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~ RHEL 5.5 Beta has been released! There should be a fix present in this release that addresses your request. Please test and report back results here, by March 3rd 2010 (2010-03-03) or sooner. Upon successful verification of this request, post your results and update the Verified field in Bugzilla with the appropriate value. If you encounter any issues while testing, please describe them and set this bug into NEED_INFO. If you encounter new defects or have additional patch(es) to request for inclusion, please clone this bug per each request and escalate through your support representative. Confirm with kernel-2.6.18-186 (x86_64) fix this symptom. Thank you for support. I can confirm cat is still intruptable with the -192 kernel An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html |