Bug 440467

Summary: ethttool -S on r8169 version 2.2LK hangs when interface is down
Product: Red Hat Enterprise Linux 4 Reporter: Bryn M. Reeves <bmr>
Component: kernelAssignee: Ivan Vecera <ivecera>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: cward, dmair, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 490162 (view as bug list) Environment:
Last Closed: 2009-05-18 19:30:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryn M. Reeves 2008-04-03 17:33:54 UTC
Description of problem:
Running ethtool -S on a Realtek R8169 device that is currently down causes
ethtool to hang:

$ ethtool -S eth0
<waits forever>

The tool is interruptible by signals during this hang.

Previous versions would exit with no stats as the old driver did not provide an
ethtool_stats method.

Version-Release number of selected component (if applicable):
2.6.9-42.32 onwards

How reproducible:
100%

Steps to Reproduce:
If eth0 is an 8169:
1. ifconfig eth0 down
2. ethtool -S eth0

Actual results:
Hang

Expected results:
Returns 0 stats like other NIC drivers.

Additional info:

Comment 1 Bryn M. Reeves 2008-04-03 17:43:33 UTC
Using strace shows we're hanging out in the ethtool ioctl:

24564 06:29:11 munmap(0x2a95557000, 138967) = 0
24564 06:29:11 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
24564 06:29:11 ioctl(3, SIOCETHTOOL, 0x7fbffff5c0) = 0
24564 06:29:11 brk(0)                   = 0x513000
24564 06:29:11 brk(0x534000)            = 0x534000
24564 06:29:11 ioctl(3, SIOCETHTOOL, 0x7fbffff5c0) = 0
24564 06:29:11 ioctl(3, SIOCETHTOOL, 0x7fbffff5c0) = 0
24564 06:42:41 --- SIGINT (Interrupt) @ 0 (0) ---
24564 06:42:41 +++ killed by SIGINT +++

I'm wondering if this is down to this code:

+       while (RTL_R32(CounterAddrLow) & CounterDump) {
+               if (msleep_interruptible(1))
+                       break;
+       }
+

It seems like the only thing that would put us in interruptible sleep in the
ethtool_stats path here - is it plausible that the terminating condition would
never be reached if the NIC is down?

The code's unchanged upstream (introduced in commit
d4a3a0fc9c2d012093cf75a8d95034966c17e71e), but it seems like adding a
timeout/iteration limit on that loop would address this?



Comment 9 Ivan Vecera 2009-01-21 16:02:16 UTC
I found this issue does not affects RHEL 4 only. This affects RHEL 5 and also upstream. Not all of NICs supported by r8169 driver cause this problem. I have several r8169 based NICs and only one (RTL8169sb PCI) is not able to return ethtool statistics when it is down.

Comment 10 Ivan Vecera 2009-02-10 13:58:00 UTC
The patch resolving this issue was pushed upstream 3 days ago. I have tested the patch on my machine (kernel 2.6.27) without any problems. I'm going to prepare the kernel packages for RHEL-4 now and I will provide them soon.

URL with upstream commit:
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commitdiff;h=355423d0849f4506bc71ab2738d38cb74429aaef

Comment 11 Ivan Vecera 2009-02-10 16:48:07 UTC
The kernel packages for testing are available at:
http://people.redhat.com/ivecera/rhel-4-ivtest/

Could you please test them?

Comment 12 RHEL Program Management 2009-02-10 16:53:40 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 18 Vivek Goyal 2009-03-11 14:09:33 UTC
Committed in 83.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 20 Chris Ward 2009-05-05 13:55:53 UTC
Any updates here? Has this issue been resolved in the RHEL 4.8 Beta? later kernel?

Comment 24 errata-xmlrpc 2009-05-18 19:30:50 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html