Bug 859707

Summary: vlan tagging leads to spurious hw errors on sunhme
Product: Red Hat Enterprise Linux 6 Reporter: manuel wolfshant <manuel.wolfshant>
Component: kernelAssignee: Andy Gospodarek <agospoda>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: agospoda, peterm
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-09 21:23:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
grep '?' /var/log/messages none

Description manuel wolfshant 2012-09-23 12:37:14 UTC
Created attachment 616095 [details]
grep '?' /var/log/messages

Description of problem:
I have just installed CentOS 6.3 on an old machine which has a Sun Happy Meal quad-card inside. Ever since starting it, the logs are filled with messages similar to those from the attached file

Version-Release number of selected component (if applicable):
Linux mail3 2.6.32-279.5.2.el6.i686 #1 SMP Thu Aug 23 22:16:48 UTC 2012 i686 i686 i386 GNU/Linux


How reproducible:
Always

Steps to Reproduce:
1. install CentOS 6.3 on a machine with a Sun Happy Meal quad card
2. configure network to use vlans. In my setup I have:

6: eth1.5@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 08:00:20:bf:09:f1 brd ff:ff:ff:ff:ff:ff
    inet 86.a.b.c/27 brd 86.a.b.d scope global eth1.5
    inet 212.u.v.145/32 brd 212.u.v.145 scope global eth1.5:1
    inet 212.u.v.144/32 brd 212.u.v.144 scope global eth1.5:2
    inet 212.u.v.146/32 brd 212.u.v.146 scope global eth1.5:3
    inet 212.u.v.147/32 brd 212.u.v.147 scope global eth1.5:4
7: eth1.6@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 08:00:20:bf:09:f1 brd ff:ff:ff:ff:ff:ff
    inet 83.x.y.z/24 brd 83.166.206.255 scope global eth1.6

3. start network
  
Actual results:
lots and lots of
Sep 23 15:29:29 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:29 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:29 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:29 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:29 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:29 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:34 mail3 kernel: eth1.6: hw csum failure.
Sep 23 15:29:37 mail3 kernel: eth1.5: hw csum failure.
Sep 23 15:29:40 mail3 kernel: eth1.5: hw csum failure.
each one followed by a call trace. See attached file for a longer log.

Expected results:
 No such messages should exist. The card is functioning OK and it's connected to an IBM BladeChassis which is also OK.


Additional info:
[root@mail3 ~]# rpm -qa kernel\*
kernel-firmware-2.6.32-279.5.2.el6.noarch
kernel-2.6.32-279.el6.i686
kernel-2.6.32-279.5.2.el6.i686

[root@mail3 ~]# uptime ; grep "csum failure" /var/log/messages -c
 15:35:49 up  1:18,  2 users,  load average: 0.19, 0.07, 0.02
1213

Comment 1 manuel wolfshant 2012-09-23 12:37:38 UTC
[root@mail3 ~]# lspci
00:00.0 Host bridge: ALi Corporation M1647 Northbridge [MAGiK 1 / MobileMAGiK 1] (rev 04)
00:01.0 PCI bridge: ALi Corporation PCI to AGP Controller
00:02.0 USB controller: ALi Corporation USB 1.1 Controller (rev 03)
00:04.0 IDE interface: ALi Corporation M5229 IDE (rev c4)
00:06.0 USB controller: ALi Corporation USB 1.1 Controller (rev 03)
00:07.0 ISA bridge: ALi Corporation M1533/M1535/M1543 PCI to ISA Bridge [Aladdin IV/V/V+]
00:09.0 PCI bridge: Digital Equipment Corporation DECchip 21153 (rev 04)
00:11.0 Bridge: ALi Corporation M7101 Power Management Controller [PMU]
01:00.0 VGA compatible controller: NVIDIA Corporation NV5M64 [RIVA TNT2 Model 64/Model 64 Pro] (rev 11)
02:00.0 Bridge: Oracle/SUN EBUS (rev 01)
02:00.1 Ethernet controller: Oracle/SUN Happy Meal 10/100 Ethernet [hme] (rev 01)
02:01.0 Bridge: Oracle/SUN EBUS (rev 01)
02:01.1 Ethernet controller: Oracle/SUN Happy Meal 10/100 Ethernet [hme] (rev 01)
02:02.0 Bridge: Oracle/SUN EBUS (rev 01)
02:02.1 Ethernet controller: Oracle/SUN Happy Meal 10/100 Ethernet [hme] (rev 01)
02:03.0 Bridge: Oracle/SUN EBUS (rev 01)
02:03.1 Ethernet controller: Oracle/SUN Happy Meal 10/100 Ethernet [hme] (rev 01)

Comment 4 RHEL Program Management 2012-12-14 08:52:49 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 6 Andy Gospodarek 2013-05-06 20:38:09 UTC
Is this still a problem with the latest RHEL6.4.  I suspect it is, but I do not have any of these devices at this point.  Let me know and I'll see if I can find a device.

Comment 7 manuel wolfshant 2013-05-08 14:29:25 UTC
   I just tried to reproduce the issue using exactly the same hardware as before. The only difference is that in the original scenario the exit interface was running in VLAN mode while now in the test lab it was setup without tagging.

   The good news is that I failed to reproduce the initial problem. During the next days I will try to verify if the kernel from 6.3 and/or using VLAN make a difference.

Comment 8 Andy Gospodarek 2013-05-08 14:37:46 UTC
Thanks!

Comment 9 Andy Gospodarek 2013-12-09 21:23:15 UTC
Without a response it is difficult to diagnose.

Please reopen if this is still a problem on your hardware.

Thanks!