857676 – directly connected NIC lost packets or cannot communicate to each other

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 857676 - directly connected NIC lost packets or cannot communicate to each other

Summary: directly connected NIC lost packets or cannot communicate to each other

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Veaceslav Falico
QA Contact:	BaseOS QE - Apps
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-09-16 00:48 UTC by davidyangyi
Modified:	2014-09-30 23:45 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-04-23 15:34:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description davidyangyi 2012-09-16 00:48:09 UTC

Description of problem:
two servers, both eth6 are directly connected, and both eth7 are directly connected. bonding with eth6 and eth7

most of time they work find. but sometimes ping lost packets, sometimes cannot ping each other.


Version-Release number of selected component (if applicable):


How reproducible:

server1: 
cat /etc/modprobe.d/bonding.conf
alias bond3 bonding
options bond3 mode=1 miimon=100 primary=eth6

# cat ifcfg-bond3
DEVICE=bond3
BOOTPROTO=static
ONBOOT=yes
NM_CONTROLLED=no
IPADDR=12.12.200.12
NETMASK=255.255.255.128

# cat ifcfg-eth6
DEVICE=eth6
NM_CONTROLLED=no
BOOTPROTO=none
ONBOOT=yes
MASTER=bond3
SLAVE=yes
USERCTL=no

# cat ifcfg-eth7
DEVICE=eth7
NM_CONTROLLED=no
BOOTPROTO=none
ONBOOT=yes
MASTER=bond3
SLAVE=yes
USERCTL=no


server2: 
cat /etc/modprobe.d/bonding.conf
alias bond3 bonding
options bond3 mode=1 miimon=100 primary=eth6

# cat ifcfg-bond3
DEVICE=bond3
BOOTPROTO=static
ONBOOT=yes
NM_CONTROLLED=no
IPADDR=12.12.200.20
NETMASK=255.255.255.128

# cat ifcfg-eth6
DEVICE=eth6
NM_CONTROLLED=no
BOOTPROTO=none
ONBOOT=yes
MASTER=bond3
SLAVE=yes
USERCTL=no


# cat ifcfg-eth7
DEVICE=eth7
NM_CONTROLLED=no
BOOTPROTO=none
ONBOOT=yes
MASTER=bond3
SLAVE=yes
USERCTL=no


restart system and ping

# ping 12.12.200.12
PING 12.12.200.12 (12.12.200.12) 56(84) bytes of data.
64 bytes from 12.12.200.12: icmp_seq=2 ttl=64 time=0.184 ms
64 bytes from 12.12.200.12: icmp_seq=3 ttl=64 time=0.140 ms
64 bytes from 12.12.200.12: icmp_seq=6 ttl=64 time=0.153 ms
64 bytes from 12.12.200.12: icmp_seq=13 ttl=64 time=0.186 ms
64 bytes from 12.12.200.12: icmp_seq=17 ttl=64 time=0.131 ms
64 bytes from 12.12.200.12: icmp_seq=18 ttl=64 time=0.138 ms
64 bytes from 12.12.200.12: icmp_seq=19 ttl=64 time=0.138 ms
64 bytes from 12.12.200.12: icmp_seq=22 ttl=64 time=0.206 ms
^C
--- 12.12.200.12 ping statistics ---
26 packets transmitted, 8 received, 69% packet loss, time 25413ms


restart system again work restart network, it work fine again



# cat /proc/net/bonding/bond3
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth6
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth6
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:e0:ed:20:a1:62
Slave queue ID: 0

Slave Interface: eth7
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:e0:ed:20:a1:63
Slave queue ID: 0


# ethtool eth6
Settings for eth6:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: off
        MDI-X: Unknown
        Supports Wake-on: pumbg
        Wake-on: d
        Current message level: 0x00000003 (3)
        Link detected: yes


# ethtool eth7
Settings for eth7:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: off
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000003 (3)
        Link detected: yes


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 davidyangyi 2012-09-16 00:50:18 UTC

the NIC type is :

04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)

Comment 3 Jay Fenlason 2012-09-17 03:33:28 UTC

What ethtool command(s) failed?  Why do you think this is an ethtool problem rather than a Linux kernel problem?  Ethtool is a very simple program that merely tells the kernel what to do.

Comment 4 davidyangyi 2012-09-20 06:16:00 UTC

Because I didn't know which component cause the problem. so I choose ethtool casually. 
Now I choose the kernel component. Please give me some ideas, thank you very much.

Comment 5 RHEL Program Management 2012-12-14 08:00:29 UTC

This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 7 Veaceslav Falico 2013-04-23 15:34:59 UTC

Hi,

Sorry, I'm closing this bug as not enough info. The ping can fail in thousands of situations, and if there were no comments for that long time - seems like it got fixed.

If you really think that it should be open and can provide some additional info on the bug - feel free to reopen.

Thank you!

Note You need to log in before you can comment on or make changes to this bug.