Bug 616352

Summary: Lost first 9 packets when ping host using more than 2 NICs
Product: Red Hat Enterprise Linux 6 Reporter: Amos Kong <akong>
Component: kernelAssignee: Justin M. Forbes <jforbes>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0CC: ailan, gcosta, herbert.xu, lcapitulino, llim, ndai, tburke, virt-maint
Target Milestone: rcKeywords: RHELNAK
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-08-05 14:20:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amos Kong 2010-07-20 08:46:12 UTC
Description of problem:
I boot up guest with four virtio nics, try to ping host from guest with different nic, the previous 10 packages always lose, but related arp entries exist in host and guest.

Wait some times(about 1 min), continually ping with different nics,
the issue of lost packages also exists. (not always, only less times)


Version-Release number of selected component (if applicable):
guest kernel: 2.6.32-44.1.el6.x86_64
host kernel: 2.6.32-44.1.el6.x86_64
# rpm -qa |grep qemu
qemu-img-0.12.1.2-2.96.el6.x86_64
qemu-kvm-tools-0.12.1.2-2.96.el6.x86_64
qemu-kvm-0.12.1.2-2.96.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.96.el6.x86_64
gpxe-roms-qemu-0.9.7-6.3.el6.noarch


How reproducible:
always

Steps to Reproduce:
1. boot up guest with four nics
2. connect vnc
# vncviewer :0
3. ping host from guest by different nic (use eth1 ping first)
# ping $host_ip -I eth1
# ping $host_ip -I eth2
# ping $host_ip -I eth3
# ping $host_ip -I eth0
  
Actual results:
always lose previous 10 packages

Expected results:
no packages lost

Additional info:
1. QEMU-command-line:
#qemu-kvm -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100720-150129-AxiM',server,nowait -serial unix:'/tmp/serial-201000-150129-AxiM',server,nowait -drive file='/home/devel/push/client/tests/kvm/images/RHEL-Server-6.0-64-virtio.qcow2',if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=on,boot=on,format=qcow2 -device virtio-blk-pci,drive=drive-virtio-disk1,id=virtio-disk1 -net nic,vlan=0,netdev=idynJSmX,model=virtio,macaddr='02:A9:7C:6C:17:14' -netdev tap,id=idynJSmX,ifname='virtio_0_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup',downscript='no',vhost=on -net nic,vlan=1,netdev=id9i4ifK,model=virtio,macaddr='02:A9:7C:6C:0a:5a' -netdev tap,id=id9i4ifK,ifname='virtio_1_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup',downscript='no',vhost=on -net nic,vlan=2,netdev=idBs5261,model=virtio,macaddr='02:A9:7C:6C:d8:6d' -netdev tap,id=idBs5261,ifname='virtio_2_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup',downscript='no',vhost=on -net nic,vlan=3,netdev=idoRwFAY,model=virtio,macaddr='02:A9:7C:6C:a6:52' -netdev tap,id=idoRwFAY,ifname='virtio_3_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup',downscript='no',vhost=on -m 2048 -smp 2 -vnc :0 -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host -M rhel6.0.0 -usbdevice tablet -cpu qemu64,+sse2 -no-kvm-pit-reinjection

2. arp entries on host (can ping successfully)
host) # arp -a
rhel (10.66.91.190) at 00:23:ae:8f:f3:8d [ether] on switch
dhcp-91-156.nay.redhat.com (10.66.91.156) at 02:a9:7c:6c:0a:5a [ether] on switch
dhcp-91-107.nay.redhat.com (10.66.91.107) at 02:a9:7c:6c:0a:5a [ether] on switch
dhcp-91-36.nay.redhat.com (10.66.91.36) at 02:a9:7c:6c:a6:52 [ether] on switch
dhcp-91-154.nay.redhat.com (10.66.91.154) at 02:a9:7c:6c:0a:5a [ether] on switch

3. arp entries on guest (can ping successfully)
guest) # arp -a
dhcp-91-173.nay.redhat.com (10.66.91.173) at 00:23:ae:a9:7c:6c [ether] on eth2
dhcp-91-173.nay.redhat.com (10.66.91.173) at 00:23:ae:a9:7c:6c [ether] on eth0
dhcp-91-173.nay.redhat.com (10.66.91.173) at 00:23:ae:a9:7c:6c [ether] on eth3
dhcp-91-173.nay.redhat.com (10.66.91.173) at 00:23:ae:a9:7c:6c [ether] on eth1

Comment 1 Amos Kong 2010-07-20 08:57:48 UTC
Setup bridge delay to 0, the bug also exists.

host) # brctl setfd switch 0

Comment 3 RHEL Program Management 2010-07-20 09:17:39 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 4 Dor Laor 2010-07-22 20:36:00 UTC
Do you have spanning tree off for the bridge (you should)?
Can you also test with vhost disabled?

Comment 5 Amos Kong 2010-07-28 06:35:45 UTC
(In reply to comment #4)
> Do you have spanning tree off for the bridge (you should)?

Yes, stp was stopped.

> Can you also test with vhost disabled?    

Bug can also be reproduced when disable vhost.

It could not reproduce with rhel5.5 guest (vhost=on/off). A bug of rhel6 virtio driver ?

Comment 6 Luiz Capitulino 2010-07-28 19:46:35 UTC
What do you mean by "previous 10 packets"? Can you show the ping output please? Also, could you try to reduce the command-line options and the number of NICs to the minimal set that reproduces the problem?

Comment 7 Amos Kong 2010-07-29 05:43:46 UTC
(In reply to comment #6)
> What do you mean by "previous 10 packets"? Can you show the ping output please?

The first 9 packages lost.
BTW, change the subject to 'Lost first 9 packages when ping host using more than 2 NICs'

> Also, could you try to reduce the command-line options and the number of NICs
> to the minimal set that reproduces the problem?    

Can not reproduce with 2 nics.
can reproduce with more than 2 nics.

qemu command line:
# qemu-kvm RHEL-Server-6.0-64-virtio.qcow2 -net nic,vlan=0,netdev=idMk6AAs,model=virtio,macaddr='02::7C:6C:9f:8f' -netdev tap,id=idMk6AAs,ifname='virtio_0_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no',vhost=on -net nic,vlan=1,netdev=idLMFA6P,model=virtio,macaddr='02:A9:7C:6C:f9:ce' -netdev tap,id=idLMFA6P,ifname='virtio_1_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no',vhost=on -net nic,vlan=2,netdev=idBs5261,model=e1000,macaddr='02:A9:7C:6C:d8:6d' -netdev tap,id=idBs5261,ifname='virtio_2_8000',script='/home/devel/push/client/tests/kvm/scripts/qemu-ifup',downscript='no',vhost=on -m 512 -vnc :0 -cpu qemu64,+sse2 -no-kvm-pit-reinjection


host ip: 10.66.91.173
guest) # ping 10.66.91.173 -I eth2 -c 15 
ping 10.66.91.173 -I eth2 -c 15
PING 10.66.91.173 (10.66.91.173) from 10.66.91.178 eth2: 56(84) bytes of data.
64 bytes from 10.66.91.173: icmp_seq=10 ttl=64 time=0.612 ms
64 bytes from 10.66.91.173: icmp_seq=11 ttl=64 time=0.127 ms
64 bytes from 10.66.91.173: icmp_seq=12 ttl=64 time=0.113 ms
64 bytes from 10.66.91.173: icmp_seq=13 ttl=64 time=0.109 ms
64 bytes from 10.66.91.173: icmp_seq=14 ttl=64 time=0.088 ms
64 bytes from 10.66.91.173: icmp_seq=15 ttl=64 time=0.116 ms

--- 10.66.91.173 ping statistics ---
15 packets transmitted, 6 received, 60% packet loss, time 15007ms
rtt min/avg/max/mdev = 0.088/0.194/0.612/0.187 ms

guest) #ifconfig
eth0      Link encap:Ethernet  HWaddr 02:A9:7C:6C:9F:8F  
          inet addr:10.66.91.162  Bcast:10.66.91.255  Mask:255.255.255.0
          inet6 addr: fe80::a9:7cff:fe6c:9f8f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:849 errors:0 dropped:0 overruns:0 frame:0
          TX packets:31 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:103950 (101.5 KiB)  TX bytes:4844 (4.7 KiB)

eth1      Link encap:Ethernet  HWaddr 02:A9:7C:6C:F9:CE  
          inet addr:10.66.91.196  Bcast:10.66.91.255  Mask:255.255.255.0
          inet6 addr: fe80::a9:7cff:fe6c:f9ce/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:846 errors:0 dropped:0 overruns:0 frame:0
          TX packets:25 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:103626 (101.1 KiB)  TX bytes:4136 (4.0 KiB)

eth2      Link encap:Ethernet  HWaddr 02:A9:7C:6C:D8:6D  
          inet addr:10.66.91.178  Bcast:10.66.91.255  Mask:255.255.255.0
          inet6 addr: fe80::a9:7cff:fe6c:d86d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:411 errors:0 dropped:0 overruns:0 frame:0
          TX packets:46 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:56914 (55.5 KiB)  TX bytes:6454 (6.3 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

Comment 8 Glauber Costa 2010-08-03 12:46:49 UTC
Can I see the contents of your qemu-ifup scripts ?

Comment 9 Amos Kong 2010-08-04 01:15:57 UTC
(In reply to comment #8)
> Can I see the contents of your qemu-ifup scripts ?    

switch is a public bridge on host.

# cat qemu-ifup-switch 
#!/bin/sh
switch=switch
/sbin/ifconfig $1 0.0.0.0 up
/usr/sbin/brctl addif ${switch} $1
/usr/sbin/brctl setfd ${switch} 0
/usr/sbin/brctl stp ${switch} off


[root@dhcp-91-173 network-scripts]# cat ifcfg-eth0
DEVICE="eth0"
DEFROUTE="yes"
HWADDR="00:23:AE:A9:7C:6C"
IPV4_FAILURE_FATAL="yes"
IPV6INIT="no"
NAME="System eth0"
NM_CONTROLLED="yes"
ONBOOT="yes"
OPTIONS="layer2"
PEERDNS="yes"
PEERROUTES="yes"
TYPE="Ethernet"
UUID="5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03"
BRIDGE=switch
[root@dhcp-91-173 network-scripts]# cat ifcfg-br0 
DEVICE=switch
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Bridge

Comment 10 Glauber Costa 2010-08-04 14:41:26 UTC
Ok, I am still investigating the issue, but I will drop the info I have here, in case anybody is also looking at it.

First of all, it is indeed an arp-related problem.

If I pin the arp entries, problem go away. If I let 10 packets fire, then manually drop the arp entry, it stop working again.

It is also worth noting that one of the interfaces, always seem to work.
tcpdump in the guest, says this:

10:11:43.375275 ARP, Request who-has virtlab1.virt.bos.redhat.com tell dhcp75-13.virt.bos.redhat.com, length 28
10:11:43.375496 ARP, Reply virtlab1.virt.bos.redhat.com is-at 00:13:20:f5:fe:6b (oui Unknown), length 28

this is eth1, and it works. On eth0, that does NOT work, I see this:

10:11:43.375456 ARP, Request who-has virtlab1.virt.bos.redhat.com tell dhcp75-13.virt.bos.redhat.com, length 28

But not a reply. Guest tcpdump does not even show icmp requests. A little later on, I see another arp request, an actual reply, and then the icmp packets start showing up.

I'll keep on it, further tips are appreciated.

Comment 11 Herbert Xu 2010-08-05 14:04:56 UTC
This is not a bug, just rp_filter doing its job.

You should never connect four interfaces on the same machine (even if it's a virtual machine) to the same Ethernet, run DHCP and expect it to work.

Only one interface will work, the one whose subnet route is added last.

All the other ones will drop all inbound packets because of rp_filter.

Comment 12 Herbert Xu 2010-08-05 14:09:05 UTC
Even if you disabled rp_filter, things won't work consistently because of the fact that IPv4 addresses are per-host, so ARP replies will appear on random interfaces.

To test this properly, you either need pin everything down on both sides with static ARP entries, or use separate Ethernet bridges for each guest interface.