Bug 1080494

Summary: physical host shows different packet counts than VM running on it
Product: Red Hat Enterprise Linux 6 Reporter: Martin Pavlik <mpavlik>
Component: kernelAssignee: Jiri Benc <jbenc>
Status: CLOSED WORKSFORME QA Contact: Network QE <network-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.6CC: ccui, cpelland, danken, gklein, jbenc, lxin, mpavlik, mst, rkhan
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-29 07:09:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 871829, 1062166, 1066570    
Attachments:
Description Flags
/var/log from host
none
/var/log from VM none

Description Martin Pavlik 2014-03-25 14:34:03 UTC
Created attachment 878488 [details]
/var/log from host

Description of problem:

There is a VM running on physical host. The VM is connected to physical host via vnet0 on the host

[root@dell-r210ii-05 ~]# brctl show
bridge name	bridge id		STP enabled	interfaces
;vdsmdummy;		8000.000000000000	no		
rhevm		8000.d067e5f07f02	no		em1
							vnet0

/sys/class/net/vnet0/statistics/ show many more sent packets on guest VM than on physical host

guest virtual machine:
[root@localhost ~]# uname -a
Linux localhost.localdomain 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

[root@localhost ~]# more /sys/class/net/eth0/statistics/*x_bytes
::::::::::::::
/sys/class/net/eth0/statistics/rx_bytes
::::::::::::::
87784288
::::::::::::::
/sys/class/net/eth0/statistics/tx_bytes
::::::::::::::
13854324130


physical host:

[root@dell-r210ii-05 ~]# uname -a
Linux dell-r210ii-05.rhev.lab.eng.brq.redhat.com 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

[root@dell-r210ii-05 ~]# more /sys/class/net/vnet0/statistics/*x_bytes
::::::::::::::
/sys/class/net/vnet0/statistics/rx_bytes
::::::::::::::
7328381579
::::::::::::::
/sys/class/net/vnet0/statistics/tx_bytes
::::::::::::::
65270403



Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. launch VM in RHEV-M (I used  Red Hat Enterprise Virtualization Manager Version: 3.4.0-0.10.beta2.el6ev, pure qemu might work as well )
2. create some traffic from the VM to outside world ( I used iperf)
3. check more /sys/class/net/eth0/statistics/*x_bytes in VM
4. check more /sys/class/net/vnet0/statistics/*x_bytes on host

Actual results:
/sys/class/net/vnet0/statistics/ show many more sent packets on guest VM than on physical host

Expected results:
physical host displays correct packet count

Additional info:

problem appears for virtIO driver as well as e100 

qemu line

2014-03-25 11:54:15.824+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name vm2 -S -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 1024 -realtime mlock=off -smp 1,maxcpus=160,sockets=160,cores=1,threads=1 -uuid 92a98893-983f-42f5-a455-2219b92af114 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=6Server-6.5.0.1.el6,serial=4C4C4544-0037-5410-8033-C4C04F39354A,uuid=92a98893-983f-42f5-a455-2219b92af114 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2014-03-25T11:54:15,driftfix=slew -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/00000002-0002-0002-0002-000000000350/4e9d8312-d7bd-4cd8-b925-879c20406266/images/5a3a0fd2-018d-4026-b456-2346c4e84865/cf2c9d8f-99d0-4500-9325-d4bfb38b9579,if=none,id=drive-virtio-disk0,format=raw,serial=5a3a0fd2-018d-4026-b456-2346c4e84865,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=31,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:1a:4a:c0:3f:21,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/92a98893-983f-42f5-a455-2219b92af114.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/92a98893-983f-42f5-a455-2219b92af114.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5900,tls-port=5901,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -k en-us -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432 -incoming tcp:[::]:49153 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7

Comment 1 Martin Pavlik 2014-03-25 14:34:59 UTC
Created attachment 878489 [details]
/var/log from VM

Comment 2 Martin Pavlik 2014-03-25 14:37:50 UTC
[root@dell-r210ii-05 ~]# rpm -qa | grep qemu
qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.6.x86_64
gpxe-roms-qemu-0.9.7-6.10.el6.noarch
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.6.x86_64
qemu-img-rhev-0.12.1.2-2.415.el6_5.6.x86_64

Comment 4 Jiri Benc 2014-04-09 12:19:26 UTC
What's vnet0? The command line suggests qemu is using tap, how is tap0 connected to vnet0?

Comment 5 Rashid Khan 2014-04-09 12:49:30 UTC
Hi JBenc, 
can you please have a look at it when you have a chance. 
We can discuss relative priorities offline. 

Thanks
Rashid

Comment 6 Dan Kenigsberg 2014-04-09 15:38:44 UTC
yes, vnet0 is a tap device that libvirt creates and passes to qemu after having added it to a bridge.

Comment 7 Jiri Benc 2014-04-09 16:42:55 UTC
For the record, have not reproduced on my machine but I don't have RHEL as the host.

Comment 8 RHEL Program Management 2014-04-09 21:22:44 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 10 Jiri Benc 2014-04-22 15:30:04 UTC
Reproduced on the provided machine only for a short while, when transferring data (using ssh) from the VM to an outside machine. I observed the difference on the direction opposite to the transfer, i.e. rx_bytes on the VM side vs. tx_bytes on the host side (the latter is larger).

When tried to transfer data from outside to the VM, the counters started to behave correctly. They're still behaving correctly even after switching the direction again.

Comment 11 Jiri Benc 2014-04-23 12:49:56 UTC
Tried with a different machine with RHEL 6.5 (both guest and host). I cannot reproduce the issue. The tx bytes on host side and rx bytes on guest side differ exactly by the number of bytes that were filtered by the (emulated) hardware, as they're not destined to the guest's MAC address (e.g. they're multicast and there's no multicast subscriber).

Looking back at the things I did with the provided machine, this was exactly the same, except it seems the number of foreign packets were much higher (which confused me yesterday).

I'm sorry, I cannot reproduce the issue, even with the provided machine. I'll need more information on what you've done, what kind of traffic was used, etc. Machine in a state with the bug reproduced would help, too, if possible.

Comment 12 Rashid Khan 2014-05-21 20:00:21 UTC
Hi Martin Pavlik
Any updates?
6.6 window is closing fast. If you need us to fix this, please give a method to reliably reproduce in the next week or so. 

Please see comment 11 above. 


Thanks
Rashid

Comment 13 Jiri Benc 2014-05-27 19:07:55 UTC
Martin, were you able to reproduce the problem?

Comment 14 Martin Pavlik 2014-05-29 07:09:27 UTC
it seems that I cannot reproduce this bug anymore, so closing asworksforme