Bug 1251379 - [RHEL-7.2] Package is 100% lost when ping from host to Win2012r2 guest with 64000 size
[RHEL-7.2] Package is 100% lost when ping from host to Win2012r2 guest with 6...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Vlad Yasevich
Virtualization Bugs
:
Depends On:
Blocks: 1252757 1262866 1288337
  Show dependency treegraph
 
Reported: 2015-08-07 03:21 EDT by Yang Meng
Modified: 2016-11-07 15:35 EST (History)
12 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.5.0-1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1252757 1262866 (view as bug list)
Environment:
Last Closed: 2016-11-07 15:35:27 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Yang Meng 2015-08-07 03:21:15 EDT
Description of problem:
packet will lose when ping using big packets from host to guest
,from guest to host works fine.

Version-Release number of selected component (if applicable):

qemu: qemu-kvm-rhev-2.3.0-15.el7.x86_64
kernel: kernel-3.10.0-302.el7.x86_64



How reproducible:

i tried 3 times using autotest script and manually, happens every time.


Steps to Reproduce:
1)boot up the guest using the commandline:

/usr/libexec/qemu-kvm \
    -S  \
    -name 'virt-tests-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga std  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20150807-143123-2OZwwnr3,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20150807-143123-2OZwwnr3,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control  \
    -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150807-143123-2OZwwnr3,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20150807-143123-2OZwwnr3,path=/tmp/seabios-20150807-143123-2OZwwnr3,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20150807-143123-2OZwwnr3,iobase=0x402 \
    -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04 \
    -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=/home/autotest/autotest-devel/client/tests/virt/shared/data/images/win2012-64r2-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device rtl8139,mac=9a:79:7a:7b:7c:7d,id=idCwN3Ey,netdev=idqrOgQm,bus=pci.0,addr=05  \
    -netdev tap,id=idqrOgQm \
    -m 8192  \
    -smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
    -cpu 'SandyBridge',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=native,media=cdrom,file=/home/autotest/autotest-devel/client/tests/virt/shared/data/isos/windows/winutils.iso \
    -device scsi-cd,id=cd1,drive=drive_cd1 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \

2)when guest boots up, get the ip through vnc or spice,and then begin send packet from host to guest

3)ping results:

[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10
PING 10.66.73.164 (10.66.73.164) 56(84) bytes of data.
64 bytes from 10.66.73.164: icmp_seq=1 ttl=128 time=0.198 ms
64 bytes from 10.66.73.164: icmp_seq=2 ttl=128 time=0.182 ms
64 bytes from 10.66.73.164: icmp_seq=3 ttl=128 time=0.251 ms
64 bytes from 10.66.73.164: icmp_seq=4 ttl=128 time=0.223 ms
64 bytes from 10.66.73.164: icmp_seq=5 ttl=128 time=0.224 ms
64 bytes from 10.66.73.164: icmp_seq=6 ttl=128 time=0.253 ms
64 bytes from 10.66.73.164: icmp_seq=7 ttl=128 time=0.221 ms
64 bytes from 10.66.73.164: icmp_seq=8 ttl=128 time=0.264 ms
64 bytes from 10.66.73.164: icmp_seq=9 ttl=128 time=0.284 ms
64 bytes from 10.66.73.164: icmp_seq=10 ttl=128 time=0.295 ms

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.182/0.239/0.295/0.037 ms
[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10 -s 63000
PING 10.66.73.164 (10.66.73.164) 63000(63028) bytes of data.
63008 bytes from 10.66.73.164: icmp_seq=1 ttl=128 time=2.82 ms
63008 bytes from 10.66.73.164: icmp_seq=2 ttl=128 time=2.91 ms
63008 bytes from 10.66.73.164: icmp_seq=3 ttl=128 time=3.35 ms
63008 bytes from 10.66.73.164: icmp_seq=4 ttl=128 time=2.73 ms
63008 bytes from 10.66.73.164: icmp_seq=5 ttl=128 time=2.90 ms
63008 bytes from 10.66.73.164: icmp_seq=6 ttl=128 time=2.92 ms
63008 bytes from 10.66.73.164: icmp_seq=7 ttl=128 time=2.89 ms
63008 bytes from 10.66.73.164: icmp_seq=8 ttl=128 time=2.94 ms
63008 bytes from 10.66.73.164: icmp_seq=9 ttl=128 time=2.84 ms
63008 bytes from 10.66.73.164: icmp_seq=10 ttl=128 time=2.83 ms

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9011ms
rtt min/avg/max/mdev = 2.735/2.917/3.353/0.167 ms
[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10 -s 64000
PING 10.66.73.164 (10.66.73.164) 64000(64028) bytes of data.

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 8999ms

[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10 -s 65000
PING 10.66.73.164 (10.66.73.164) 65000(65028) bytes of data.

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 8999ms



Actual results:

packets lost when ping  from host to guest with big packet.

Expected results:

ping successfully with on packet lost

Additional info:

host info:
1)cpu

processor	: 15
vendor_id	: GenuineIntel
cpu family	: 6
model		: 62
model name	: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping	: 4
microcode	: 0x428
cpu MHz		: 3025.445
cache size	: 20480 KB
physical id	: 0
siblings	: 16
core id		: 7
cpu cores	: 8
apicid		: 15
initial apicid	: 15
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
bogomips	: 5187.50
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual


2)host 

hp-z420-02.qe.lab.eng.nay.redhat.com

3) i first met this problem in z stream

qemu: qemu-kvm-rhev-2.1.2-23.el7_1.7.x86_64
kernel: kernel-3.10.0-229.11.1.el7.x86_64
Comment 1 Yang Meng 2015-08-07 03:36:39 EDT
1. Guest: Win2012r2

2. pass:
ping 10.66.73.164 -c 10
ping 10.66.73.164 -c 10 -s 63000

3. failed from -s 64000
ping 10.66.73.164 -c 10 -s 64000

4. nic: rtl8139
it works fine with virtio_nic
Comment 2 juzhang 2015-08-07 03:38:12 EDT
> 
> 3) i first met this problem in z stream
> 
> qemu: qemu-kvm-rhev-2.1.2-23.el7_1.7.x86_64
> kernel: kernel-3.10.0-229.11.1.el7.x86_64


We need to identify whether it's a regression issue.  Could you have a try qemu-kvm-rhev-2.1.2-23.el7_1.6.x86_64? It's pretty important.

Best Regards,
Junyi
Comment 4 Yang Meng 2015-08-07 04:15:31 EDT
1)have downgrade to qemu:qemu-kvm-rhev-2.3.0-13.el7.x86_64,and also has the problem

results:

1. pass:


[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10
PING 10.66.73.164 (10.66.73.164) 56(84) bytes of data.
64 bytes from 10.66.73.164: icmp_seq=1 ttl=128 time=0.483 ms
64 bytes from 10.66.73.164: icmp_seq=2 ttl=128 time=0.243 ms
64 bytes from 10.66.73.164: icmp_seq=3 ttl=128 time=0.244 ms
64 bytes from 10.66.73.164: icmp_seq=4 ttl=128 time=0.215 ms
64 bytes from 10.66.73.164: icmp_seq=5 ttl=128 time=0.279 ms
64 bytes from 10.66.73.164: icmp_seq=6 ttl=128 time=0.240 ms
64 bytes from 10.66.73.164: icmp_seq=7 ttl=128 time=0.225 ms
64 bytes from 10.66.73.164: icmp_seq=8 ttl=128 time=0.272 ms
64 bytes from 10.66.73.164: icmp_seq=9 ttl=128 time=0.242 ms
64 bytes from 10.66.73.164: icmp_seq=10 ttl=128 time=0.225 ms

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9006ms
rtt min/avg/max/mdev = 0.215/0.266/0.483/0.077 ms
[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10 -s 63000
PING 10.66.73.164 (10.66.73.164) 63000(63028) bytes of data.
63008 bytes from 10.66.73.164: icmp_seq=1 ttl=128 time=2.91 ms
63008 bytes from 10.66.73.164: icmp_seq=2 ttl=128 time=2.76 ms
63008 bytes from 10.66.73.164: icmp_seq=3 ttl=128 time=2.82 ms
63008 bytes from 10.66.73.164: icmp_seq=4 ttl=128 time=2.89 ms
63008 bytes from 10.66.73.164: icmp_seq=5 ttl=128 time=2.99 ms
63008 bytes from 10.66.73.164: icmp_seq=6 ttl=128 time=2.85 ms
63008 bytes from 10.66.73.164: icmp_seq=7 ttl=128 time=3.28 ms
63008 bytes from 10.66.73.164: icmp_seq=8 ttl=128 time=2.84 ms
63008 bytes from 10.66.73.164: icmp_seq=9 ttl=128 time=2.80 ms
63008 bytes from 10.66.73.164: icmp_seq=10 ttl=128 time=2.76 ms

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9013ms
rtt min/avg/max/mdev = 2.760/2.894/3.280/0.148 ms



2. failed:
[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10 -s 64000
PING 10.66.73.164 (10.66.73.164) 64000(64028) bytes of data.

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 8999ms

[root@hp-z420-02 ~]# ping 10.66.73.164 -c 10 -s 65000
PING 10.66.73.164 (10.66.73.164) 65000(65028) bytes of data.

--- 10.66.73.164 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 8999ms
Comment 5 Vlad Yasevich 2015-08-12 14:41:28 EDT
This is an interesting case.  There are 2 issues here and both are in the qemu
emulation of the rtl8139 driver.

The first issue is that if a packet overflows the available emulated ring buffer,
the packet ends up being dropped by qemu.  This is simple enough to fix.

The other issue, which is a bit more interesting, and might have security
implications, is that emulated driver can not determine correctly whether the
receive ring is full.  This happens only in standard operating mode, which just
happens to be what windows uses.
In standard mode, the ring buffer simply uses 2 pointers to track the beginning
and the end of the buffer.  We write new data to the end of the buffer and read
from the beginning.  The buffer is also allowed to wrap.   The emulation assumes
that when begining and the end pointers are the same, the buffer is empty.  This
is not always the case, as the buffer may also be completely full in this case.
It is possible to send a specially sized packet that it after fragmentation it
would fill the emulated buffer completely.  Subsequent sends would end up
overwriting the original contents of the buffer.
Thus is it possible to partially or completely overwrite packet data currently
queued in the emulated ring buffer.
Comment 9 Vlad Yasevich 2015-09-14 09:37:29 EDT
Moving to 7.3.  Not a security issue, not a regression, not a blocker since this
bug has existed or a very long time.
Comment 12 weliao 2016-05-17 03:15:01 EDT
Reproduce on below version:
3.10.0-378.el7.x86_64
qemu-kvm-rhev-2.3.0-31.el7.x86_64

Test steps:
1.launch a win2012r2 guest with rtl8139 NIC.
/usr/libexec/qemu-kvm     -S      -name 'virt-tests-vm1'      -sandbox off      -machine pc      -nodefaults      -vga std      -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20150807-143123-2OZwwnr3,server,nowait     -mon chardev=qmp_id_qmpmonitor1,mode=control      -chardev socket,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20150807-143123-2OZwwnr3,server,nowait     -mon chardev=qmp_id_catch_monitor,mode=control      -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150807-143123-2OZwwnr3,server,nowait     -device isa-serial,chardev=serial_id_serial0      -chardev socket,id=seabioslog_id_20150807-143123-2OZwwnr3,path=/tmp/seabios-20150807-143123-2OZwwnr3,server,nowait     -device isa-debugcon,chardev=seabioslog_id_20150807-143123-2OZwwnr3,iobase=0x402     -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03     -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=04     -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=/home/win2012-64r2-virtio.qcow2    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=13 -device rtl8139,netdev=macvtap0,mac=32:d7:63:b5:a9:54,id=net1 -netdev tap,id=macvtap0,vhost=on     -m 8192      -smp 8,maxcpus=8,cores=4,threads=1,sockets=2      -cpu 'SandyBridge',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time       -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1      -vnc :0      -rtc base=localtime,clock=host,driftfix=slew      -boot order=cdn,once=c,menu=off,strict=off     -enable-kvm     -monitor stdio
2.ping guest from host with 63000/64000 size.

[root@dhcp-8-118 qemu230]#  ping 10.66.10.224 -s 63000
PING 10.66.10.224 (10.66.10.224) 63000(63028) bytes of data.
63008 bytes from 10.66.10.224: icmp_seq=40 ttl=128 time=918 ms
63008 bytes from 10.66.10.224: icmp_seq=41 ttl=128 time=2.50 ms

[root@dhcp-8-118 qemu230]#  ping 10.66.10.224 -s 64000
PING 10.66.10.224 (10.66.10.224) 64000(64028) bytes of data.
From 10.66.10.224 icmp_seq=1 Frag reassembly time exceeded
From 10.66.10.224 icmp_seq=2 Frag reassembly time exceeded

ping 64000 byte size failed, so this bug reproduce.

--------------------------------------------------------------
Verify this bug with below version:
3.10.0-378.el7.x86_64
qemu-kvm-rhev-2.6.0-1.el7.x86_64

The same test steps:
[root@dhcp-8-118 qemu260-1]#  ping 10.66.10.224 -s 63000
PING 10.66.10.224 (10.66.10.224) 63000(63028) bytes of data.
63008 bytes from 10.66.10.224: icmp_seq=45 ttl=128 time=2.74 ms

[root@dhcp-8-118 qemu260-1]#  ping 10.66.10.224 -s 64000
PING 10.66.10.224 (10.66.10.224) 64000(64028) bytes of data.
64008 bytes from 10.66.10.224: icmp_seq=1 ttl=128 time=854 ms
64008 bytes from 10.66.10.224: icmp_seq=2 ttl=128 time=2.39 ms

[root@dhcp-8-118 qemu260-1]#  ping 10.66.10.224 -s 65000
PING 10.66.10.224 (10.66.10.224) 65000(65028) bytes of data.
65008 bytes from 10.66.10.224: icmp_seq=1 ttl=128 time=21.3 ms

63000/64000/65000 size all fine, so this bug fix well.
Comment 15 errata-xmlrpc 2016-11-07 15:35:27 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html

Note You need to log in before you can comment on or make changes to this bug.