Bug 1027571

Summary: [virtio-win]win8.1 guest network can not resume automatically after do "set_link tap1 on"
Product: Red Hat Enterprise Linux 7 Reporter: Jun Li <juli>
Component: qemu-kvmAssignee: Vlad Yasevich <vyasevic>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, areis, bcao, dfleytma, ghammer, hhuang, huding, jasowang, juli, juzhang, knoel, michen, mst, qiguo, rhod, sluo, virt-maint, vyasevic, xfu, yvugenfi
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-1.5.3-22.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 12:54:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jun Li 2013-11-07 06:35:28 UTC
Description of problem:
win8.1 guest network can not resume automatically after do "set_link tap1 on" inside qemu-kvm monitor. Network will works well after disable/enable guest physical network card of the guest.

Version-Release number of selected component (if applicable):
# rpm -qa|grep virtio && rpm -qa|grep qemu-kvm
virtio-win-1.6.7-2.el7.noarch
qemu-kvm-rhev-1.5.3-13.el7.x86_64
3.10.0-42.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with virtio-net-pci.
# /usr/libexec/qemu-kvm -S -M pc-i440fx-rhel7.0.0 -cpu SandyBridge -enable-kvm -m 4G -smp 4,sockets=2,cores=2,threads=1 -name juli -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa61 -rtc base=localtime,clock=host,driftfix=slew \
-drive file=iscsi://10.66.90.100:3260/iqn.2001-05.com.equallogic:0-8a0906-6f81f7d03-cdbf49b41f6525ca-s2-sluo-259030-1/0,if=none,id=drive-system-disk,cache=writeback -iscsi id=iqn,initiator-name=iqn.1994-05.com.redhat:sluo, \
-device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0,ioeventfd=off \
-device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=disk,bootindex=1,physical_block_size=4096,logical_block_size=512  \
-drive file=/home/ISO/en_windows_8.1_preview_x86_dvd_2358833.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-0 -device ide-drive,drive=drive-ide1-0-0,id=ide1-0-0,bus=ide.0,unit=0,bootindex=4 \
-device virtio-balloon-pci,id=ballooning \
-global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 \
-netdev tap,id=tap1,vhost=on,queues=4,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown \
-device virtio-net-pci,netdev=tap1,id=nic1,mq=on,vectors=17,mac=1a:59:0a:4b:5a:94 \
-k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4445,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :3 -spice port=5932,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7445,server,nowait -monitor unix:/tmp/monitor1,server,nowait -drive file=/usr/share/virtio-win/virtio-win-1.6.7.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-2 -device ide-drive,drive=drive-ide1-0-2,id=ide1-0-2,bus=ide.0,unit=1
2.do "set_link tap1 off" via HMP.
(qemu) set_link tap1 off
3.reboot guest
4.do "set_link tap1 on" via HMP.
(qemu) set_link tap1 on
5.check network inside guest.

Actual results:
after step 5, guest network failed.
C:\Users\andrew>ping 10.66.106.4
Pinging 10.66.106.4 with 32 bytes of data:
PING: transmit failed. General failure.
PING: transmit failed. General failure.
PING: transmit failed. General failure.
PING: transmit failed. General failure.
Ping statistics for 10.66.106.4:
    Packets: Sent = 4, Received = 0, Lost = 4 <100% loss>

Expected results:
After step 5, guest network can resume automatically. No need to disable/enable network card by manually.

Additional info:
Also test using e1000 card, no this issue. After "set_link tap1 on", guest network will work well.

Comment 2 Jun Li 2013-11-07 10:00:29 UTC
Retest this issue, steps as comments.
After step 4, wait for 2 minutes, guest network can resume automatically.

Comment 3 Yvugenfi@redhat.com 2013-11-10 08:11:27 UTC
Trying to understand comment 2.

Can you open windows network connection dialog or in separate window run ipconfig?

It looks like it takes some time for Windows to square IP address from DHCP server. In this case it is not a bug.

Comment 4 Mike Cao 2013-11-10 09:07:51 UTC
(In reply to Jun Li from comment #2)
> Retest this issue, steps as comments.
> After step 4, wait for 2 minutes, guest network can resume automatically.

lijun

Pls try to reproduce this issue on a private subnet to see whether it is a DHCP issue or a virtio-win bug .

Mike

Comment 5 Jun Li 2013-11-11 02:57:30 UTC
(In reply to Yan Vugenfirer from comment #3)
> Trying to understand comment 2.
> 
> Can you open windows network connection dialog or in separate window run
> ipconfig?
> 
When guest network not resume, the guest can not get IP from DHCP server. The guest IP is 169.254.0.0. After resume, guest can get correct IP from DHCP server.

> It looks like it takes some time for Windows to square IP address from DHCP
> server. In this case it is not a bug.

Hi Yan and Mike,
    Retest this bug using the same steps as comment 0. Retest 10 times, only three times need about 1 minutes. Other times will resume at once.

Also do another test:
Inside guest, using the static IP addr. After step 4, the guest network will resume at once. So it's due to the DHCP sever. Thank you.


Best Regards,
Jun Li

Comment 6 Mike Cao 2013-11-11 03:34:13 UTC
(In reply to Jun Li from comment #5)
> (In reply to Yan Vugenfirer from comment #3)
> > Trying to understand comment 2.
> > 
> > Can you open windows network connection dialog or in separate window run
> > ipconfig?
> > 
> When guest network not resume, the guest can not get IP from DHCP server.
> The guest IP is 169.254.0.0. After resume, guest can get correct IP from
> DHCP server.
> 
> > It looks like it takes some time for Windows to square IP address from DHCP
> > server. In this case it is not a bug.
> 
> Hi Yan and Mike,
>     Retest this bug using the same steps as comment 0. Retest 10 times, only
> three times need about 1 minutes. Other times will resume at once.
> 
> Also do another test:
> Inside guest, using the static IP addr. After step 4, the guest network will
> resume at once. So it's due to the DHCP sever. Thank you.

I do not think this can prove whether it is DHCP server issue or  client issue ,Pls do comment #4 to avoid network traffic jam
> 
> 
> Best Regards,
> Jun Li

Comment 8 Jun Li 2013-11-12 05:02:33 UTC
(In reply to Mike Cao from comment #6)
> (In reply to Jun Li from comment #5)
> > (In reply to Yan Vugenfirer from comment #3)
> > > Trying to understand comment 2.
> > > 
> > > Can you open windows network connection dialog or in separate window run
> > > ipconfig?
> > > 
> > When guest network not resume, the guest can not get IP from DHCP server.
> > The guest IP is 169.254.0.0. After resume, guest can get correct IP from
> > DHCP server.
> > 
> > > It looks like it takes some time for Windows to square IP address from DHCP
> > > server. In this case it is not a bug.
> > 
> > Hi Yan and Mike,
> >     Retest this bug using the same steps as comment 0. Retest 10 times, only
> > three times need about 1 minutes. Other times will resume at once.
> > 
> > Also do another test:
> > Inside guest, using the static IP addr. After step 4, the guest network will
> > resume at once. So it's due to the DHCP sever. Thank you.
> 
> I do not think this can prove whether it is DHCP server issue or  client
> issue ,Pls do comment #4 to avoid network traffic jam
> > 
> > 
> > Best Regards,
> > Jun Li

Retest this issue, using dnsmasq to provide DHCP. Steps as comments 0.
# dnsmasq --strict-order --bind-interfaces --listen-address 192.168.1.1 --dhcp-range 192.168.1.2,192.168.1.254

After step 4, guest network can resume within 10s.

Best Regards,

Jun Li

Comment 9 Vlad Yasevich 2013-11-12 15:41:23 UTC
This is similar to problem to Bug 965396.  The issue that when performing
set_link tap0 [on|off], you are changing the state of the link on the tap
device, but the qemu does NOT change the link state on the guest device.

Rebooting a guest does not change this and the guest ends up booting with
link in a very strange state.  The guest thinks that the link is up, but
there can be no traffic outside of qemu.

The correct solution is to control the guest link state as well.  This way
an OS can correctly detect the link state change and configure the interface.

-vlad

Comment 11 Vlad Yasevich 2013-12-02 15:21:08 UTC
It's already fixed in Stefans net tree.  The fix is simple enough.

https://github.com/stefanha/qemu/commit/32511186853cc6844e3a23dd6aa749a41cb2c169

Comment 12 Yvugenfi@redhat.com 2013-12-02 15:28:36 UTC
(In reply to Vlad Yasevich from comment #11)
> It's already fixed in Stefans net tree.  The fix is simple enough.
> 
> https://github.com/stefanha/qemu/commit/
> 32511186853cc6844e3a23dd6aa749a41cb2c169

Great!

Comment 13 Mike Cao 2013-12-03 02:51:12 UTC
(In reply to Yan Vugenfirer from comment #12)
> (In reply to Vlad Yasevich from comment #11)
> > It's already fixed in Stefans net tree.  The fix is simple enough.
> > 
> > https://github.com/stefanha/qemu/commit/
> > 32511186853cc6844e3a23dd6aa749a41cb2c169
> 
> Great!

reopening has there is a solution for it 
Vlad need we move this bug to qemu-kvm copmonent ?

Thanks
Mike

Comment 14 Vlad Yasevich 2013-12-06 14:17:49 UTC
yes.  will do.

thanks
-vlad

Comment 19 huiqingding 2013-12-23 02:51:45 UTC
Reproduce this bug using the following version:
# rpm -qa|grep virtio && rpm -qa|grep qemu-kvm
virtio-win-1.6.7-2.el7.noarch
qemu-kvm-1.5.3-13.el7.x86_64
host kernel: kernel-3.10.0-63.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with virtio-net-pci.
# /usr/libexec/qemu-kvm -S -M pc-i440fx-rhel7.0.0 -cpu SandyBridge -enable-kvm -m 4G -smp 4,sockets=2,cores=2,threads=1 -name win8_1-32 -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa61 -e=localtime,clock=host,driftfix=slew -drive file=iscsi://10.66.6.82:3260/iqn.2013-11.com.example:storage.disk1.juli.xyz/4,if=none,id=drive-system-disk,cache=writeback -iscsi id=iqn,initiator-name=iqn.1994-05.com.redhat:sluo, -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0,ioeventfd=off -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=disk,bootindex=1,physical_block_size=4096,logical_block_size=512 -drive file=/home/en_windows_8_1_enterprise_x86_dvd_2972289.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-0 -device ide-drive,drive=drive-ide1-0-0,id=ide1-0-0,bus=ide.0,unit=0,bootindex=4 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -netdev tap,id=tap1,vhost=on,queues=4,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown -device virtio-net-pci,netdev=tap1,id=nic1,mq=on,vectors=17,mac=1a:59:0a:4b:5a:94 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4445,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :3 -spice port=5932,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7445,server,nowait -monitor unix:/tmp/monitor1,server,nowait -drive file=/usr/share/virtio-win/virtio-win-1.6.7.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-2 -device ide-drive,drive=drive-ide1-0-2,id=ide1-0-2,bus=ide.0,unit=1
2.do "set_link tap1 off" via HMP.
(qemu) set_link tap1 off
3.reboot guest
4.do "set_link tap1 on" via HMP.
(qemu) set_link tap1 on
5.check network inside guest.

Actual results:
after step 5, guest network failed. After about 13 seconds, the guest network can work and can ping the host.

Verify this bug using the following version:
# rpm -qa|grep virtio && rpm -qa|grep qemu-kvm
virtio-win-1.6.7-2.el7.noarch
qemu-kvm-1.5.3-24.el7.x86_64
host kernel: kernel-3.10.0-63.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot guest with virtio-net-pci.
# /usr/libexec/qemu-kvm -S -M pc-i440fx-rhel7.0.0 -cpu SandyBridge -enable-kvm -m 4G -smp 4,sockets=2,cores=2,threads=1 -name win8_1-32 -uuid 355a2475-4e03-4cdd-bf7b-5d6a59edaa61 -e=localtime,clock=host,driftfix=slew -drive file=iscsi://10.66.6.82:3260/iqn.2013-11.com.example:storage.disk1.juli.xyz/4,if=none,id=drive-system-disk,cache=writeback -iscsi id=iqn,initiator-name=iqn.1994-05.com.redhat:sluo, -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0,ioeventfd=off -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=disk,bootindex=1,physical_block_size=4096,logical_block_size=512 -drive file=/home/en_windows_8_1_enterprise_x86_dvd_2972289.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-0 -device ide-drive,drive=drive-ide1-0-0,id=ide1-0-0,bus=ide.0,unit=0,bootindex=4 -device virtio-balloon-pci,id=ballooning -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -netdev tap,id=tap1,vhost=on,queues=4,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown -device virtio-net-pci,netdev=tap1,id=nic1,mq=on,vectors=17,mac=1a:59:0a:4b:5a:94 -k en-us -boot menu=on,reboot-timeout=-1,strict=on -qmp tcp:0:4445,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :3 -spice port=5932,disable-ticketing -vga qxl -monitor stdio -monitor tcp:0:7445,server,nowait -monitor unix:/tmp/monitor1,server,nowait -drive file=/usr/share/virtio-win/virtio-win-1.6.7.iso,if=none,media=cdrom,format=raw,aio=native,id=drive-ide1-0-2 -device ide-drive,drive=drive-ide1-0-2,id=ide1-0-2,bus=ide.0,unit=1
2.do "set_link tap1 off" via HMP.
(qemu) set_link tap1 off
3.reboot guest
4.do "set_link tap1 on" via HMP.
(qemu) set_link tap1 on
5.check network inside guest.

Actual results:
after step 5, guest network can work and can ping the host.
Additional info:
I also test RHEL7 guest, the result is guest network can work after "set_link tap1 on".

Based on the above result, I think this bug is fixed.

Comment 20 huiqingding 2013-12-23 03:26:19 UTC
I also verify this bug using the steps of comment #0 and the combination of guest and nic card is as following:
RHEL7 guest + e1000 nic card
RHEL7 guest + rtl8139 nic card
win8.1 guest + e1000 nic card
win8.1 guest + rtl8139 nic card

qemu and kernel version is as the following:
# rpm -qa|grep virtio && rpm -qa|grep qemu-kvm
virtio-win-1.6.7-2.el7.noarch
qemu-kvm-1.5.3-24.el7.x86_64
host kernel: kernel-3.10.0-63.el7.x86_64

The results are that after step 5, guest network can work and can ping the host.

Comment 22 Ludek Smid 2014-06-13 12:54:38 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.