Bug 1577054 - [virtio-win][netkvm] Whql job "NDISTest 6.0 - [1 Machine] - 1c_Mini6PerfSend" fails on win10-32/64 and w2k12-64/r2
Summary: [virtio-win][netkvm] Whql job "NDISTest 6.0 - [1 Machine] - 1c_Mini6PerfSend"...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.6
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: rc
: ---
Assignee: Sameeh Jubran
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-11 06:19 UTC by xiagao
Modified: 2018-10-30 16:22 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
NO_DOCS
Clone Of:
Environment:
Last Closed: 2018-10-30 16:21:50 UTC
Target Upstream Version:


Attachments (Terms of Use)
1c_Mini6PerfSend.hlkx (2.89 MB, application/x-gzip)
2018-05-11 13:41 UTC, xiagao
no flags Details
Netkvm Build which might resolve the issue (10.83 MB, application/zip)
2018-05-23 07:15 UTC, Sameeh Jubran
no flags Details
Netkvm Build which might resolve the issue, this build reverts enabling iommu dma feature in NetKVM which is only targeted for Win10 (10.75 MB, application/zip)
2018-05-31 08:29 UTC, Sameeh Jubran
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3413 0 None None None 2018-10-30 16:22:51 UTC

Description xiagao 2018-05-11 06:19:50 UTC
Description of problem:
when run whql job "NDISTest 6.5 - [2 Machine] - InvalidPackets" for win10-32/64 and w2k12-64/r2,guest will hang.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-1.el7.x86_64
kernel-3.10.0-884.el7.x86_64
seabios-bin-1.11.0-2.el7.noarch
virtio-win-prewhql-151

How reproducible:
100%

Steps to Reproduce:
1.boot up a client guest.
/usr/libexec/qemu-kvm -name 151NICW10D32C4H -enable-kvm -m 3G -smp 4 -uuid 8042da8c-a4a0-43fa-ac97-25011e3c8e23 -nodefconfig -nodefaults -cpu host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -chardev socket,id=charmonitor,path=/tmp/151NICW10D32C4H,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=151NICW10D32C4H,if=none,id=drive-ide0-0-0,format=raw,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_10_business_editions_version_1803_updated_march_2018_x86_dvd_12063341.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=151NICW10D32C4H.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:2e:20:24:d5 -device usb-tablet,id=input0 -vnc 0.0.0.0:2 -vga std -M pc -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:3f:19:0d:23,bus=pci.0,mq=on,vectors=10 &

2.run  job "NDISTest 6.5 - [2 Machine] - InvalidPackets" 

Actual results:
Didn't finish in more than 8 hours and guest hang there.

Expected results:
pass

Additional info:
pass in virtio-win-prewhql-144,so it's a regression

Comment 3 xiagao 2018-05-11 08:51:07 UTC
(In reply to xiagao from comment #0)
> Description of problem:
> when run whql job "NDISTest 6.5 - [2 Machine] - InvalidPackets" for

Correct job name to "NDISTest 6.0 - [1 Machine] - 1c_Mini6PerfSend"

Comment 4 xiagao 2018-05-11 13:41:43 UTC
Created attachment 1434906 [details]
1c_Mini6PerfSend.hlkx

Comment 5 xiagao 2018-05-11 13:42:30 UTC
(In reply to xiagao from comment #4)
> Created attachment 1434906 [details]
> 1c_Mini6PerfSend.hlkx

job log.

Comment 6 Sameeh Jubran 2018-05-13 15:01:52 UTC
@(In reply to xiagao from comment #0)
> Description of problem:
> when run whql job "NDISTest 6.5 - [2 Machine] - InvalidPackets" for
> win10-32/64 and w2k12-64/r2,guest will hang.
> 
> Version-Release number of selected component (if applicable):
> qemu-kvm-rhev-2.12.0-1.el7.x86_64
> kernel-3.10.0-884.el7.x86_64
> seabios-bin-1.11.0-2.el7.noarch
> virtio-win-prewhql-151
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1.boot up a client guest.
> /usr/libexec/qemu-kvm -name 151NICW10D32C4H -enable-kvm -m 3G -smp 4 -uuid
> 8042da8c-a4a0-43fa-ac97-25011e3c8e23 -nodefconfig -nodefaults -cpu
> host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff -chardev
> socket,id=charmonitor,path=/tmp/151NICW10D32C4H,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc
> base=localtime,driftfix=slew -boot order=cd,menu=on -device
> piix3-usb-uhci,id=usb -drive
> file=151NICW10D32C4H,if=none,id=drive-ide0-0-0,format=raw,cache=none -device
> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive
> file=en_windows_10_business_editions_version_1803_updated_march_2018_x86_dvd_
> 12063341.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
> -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive
> file=151NICW10D32C4H.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none
> -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device
> e1000,netdev=hostnet0,id=net0,mac=00:52:2e:20:24:d5 -device
> usb-tablet,id=input0 -vnc 0.0.0.0:2 -vga std -M pc -netdev
> tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,
> queues=4 -device
> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:3f:19:0d:23,bus=pci.0,mq=on,
> vectors=10 &
> 
> 2.run  job "NDISTest 6.5 - [2 Machine] - InvalidPackets" 
> 
> Actual results:
> Didn't finish in more than 8 hours and guest hang there.
> 
> Expected results:
> pass
> 
> Additional info:
> pass in virtio-win-prewhql-144,so it's a regression

Hi,

Thanks for reporting the issue, unfortunately I am unable to reproduce this. Can you please execute "nmi" in the qemu monitor while the client machine under test is stcuk?
the nmi forces Windows to generate a dump file and save it - usually - to c:/widows/memory.dmp.

zip the file and send it to me please =)
Thanks

Comment 7 xiagao 2018-05-14 05:48:15 UTC
Hi sameeh,

Dump file is too big to upload to bugzilla,pls download it by nfs.

http://fileshare.englab.nay.redhat.com/pub/section2/coredump/virtio-win/1c_mini6perfsend-151.DMP.tar.gz

Comment 8 Sameeh Jubran 2018-05-21 08:04:23 UTC
(In reply to xiagao from comment #7)
> Hi sameeh,
> 
> Dump file is too big to upload to bugzilla,pls download it by nfs.
> 
> http://fileshare.englab.nay.redhat.com/pub/section2/coredump/virtio-win/
> 1c_mini6perfsend-151.DMP.tar.gz

I have taken a look at the dump file, nothing interesting so far. I'll dig a bit more but just to make sure that this issue is not with the setup, since the setup you are using is new, did you try the old build (144) on this setup and it has passed?

Comment 9 xiagao 2018-05-22 09:49:27 UTC
(In reply to Sameeh Jubran from comment #8)
since
> the setup you are using is new, did you try the old build (144) on this
> setup and it has passed?


yes, passed in the old build(144).

Comment 10 Sameeh Jubran 2018-05-23 07:15:46 UTC
Created attachment 1440449 [details]
Netkvm Build which might resolve the issue

Hi Xiaoling,

Can you please test if the attached build passes?

Comment 11 xiagao 2018-05-23 08:14:01 UTC
(In reply to Sameeh Jubran from comment #10)
> Created attachment 1440449 [details]
> Netkvm Build which might resolve the issue
> 
> Hi Xiaoling,
> 
> Can you please test if the attached build passes?

It still hang there with the attached build. :(

Comment 12 Sameeh Jubran 2018-05-23 08:16:04 UTC
Please execute nmi and attach it :)

Comment 13 xiagao 2018-05-23 10:07:35 UTC
(In reply to Sameeh Jubran from comment #12)
> Please execute nmi and attach it :)

dump file:

http://fileshare.englab.nay.redhat.com/pub/section2/coredump/virtio-win/MEMORY-mini6perfsend2.DMP.tar.gz

Comment 14 Sameeh Jubran 2018-05-23 13:43:45 UTC
(In reply to xiagao from comment #13)
> (In reply to Sameeh Jubran from comment #12)
> > Please execute nmi and attach it :)
> 
> dump file:
> 
> http://fileshare.englab.nay.redhat.com/pub/section2/coredump/virtio-win/
> MEMORY-mini6perfsend2.DMP.tar.gz

Thanks for the fast reply, I have taken a look at the dump file and found nothing interesting. The driver doesn't seem to be stuck at all! Can you please give it a very long time to run and if it is still stuck, then run NMI.

Comment 15 xiagao 2018-05-29 08:59:00 UTC
Run almost 30h and it's still in running, create crash dump by nmi.


http://fileshare.englab.nay.redhat.com/pub/section2/coredump/virtio-win/mini6perfsend-30h.DMP.tar.gz

Comment 16 Sameeh Jubran 2018-05-30 15:02:06 UTC
Hi xiaoling,

Thanks a lot for your cooperation, unfortunately I can't find anything interesting in the dump. The driver seems to be functioning normally in the dump, 
I know that I have asked this earlier but just a sanity check, are you sure that build 144 passes on this same setup?

Can you provide me with 2012R2 dump with of the machine?
Does this reproduce on 2012/2008 too?

Comment 17 xiagao 2018-05-31 05:30:09 UTC
(In reply to Sameeh Jubran from comment #16)
> Hi xiaoling,
> 
> Thanks a lot for your cooperation, unfortunately I can't find anything
> interesting in the dump. The driver seems to be functioning normally in the
> dump, 
> I know that I have asked this earlier but just a sanity check, are you sure
> that build 144 passes on this same setup?

No problem:)

I run it in the same guest(win10-32) with build 144 again, passed and run for about 30 minutes.

> 
> Can you provide me with 2012R2 dump with of the machine?
> Does this reproduce on 2012/2008 too?

Check hck result and find that it passed on 2012-64 with build 151, but it run for almost 30 hours.
For 2008-r2, passed with build 151, run for almost 30 minutes.

And run 2012-r2 again, passed with build 151, run for almost 30 minutes.

There is no this test case for 2008-32/64.

Comment 18 Sameeh Jubran 2018-05-31 08:29:47 UTC
Created attachment 1446162 [details]
Netkvm Build which might resolve the issue, this build reverts enabling iommu dma feature in NetKVM which is only targeted for Win10

Hi,

Can you please test the attached build with Win10. And if the test still get's stuck, can you please send us a screen shot from within the guest itself, a test client should be running and should show up some info regarding the state of the test.


Thanks!

Comment 25 xiagao 2018-06-12 09:48:14 UTC
Verified this bug with new build 154 on win10-32/64 guest.

Comment 26 Danilo Cesar Lemes de Paula 2018-08-21 14:13:14 UTC
Can we have QA, PM and Release flags for this, please?

Comment 28 errata-xmlrpc 2018-10-30 16:21:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3413


Note You need to log in before you can comment on or make changes to this bug.