Bug 1113910 - [virtio-win][balloon][RHEL6]Guest BSOD when running hotplug/unplug in a loop with verifier enabled
Summary: [virtio-win][balloon][RHEL6]Guest BSOD when running hotplug/unplug in a loop ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Gal Hammer
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1130853
Blocks: 1131838
TreeView+ depends on / blocked
 
Reported: 2014-06-27 07:47 UTC by Mike Cao
Modified: 2015-11-23 03:37 UTC (History)
10 users (show)

Fixed In Version: virtio-win-prewhql-0.1-90
Doc Type: Bug Fix
Doc Text:
Cause: Guest BSOD when running balloon hotplug/unplug in a loop with verifier enabled Consequence: Executing balloon hotplug/unplug sequence in a loop with verifier enabled will lead to BSOD. Fix: Release any allocated resources if failed entering D0 state. Result: Now balloon hotplug/unplug sequence with verifier enabled can be executed in a loop without BSOD.
Clone Of:
Environment:
Last Closed: 2015-03-05 05:34:19 UTC
Target Upstream Version:


Attachments (Terms of Use)
minidump (45.76 KB, application/x-zip-compressed)
2014-06-27 08:47 UTC, Mike Cao
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0289 normal SHIPPED_LIVE virtio-win bug fix and enhancement update 2015-03-05 10:32:54 UTC

Description Mike Cao 2014-06-27 07:47:10 UTC
Description of problem:


Version-Release number of selected component (if applicable):
virtio-win-1.7.1-1(virtio-win-prewhql-79)

How reproducible:
2/2

Steps to Reproduce:
1.Start VM w/ virtio-balloon-pci and qmp
CLI:/usr/libexec/qemu-kvm -m 2G -smp 2,sockets=1,cores=2,threads=1 -name rhel6.3 -uuid 4c84db67-faf8-4498-9829-19a3d6431d9d -rtc base=localtime,driftfix=slew -drive file=086BLNBLUE64RYE,if=none,id=drive-virtio-disk0,format=raw,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=1,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:2a:42:10:66,bus=pci.0,addr=0x3 -usb -device usb-tablet,id=input0 -monitor stdio -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -vnc :11 -qmp tcp:0:4455,server,nowait -device virtio-balloon-pci,id=balloon1
2.in the guest  # verifier.exe /standard /driver balloon.sys
3.reboot the guest 
4.hotplug/unplug balloon device in a loop
[root@localhost ~]# cat plug.sh 
#!/bin/bash
# some simply scripts for balloon device hotplug/unplug in a loop
let i=0
exec 3<>/dev/tcp/localhost/4455
echo -e "{ 'execute': 'qmp_capabilities' }" >&3
read response <&3
echo $response
while [ $i -lt 500 ]
do
    echo -e "{ 'execute': 'device_del', 'arguments': {'id': 'balloon1' }}">&3 ;
    sleep 2 ;
    read response <&3 ;
    echo "$i: $response"
    sleep 2 ;
     echo -e "{'execute':'device_add','arguments':{'driver':'virtio-balloon-pci','id':'balloon1','addr':'0x9'}}">&3 ;
    sleep 2 ;
    read response <&3
    echo "$i: $response"
    let i=$i+1
done


Actual results:
BOSD w/ DRIVER_VERIFIER_DETECT_VIOLATION

Expected results:
no bsod occurs

Additional info:
no memeory dump occurs ,only minidump ,I will upload dumps later

Comment 2 Mike Cao 2014-06-27 08:47:17 UTC
Created attachment 912709 [details]
minidump

Comment 3 Gal Hammer 2014-06-29 11:17:27 UTC
The BSOD occurs even without setting a balloon's value?

Comment 4 Mike Cao 2014-06-29 12:37:41 UTC
(In reply to Gal Hammer from comment #3)
> The BSOD occurs even without setting a balloon's value?

yes,w/o setting it.

Comment 5 Gal Hammer 2014-07-01 13:49:03 UTC
I'm unable to reproduce.

Does the balloon service is running on the guest?
Is it crash on first device removal?

Comment 6 Mike Cao 2014-07-01 13:57:40 UTC
(In reply to Gal Hammer from comment #5)
> I'm unable to reproduce.
> 
> Does the balloon service is running on the guest?
> Is it crash on first device removal?

I did not enable balloon service ,not the first device removal as I just make the scripts running  Did you try w/ my hotplug/unplug scripts ?

Mike

Comment 7 Gal Hammer 2014-07-01 14:00:42 UTC
(In reply to Mike Cao from comment #6)
> (In reply to Gal Hammer from comment #5)
> > I'm unable to reproduce.
> > 
> > Does the balloon service is running on the guest?
> > Is it crash on first device removal?
> 
> I did not enable balloon service ,not the first device removal as I just
> make the scripts running  Did you try w/ my hotplug/unplug scripts ?
> 
> Mike

I'm using your plug.sh script.

    Gal.

Comment 8 Mike Cao 2014-07-01 14:34:26 UTC
(In reply to Gal Hammer from comment #7)
> (In reply to Mike Cao from comment #6)
> > (In reply to Gal Hammer from comment #5)
> > > I'm unable to reproduce.
> > > 
> > > Does the balloon service is running on the guest?
> > > Is it crash on first device removal?
> > 
> > I did not enable balloon service ,not the first device removal as I just
> > make the scripts running  Did you try w/ my hotplug/unplug scripts ?
> > 
> > Mike
> 
> I'm using your plug.sh script.
> 
>     Gal.

I hit this issue when I write this case ,I will try to reproduce it tomorrow again .

Mike

Comment 9 Mike Cao 2014-07-02 09:08:22 UTC
(In reply to Gal Hammer from comment #7)
> (In reply to Mike Cao from comment #6)
> > (In reply to Gal Hammer from comment #5)
> > > I'm unable to reproduce.
> > > 
> > > Does the balloon service is running on the guest?
> > > Is it crash on first device removal?
> > 
> > I did not enable balloon service ,not the first device removal as I just
> > make the scripts running  Did you try w/ my hotplug/unplug scripts ?
> > 
> > Mike
> 
> I'm using your plug.sh script.
> 
>     Gal.

Sorry Gal ,looks do some ballooning can hit the issue .
I still can reproduce this issue ,but not 100% reproduce.
sometimes in the device manager it shows "This device cannot start (CODE10)
STATUS_DEVICE_POEWER_FAILURE" w/o BSOD 
or shows "This device is not working properly because Windows cannot lod the driver required for this device.(Code31)
Insufficient system resources exist to complte the API

Sometimes it bsod w/DRIVER_VERIFIER_DETECT_VIOLATION

But I did not find a way how to 100% reproduce it

Comment 10 Mike Cao 2014-07-02 09:42:33 UTC
Looks following steps can reproduce it easily

Do steps in comment #0
then

in the qemu-monitor (qemu)balloon 2
-->Then It will hit driver can not start (code10) or can not load the driver required for this device(code 31)

Then in the guest disable/enable balloon driver and still keeps the plug.sh running

Then bosd occurs

Comment 11 Gal Hammer 2014-07-02 12:47:25 UTC
Which kernel/qemu/windows version are you using?

Comment 13 Gal Hammer 2014-07-02 15:00:17 UTC
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 Beta (Santiago)

# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.427.el6.x86_64

# uname -r
2.6.32-478.el6.x86_64

Is it reproducible on a RHEL-7 host?

Comment 14 Mike Cao 2014-07-04 05:30:03 UTC
(In reply to Gal Hammer from comment #13)
> # cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 6.6 Beta (Santiago)
> 
> # rpm -q qemu-kvm
> qemu-kvm-0.12.1.2-2.427.el6.x86_64
> 
> # uname -r
> 2.6.32-478.el6.x86_64
> 
> Is it reproducible on a RHEL-7 host?

Yes successfully reproduce it on RHEL7 host 
# uname -r;rpm -q qemu-kvm seabios
3.10.0-121.el7.x86_64
qemu-kvm-1.5.3-62.el7.x86_64
seabios-1.7.2.2-12.el7.x86_64

Steps:
disable/enable and ballooing guest during hotplug/unplug in a loop

Mike

Comment 16 Gal Hammer 2014-07-27 11:30:17 UTC
I'm still unable to reproduce (with build 87).

Can you please try to reproduce with a fresh installation of Windows?

Comment 17 Mike Cao 2014-07-28 08:32:24 UTC
(In reply to Gal Hammer from comment #16)
> I'm still unable to reproduce (with build 87).
> 
> Can you please try to reproduce with a fresh installation of Windows?

Retested on virtio-win-prewhql-87

I can not reproduce it w/ # verifier.exe /standard /driver balloon.sys 
But I can reproduce it w/ #verifier.exe flags 0x01FFFFFF /driver balloon.sys  (open all flags)

Then execute the scripts in comment #0 and check the balloon device status in guest device manger .

then (qemu)balloon 2

When it shows shows "This device is not working properly because Windows cannot lod the driver required for this device.(Code31) in device manager , disable/enable the driver 


Then BSOD occurs

Comment 18 Gal Hammer 2014-07-29 08:34:17 UTC
(In reply to Mike Cao from comment #17)
> (In reply to Gal Hammer from comment #16)
> > I'm still unable to reproduce (with build 87).
> > 
> > Can you please try to reproduce with a fresh installation of Windows?
> 
> Retested on virtio-win-prewhql-87
> 
> I can not reproduce it w/ # verifier.exe /standard /driver balloon.sys 
> But I can reproduce it w/ #verifier.exe flags 0x01FFFFFF /driver balloon.sys
> (open all flags)
> 
> Then execute the scripts in comment #0 and check the balloon device status
> in guest device manger .
> 
> then (qemu)balloon 2
> 
> When it shows shows "This device is not working properly because Windows
> cannot lod the driver required for this device.(Code31) in device manager ,
> disable/enable the driver 
> 
> 
> Then BSOD occurs

But did you install a fresh copy of Windows (and not from existing template)? Which Windows version?

Comment 19 Mike Cao 2014-07-29 10:08:50 UTC
(In reply to Gal Hammer from comment #18)
> (In reply to Mike Cao from comment #17)
> > (In reply to Gal Hammer from comment #16)
> > > I'm still unable to reproduce (with build 87).
> > > 
> > > Can you please try to reproduce with a fresh installation of Windows?
> > 
> > Retested on virtio-win-prewhql-87
> > 
> > I can not reproduce it w/ # verifier.exe /standard /driver balloon.sys 
> > But I can reproduce it w/ #verifier.exe flags 0x01FFFFFF /driver balloon.sys
> > (open all flags)
> > 
> > Then execute the scripts in comment #0 and check the balloon device status
> > in guest device manger .
> > 
> > then (qemu)balloon 2
> > 
> > When it shows shows "This device is not working properly because Windows
> > cannot lod the driver required for this device.(Code31) in device manager ,
> > disable/enable the driver 
> > 
> > 
> > Then BSOD occurs
> 
> But did you install a fresh copy of Windows (and not from existing
> template)? Which Windows version?

Fresh installation from ISO 
win8.1-64

Mike

Comment 24 Gal Hammer 2014-08-07 10:54:28 UTC
With the help of Mike I was able to reproduce. Thanks!

A patch was posted.

Comment 26 Mike Cao 2014-08-21 02:51:37 UTC
Retest this issue on virtio-win-prewhql-89 ,still can hit the issue 

Steps:
1.Start VM
/usr/libexec/qemu-kvm -name 088BLNBLUE64PDO -enable-kvm -m 6G -smp 4 -uuid 5d1bcb4e-db13-4e7c-a071-ff4091c57295 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/088BLNBLUE64PDO,server,nowait -mon chardev=charmonitor,id=monitor3,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=088BLNBLUE64PDO,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_8_1_enterprise_x64_dvd_2971902.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=088BLNBLUE64PDO.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:6d:1d:de:2f,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -device virtio-balloon-pci,id=balloon1,bus=pci.0,addr=0x7 -monitor stdio -qmp tcp:0:4455,server,nowait

2.running sh plug.sh (scripts can be found in comment #0)
3.after 1 mins ,running (qemu)balloon 2
4.after qmp monitor shows "30: {"return": {}}
31: {"error": {"class": "DuplicateId", "desc": "Duplicate ID 'balloon1' for device", "data": {"object": "device", "id": "balloon1"}}}
"
in a loop , disable/enable driver in the guest

Actual Results for step4 : it still shows the driver is not loaded properly 


5.try to reinstall the driver from virito-win-prewhql-89\win8\amd64

Actual Results: guest BSOD  w/ DRIVER_VERIFIER_DETECT_VIOLATION

Based on above ,this issue has not been fixed yet 

Reassign this issue

Comment 27 Gal Hammer 2014-08-21 07:09:47 UTC
The patch was posted and merged. But it was not included in build 89.

Comment 29 Mike Cao 2014-08-29 12:15:59 UTC
Verified this issue on virtio-win-prewhql-90 steps same as comment @26

Actual Results:
no bsod occurs 

the original issue has gone ,we hit the new issue and open https://bugzilla.redhat.com/show_bug.cgi?id=1133842 to track it 

Based on above ,move status to Verified

Comment 34 errata-xmlrpc 2015-03-05 05:34:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0289.html


Note You need to log in before you can comment on or make changes to this bug.