Bug 786299

Summary: Windows 2003 SP1 32bit randomly reboots on a Redhat 6.1 KVM
Product: Red Hat Enterprise Linux 6 Reporter: James Shirley <james.shirley>
Component: virtio-winAssignee: Yvugenfi <yvugenfi>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.1CC: acathrow, andrew.bellussi, areis, bcao, bsarathy, dallan, hein.soe, michen, rhod, tburke
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-11 12:25:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
libvirt log for the guest
none
memory.dmp
none
memory.dmp1
none
memory.dmp2
none
memory.dmp3
none
memory.dmp4
none
Memory Dump frap01.dms Part1 0f 4
none
Memory Dump frap01.dms part 2 of 4
none
Mem Dump frap01.dms Part 3 0f 4
none
Mem Dump frap01.dms Part 4 of 4
none
Mem Dump Frap02.dms part1 of 4
none
Mem Dump frap02.dms Part 2 of 4
none
Mem Dump frap02.dms part 3 of 4
none
Mem Dump frap02.dms part 4 0f 4
none
Mem Dump frap03.dms 1 of 4
none
Mem Dump frap03.dms 2 of 4
none
Mem Dump frap03.dms 3 of 4
none
Mem Dump frap03.dms 4 of 4 none

Description James Shirley 2012-02-01 01:00:38 UTC
Description of problem:

a windows 2003 SP1 32bit guest with redhat IO & network virtio drivers randomally reboots. 

The libvirt qemu log for the VM reports: 

virtio_ioport_write: unexpected address 0x13 value 0x1

on each occasion it reboots

There are no system event logs, just one to say an unexpected shutdown occured after it rebooted..


Version-Release number of selected component (if applicable):

Redhat 6.1

[root@kvm04 libvirt]# uname -a
Linux kvm04.dms 2.6.32-131.17.1.el6.x86_64 #1 SMP Thu Sep 29 10:24:25 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@kvm04 libvirt]# rpm -q libvirt
libvirt-0.8.7-18.el6_1.4.x86_64



How reproducible:

Its random, has occured 3 times in the past day


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Its on Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz platform

Command running this VM arguments are:

/usr/libexec/qemu-kvm -S -M rhel6.1.0 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name frap01.dms -uuid b458d691-675e-89a9-c291-b22c80ed3aaf -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/frap01.dms.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -drive file=/dev/sr0,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/dev/vg/frap01.dms,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:db:17:f6,bus=pci.0,addr=0x3 -netdev tap,fd=32,id=hostnet1,vhost=on,vhostfd=33 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:40:21:5f,bus=pci.0,addr=0x7 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:8 -vga std -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

This VM server was fine before, however in the past few days it has considerably more load CPU/DISK/MEM/NET now.

Cheers James

Comment 2 Dave Allan 2012-02-01 01:34:03 UTC
I'm fairly certain that this BZ shouldn't be filed against libvirt, although I'm not 100% sure what the correct component is.  I've moved it to virtio-win.

Comment 3 Mike Cao 2012-02-03 05:27:29 UTC
Hi, James 

1.what if when you using IDE disk and RTL8139 network drivers?
2.Could you check whether there is a MEMORY.DMP located in C:\Windows ?
3.Could you tell me which tools are you using to load the guest ?
4.Could you upload the full libvirt log for the guest ?


Thanks,
Mike

Comment 4 Mike Cao 2012-02-03 07:13:48 UTC
(In reply to comment #3)
> Hi, James 
> 
> 1.what if when you using IDE disk and RTL8139 network drivers?
> 2.Could you check whether there is a MEMORY.DMP located in C:\Windows ?
> 3.Could you tell me which tools are you using to load the guest ?
> 4.Could you upload the full libvirt log for the guest ?
> 
> 

Pls provide me virtio-win version ,

> Thanks,
> Mike

Comment 5 Yvugenfi@redhat.com 2012-02-05 10:27:16 UTC
Hello,

Printout:
"virtio_ioport_write: unexpected address 0x13 value 0x1" - is coming from bug check callback that we have in our virtio-win driver for debug purposes. Its meaning -there was a crash (blue screen) on the guest (not related to driver, it will printout such message on any blue screen). 

To further debug this issue:
1. Please provide mini dump or kernel dump that was created as a result of the blue screen.
2. Disable auto restart on crash in Windows guest (it will enable you to get some information in case memory dumps are unavailable): Control Panel -> System -> "Advanced" tab -> Click on "Settings" button under "Startup and Recovery"-> Uncheck "Automatically restart" check button -> Under "Write debugging information" select 'Kernel memory dump" from a drop list.


Thanks!

Comment 6 Hein 2012-02-08 02:46:58 UTC
Created attachment 560112 [details]
libvirt log for the guest

Hi Mike,

There is a memory dump file in C:\windows modified on 31 Jan which was like 8 days ago, no crash since then.
virt-manager is used to load the guest.
FYI, libvirt log for this guest is attached.

Thanks
Hein

Comment 7 Mike Cao 2012-02-08 10:27:36 UTC
(In reply to comment #6)
> Created attachment 560112 [details]
> libvirt log for the guest
> 
> Hi Mike,
> 
> There is a memory dump file in C:\windows modified on 31 Jan which was like 8
> days ago, no crash since then.
> virt-manager is used to load the guest.
> FYI, libvirt log for this guest is attached.
> 
> Thanks
> Hein

Hi, Hein

pls try steps in comment #5 and keep your guest running  .Once it happened again ,pls attach the MEMORY.DMP to this bug .

Best Regards,
Mike

Comment 8 Hein 2012-02-10 02:53:58 UTC
Created attachment 560746 [details]
memory.dmp

Comment 9 Hein 2012-02-10 02:55:23 UTC
Created attachment 560747 [details]
memory.dmp1

Comment 10 Hein 2012-02-10 02:56:20 UTC
Created attachment 560748 [details]
memory.dmp2

Comment 11 Hein 2012-02-10 02:57:17 UTC
Created attachment 560749 [details]
memory.dmp3

Comment 12 Hein 2012-02-10 02:58:09 UTC
Created attachment 560751 [details]
memory.dmp4

Comment 13 Hein 2012-02-10 03:00:48 UTC
Hi Mike,
Please find the attached memory dmp file as requested. FYI, the file has splitted into 5MB.

Thanks
Hein

Comment 15 Yvugenfi@redhat.com 2012-02-13 00:10:48 UTC
I reviewed the crash dumps. They don't seams to point to the virtio-win driver.

The problem look like http://support.microsoft.com/kb/908369 (the system is idle, is it very possible that CPU is resumed from C1).
Is it possible to follow the steps in this KB and comment if the problem persists?

Also the machine is 2003 SP1, not SP2 - is it intentional?

Comment 16 Hein 2012-02-13 02:01:04 UTC
Hi Yan,

Registry subkey has been updated as per KB, so we will continue to monitor the system to see if this has fixed the issue. 

Thank you

Comment 17 Yvugenfi@redhat.com 2012-02-19 17:41:43 UTC
(In reply to comment #16)
> Hi Yan,
> 
> Registry subkey has been updated as per KB, so we will continue to monitor the
> system to see if this has fixed the issue. 
> 
> Thank you

Any updates?

Thanks,
Yan.

Comment 18 Hein 2012-03-01 05:42:07 UTC
Hi Yan
Even though registry subkey has been updated, the servers keep crashing. 3 windows 2003 servers having the same characteristics.
Thank you.
Hein

Comment 19 Yvugenfi@redhat.com 2012-03-04 09:05:12 UTC
(In reply to comment #18)
> Hi Yan
> Even though registry subkey has been updated, the servers keep crashing. 3
> windows 2003 servers having the same characteristics.
> Thank you.
> Hein

Could you please upload the new crash dumps as well?

Thanks!

Comment 20 Ronen Hod 2012-03-06 10:34:50 UTC
Since we are closing virtio-win for RHEL6.3, and this is not a virtio-win bug, deferring to RHEL6.4.
Waiting for more answers from the customer.
Regards, Ronen.

Comment 21 andrew bellussi 2012-03-15 02:51:36 UTC
Created attachment 570151 [details]
Memory Dump frap01.dms Part1 0f  4

Hi I'm uploading the latest Memory Dump files from all 3 Win 2003 Servers (maxed to 5MB files)

Regards,

Andrew Bellussi

Comment 22 andrew bellussi 2012-03-15 02:53:49 UTC
Created attachment 570152 [details]
Memory Dump frap01.dms part 2 of 4

Comment 23 andrew bellussi 2012-03-15 02:55:40 UTC
Created attachment 570153 [details]
Mem Dump frap01.dms Part 3 0f 4

Comment 24 andrew bellussi 2012-03-15 02:57:23 UTC
Created attachment 570154 [details]
Mem Dump frap01.dms Part 4 of 4

Comment 25 andrew bellussi 2012-03-15 02:59:04 UTC
Created attachment 570155 [details]
Mem Dump Frap02.dms part1 of 4

Comment 26 andrew bellussi 2012-03-15 03:00:55 UTC
Created attachment 570156 [details]
Mem Dump frap02.dms Part 2 of 4

Comment 27 andrew bellussi 2012-03-15 03:02:20 UTC
Created attachment 570157 [details]
Mem Dump frap02.dms part 3 of 4

Comment 28 andrew bellussi 2012-03-15 03:03:41 UTC
Created attachment 570158 [details]
Mem Dump frap02.dms part 4 0f 4

Comment 29 andrew bellussi 2012-03-15 03:05:35 UTC
Created attachment 570160 [details]
Mem Dump frap03.dms 1 of 4

Comment 30 andrew bellussi 2012-03-15 03:08:20 UTC
Created attachment 570161 [details]
Mem Dump frap03.dms 2 of 4

Comment 31 andrew bellussi 2012-03-15 03:10:20 UTC
Created attachment 570162 [details]
Mem Dump frap03.dms 3 of 4

Comment 32 andrew bellussi 2012-03-15 03:12:58 UTC
Created attachment 570163 [details]
Mem Dump frap03.dms 4 of 4

Comment 33 Miya Chen 2012-03-27 05:03:50 UTC
Hi Yan,
Could you please help check the above uploaded dump file? Thanks.

Comment 34 Yvugenfi@redhat.com 2012-04-03 09:07:40 UTC
(In reply to comment #33)
> Hi Yan,
> Could you please help check the above uploaded dump file? Thanks.

Hi I checked the crash dumps. They look exactly the same and are same as the initial one.
Unfortunately I don't have any new conclusions.

I will continue to research it.

For now the only odd thing that I saw in the dump files is that HDA bus driver was loaded\unloaded for several times.

Is it possible to run VM is same scenario but with our sound? (remove -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 )

Comment 35 andrew bellussi 2012-04-11 05:04:48 UTC
Hi Folks,

Good News...

Ever since we upgraded our 3 troublesome win2003 vm servers to SP2 and connected them to our Windows updates server, we have not had an unexplained reboot. So for appox 2 weeks all has been good.

Our Thanks to the people below which have helped tying to solve our problem:

Mike Cao
Yan Vugenfirer
Ronen Hod
Miya Chen.

Regards,

Andrew Bellussi.

Comment 36 James Shirley 2012-04-11 10:25:56 UTC
Please close