Bug 680341

Summary: Win2008_64bit (& Win7-32) BSOD when installation
Product: Red Hat Enterprise Linux 5 Reporter: Xiaoqing Wei <xwei>
Component: kvmAssignee: Ronen Hod <rhod>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.7CC: dyasny, gcosta, gleb, jasowang, juzhang, knoel, lihuang, mkenneth, pbonzini, rdassen, rhod, shuang, tburke, virt-maint, vrozenfe
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-26 08:56:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580948    
Attachments:
Description Flags
bsod snapshot
none
top,gdb,kvm_stat info,hope they will help
none
the unattended file for this installation
none
minidump itself
none
minidump analyzed
none
memory dump analyzed
none
memdump itself part0
none
memdump itself part1
none
meet the same bsod when installing win7-32
none
memdump itself part2 none

Description Xiaoqing Wei 2011-02-25 07:00:11 UTC
Description of problem:
BSOD at stage 'completing installation'

Version-Release number of selected component (if applicable):

kvm-83-225

How reproducible:
2/16

Steps to Reproduce:

1. cmd
"qemu-kvm -drive file='/usr/images/win2008-64.qcow2',index=0,if=ide,media=disk,cache=none,format=qcow2 
-net nic,vlan=0,model=rtl8139,macaddr='9a:92:b2:89:51:78' 
-net tap,vlan=0,ifname='t0-093847-WUVh',script='/usr/scripts/qemu-ifup-switch',downscript='no' 
-m 2048 -smp 2,cores=1,threads=1,sockets=2 -drive file='/usr/isos/ISO/Win2008/64/en_windows_server_2008_datacenter_enterprise_standard_x64_dvd_X14-26714.iso',media=cdrom,index=1 
-drive file='/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/isos/windows/winutils.iso',media=cdrom,index=2 
-drive file='/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/isos/windows/virtio-win.iso',media=cdrom,index=3 
-cpu qemu64,+sse2 
-fda '/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/images/win2008-sp1-64/answer.vfd' 
-redir tcp:5000::10023 -vnc :0 -rtc-td-hack -M rhel5.6.0 -boot d  -usbdevice tablet"

  
Actual results:
blue screen of death

Expected results:
installation successfully






Additional info:

1. host info
kernel:x64-2.6.18-245
cpuinfo:   have tested on 2 kinds of amd cpu:



processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 67
model name	: Dual-Core AMD Opteron(tm) Processor 1216
stepping	: 3
cpu MHz		: 1000.000
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips	: 2009.36
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc









processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 107
model name	: AMD Athlon(tm) Dual Core Processor 4450B
stepping	: 2
cpu MHz		: 1000.000
cache size	: 512 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy misalignsse
bogomips	: 2004.17
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc 100mhzsteps


2.can not reproduce on Intel host

3. guest
Win2008.64

Comment 1 Xiaoqing Wei 2011-02-25 07:01:00 UTC
Created attachment 480939 [details]
bsod snapshot

Comment 4 Xiaoqing Wei 2011-05-16 04:59:47 UTC
Created attachment 499078 [details]
top,gdb,kvm_stat info,hope they will help

Comment 5 Xiaoqing Wei 2011-05-16 05:00:52 UTC
Created attachment 499079 [details]
the unattended file for this installation

Comment 6 Xiaoqing Wei 2011-05-16 10:32:29 UTC
retest this when host heavily loaded ,this bug is much easier to reproduce.

and it can be reproduced with virtio device.


when qemu-kvm process starts,do 10+ dd 
dd if=/dev/urandom of=/tmp/...1 bs=1M count=1000 &
.....


qemu-kvm -name 08-virt_nic-ins1 -monitor stdio  \
-serial unix:'/tmp/serial-ins1',server,nowait -drive file='win2008-64-virtio1.qcow2',index=0,if=virtio,media=disk,cache=none,boot=on,format=qcow2 -net nic,vlan=0,model=virtio,macaddr='9a:bf:2a:ba:3c:41' \
-net tap,vlan=0,... -m 2048 -smp 2,cores=1,threads=1,sockets=2 \
-drive file='26714.iso',media=cdrom,index=1 \
-drive file='winutils.iso',media=cdrom,index=2 \
-drive file='irtio-win.iso.el5',media=cdrom,index=3 \
-cpu qemu64,+sse2 -soundhw ac97 \
-fda '/home/kvm_autotest_root/images/win2008-sp2-64/answer.vfd'  \
-vnc :1  -rtc-td-hack -M rhel5.6.0 -boot d  -usbdevice tablet




top info(show threads)

top - 17:45:17 up 2 days, 23:24,  6 users,  load average: 6.09, 6.20, 4.61
Tasks:   5 total,   2 running,   3 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.2%us, 94.8%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8177480k total,  3645032k used,  4532448k free,    23284k buffers
Swap: 10223608k total,    12392k used, 10211216k free,  1434544k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                       
29939 root      25   0 2253m 2.0g 2488 R 100.0 25.3   1:36.17 qemu-kvm                     
29940 root      25   0 2253m 2.0g 2488 R 100.0 25.3   1:24.07 qemu-kvm                     
29928 root      15   0 2253m 2.0g 2488 S  0.0 25.3   0:01.19 qemu-kvm                      
29938 root      18   0 2253m 2.0g 2488 S  0.0 25.3   0:00.00 qemu-kvm                      
29959 root      15   0 2253m 2.0g 2488 S  0.0 25.3   0:00.00 qemu-kvm                      



gdb info
(gdb) bt
#0  0x00000037ac2cd722 in select () from /lib64/libc.so.6
#1  0x00000000004093e1 in qemu_select (timeout=0)
    at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:3995
#2  main_loop_wait (timeout=0)
    at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:4094
#3  0x000000000050114a in kvm_main_loop ()
    at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/qemu-kvm.c:596
#4  0x000000000040e757 in main_loop (argc=38, argv=0x7fffa3b47408, 
    envp=<value optimized out>)
    at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:4157
#5  main (argc=38, argv=0x7fffa3b47408, envp=<value optimized out>)
    at /usr/src/debug/kvm-83-maint-snapshot-20090205/qemu/vl.c:6559
(gdb) 


kvm_stat info
[root@amd-1216-8-2 ~]# kvm_stat -1
efer_reload                    0         0
exits                   43164505    203030
fpu_reload              40862598    201904
halt_exits                 12704         0
halt_wakeup                11694         0
host_state_reload       41273245    202092
hypercalls                     0         0
insn_emulation            836956         0
insn_emulation_fail            3         0
invlpg                     95202         0
io_exits                40989217    201745
irq_exits                 321246      1284
irq_injections             36717         0
irq_window                     0         0
kvm_request_irq                0         0
largepages                     0         0
mmio_exits                128856         0
mmu_cache_miss            130105         0
mmu_flooded                 1536         0
mmu_pde_zapped              3408         0
mmu_pte_updated            11094         0
mmu_pte_write              11880         0
mmu_recycled                   0         0
mmu_shadow_zapped         173935         0
mmu_unsync                     0         0
mmu_unsync_global              0         0
nmi_injections                 4         0
nmi_window                     0         0
pf_fixed                  951702         0
pf_guest                   72491         0
remote_tlb_flush           44786         0
request_nmi                    0         0
signal_exits                   2         0
tlb_flush                 194058         0
[root@amd-1216-8-2 ~]# 



CPU info
processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 67
model name	: Dual-Core AMD Opteron(tm) Processor 1216
stepping	: 3
cpu MHz		: 2400.000
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy


RAM info 8G


since this seems like a performance related issue,

may it does not cpu vendor related,I will test on intel again.

previously,I used amd small(dual core/8G) /intel big(oct core/ 32G ) machine to test this ,may the intel host can pass the installation due to no enough preesure

Comment 7 Xiaoqing Wei 2011-05-17 02:14:22 UTC
(In reply to comment #6)


> since this seems like a performance related issue,
> 
> may it does not cpu vendor related,I will test on intel again.
> 
> previously,I used amd small(dual core/8G) /intel big(oct core/ 32G ) machine to
> test this ,may the intel host can pass the installation due to no enough
> preesure


intel big machine also reproduces this issue,just give it enough pressure .

I run 40 dd operation at the end of the installation,so the BSOD happens,
memory dump and minidump generated at the same time.

attached the files,both original && analysed results

Comment 8 Xiaoqing Wei 2011-05-17 02:19:16 UTC
Created attachment 499256 [details]
minidump itself

Comment 9 Xiaoqing Wei 2011-05-17 02:19:54 UTC
Created attachment 499257 [details]
minidump analyzed

Comment 10 Xiaoqing Wei 2011-05-17 02:20:50 UTC
Created attachment 499258 [details]
memory dump analyzed

Comment 11 Xiaoqing Wei 2011-05-17 02:40:40 UTC
Created attachment 499261 [details]
memdump itself part0

the memdump it's toooo big, so split it to fit the bugzilla upload size

Comment 12 Xiaoqing Wei 2011-05-17 02:44:09 UTC
Created attachment 499262 [details]
memdump itself part1

Comment 13 Xiaoqing Wei 2011-05-25 08:53:12 UTC
Created attachment 500762 [details]
meet the same bsod when installing win7-32

Comment 14 Xiaoqing Wei 2011-05-25 08:59:51 UTC
(In reply to comment #13)
> Created attachment 500762 [details]
> meet the same bsod when installing win7-32


Host info:
kernel: 2.6.18-262.el5
kvm: kvm-83-235.el5

CPU:
processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU           E5310  @ 1.60GHz
stepping	: 11
cpu MHz		: 1600.002
cache size	: 4096 KB
physical id	: 1
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm tpr_shadow vnmi flexpriority
bogomips	: 3191.90
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual
power management:


[root@intel-5310-32-2 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         32108       1209      30898          0         51        412
-/+ buffers/cache:        746      31362
Swap:        34287          0      34287

Comment 15 Xiaoqing Wei 2011-05-26 02:28:13 UTC
Created attachment 500968 [details]
memdump itself part2

Comment 16 Qunfang Zhang 2011-05-26 10:29:43 UTC
*** Bug 667112 has been marked as a duplicate of this bug. ***

Comment 17 Qunfang Zhang 2011-05-31 02:49:30 UTC
*** Bug 643060 has been marked as a duplicate of this bug. ***

Comment 25 Vadim Rozenfeld 2011-07-04 07:37:07 UTC
Guys,
I suggest closing this bug with WONTFIX, 
as we did it for Bz#667112.
Best regards,
Vadim.

Comment 36 Ronen Hod 2011-08-11 15:07:40 UTC
*** Bug 557660 has been marked as a duplicate of this bug. ***

Comment 37 Ronen Hod 2011-08-11 15:11:39 UTC
In 
https://bugzilla.redhat.com/show_bug.cgi?id=557660#c12

Vadim Rozenfeld 2011-04-06 08:49:19 EDT
Please try disabling the display watchdog.
http://msdn.microsoft.com/en-us/library/ff553890%28v=vs.85%29.aspx

It might be relevant.

Comment 50 Xiaoqing Wei 2011-09-10 10:45:56 UTC
Hi,

renice qemu-kvm to higher value still bsod,but, it's more difficult to make guest bsod. guest is more robust compare to before.

here's the info when guest bsod,hope it will help.
from monitor :

info registers 
RAX=fffff8000aa30001 RBX=0000000000000000 RCX=fffff8000c34c3e0 RDX=000000000000b000
RSI=fffff8000aa71640 RDI=0000000000000001 RBP=000000000000000f RSP=fffff8000c34c360
R8 =0000000000000000 R9 =0000000000000000 R10=fffff8000c34c148 R11=fffff8000c34b2e8
R12=0000000000000101 R13=0000000000000001 R14=0000000040000082 R15=0000000000000001
RIP=fffff8000a80e525 RFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300
CS =0010 0000000000000000 00000000 00209b00
SS =0018 0000000000000000 ffffffff 00c09300
DS =002b 0000000000000000 ffffffff 00c0f300
FS =0053 00000000fffd5000 00003c00 0040f300
GS =002b fffff8000a9bd500 ffffffff 00c0f300
LDT=0000 0000000000000000 ffffffff 00000000
TR =0040 fffff8000c346070 00000067 00008b00
GDT=     fffff8000c345000 0000006f
IDT=     fffff8000c345070 00000fff
CR0=80050031 CR2=fffff900c019c000 CR3=00000000002dd000 CR4=000006f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
FCW=027f FSW=3800 [ST=7] FTW=80 MXCSR=00000000
FPR0=9fc0000000000000 4008 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000
(qemu) info cpus
* CPU #0: pc=0xfffff8000a80e525 thread_id=29221
  CPU #1: pc=0xfffff8000a8dcdee thread_id=29223



top -p pid (show threads)
Tasks:   6 total,   2 running,   4 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.2%us, 46.8%sy,  0.0%ni, 49.5%id,  0.5%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8106820k total,  7660452k used,   446368k free,    15208k buffers
Swap:  8388600k total,  1988624k used,  6399976k free,  4336448k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                  
29221 root      15  -5 4465m 2.7g 4532 R 100.0 35.5  10:35.89 qemu-kvm                                
29223 root      20  -5 4465m 2.7g 4532 R 100.0 35.5   9:55.67 qemu-kvm                                
29145 root      10  -5 4465m 2.7g 4532 S  0.0 35.5   0:36.92 qemu-kvm                                 
29168 root      10  -5 4465m 2.7g 4532 S  0.0 35.5   0:01.13 qemu-kvm                                 
29253 root      10  -5 4465m 2.7g 4532 S  0.0 35.5   0:00.25 qemu-kvm                                 
29255 root      10  -5 4465m 2.7g 4532 S  0.0 35.5   0:00.07 qemu-kvm                                 




[root@localhost ~]# kvm_stat -1
efer_reload                17603         0
exits                   98251711    238528
fpu_reload              76183510    237386
halt_exits                883519         0
halt_wakeup               757816         0
host_state_reload       76429715    237390
hypercalls                     0         0
insn_emulation           6128524         0
insn_emulation_fail            0         0
invlpg                   1657766         0
io_exits                75371533    237380
irq_exits                2067003      1146
irq_injections           1921074         0
irq_window                380669         0
kvm_request_irq                0         0
largepages                     0         0
mmio_exits                 98379         0
mmu_cache_miss            275386         0
mmu_flooded               159606         0
mmu_pde_zapped            246150         0
mmu_pte_updated           250571         0
mmu_pte_write             329977         0
mmu_recycled               60112         0
mmu_shadow_zapped         254173         0
mmu_unsync                 20048         0
mmu_unsync_global              0         0
nmi_injections                 1         0
nmi_window                     0         0
pf_fixed                10173543         0
pf_guest                  872246         0
remote_tlb_flush          641323         0
request_nmi                    0         0
signal_exits                  57         0
tlb_flush                3130543         0

Comment 55 Ronen Hod 2012-02-01 11:50:15 UTC
*** Bug 781304 has been marked as a duplicate of this bug. ***

Comment 56 Ronen Hod 2012-02-26 08:56:07 UTC
Closing. This bug-fix seems to be too intrusive for RHEL5.9. Things are better in RHEL6.