RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 999296 - Win2012.64 guest hang on Intel(R) Xeon(R) CPU 5130 host
Summary: Win2012.64 guest hang on Intel(R) Xeon(R) CPU 5130 host
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 7.0
Assignee: Yvugenfi@redhat.com
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1000882
TreeView+ depends on / blocked
 
Reported: 2013-08-21 06:12 UTC by xhan
Modified: 2014-06-18 08:09 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-15 13:22:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
register info (9.97 KB, text/plain)
2013-08-22 05:50 UTC, xhan
no flags Details

Description xhan 2013-08-21 06:12:24 UTC
Description of problem:

On host with "Intel(R) Xeon(R) CPU 5130  @ 2.00GHz", boot a guest win2012. After boot up, start up Server Manager, then guest freeze first and turns into black screen at last. 

Version-Release number of selected component (if applicable):
kernel-2.6.32-412.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.394.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot up vm
/usr/libexec/qemu-kvm \
    -S \
    -name 'vm1' \
    -nodefaults \
    -monitor stdio \
    -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x4 \
    -drive file='win2012-64-virtio.raw',index=0,if=none,id=drive-ide0-0-0,media=disk,cache=none,snapshot=off,format=raw,aio=native \
    -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0 \
    -device virtio-net-pci,netdev=idLd6elX,mac='9a:3a:3b:3c:3d:3e',bus=pci.0,id='id2MXw9O' \
    -netdev tap,id=idLd6elX,vhost=on,script=qemu-ifup \
    -m 8196 \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
    -cpu 'Conroe' \
    -M rhel6.5.0 \
    -drive file='winutils.iso',index=1,if=none,id=drive-ide0-0-1,media=cdrom,format=raw \
    -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1 \
    -vnc :0 \
    -vga std \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off  \
    -enable-kvm
2. cont (via monitor)
3. view the screen using remote-viewer 
   remote-viewer  vnc://ip_host:5900

Actual results:
the guest freezes first and wait for turns into black screen at last 

Expected results:
The guest could works normally.

Additional info:

Host machine infos:
# free
             total       used       free     shared    buffers     cached
Mem:      16329084   16156388     172696          0       8436    7204608
-/+ buffers/cache:    8943344    7385740
Swap:     58720240      19408   58700832

#cat /proc/cpuinfo
...
processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
stepping	: 11
cpu MHz		: 249.999
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm dts tpr_shadow vnmi flexpriority
bogomips	: 3990.05
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual

Guest/VM infos:
(qemu) info registers 
RAX=00000000000000c0 RBX=000000000000000c RCX=000000000000000c RDX=0000000000000071
RSI=000000000000000c RDI=fffff88004578218 RBP=000000000000000c RSP=fffff88004578158
R8 =fffff88004578218 R9 =0000000000000001 R10=fffff8027f573bcc R11=0000000000000258
R12=0000000000010000 R13=0000000000000000 R14=0000000000000000 R15=fffff8027f24a490
RIP=fffff8027f21d5e2 RFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0000 0000000000000000 ffffffff 00000000
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
FS =0053 00000000ff946000 00003c00 0040f300 DPL=3 DS   [-WA]
GS =002b fffff8027f56e000 ffffffff 00c0f300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 ffffffff 00000000
TR =0040 fffff8027e7b2080 00000067 00008b00 DPL=0 TSS64-busy
GDT=     fffff8027e7b1000 0000007f
IDT=     fffff8027e7b1080 00000fff
CR0=80050031 CR2=000000541cbf72a0 CR3=0000000113045000 CR4=000006f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
FCW=027f FSW=3800 [ST=7] FTW=80 MXCSR=00001f80
FPR0=9fc0000000000000 4008 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=0000000000000000ffffffffffffffff XMM03=3f8000003f8000003f8000003f800000
XMM04=3a8282833a8282833a8282833a828283 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000

kvm_stat -1
efer_reload                    0         0
exits                   44562930      6491
fpu_reload                623753        83
halt_exits                119164        31
halt_wakeup               124813        32
host_state_reload       35218919      6143
hypercalls                     0         0
insn_emulation           1507140       124
insn_emulation_fail          132         0
invlpg                    272094         0
io_exits                34979999      6109
irq_exits                2198737        56
irq_injections            511083        62
irq_window                179590        31
largepages                    72         0
mmio_exits                 49857         0
mmu_cache_miss             24534         0
mmu_flooded                 2186         0
mmu_pde_zapped             26830         0
mmu_pte_updated           135871         0
mmu_pte_write             138148         0
mmu_recycled                   0         0
mmu_shadow_zapped          18591         0
mmu_unsync                  1367         0
nmi_injections                 0         0
nmi_window                     0         0
pf_fixed                 3779591         0
pf_guest                  457431         0
remote_tlb_flush          574626         0
request_irq                    0         0
signal_exits                  60         0
tlb_flush                 566412         0

Comment 2 Andrew Jones 2013-08-21 08:52:15 UTC
Is the guest still running? i.e. are the vcpu threads still consuming cycles (check top), and/or is the rip changing if you check 'info registers' a few times? If so, then it might help to get a Windows DMP.

Comment 3 xhan 2013-08-22 03:10:04 UTC
top command output:
-----------------------------------------------------------------------
Tasks: 144 total,   2 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s): 12.6%us,  9.4%sy,  0.0%ni, 76.8%id,  0.8%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:  16329084k total, 16119504k used,   209580k free,    48464k buffers
Swap: 58720240k total,    19308k used, 58700932k free,  7101752k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                     
32619 root      20   0 8647m 8.1g 4948 R 256.4 51.8  42:23.31 qemu-kvm
-----------------------------------------------------------------------

rip only changes from "RIP=fffff8027f21d5e2" to "RIP=fffff801433715dc" in info registers output.

(qemu) info registers 
RAX=000000000000000c RBX=000000000000000c RCX=000000000000000c RDX=0000000000000070
RSI=000000000000000c RDI=fffff80144841698 RBP=000000000000000c RSP=fffff801448415d8
R8 =fffff80144841698 R9 =0000000000000001 R10=0000000000000024 R11=fffffa8006e79f10
R12=0000000000000300 R13=fffffa8006e78300 R14=0000000000000000 R15=fffff8014339e490
RIP=fffff801433715dc RFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0000 0000000000000000 ffffffff 00000000
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
FS =0053 00000000745f0000 00003c00 0040f300 DPL=3 DS   [-WA]
GS =002b fffff80142f0e000 ffffffff 00c0f300 DPL=3 DS   [-WA]
LDT=0000 0000000000000000 ffffffff 00000000
TR =0040 fffff80144835080 00000067 00008b00 DPL=0 TSS64-busy
GDT=     fffff80144834000 0000007f
IDT=     fffff80144834080 00000fff
CR0=80050031 CR2=000007ff482ea0c0 CR3=0000000000187000 CR4=000006f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
FCW=027f FSW=3800 [ST=7] FTW=80 MXCSR=00001f80
FPR0=9fc0000000000000 4008 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=ea9835d47887d91d97b73beca6df947e XMM01=0000000000000011fffffa8006fffe10
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
XMM08=00000000000000000000000000000000 XMM09=00000000000000000000000000000000
XMM10=00000000000000000000000000000000 XMM11=00000000000000000000000000000000
XMM12=00000000000000000000000000000000 XMM13=00000000000000000000000000000000
XMM14=00000000000000000000000000000000 XMM15=00000000000000000000000000000000

Comment 4 xhan 2013-08-22 03:32:32 UTC
When the guest turns into hang status, occationally qemu-kvm would output error messages
virtio_ioport_write: unexpected address 0x13 value 0x0.

Comment 5 xhan 2013-08-22 05:46:37 UTC
1. guest works well on Sandybridge host with -cpu SandyBridge, -cpu Westmere, -cpu Conroe cpu mode

host cpuinfo:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
stepping	: 9
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips	: 6784.34
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual

2. can not reproduce with Win7.64


3. top after guest hang

Tasks: 144 total,   2 running, 142 sleeping,   0 stopped,   0 zombie
Cpu0  : 42.9%us, 57.1%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 20.0%us, 80.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 16.7%us, 50.0%sy,  0.0%ni, 33.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16329084k total, 11878968k used,  4450116k free,    53088k buffers
Swap: 58720240k total,    19308k used, 58700932k free,  7093656k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  WCHAN     COMMAND                                                                           
  499 root      20   0 4525m 4.0g 4920 R 246.3 25.9  11:17.89 -         qemu-kvm   

4. kvm_stat

kvm statistics

 efer_reload                  0       0
 exits                  4499473    6374
 fpu_reload              194411      86
 halt_exits               31250      25
 halt_wakeup              30558      27
 host_state_reload	1802632    6057
 hypercalls                   0       0
 insn_emulation          526907     131
 insn_emulation_fail         65       0
 invlpg                  138937       0
 io_exits               1690371    6001
 irq_exits               308306      50
 irq_injections          110721      50
 irq_window               11007      25
 largepages                  29       0
 mmio_exits               61049      31
 mmu_cache_miss           12140       0
 mmu_flooded               1500       0
 mmu_pde_zapped           18627       0
 mmu_pte_updated          12350       0
 mmu_pte_write            85467       0
 mmu_recycled                 0       0
 mmu_shadow_zapped         8560       0
 mmu_unsync                 853       0
 nmi_injections               0       0
 nmi_window                   0       0
 pf_fixed               1254739       0
 pf_guest                219912       0
 remote_tlb_flush        151012       0
 request_irq                  0       0
 signal_exits                 1       0
 tlb_flush               296338       0


5. cmd:
/usr/libexec/qemu-kvm -name vm1 -nodefaults -monitor stdio -device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=0x4 \
-drive file=/win2012-64-virtio.raw,index=0,if=none,id=drive-ide0-0-0,media=disk,cache=none,snapshot=off,format=raw,aio=native \
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0 \
-device e1000,netdev=idLd6elX,mac=9a:3a:3b:3c:3d:3e,id=id2MXw9O \
-netdev tap,id=idLd6elX,script=/scripts/qemu-ifup-switch \
-m 4G -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 -cpu Conroe -M rhel6.5.0 -vnc :0 -vga std \
-rtc base=localtime,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off -enable-kvm

6. guest: Win2012.64

7. result:
guest hang, cpu 100% used, can not ping guest from host

8. host that hang

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
stepping	: 11
cpu MHz		: 249.999
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm dts tpr_shadow vnmi flexpriority
bogomips	: 3990.05
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual

[root@intel-5130-16-1 windows]# free -m
             total       used       free     shared    buffers     cached
Mem:         15946      11652       4293          0         53       6980
-/+ buffers/cache:       4619      11327
Swap:        57343         18      57325

Comment 6 xhan 2013-08-22 05:50:10 UTC
Created attachment 789044 [details]
register info

Comment 7 Andrew Jones 2013-08-22 09:18:26 UTC
(In reply to xhan from comment #5)
> kvm statistics
> 
>  efer_reload                  0       0
>  exits                  4499473    6374
>  fpu_reload              194411      86
>  halt_exits               31250      25
>  halt_wakeup              30558      27
>  host_state_reload	1802632    6057
>  hypercalls                   0       0
>  insn_emulation          526907     131
>  insn_emulation_fail         65       0
>  invlpg                  138937       0
>  io_exits               1690371    6001
>  irq_exits               308306      50
>  irq_injections          110721      50
>  irq_window               11007      25
>  largepages                  29       0
>  mmio_exits               61049      31
>  mmu_cache_miss           12140       0
>  mmu_flooded               1500       0
>  mmu_pde_zapped           18627       0
>  mmu_pte_updated          12350       0
>  mmu_pte_write            85467       0
>  mmu_recycled                 0       0
>  mmu_shadow_zapped         8560       0
>  mmu_unsync                 853       0
>  nmi_injections               0       0
>  nmi_window                   0       0
>  pf_fixed               1254739       0
>  pf_guest                219912       0
>  remote_tlb_flush        151012       0
>  request_irq                  0       0
>  signal_exits                 1       0
>  tlb_flush               296338       0
> 

Are any of these counts climbing quickly? Such as the interrupts? If so, then the symptoms are quite similar to a problem we had with the e1000 model and win2012, but this time there's no e1000. Or wait, is there? The cmdline in comment 0 doesn't have one configured, but the command line in comment 5 does.

Comment 8 xhan 2013-08-23 02:14:20 UTC
kvm statistics

 efer_reload                  0       0
 exits                 12472328    8336
 fpu_reload              400786     111
 halt_exits               32808       0
 halt_wakeup              33796       0
 host_state_reload	7120683    6389
 hypercalls                   0       0
 insn_emulation          862899     241
 insn_emulation_fail        137       0
 invlpg                  176507      31
 io_exits               6947633    6372
 irq_exits              1017898    1397
 irq_injections          255049     103
 irq_window               69279      45
 largepages                  75       0
 mmio_exits               49857       0
 mmu_cache_miss           15704       0
 mmu_flooded               1686       0
 mmu_pde_zapped           24986       0
 mmu_pte_updated          66547       0
 mmu_pte_write           102439       0
 mmu_recycled                 0       0
 mmu_shadow_zapped         9908       0
 mmu_unsync                1153       1
 nmi_injections               0       0
 nmi_window                   0       0
 pf_fixed               2367442      41
 pf_guest                292476       8
 remote_tlb_flush        320097      25
 request_irq                  0       0
 signal_exits                95       0
 tlb_flush               426751      52

1) io_exits, irq_exits climb quickly. 
2) comment 5 uses e1000 to exclude the virtio problems

Comment 9 xhan 2013-08-23 08:13:36 UTC
(In reply to xhan from comment #4)
> When the guest turns into hang status, occationally qemu-kvm would output
> error messages
> virtio_ioport_write: unexpected address 0x13 value 0x0.

Try with the guest with the latest prewhql driver for virtio, the error message is not met. 

Hang problem still exists.

Comment 10 Andrew Jones 2013-08-23 09:45:07 UTC
Please remove the e1000 from the config. If the problem is hit again, then check kvm stats again for the io and irq_exits. Thanks.

Comment 15 Andrew Jones 2013-08-26 08:25:26 UTC
Hmm, it's looking like win2012's interrupt handling is sensitive to running as a KVM guest in general - not necessarily just with the e1000 driver.

Comment 23 RHEL Program Management 2013-10-14 02:38:15 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 24 Yvugenfi@redhat.com 2014-01-14 13:17:29 UTC
Is this bug still reproducible?

Thanks,
Yan.

Comment 25 xhan 2014-01-15 06:30:22 UTC
Start up win2012 guest with the command in the description. Guest works well. This bug could not be reproducible.

tested packages version:
qemu-kvm-1.5.3-10.el7.x86_64
kernel-3.10.0-35.el7.x86_64

Comment 26 Yvugenfi@redhat.com 2014-01-15 13:22:24 UTC
Closing according to comment #25


Note You need to log in before you can comment on or make changes to this bug.