Bug 821377

Summary: Guest for win2k8-R2 with -smp 64 and -m 256GB often stuck after resuming from S4.
Product: Red Hat Enterprise Linux 7 Reporter: dawu
Component: qemu-kvmAssignee: Vadim Rozenfeld <vrozenfe>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: amit.shah, areis, bcao, bsarathy, flang, juzhang, knoel, michen, mkenneth, rbalakri, tburke, virt-bugs, virt-maint, vrozenfe, xfu, yvugenfi
Target Milestone: rc   
Target Release: 7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-04 05:47:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 720669, 761491, 923626    
Attachments:
Description Flags
System_stuck_afterResumingFromS4_1
none
System_stuck_afterResumingFromS4_2
none
systemEventLogs.txt none

Description dawu 2012-05-14 09:46:19 UTC
Description of problem:
Found this issue when running job "Common Scenario Stress With IO" for svvp, guest for win2k8-R2 with -smp 64 and -m 256GB often stuck after resuming from S4.

Version-Release number of selected component (if applicable):
kernel-2.6.32-269.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.290.el6.x86_64
Netkvm virtio-win-rewhql-0.1-27
Block  virtio-win-rewhql-0.1-26
seabios-0.6.1.2-19.el6.x86_64

How reproducible:
70% 

Steps to Reproduce:
Summarize the main steps for job "Common Scenario Stress With IO" for svvp as following:
1. Disable/Enable driver repeatedly for almost 10 times.
2. Do simpeIOStress 
3. S4 after IO stress work done.
4. Resuming from S4.
5. Repeate 1-4 steps for many times, such as 12 times.
Actual results:
Guest stuck  after resuming from S4,mainly happened in the first several times for S4,please refer to the attached "System_stuck_afterResumingFromS4_1.png".

Expected results:
Guest should resume successfully after S4.

Additional info:
When system_reset from qemu monitor, guest will prompt if "continue with system resume", if select it, you can continue resuming successfully,please refer to the attached "System_stuck_afterResumingFromS4_2.png".

Comment 1 dawu 2012-05-14 09:47:45 UTC
Created attachment 584307 [details]
System_stuck_afterResumingFromS4_1

Comment 2 dawu 2012-05-14 09:48:21 UTC
Created attachment 584308 [details]
System_stuck_afterResumingFromS4_2

Comment 3 dawu 2012-05-14 09:52:00 UTC
Hi Vadim,

For job "Common Scenario Stress With IO" for svvp, if there is some mistakes in the steps described in the bug,please correct me, thanks!

Best Regards,
Dawn

Comment 5 Vadim Rozenfeld 2012-05-14 10:19:55 UTC
Hi Dawn,
I don't see any mistake in your description.
Could you please export the System Event Log to a file and upload it as an 
attachment?

Best regards,
Vadim.

Comment 6 Dor Laor 2012-05-14 14:11:47 UTC
Please also provide the guest configuration - are virtio drivers used?

Comment 7 Mike Cao 2012-05-14 15:00:36 UTC
(In reply to comment #6)
> Please also provide the guest configuration - are virtio drivers used?

Yes ,We use virtio drivers ,We hit this When we run SVVP Test -(system)Sleep stress with IO job .

Mike

Comment 8 dawu 2012-05-15 05:10:04 UTC
Created attachment 584534 [details]
systemEventLogs.txt

Comment 9 dawu 2012-05-21 05:48:51 UTC
CLI:

/usr/libexec/qemu-kvm -m 256G -smp 64 -cpu cpu64-rhel6,+x2apic,family=0xf -usb -device usb-tablet -drive file=Intel_Max_Sut.raw,format=raw,if=none,id=drive-virtio0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,vhost=on,script=/etc/qemu-ifup0,downscript=no -device virtio-net-pci,netdev=hostnet0,mac=00:10:1a:75:50:01,bus=pci.0,addr=0x4,id=virtio-net-pci0 -uuid e8bfe003-1426-4924-b0ac-bd42360a1c36 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/intel-max-sut,server,nowait -mon chardev=111a,mode=readline -name intel-max-sut -vnc :1 -drive file=en_windows_server_2008_r2_standard_enterprise_datacenter_and_web_with_sp1_x64_dvd_617601.iso,media=cdrom,id=cdrom,if=none -device ide-drive,drive=cdrom

Comment 11 RHEL Program Management 2012-07-10 06:08:55 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 12 RHEL Program Management 2012-07-11 02:07:38 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 13 Ronen Hod 2012-07-17 09:19:33 UTC
Might be the same issue as Bug 820112

Comment 14 Ronen Hod 2012-07-17 09:21:02 UTC
This is a tough bug that requires a lot of time and patience.
The price-performance of working on it currently is not good, with all the Win8/HCK issues.
Deferring to 6.5.

Comment 20 Ronen Hod 2014-08-06 08:58:19 UTC
QE,
There were so many changes since RHEL6.4. Please recheck.

Comment 21 juzhang 2014-08-11 08:53:29 UTC
(In reply to Ronen Hod from comment #20)
> QE,
> There were so many changes since RHEL6.4. Please recheck.

Hi Xiangchun,

Can you handle this?

Best Regards,
Junyi

Comment 22 FuXiangChun 2014-08-11 09:03:43 UTC
(In reply to juzhang from comment #21)
> (In reply to Ronen Hod from comment #20)
> > QE,
> > There were so many changes since RHEL6.4. Please recheck.
> 
> Hi Xiangchun,
> 
> Can you handle this?
> 
> Best Regards,
> Junyi

Junyi,
I will find a machine to re-test it.

Comment 23 FuXiangChun 2014-08-12 10:38:40 UTC
As QE didn't find big memory machine.  so Re-test it with -m 12G -smp 64.

According to comment 0.

result:
S4 pass, resume pass.


QE will continue to it when finding big memory machine.  and update test result it asap.