Bug 821377 - Guest for win2k8-R2 with -smp 64 and -m 256GB often stuck after resuming from S4.
Guest for win2k8-R2 with -smp 64 and -m 256GB often stuck after resuming from...
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
Unspecified Unspecified
medium Severity medium
: rc
: 7.0
Assigned To: Vadim Rozenfeld
Virtualization Bugs
:
Depends On:
Blocks: 720669 761491 Virt-S3/S4-7.0
  Show dependency treegraph
 
Reported: 2012-05-14 05:46 EDT by dawu
Modified: 2015-03-04 00:47 EST (History)
16 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-03-04 00:47:59 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
System_stuck_afterResumingFromS4_1 (73.54 KB, image/png)
2012-05-14 05:47 EDT, dawu
no flags Details
System_stuck_afterResumingFromS4_2 (36.78 KB, image/png)
2012-05-14 05:48 EDT, dawu
no flags Details
systemEventLogs.txt (1.19 MB, text/plain)
2012-05-15 01:10 EDT, dawu
no flags Details

  None (edit)
Description dawu 2012-05-14 05:46:19 EDT
Description of problem:
Found this issue when running job "Common Scenario Stress With IO" for svvp, guest for win2k8-R2 with -smp 64 and -m 256GB often stuck after resuming from S4.

Version-Release number of selected component (if applicable):
kernel-2.6.32-269.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.290.el6.x86_64
Netkvm virtio-win-rewhql-0.1-27
Block  virtio-win-rewhql-0.1-26
seabios-0.6.1.2-19.el6.x86_64

How reproducible:
70% 

Steps to Reproduce:
Summarize the main steps for job "Common Scenario Stress With IO" for svvp as following:
1. Disable/Enable driver repeatedly for almost 10 times.
2. Do simpeIOStress 
3. S4 after IO stress work done.
4. Resuming from S4.
5. Repeate 1-4 steps for many times, such as 12 times.
Actual results:
Guest stuck  after resuming from S4,mainly happened in the first several times for S4,please refer to the attached "System_stuck_afterResumingFromS4_1.png".

Expected results:
Guest should resume successfully after S4.

Additional info:
When system_reset from qemu monitor, guest will prompt if "continue with system resume", if select it, you can continue resuming successfully,please refer to the attached "System_stuck_afterResumingFromS4_2.png".
Comment 1 dawu 2012-05-14 05:47:45 EDT
Created attachment 584307 [details]
System_stuck_afterResumingFromS4_1
Comment 2 dawu 2012-05-14 05:48:21 EDT
Created attachment 584308 [details]
System_stuck_afterResumingFromS4_2
Comment 3 dawu 2012-05-14 05:52:00 EDT
Hi Vadim,

For job "Common Scenario Stress With IO" for svvp, if there is some mistakes in the steps described in the bug,please correct me, thanks!

Best Regards,
Dawn
Comment 5 Vadim Rozenfeld 2012-05-14 06:19:55 EDT
Hi Dawn,
I don't see any mistake in your description.
Could you please export the System Event Log to a file and upload it as an 
attachment?

Best regards,
Vadim.
Comment 6 Dor Laor 2012-05-14 10:11:47 EDT
Please also provide the guest configuration - are virtio drivers used?
Comment 7 Mike Cao 2012-05-14 11:00:36 EDT
(In reply to comment #6)
> Please also provide the guest configuration - are virtio drivers used?

Yes ,We use virtio drivers ,We hit this When we run SVVP Test -(system)Sleep stress with IO job .

Mike
Comment 8 dawu 2012-05-15 01:10:04 EDT
Created attachment 584534 [details]
systemEventLogs.txt
Comment 9 dawu 2012-05-21 01:48:51 EDT
CLI:

/usr/libexec/qemu-kvm -m 256G -smp 64 -cpu cpu64-rhel6,+x2apic,family=0xf -usb -device usb-tablet -drive file=Intel_Max_Sut.raw,format=raw,if=none,id=drive-virtio0,cache=none,werror=stop,rerror=stop -device virtio-blk-pci,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,vhost=on,script=/etc/qemu-ifup0,downscript=no -device virtio-net-pci,netdev=hostnet0,mac=00:10:1a:75:50:01,bus=pci.0,addr=0x4,id=virtio-net-pci0 -uuid e8bfe003-1426-4924-b0ac-bd42360a1c36 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/intel-max-sut,server,nowait -mon chardev=111a,mode=readline -name intel-max-sut -vnc :1 -drive file=en_windows_server_2008_r2_standard_enterprise_datacenter_and_web_with_sp1_x64_dvd_617601.iso,media=cdrom,id=cdrom,if=none -device ide-drive,drive=cdrom
Comment 11 RHEL Product and Program Management 2012-07-10 02:08:55 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 12 RHEL Product and Program Management 2012-07-10 22:07:38 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 13 Ronen Hod 2012-07-17 05:19:33 EDT
Might be the same issue as Bug 820112
Comment 14 Ronen Hod 2012-07-17 05:21:02 EDT
This is a tough bug that requires a lot of time and patience.
The price-performance of working on it currently is not good, with all the Win8/HCK issues.
Deferring to 6.5.
Comment 20 Ronen Hod 2014-08-06 04:58:19 EDT
QE,
There were so many changes since RHEL6.4. Please recheck.
Comment 21 juzhang 2014-08-11 04:53:29 EDT
(In reply to Ronen Hod from comment #20)
> QE,
> There were so many changes since RHEL6.4. Please recheck.

Hi Xiangchun,

Can you handle this?

Best Regards,
Junyi
Comment 22 FuXiangChun 2014-08-11 05:03:43 EDT
(In reply to juzhang from comment #21)
> (In reply to Ronen Hod from comment #20)
> > QE,
> > There were so many changes since RHEL6.4. Please recheck.
> 
> Hi Xiangchun,
> 
> Can you handle this?
> 
> Best Regards,
> Junyi

Junyi,
I will find a machine to re-test it.
Comment 23 FuXiangChun 2014-08-12 06:38:40 EDT
As QE didn't find big memory machine.  so Re-test it with -m 12G -smp 64.

According to comment 0.

result:
S4 pass, resume pass.


QE will continue to it when finding big memory machine.  and update test result it asap.

Note You need to log in before you can comment on or make changes to this bug.