996038 – it takes 8~30 minutes or more to resume rhel guest from S4

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 996038 - it takes 8~30 minutes or more to resume rhel guest from S4

Summary: it takes 8~30 minutes or more to resume rhel guest from S4

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	6.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Marcel Apfelbaum
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1135383 (view as bug list)
Depends On:
Blocks:	912287 1056252
TreeView+	depends on / blocked

Reported:	2013-08-12 09:29 UTC by Chao Yang
Modified:	2014-09-01 01:43 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-03-23 12:50:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
dmesg from serial (50.92 KB, text/x-log) 2013-08-12 09:29 UTC, Chao Yang	no flags	Details
View All

Description Chao Yang 2013-08-12 09:29:14 UTC

Description of problem:
Booted a rhel6.4 guest with newer kernel installed, suspended to disk, then tried to resume it, but it took a long time to get resumed. 

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.382.el6.x86_64
seabios-0.6.1.2-28.el6.x86_64
2.6.32-407.el6.x86_64(both guest and host)


How reproducible:
100%

Steps to Reproduce:
1. boot a rhel guest by:
 /usr/libexec/qemu-kvm -name test -M rhel6.5.0 -cpu host -enable-kvm -m 4096 -smp 4,sockets=4,cores=2,threads=1,maxcpus=8 -rtc base=utc,clock=host,driftfix=slew -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device ich9-usb-ehci1,id=ehci,addr=3.0 -drive file=/home/usb-storage.qcow2,if=none,id=usb-storage,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device usb-storage,drive=usb-storage,id=usb_storage_1 -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=4.0 -chardev socket,id=channel0,path=/tmp/socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=channel0,name=port0 -device virtio-scsi-pci,id=scsi,addr=5.0 -drive file=/home/scsi-storage.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device scsi-disk,bus=scsi.0,drive=drive-virtio-disk0,id=virtio-disk0,lun=0 -drive file=/home/ide.qcow2,if=none,id=ide,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device ide-drive,drive=ide,bus=ide.1,unit=1 -drive file=/home/device_interrupt.raw,if=none,id=drive-virtio-0-0,media=disk,format=raw,cache=none -device virtio-blk-pci,drive=drive-virtio-0-0,id=virt0-0-0,bootindex=1,addr=6.0 -netdev tap,id=hostnet1 -device e1000,netdev=hostnet1,id=net1,mac=00:1a:4a:42:48:12,bus=pci.0,addr=a.0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:42:48:ab,bus=pci.0,addr=7.0 -spice port=5900,disable-ticketing,seamless-migration=on -k en-us -vga cirrus -vnc :1 -device intel-hda,id=sound0,bus=pci.0,addr=8.0 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=9.0 -monitor stdio -serial unix:/tmp/serial,server,nowait -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

2. suspend to disk by:
 echo disk > /sys/power/state

3. resume it with exactly same cli in step 1

Actual results:
It took more than 8 minutes. 

Expected results:


Additional info:
1. It looked like hung at "Suspending console(s) (use no_console_suspend to debug)" for long time.
2. Guest could catch up to real time after resuming from S4

Comment 1 Chao Yang 2013-08-12 09:29:56 UTC

Created attachment 785624 [details]
dmesg from serial

Comment 3 Chao Yang 2013-08-12 10:01:08 UTC

I also tested S3 with same cli, guest got resumed quickly instead of taking many minutes.

Comment 4 Ademar Reis 2013-08-12 22:37:12 UTC

(In reply to chayang from comment #0)
> Description of problem:
> Booted a rhel6.4 guest with newer kernel installed, suspended to disk, then
> tried to resume it, but it took a long time to get resumed. 
> 

Looks like a regression then, can you confirm? Even though we don't support S3/S4, it would be good to keep this use-case working, specially considering it's RHEL6/RHEL6 setup.

Comment 5 Chao Yang 2013-08-13 03:14:20 UTC

(In reply to Ademar de Souza Reis Jr. from comment #4)
> (In reply to chayang from comment #0)
> > Description of problem:
> > Booted a rhel6.4 guest with newer kernel installed, suspended to disk, then
> > tried to resume it, but it took a long time to get resumed. 
> > 
> 
> Looks like a regression then, can you confirm? Even though we don't support
> S3/S4, it would be good to keep this use-case working, specially considering
> it's RHEL6/RHEL6 setup.

Yes, this is a regression bug. 
Tried with qemu-kvm-0.12.1.2-2.355.el6.x86_64 using exactly same CLI in Comment 0. It took guest less than 20 seconds to resume from S4. Adding 'Regression' keyword.

Comment 7 Amit Shah 2013-08-13 11:09:29 UTC

Please narrow down the builds which introduce the regression.  Right now we know -355 is the good build and -382 is the bad one.

Comment 8 Chao Yang 2013-08-14 14:51:01 UTC

(In reply to Amit Shah from comment #7)
> Please narrow down the builds which introduce the regression.  Right now we
> know -355 is the good build and -382 is the bad one.

I retested with -382 and -381, this time it just took about 1 minute to resume. But with -377, it only took about 5 seconds to resume.

Comment 9 Qunfang Zhang 2013-08-15 02:50:10 UTC

Some of our guys say it takes more than half an hour to resume a guest from S4. So our cases including S4 steps will be blocked.

Comment 10 langfang 2013-08-15 03:11:51 UTC

Hit this problem on the latest version:
Host:
# uname -r
2.6.32-412.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.387.el6.x86_64

Guest:
2.6.32-412.el6.x86_64

Steps:
1.Boot guest:
 /usr/libexec/qemu-kvm -M rhel6.5.0 -m 4G -smp 2 -device virtio-scsi-pci,bus=pci.0,addr=0x5,id=scsi0 -drive file=/mnt/rhel6.5-newinstall.qcow2,if=none,id=drive-scsi0-0-0,media=disk,cache=none,format=qcow2,werror=stop,rerror=stop,aio=native -device scsi-hd,drive=drive-scsi0-0-0,bus=scsi0.0,scsi-id=0,lun=0,id=flang,bootindex=1 -spice port=5840,disable-ticketing -vga qxl -qmp tcp:0:5556,server,nowait -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/tty0,server,nowait  -boot menu=on -monitor stdio -device virtio-balloon-pci,bus=pci.0,id=balloon0  -netdev tap,vhost=on,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,mac=00:10:20:2c:45:23,bus=pci.0,addr=0x4,id=net0 -drive file=/home/RHEL6.4-20130123.0-Server-x86_64-DVD1.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw  -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x7  -chardev socket,id=channel0,path=/tmp/serial,server,nowait -device virtserialport,chardev=channel0,name=org.linux-kvm.port.0,bus=virtio-serial0.0,id=port1

2.Do S4

Results:

Do S4(wait about 4 min,then qemu quit)-->Boot guest with same CLI---> wait about 20 min,still can not resume

Comment 13 Amit Shah 2013-08-21 08:31:01 UTC

(In reply to chayang from comment #8)
> (In reply to Amit Shah from comment #7)
> > Please narrow down the builds which introduce the regression.  Right now we
> > know -355 is the good build and -382 is the bad one.
> 
> I retested with -382 and -381, this time it just took about 1 minute to
> resume. But with -377, it only took about 5 seconds to resume.

The difference in 377 and 381 doesn't highlight anything that touches acpi or anything that should affect S4.

Comment 18 langfang 2013-08-27 12:13:36 UTC

Test this on latest version:

Host:
# uname -r 
2.6.32-414.el6.x86_64
# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.398.el6.x86_64
# rpm -q seabios
seabios-0.6.1.2-28.el6.x86_64


Guest:Windows


Results:
win8(32)--->S4--->resume--->successfully
win8(64)--->S4--->resume---->successfully
win7(32)---->S4--->resume--->successfully
win7(64)--->S4-->resume--->BSOD

Bug 1001616 - win7-64 guest bsod while enter s4 state

Comment 22 Qunfang Zhang 2014-09-01 01:43:38 UTC

*** Bug 1135383 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.